Home

Veda Anand - Sr Big Data Engineer Cloudera, HWX, MapR, AWS, Azure and GCP Consultant
[email protected]
Location: Whitmore Lake, Michigan, USA
Relocation: No
Visa: GC
Resume file: Ravikanth_Machapur_Resume_D2024 (1)_1763044647253.docx
Please check the file(s) for viruses. Files are checked manually and then made available for download.
RAVIKANTH MACHAPUR
Email:[email protected] Sr Big Data Engineer

Cloudera, HWX, MapR, AWS, Azure and GCP Consultant

Page 1 of

PROFESSIONAL SUMMARY:
Over 16 years of experience in design, development, implementation of Software applications and
BI/DWH solutions. Experience in data discovery and advance analytics and building business
solutions with knowledge in developing strategic ideas for deploying Big Data solutions in both
cloud and on-premise environments, to efficiently solve Big Data processing requirements.
Build Advanced Analytics Applications on different eco systems MapR, Cloudera, HWX, GCP, Azure
and AWS.
Strong Understanding in distributed systems, RDBMS, large-scale & small-scale non-relational
data stores, map-reduce systems, database performance, data modeling, and multi-terabyte
data warehouses.
Extensively used Hadoop open source tools like Hive, Hbase, Sqoop, Spark for ETL on
Hadoop Cluster.
Worked with Different Clients across Health Care Insurance domain (BCBS, KP, Molina Health
Care)
Worked with several Data Integrating and Replication tools like Informatica BDM, SAP BODS,
Attunity Replicate etc.
Strong knowledge on system development lifecycles and project management on BI
implementations.
Extensively used RDBMS like Oracle and SQL Server for developing different applications.
Build several Data Lakes to help different clients to perform their advance analysis on big data.
Work with Data science team to provide and feed data for AI, ML and Deep learning projects
Real-time experience in Hadoop Distributed files system, Hadoop framework and Parallel
processing implementation (MapR, AWS EMR,Cloudera) with hands on experience in HDFS,
Map Reduce, Pig/Hive, Hbase, Yarn, Sqoop, Spark, Java, RDBMS, Linux/Unix shell scripting
and Linux internals.
Experience in writing UDF s and map reduce programs in java for Hive and Pig.
Procedural knowledge in cleansing and analyzing data using HiveQL, Pig Latin, and custom
Map Reduce programs in Java.
Created Kafka data pipelines for Google Ads platform to consumer latest customer profiles.
Experience in Data visualization using oracle Big data Discovery tool & IBM Cognos.
Experience in Object Oriented Analysis Design (OOAD) and development of software using
UML Methodology, good knowledge of J2EE design patterns and Core Java design patterns.
Extensively used RDBMS like Oracle and SQL Server for developing different applications.
Experience in creating scripts and Macros using Microsoft Visual Studios to automate tasks.
Experience in working with GitHub Repository.
Experienced in designing software which enables a system which is secure and enforces
authentication, authorization, confidentiality, data integrity, accountability, availability and non-
repudiation.
Have experience in web designing, web hosting and DNS configurations.
Other Experiences:
Have experience working with web designer tools like Adobe DreamWeaver CC, Wordpress &
Joomla.
Proficient in Manual, Functional and Automation testing.
Also experienced in Smoke, Integration, Regression, Functional, Front End and Back End
Testing.
Capable in developing/writing Test Plans, Test Cases, and Test Scripts based on User
Requirements, and SAD documentation.
Highly experienced in writing test cases and executing in HP Interactive Testing Tools: Quality
Center, Quick Test Professional (QTP).

RAVIKANTH MACHAPUR
Email:[email protected] Sr Big Data Engineer

Cloudera, HWX, MapR, AWS, Azure and GCP Consultant

Page 2 of

Technical Skills:
Reporting Tools: Tableau 8.1
Big Data Ecosystem: HDFS, Map Reduce, Oozie, Hive, Pig, Sqoop, Flume, Zookeeper and
HBase, CAWA, Spark, spark-sql, Impala, Mapr-DB, Azure, VOCI, Oracle Big Data Discovery,
Kafka, Nifi,
Hadoop Ecosystems: MapR, Cloudera, AWS EMR, HortonWorks.
Servers: Application Servers (WAS, Tomcat), Web Servers (IIS6, 7, IHS).
Operating Systems: Windows 2003 Enterprise Server, XP, 2000, UNIX, Red Hat Enterprise
Linux Server release 6.7
Databases: SQL Server 2005, SQL 2008, Oracle 9i/10g, DB2, MS Access2003,
Teradata.Postgres, Mysql, Mssql
Languages: C, C++, Java, XML, JSP/Servlets, Struts, spring, HTML, Python, PHP, JavaScript,
jQuery, Web services,Scala.
Data Modeling: Star-Schema and Snowflake-schema.
ETL Tools: Knowledge on Informatica & IBM Data stage 8.1,SSIS
EDUCATION:
Title of the Degree with Branch College/University Year of Passing
MASTER OF SCIENCE
(COMPUTER SCIENCE)

California State University
Long Beach, CA USA

2015

BACHELOR OF ENGINEERING
(INFORMATION TECHNOLOGY)

Vasavi College of
Engineering/Osmania
University, Andhra Pradesh,
India

2011

Board of Intermediate Education,
Andhra Pradesh, India

Sri Chaitanya Junior
Kalasala,ECIL, Telangana,
India

2007

Board of Secondary Education,
Andhra Pradesh, India

St.Ann s Grammar High
School, Malkajgiri, Hyderabad,
Telangana, India

2005

TRAININGS:
Azure Event Hub (Big Bets)
Semantic Modeling
Cognitive Services
HDInsight
Digital & IOT, Advanced SPARQL
WORK EXPERIENCE:
Fandango, Beverly Hills CA March 2023 till
date
Sr Big Data Architect/ Engineer
Client Description:
Fandango Media, LLC is an American ticketing company that sells movie tickets via their website
as well as through their mobile app.

RAVIKANTH MACHAPUR
Email:[email protected] Sr Big Data Engineer

Cloudera, HWX, MapR, AWS, Azure and GCP Consultant

Page 3 of

Roles and Responsibilities:
Design, Plan, Implement and Responsible for Sales, Finance Data pipelines to Data Lake (S3)
and Data Warehouse (Redshift),
Process Data using EMR, AWS Lambda, step functions, talend jobs and Hadoop jobs.
Participate in the design, Architect, and implementation of CCPA on S3 and redshift.
Write code in Java for the pipelines to ingest data from different sources to flow through S3 and
Redshift.
Designed Multple API s to orchestrate microservices using AWS API gateway and python Flask
and deployed docker contaiiners on AWS EKS.
Designed and developed dynamic data ETL/analytical pipelines using AWS Kubernetes and
Python Flask to meet real-time client needs.
Contributed to the optimization project for DBG Optimization Client Project, which involved
migrating to a new tech stack including AWS, Snowflake, Kubernetes, Docker, Bitbucket, and
other newer technologies.
Led the migration of existing pipelines from AWS Glue/Lambda and step functions to GCP
using GCP cloud functions. Utilized Airflow to create DAGs and interconnect different GCP
functions.
Orchestrated the migration of 780 tables from Teradata to Snowflake using AWS Glue, AWS
Lambda, and AWS step functions. Leveraged the in-built orchestration tool to streamline the
process.
Designed, created, and implemented various AWS Lambda functions, SNS, and SQS to send
emails, subscribe and trigger, improving the efficiency of data processing workflows.
Developed AWS Glue jobs to submit SQL scripts to Snowflake, ensuring smooth data
integration between different systems.
Created Execute Immediate scripts to optimize SQL code and control the flow of logic.
Successfully migrated over 150 BTEQ scripts to Snowflake scripts, ensuring data continuity and
accuracy.
Collaborated with SRE to create different projects on Bitbucket and designed CI/CD pipelines
on Bamboo, which improved the speed and quality of code deployment.
Worked with an offshore team of 15 to coordinate issues and manage the development
progress, ensuring timely delivery of projects.
Worked on Loading, Processing and Analyzing on Adobe Omniture Clickstream data coming
from different devices and for Both Fandango and Vudu Data pipelines.
Worked on Several Data pipelines using AWS lambda, AWS Glue, AWS Step Functions, AWS
Databrew. Apache Iveberg.
Monitor And Debug cloud data pipelines using job runs on Glue, Lamda and Cloud Watch logs.
Single Handedly Managed two different lines of Business ETL pipelines bought Rotten
Tomatoes, Fandango and Vudu.
Moved an Existing Legacy ETL pipelines to Cloud.
Loaded 3rd party data calling different API end points and building a single source of raw data
for downstream reporting and applications.
Deploy code to dev, int and prod using built CICD pipelines in Jenkins.
Collaborate with different teams to communicate, negotiate and implement end to end
solutions.
Developed several AWS lambda and AWS glue jobs on Python and deployed using terraform
and cloud formation tools.
Use EMR to process heavy load batch processing.
Used Hive, pyspark, Oozie, Kafka, Python to orchestrate jobs on EMR.
Managed Code repositories in BitBucket. Manage and Support about 43 data pipelines on
JAWS, AWS, Lamda and Talend.

RAVIKANTH MACHAPUR
Email:[email protected] Sr Big Data Engineer

Cloudera, HWX, MapR, AWS, Azure and GCP Consultant

Page 4 of

Support BI team for analytics reports.
Extensively Used Postman and Insomnia tools for API testing.
Worked with DevOps to build Jenkins pipelines and deploy CI/CD pipelines,
Environment: AWS Redshift, S3, Java Spring, PostgreSQL, MS SQL, Python, AWS EMR, Github,
Jenkins, Veracode, Scala, Talend, Jaws, GCP, AWS EMR, Snowflake, Bamboo,
Deloitte Contingent Worker
Carelon (Anthem), Remote Oct 2021 March 2023
Sr Big Data Cloud Engineer
Client Description:
Carelon (Elevance Health-Anthem) is an American health insurance provider. The company's
services include medical, pharmaceutical, dental, behavioral health, long-term care, and disability
plans through affiliated companies such as Anthem Blue Cross and Blue Shield, Empire BlueCross
BlueShield in New York State, Anthem Blue Cross in California, Wellpoint, and Carelon. It is the
largest for-profit managed health care company in the Blue Cross Blue Shield Association.
Roles and Responsibilities:
Design, Plan, Implement and Responsible for Sales, Finance Data pipelines to Data Lake (S3)
and Data Warehouse (Redshift),
Process Data using EMR, AWS Lambda, step functions, talend jobs and Hadoop jobs.
Participate in the design, Architect, and implementation of CCPA on S3 and redshift.
Write code in Java for the pipelines to ingest data from different sources to flow through S3 and
Redshift.
Designed Multiple API s to orchestrate microservices using AWS API gateway and python
Flask and deployed docker containers on AWS EKS.
Designed and developed dynamic data ETL/analytical pipelines using AWS Kubernetes and
Python Flask to meet real-time client needs.
Contributed to the optimization project for DBG Optimization Client Project, which involved
migrating to a new tech stack including AWS, Snowflake, Kubernetes, Docker, Bitbucket, and
other newer technologies.
Led the migration of existing pipelines from AWS Glue/Lambda and step functions to GCP
using GCP cloud functions. Utilized Airflow to create DAGs and interconnect different GCP
functions.
Orchestrated the migration of 780 tables from Teradata to Snowflake using AWS Glue, AWS
Lambda, and AWS step functions. Leveraged the in-built orchestration tool to streamline the
process.
Designed and developed interactive dashboards and reports using Oracle Orbit Analytics to
provide business insights.
Integrated data sources from Oracle databases to create customized visualizations and KPI
dashboards.
Created and maintained metadata models to support ad-hoc reporting and analysis.
Optimized performance of Orbit Analytics queries and dashboards for better response time and
scalability.
Implemented role-based access controls (RBAC) to secure data and reports.

RAVIKANTH MACHAPUR
Email:[email protected] Sr Big Data Engineer

Cloudera, HWX, MapR, AWS, Azure and GCP Consultant

Page 5 of

Developed and maintained Oracle Discoverer workbooks, reports, and dashboards to support
business intelligence needs.
Designed and optimized queries for performance improvement and faster report generation.
Worked with Oracle E-Business Suite (EBS) to extract operational and financial data for
reporting.
Managed end-user access, permissions, and security settings for Discoverer reports.
Provided troubleshooting and debugging support for Discoverer reports and workbooks.
Assisted in migrating reports from Oracle Discoverer to modern BI tools due to deprecation.
Designed, created, and implemented various AWS Lambda functions, SNS, and SQS to send
emails, subscribe and trigger, improving the efficiency of data processing workflows.
Developed AWS Glue jobs to submit SQL scripts to Snowflake, ensuring smooth data
integration between different systems.
Created Execute Immediate scripts to optimize SQL code and control the flow of logic.
Successfully migrated over 150 BTEQ scripts to Snowflake scripts, ensuring data continuity
and accuracy.
Collaborated with SRE to create different projects on Bitbucket and designed CI/CD pipelines
on Bamboo, which improved the speed and quality of code deployment.
Worked with an offshore team of 15 to coordinate issues and manage the development
progress, ensuring timely delivery of projects.
Deployed more than 100 data pipelines using Talend ETL tool.
Worked on Loading, Processing and Analyzing on Adobe Omniture Clickstream data coming
from different devices and for Both Fandango and Vudu Data pipelines.
Worked on Several Data pipelines using AWS lambda, AWS Glue, AWS Step Functions, AWS
Data brew. Apache Iceberg.
Monitor And Debug cloud data pipelines using job runs on Glue, Lambda and Cloud Watch
logs.
Single Handedly Managed two different lines of Business ETL pipelines bought Rotten
Tomatoes, Fandango and Vudu.
Moved an Existing Legacy ETL pipelines to Cloud.
Loaded 3rd party data calling different API end points and building a single source of raw data
for downstream reporting and applications.
Deploy code to dev, int and prod using built CICD pipelines in Jenkins.
Collaborate with different teams to communicate, negotiate and implement end to end
solutions.
Developed several AWS lambda and AWS glue jobs on Python and deployed using terraform
and cloud formation tools.
Use EMR to process heavy load batch processing.
Used Hive, pyspark, Oozie, Kafka, Python to orchestrate jobs on EMR.
Managed Code repositories in BitBucket. Manage and Support about 43 data pipelines on
JAWS, AWS, Lambda and Talend.
Support BI team for analytics reports.
Extensively Used Postman and Insomnia tools for API testing.
Worked with DevOps to build Jenkins pipelines and deploy CI/CD pipelines
Deployed more than 100 data pipelines using Talend ETL tool

RAVIKANTH MACHAPUR
Email:[email protected] Sr Big Data Engineer

Cloudera, HWX, MapR, AWS, Azure and GCP Consultant

Page 6 of

Environment: AWS, S3, Python, Teradata, PostgreSQL, GCP , AWS EMR, Snowflake, Bamboo, Glue,
AWS Lambda, pyspark, GCP Cloud functions, GCS, Airflow,
Bitwise
Beverly Hills CA June 2019 Oct 2021
Sr Big Data Architect/ Engineer
Client Description:
Fandango Media, LLC is an American ticketing company that sells movie tickets via their website
as well as through their mobile app.
Roles and Responsibilities:
Design, Plan, Implement and Responsible for Sales, Finance Data pipelines to Data Lake (S3)
and Data Warehouse (Redshift),
Process Data using EMR, AWS Lambda, step functions, talend jobs and Hadoop jobs.
Participate in the design, Architect, and implementation of CCPA on S3 and redshift.
Write code in Java for the pipelines to ingest data from different sources to flow through S3 and
Redshift.
Worked on Loading, Processing and Analyzing on Adobe Omniture Clickstream data coming
from different devices and for Both Fandango and Vudu Data pipelines.
Worked on Several Data pipelines using AWS lambda, AWS Glue, AWS Step Functions, AWS
Databrew.
Monitor And Debug cloud data pipelines using job runs on Glue, Lamdda and Cloud Watch
logs.
Single Handedly Managed two different lines of Business ETL pipelines bought Rotten
Tomatoes, Fandango and Vudu.
Moved an Existing Legacy ETL pipelines to Cloud.
Loaded 3rd party data calling different API end points and building a single source of raw data
for downstream reporting and applications.
Deploy code to dev, int and prod using built CICD pipelines in Jenkins.
Collaborate with different teams to communicate, negotiate and implement end to end
solutions.
Developed several AWS lambda and AWS glue jobs on Python and deployed using terraform
and cloud formation tools.
Use EMR to process heavy load batch processing.
Used Hive, pyspark, Oozie, Kafka, Python to orchestrate jobs on EMR.
Mange Code repositories in BitBucket. Manage and Support about 43 data pipelines on JAWS,
AWS, Lamda and Talend.
Support BI team for analytics reports.
Extensively Used Postman and Insomnia tools for API testing.
Environment: AWS Redshift, S3, Java Spring, PostgreSQL, MS SQL, Python, AWS EMR, Github,
Jenkins, Veracode, Scala, Talend, Jaws
Molina Health Insurance, Long Beach CA Nov 2018
May 2019
Sr Big Data Developer/ Independent Consultant

RAVIKANTH MACHAPUR
Email:[email protected] Sr Big Data Engineer

Cloudera, HWX, MapR, AWS, Azure and GCP Consultant

Page 7 of

Client Description:
Molina Healthcare, a FORTUNE 500, multi-state health care organization, arranges for the delivery
of health care services and offers health information management solutions to nearly five million
individuals and families who receive their care through Medicaid, Medicare and other government-
funded programs in fifteen states.
Project Description:
Optum Data Exchange
HCG Groper Pipeline
Med Insights Pipeline Executive Dash Board
Roles and Responsibilities:
Helped client to understand performance issues on the cluster by analyzing the
Cloudera stats.
Designed and implemented ETL pipelines using Azure Data Factory for data
integration across diverse sources.
Developed and maintained robust data workflows in Azure Data Factory to
ensure seamless data flow and transformation.
Utilized Azure Data Factory s mapping data flows for complex data
transformations, ensuring data accuracy and consistency.
Integrated Azure Data Factory with various data storage solutions, including
Azure Blob Storage, Azure SQL Database, and Data Lake Storage.
Scheduled and monitored pipeline activities in Azure Data Factory to ensure
timely data processing and availability.
Implemented Azure Data Factory s Linked Services and Datasets to streamline
data connections and data set definitions.
Created custom activities in Azure Data Factory using Azure Functions and
Databricks for specialized data processing tasks.
Ensured data security and compliance by implementing Azure Data Factory s
access controls and data encryption features.
Automated deployment and versioning of Azure Data Factory pipelines using
Azure DevOps CI/CD pipelines.
Developed data models and transformations in dbt to standardize and optimize
data structures for analytics.
Created reusable macros and Jinja templates in dbt to enhance productivity
and maintain consistency across projects.
Implemented rigorous data testing and validation strategies in dbt to ensure
data quality and reliability.
Collaborated with data analysts and engineers to design and build scalable dbt
models that support business intelligence needs.
Optimized SQL queries and transformations in dbt for performance and
efficiency, reducing data processing times.
Leveraged dbt documentation and lineage features to provide clear and
comprehensive data model documentation for stakeholders.

RAVIKANTH MACHAPUR
Email:[email protected] Sr Big Data Engineer

Cloudera, HWX, MapR, AWS, Azure and GCP Consultant

Page 8 of

Designed and implemented Optum Data Extracts and HCG Grouper Extracts.
Improved memory and time performances for several existing pipelines.
Improved Solr Data Ingestion, data quality for Medley Pipeline.
Owned Member Sphere, Mosaic, designed and developed Optum and HCG
pipelines.
Build pipelines using Scala, spark, sparksql, hive, hbase tools.
Loaded processed data into different consumption points like Apache solr,
Hbase, atscale cubes for visualization and search.
Automated the workflow using Talent Big Data.
Scheduled jobs using Autosys.
Used Bash Shell Scripting, Sqoop, AVRO, Hive, HDP, Redshift, Pig, Java, Map/Reduce daily to
develop ETL, batch processing, and data storage functionality.
Environment: Attunity, Oracle SQl, Cloudera, Spark, Talend workload automation, Jenkins, Git
Cognizant Technology Solutions. Sep 2017
Oct 2018
AAA Auto Club Of Southern California, Costa Mesa, CA.
Sr Big Data Consultant/ Digital Transformation (Cloudera)
Client Description:
The Automobile Club of Sothern California is the southern California affiliate of American
Automobile Association (AAA) federation of motor clubs. The Auto club was founded in 1990 in Los
Angeles as one of the nation s first motor clubs dedicated to improving roads, proposing traffic laws,
and improvement of overall driving conditions.
Project Description:
HortonWorks to Cloudera Migration.
TeraData Performace Optimization.
Digital Integration Google Adwords API.
Undisputed Leader Call Forecast, ETR Forecast
Attunity Replicate Solution Design
Speech Analytics
Sqoop Mainframe DB2, VSAM Solution design.
Roles and Responsibilities:
Responsible for moving all the production jobs from HortonWorks to Cloudera.
Leading a team of 2 Onsite and 4 Offshore.
Improving the Performance of Teradata queries wherever needed while migration.
Java API to build automation of Google Adwords Campaign.
Maintaining weekly cubes refresh for TM1 to populate the latest data from the PROD screenshots.
Built models containing query subjects, query items, and namespaces from imported metadata.
Created Ad-hoc reports using Query Studio.
Fine-tuned and enhanced queries for the performance of the reports and Models.
Ability to work under stringent deadlines with teams as well as independent.
Lead the Undisputed Leader project.

RAVIKANTH MACHAPUR
Email:[email protected] Sr Big Data Engineer

Cloudera, HWX, MapR, AWS, Azure and GCP Consultant

Page 9 of

Created Kafka Streaming for our Google Ads platform to stream real-time changes of the Customer
Profile to HBase for Google Ads App to target customers based on the latest profile.
Worked on Data pipelines to do transformations on Teradata.
Environment: SAP, Teradata 12.0. HortonWorks, Cloudera. IBM mAinframe, Oracle DB, SQL DB,
Control M workload automation
Cognizant Technology Solutions. Mar 2017
Sep 2017
Puget Sound Energy, Bellevue, WA.
Sr Cloud Architect (AWS EMR)/BI Lead
Client Description:
PUGET SOUND ENERGY, (PSE) is a Washington state energy utility providing electrical power
and natural gas primarily in the Puget Sound region of the northwest United States. The utility
serves electricity to more than 1.1 million customers in Island, King, Kitsap, Kittitas, Pierce, Skagit,
Thurston, and Whatcom counties; and provides natural gas to 750,000 customers in King, Kittitas,
Lewis, Pierce, Snohomish and Thurston counties. The company has a 6,000-square-mile (16,000
km2) electric and natural gas service area. PSE owns coal, hydroelectric, natural gas and wind
power-generating facilities, with more than 2,900 MW of capacity. Roughly one-third each of PSE
generation comes from coal, hydroelectric, and natural gas facilities, with a small remainder coming
from wind and energy efficiency programs.
Project Description:
PSE has embarked on a program called Get to Zero (GTZ), which will act as the foundational layer for
bringing about Digital transformation and enhanced customer centricity in the organization. The solution is
envisioned to integrate people, processes, and technology in Customer Service, Operations, Supply Chain,
Energy Efficiency, Workforce management and all support organizations. Consultant understands that as
part of this program, PSE is looking for the right specialized partner for creating a culture where
data is treated as a corporate asset, business decisions are data-driven, and analytics are used to
make better-informed decisions.
Roles and Responsibilities:
Worked with Business to define, identify & implement quickwins for Get2Zero GTZ program
which will deliver incremental value to the business, in collaboration with PSE.
Gathered and documented detailed business requirements to identify, and prioritize quick wins.
Engaged with the PSE team to determine the exact scope of quick wins to be delivered.
Assessed and requested for any infrastructure and environments required to implement the
prioritized quick wins.
Working with Business in requirements gathering and prepare the functional requirements
document. Analyzing the requirements and providing the estimation for the project based on
the business request.
Design Cloud Architecture on AWS, Spin up cluster for developers during data processing,
cleaning and analysis.
Working on the Data Model & technical design & implementation for Hive ETL & Big Data
Hadoop projects.
Being part of critical model, design, software development and code reviews for decision
making and maintain quality standards.
Worked with Infrastructure teams DBA, SAP BW, Middleware, and UNIX in setting up
environments during different levels of software lifecycle.
Working on performance tuning of the Big Data components to meet the SLA which is critical
for the customer.

RAVIKANTH MACHAPUR
Email:[email protected] Sr Big Data Engineer

Cloudera, HWX, MapR, AWS, Azure and GCP Consultant

Page 10 of

Install and configure different tools like Jupyter notebook, Redshift, python libraries, Spark etc.
Prepare data for consumption into tableau visualization layer.
Developed AWS data pipeline, SNS for automating the dunning process on cloud.
Project Management for Onsite, Offshore team for assigning tasks and report development work.
Environment: HP Quality center 10.2.0 Bug tracking tool, Teradata 12.0.SAP BW, AWS EMR, Tableau,
Apache Zepllien, Jupyter Notebook, Anaconda, SAP, SAP BODS
Cognizant Technology Solutions,
Schneider National, Green Bay WI. Mar 2016 Apr
2016
Sr Big Data Advance Analytics Consultant
Client Description:
Schneider National Inc, Leading Provider of Premium truckload and Intermodal services. Has about
60 years of transportation industry experience. Provides expert transportation solutions, number
one carrier in dry-freight, industrial glass and bulk motor carriers. Schneider is one of the largest
truckload carriers in North America, hauling 16,275 loads per day, with 11,300 company drivers,
9,600 company trucks and 31,000 trailers on the road. The company conducts business worldwide
with 168 facilities, including a presence in the United States, Canada, Mexico and China.
Schneider s customers include more than two-thirds of the FORTUNE 500 companies.
3
Project Description:
Schneider has decided to move with MapR eco system to continue their data analytics and
visualizations. In order to process huge data that generates daily they need a distributed network as
Hadoop. Schneider has initially started with Cloudera with a single node and did few POC s. In
order to setup a robust environment Schneider has bought services from MapR and to be deployed
in their systems and move the existing projects from Cloudera to MapR. Worked on Turndown
Analytics, Sentiment Analysis, Structured content extraction, Voice to Text Analytics, Image to Text
Analytics.
Responsibilities:
Worked collaboratively with MapR vendor and client to manage and build out of large data
clusters.
Helped design big data clusters and administered them.
Worked both independently and as an integral part of the development team.
Communicated all issues and participated in weekly strategy meetings.
Administered back end services and databases in the virtual environment.
Did several benchmark tests on hadoop sql engines (Hive, Spark-sql, Impala) and on different
data formats Avro, sequence, Parquet using different compression codecs like Gzip, snappy
etc.
Worked on extracting text from Emails, Images and voice and created data pipelines.
Worked on sentiment analysis and structured content programs for creating text analytics app.
Created and Implemented applications on Oracle Big Data Discovery for Data visualization,
Dashboard and Reports.
Implemented system wide monitoring and alerts.
Installed & configured Hive, Impala, Oracle BigData Discovery, Hue, Apache Spark, Tika, Tika
Tesseract, Sqoop, Spark sql etc.
Importing and exporting data into MapRFS and Hive using Sqoop.
Used Bash Shell Scripting, Sqoop, AVRO, Hive, Impala, HDP, Pig, Java, Map/Reduce daily to
develop ETL, batch processing, and data storage functionality.

RAVIKANTH MACHAPUR
Email:[email protected] Sr Big Data Engineer

Cloudera, HWX, MapR, AWS, Azure and GCP Consultant

Page 11 of

Responsible for developing data pipeline using Flume, Sqoop and Pig to extract the data from
weblogs and store in HDFS.
Worked on loading all tables from the reference source database schema through Sqoop.
Worked on designed, coded and configured server side J2EE components like JSP, AWS and
JAVA.
Collected data from different databases (i.e. Oracle, My Sql) to Hadoop. Used CA Workload
Automation for workflow scheduling and monitoring. .
Worked on Designing and Developing ETL Workflows using Java for processing data in
MapRFS/Hbase using Oozie.
Experienced in managing and reviewing Hadoop log files. Involved in moving all log files
generated from various sources to HDFS for further processing through Flume.
Involved in loading and transforming large sets of structured, semi structured and unstructured
data from relational databases into HDFS using Sqoop imports.
Developed Sqoop scripts to import export data from relational sources Teradata and handled
incremental loading on the customer, transaction data by date.
Developed simple and complex MapReduce programs in Java for Data Analysis on different
data formats.
Optimized MapReduce Jobs to use HDFS efficiently by using various compression
mechanisms.
Worked on partitioning HIVE tables and running the scripts in parallel to reduce run-time of the
scripts. Worked on Data Serialization formats for converting Complex objects into sequence
bits by using AVRO, PARQUET, JSON, CSV formats.
Responsible for analyzing and cleansing raw data by performing Hive queries and running Pig
scripts on data. Created Hive tables, loaded data and wrote Hive queries that run within the
map.
Implemented business logic by writing Pig UDF's in Java and used various UDFs from
Piggybanks and other sources
Environment: MapR eco system, ODI, Oracle Endeca.Oracle BigData Discovery, CA workload automation

SilverSpur Corporation, Cerritos CA. Jan 2015 Oct
2015
SR Architect/Big Data Consultant
Client Description:
Silver Spur Corporation is a packaging supply company that was founded in Cerritos, California in
1978. Since inception, we have been best known for our amber glass bottles, however today, with
access to more than 45 furnaces, we can accommodate orders of all shapes, sizes, and colors at
large volumes year-round. This enables us to serve many different industries including
nutraceutical, pharmaceutical, food & beverage, cosmetic, and wine, beer, and liquor. We assure
that both our custom and stock items are manufactured to the highest quality standards and are
regularly available in Amber, Green, Flint, and Cobalt Blue.
Responsibilities:
Developed MapReduce jobs in java for data cleaning and preprocessing.
Importing and exporting data into HDFS and Hive using Sqoop.
Responsible for developing data pipeline using Flume, Sqoop and Pig to extract the data from
weblogs and store in HDFS.
Worked on loading all tables from the reference source database schema through Sqoop.
Collected data from different databases (i.e. Oracle, My Sql) to Hadoop
Used Oozie and Zookeeper for workflow scheduling and monitoring.

RAVIKANTH MACHAPUR
Email:[email protected] Sr Big Data Engineer

Cloudera, HWX, MapR, AWS, Azure and GCP Consultant

Page 12 of

Worked on Designing and Developing ETL Workflows using Java for processing data in
HDFS/Hbase using Oozie.
Experienced in managing and reviewing Hadoop log files.
Involved in moving all log files generated from various sources to HDFS for further processing
through Flume.
Involved in loading and transforming large sets of structured, semi structured and unstructured
data from relational databases into HDFS using Sqoop imports.
Developed Sqoop scripts to import export data from relational sources and handled incremental
loading on the customer, transaction data by date.
Developed simple and complex MapReduce programs in Java for Data Analysis on different
data formats.
Optimized MapReduce Jobs to use HDFS efficiently by using various compression
mechanisms.
Worked on partitioning HIVE tables and running the scripts in parallel to reduce run-time of the
scripts.
Worked on Data Serialization formats for converting Complex objects into sequence bits by
using AVRO, PARQUET, JSON, CSV formats.
Responsible for analyzing and cleansing raw data by performing Hive queries and running Pig
scripts on data.
Used OOZIE Operational Services for batch processing and scheduling workflows dynamically.
Extensively worked on creating End-End data pipeline orchestration using Oozie.
California State University, Long Beach CA. Mar 2014 - Jan
2015
Graduate Assistant (under: Prof. Mehrdad Aliasgari)
Responsibilities:
Been part of developing website for purchasing and managing parking permits for students &
Employees.
Secure online payment through https implemented on openSSL.
Direct access and validation from the skidata parking machines to the internal database.
Worked on NetBeans IDE 8.0.1 for implementing project in JAVA.
Implemented strong Encryption using AES algorithm (AES/CBC/PKCS7 Padding) between
parking machines and the database.
Implemented Hybrid Encryption using AES and ElGamal algorithm between internal server and
the payment gateway.
Achieved Message Integrity using Cryptographic Hash functions (HMAC SHA256).
Deployed 2 separate servers to handle parking system and internal access to database using
Apache Tomcat and Glassfish server respectively.
Created intermediate Database to store transactions using MySQL workbench 6.2 CE.
Used Hibernator tool to connect between the internal database and the server.
Created Digital certificates for the website using SHA256 signing algorithm.
Generated public Key using RSA 2048 bytes.
Used keytool Explorer 5.1 to get the certificates signed from Certificate Authority.
Implemented Password based encryption using in Java: salt and key derivation.
Blue Cross Blue Shield
Cognizant Technology Solutions, Hyderabad India. Jan 2013 Nov
2013
Associate Data Engineer

RAVIKANTH MACHAPUR
Email:[email protected] Sr Big Data Engineer

Cloudera, HWX, MapR, AWS, Azure and GCP Consultant

Page 13 of

Project # 1: BCBSMN MembersEdge Application, Jan 2013- Apr 2013
Description: As a part of Claims Modernization Program BCBSMN would depend on
MembersEdge as the source of record for billing and billing finance transactions for the Individual
business migration. BCBSMN, for the Claims Modernization Program, has chosen NASCO Model
Office as its test region. NASCO is upgrading its current MembersEdge 2.5 to MembersEdge 3.0
which requires to implement all the functionalities covered in 2.5 are covered in 3.0 for smooth
processing.
Project # 2: BNCBSMN Health Reform & State Exchange, Feb 2013- Nov 2013
Description: As part of Administrative Simplification provisions of the Affordable Care Act of 2010
(ACA), build on the Health Insurance Portability and Accountability Act of 1996 (HIPAA) Minnesota
state wants to participate in the exchange program, which requires updating the existing systems to
the new business rules.
Responsibilities:
Developed the application using Struts Framework that leverages classical Model View Layer (MVC)
Architecture UML diagrams like use cases, class diagrams, interaction diagrams (sequence and
collaboration) and activity diagrams were used
Gathered business requirements and wrote functional specifications and detailed design documents
Extensively used Core Java, Servlets, JSP and XML
Designed the logical and physical data model, generated DDL scripts, and wrote DML scripts for
Oracle 9i database.
Implemented Enterprise Logging service using JMS and apache CXF.
Developed Unit Test Cases, and used JUNIT for unit testing of the application
Implemented Framework Component to consume ELS service.
Involved in designing user screens and validations using HTML, jQuery, Ext JS and JSP as per user
requirements.
Implemented JMS producer and Consumer using Mule ESB.
Wrote SQL queries, stored procedures, and triggers to perform back-end database operations
Designed Low Level design documents for ELS Service.
Closely worked with QA, Business and Architect to solve various Defects in quick and fast to meet
deadlines
Worked on Python scripting.
Worked on different modules in HealthCare like Billing and Finance.
Examination of Business Prerequisites and recognizing inconsistencies in the introductory stages to
guarantee minimization of expense and time.
Designed Test Plan, Test cases, test scenarios, expected results and prioritizing tests for the whole
project.
Identifying the areas that can be automated and defining the scope of automation.
Designed automation framework, developed test cases, executed and maintained the same.
Execution of test cases using Automation tools like SILK TEST, ZEENYX and SELENIUM.
Defect Raising and Defect tracking using test management tool using Borland s StarTeam.
Lead a team of four members in this project by assigning tasks on daily basis and coordination
between the team, onsite and higher management regarding the issues faced by the team.
Calculating Effort Estimation, Schedule Variance and Deviation Reports and Metrics for the project.
Worked on Test scenarios for GUI, Functionality, Security, Database and Regression Testing.
Executed the test cases and compared the expected results with actual results.
Strong command over Database Restoration and maintaining backups on SQL server 2000,2005.
Expertise on running database scripts, writing SQL queries, conducting test case reviews with onsite
and exclusively with clients.
Used REST for testing Web services.

RAVIKANTH MACHAPUR
Email:[email protected] Sr Big Data Engineer

Cloudera, HWX, MapR, AWS, Azure and GCP Consultant

Page 14 of

Kaiser Permanente
Cognizant Technology Solutions, Hyderabad India. Jun 2011 Dec
2012
Associate Data Analyst
Project # 1: MON/ROC Implementation, Jun 2011 Dec 2012
Description: Kaiser Permanente has undertaken an initiative to modernize and implement a new
claims platform (Dell s Xcelys) as part of their Claims and Encounter Strategy. This initiative is
expected to replace end-of-cycle legacy claims thereby contributing to process standardization,
automation of claims processing, accuracy in payment and reduction of administrative burdens.
Project # 2: CCES Implementation, Mar 2012-Dec 2012
Description: Kaiser Permanente is implementing a National Claims Platform to reduce the cost
for processing a claim, improve auto-adjudication rates, and deliver an extensible solution that can
adapt to emerging and future business needs and regulatory and compliance needs.
Responsibilities:
Worked on Various Tracks in HealthCare like Membership, Benefits, Billing and Finance.
Analysis of Business Requirements and identifying discrepancies in the initial stages to ensure
minimization of cost and time.
Strong knowledge in Developing Test Plan, Test cases, test scenarios, expected results and
prioritizing tests for various modules like membership, 834 EDI, finance, benefits.
Wrote test cases, test conditions and test scripts in MS-Excel and exported to Quality Center.
Hands on experience in maintaining the Change Request's list and updating the testing process.
Good understanding of the physical and logical data modeling, dimensional and relational schemas.
Actively participated in validation of transformations applied on source data to load target tables.
Extensively used SQL for retrieving data used for the data warehouse, Data Driven Tests to validate
the same scenario with different test data.
Designed Test Plan and Test Strategy by studying and analyzing Business Requirements of the
Project in detail.
Analyzing requirement specifications and SAD documentation to design Test Scenarios and Test
Cases.
Identifying, Raising and tracking of the defect.
Responsible for closing the defects being fixed.
Tested Web services on SoapUI.
Regular interaction with the onsite and development team to ensure quality and speedy recovery of
defects.
Worked on Complete Integration Testing between several third party Systems and applications like
BETS, CM, Xcelys, TMS and FS.
Presented functional demos to the client regarding the defects and the working of the application.
FreshDirect, Hyderabad India. Jan 2009 Dec
2010
Java DEVELOPER (IT Intern)
Responsibilities:
Designed and developed Web Services using Java/J2EE in WebLogic environment. Developed web
pages using Java Servlet, JSP, CSS, Java Script, DHTML, HTML5, and HTML. Added extensive
Struts validation.
Involve in the Analysis, Design, and Development and testing of business requirements.
Developed business logic in JAVA/J2EE technology.
Implemented business logic and generated WSDL for those web services using SOAP.
Worked on Developing JSP pages

RAVIKANTH MACHAPUR
Email:[email protected] Sr Big Data Engineer

Cloudera, HWX, MapR, AWS, Azure and GCP Consultant

Page 15 of

Implemented Struts Framework
Developed Business Logic using Java/J2EE
Modified Stored Procedures in MYSQL Database.
Developed the application using Spring Web MVC framework.
Worked with Spring Configuration files to add new content to the website.
Worked on the Spring DAO module and ORM using Hibernate. Used Hibernate Template and
HibernateDaoSupport for Spring-Hibernate Communication.
Configured Association Mappings such as one-one and one-many in Hibernate
Worked with JavaScript calls as the Search is triggered through JS calls when a Search key is
entered in the Search window
Worked on analyzing other Search engines to make use of best practices.
Collaborated with the Business team to fix defects.
Worked on XML, XSL and XHTML files.
Interacted with project management to understand, learn and to perform analysis of the Search
Techniques.
Used Ivy for dependency management.
Keywords: cprogramm cplusplus continuous integration continuous deployment quality analyst artificial intelligence machine learning javascript business intelligence sthree database active directory information technology container edition business works hewlett packard microsoft mississippi California Washington Wisconsin

To remove this resume please click here or send an email from [email protected] to [email protected] with subject as "delete" (without inverted commas)
[email protected];6410
Enter the captcha code and we will send and email at [email protected]
with a link to edit / delete this resume
Captcha Image: