Home

Veda Anand - Sr Big Data Engineer Cloudera, HWX, MapR, AWS, Azure and GCP Consultant
[email protected]
Location: Whitmore Lake, Michigan, USA
Relocation: No
Visa: GC
Resume file: Ravikanth_Machapur_Resume_D2024 (1)_1763388304649.docx
Please check the file(s) for viruses. Files are checked manually and then made available for download.
RAVIKANTH MACHAPUR
Email:[email protected] Sr Big Data Engineer

Cloudera, HWX, MapR, AWS, Azure and GCP Consultant

Page 1 of 16

PROFESSIONAL SUMMARY:
Over 16 years of experience in design, development, implementation of Software applications and BI/DWH
solutions. Experience in data discovery and advance analytics and building business solutions with
knowledge in developing strategic ideas for deploying Big Data solutions in both cloud and on-premise
environments, to efficiently solve Big Data processing requirements.
Build Advanced Analytics Applications on different eco systems MapR, Cloudera, HWX, GCP, Azure and
AWS.
Strong Understanding in distributed systems, RDBMS, large-scale & small-scale non-relational data
stores, map-reduce systems, database performance, data modeling, and multi-terabyte data warehouses.
Extensively used Hadoop open source tools like Hive, Hbase, Sqoop, Spark for ETL on Hadoop Cluster.
Worked with Different Clients across Health Care Insurance domain (BCBS, KP, Molina Health Care)
Worked with several Data Integrating and Replication tools like Informatica BDM, SAP BODS, Attunity
Replicate etc.
Store large datasets (TB PB scale), Run SQL queries efficiently, Power dashboards,
ML, and real-time analytic
No infrastructure management, Low-latency analytics, Built-in ML (BigQuery ML),
Integrates well with Dataflow, Pub/Sub, Composer
Cloud Composer is Google Cloud s managed Apache Airflow service.
Removes the overhead of managing Airflow infrastructure.
Serverless service for running Apache Beam pipelines.
No need to manage clusters, autoscaling is built-in
Provides a single programming model for both batch and streaming data.
Ideal for real-time analytics, ETL, and event processing.
Ensures data is accurate, complete, consistent, timely, and valid.
Involves monitoring quality metrics and applying rules (constraints, checks).
Centralized metadata catalogs (e.g., Dataplex, Data Catalog).
Stores information like schema, lineage, ownership, classifications (PII tags).
Ensures incoming data matches expected schema (fields, types, lengths).
Detects schema drift early (missing fields, type mismatches).
Defines roles: Data Owners, Data Stewards, Custodians.
Ensures accountability and proper governance processes.
Strong knowledge on system development lifecycles and project management on BI implementations.
Extensively used RDBMS like Oracle and SQL Server for developing different applications.
Build several Data Lakes to help different clients to perform their advance analysis on big data.
Work with Data science team to provide and feed data for AI, ML and Deep learning projects
Real-time experience in Hadoop Distributed files system, Hadoop framework and Parallel processing
implementation (MapR, AWS EMR,Cloudera) with hands on experience in HDFS, Map Reduce,
Pig/Hive, Hbase, Yarn, Sqoop, Spark, Java, RDBMS, Linux/Unix shell scripting and Linux internals.
Experience in writing UDF s and map reduce programs in java for Hive and Pig.
Procedural knowledge in cleansing and analyzing data using HiveQL, Pig Latin, and custom Map
Reduce programs in Java.
Created Kafka data pipelines for Google Ads platform to consumer latest customer profiles.
Experience in Data visualization using oracle Big data Discovery tool & IBM Cognos.

RAVIKANTH MACHAPUR
Email:[email protected] Sr Big Data Engineer

Cloudera, HWX, MapR, AWS, Azure and GCP Consultant

Page 2 of 16

Experience in Object Oriented Analysis Design (OOAD) and development of software using UML
Methodology, good knowledge of J2EE design patterns and Core Java design patterns.
Extensively used RDBMS like Oracle and SQL Server for developing different applications.
Experience in creating scripts and Macros using Microsoft Visual Studios to automate tasks.
Experience in working with GitHub Repository.
Experienced in designing software which enables a system which is secure and enforces authentication,
authorization, confidentiality, data integrity, accountability, availability and non-repudiation.
Have experience in web designing, web hosting and DNS configurations.
Other Experiences:
Have experience working with web designer tools like Adobe DreamWeaver CC, Wordpress & Joomla.
Proficient in Manual, Functional and Automation testing.
Also experienced in Smoke, Integration, Regression, Functional, Front End and Back End Testing.
Capable in developing/writing Test Plans, Test Cases, and Test Scripts based on User Requirements, and
SAD documentation.
Highly experienced in writing test cases and executing in HP Interactive Testing Tools: Quality Center,
Quick Test Professional (QTP).
Technical Skills:
Reporting Tools: Tableau 8.1
Big Data Ecosystem: HDFS, Map Reduce, Oozie, Hive, Pig, Sqoop, Flume, Zookeeper and HBase,
CAWA, Spark, spark-sql, Impala, Mapr-DB, Azure, VOCI, Oracle Big Data Discovery, Kafka, Nifi,
Hadoop Ecosystems: MapR, Cloudera, AWS EMR, HortonWorks.
Servers: Application Servers (WAS, Tomcat), Web Servers (IIS6, 7, IHS).
Operating Systems: Windows 2003 Enterprise Server, XP, 2000, UNIX, Red Hat Enterprise Linux
Server release 6.7
Databases: SQL Server 2005, SQL 2008, Oracle 9i/10g, DB2, MS Access2003, Teradata.Postgres,
Mysql, Mssql
Languages: C, C++, Java, XML, JSP/Servlets, Struts, spring, HTML, Python, PHP, JavaScript, jQuery,
Web services,Scala.
Data Modeling: Star-Schema and Snowflake-schema.
ETL Tools: Knowledge on Informatica & IBM Data stage 8.1,SSIS
EDUCATION:
Title of the Degree with Branch College/University Year of Passing
MASTER OF SCIENCE (COMPUTER
SCIENCE)

California State University Long
Beach, CA USA

2015

BACHELOR OF ENGINEERING
(INFORMATION TECHNOLOGY)

Vasavi College of
Engineering/Osmania University,
Andhra Pradesh, India

2011

Board of Intermediate Education,
Andhra Pradesh, India

Sri Chaitanya Junior
Kalasala,ECIL, Telangana, India
2007

Board of Secondary Education, Andhra
Pradesh, India

St.Ann s Grammar High School,
Malkajgiri, Hyderabad,
Telangana, India

2005

TRAININGS:
Azure Event Hub (Big Bets)
Semantic Modeling

RAVIKANTH MACHAPUR
Email:[email protected] Sr Big Data Engineer

Cloudera, HWX, MapR, AWS, Azure and GCP Consultant

Page 3 of 16

Cognitive Services
HDInsight
Digital & IOT, Advanced SPARQL
WORK EXPERIENCE:
Fandango, Beverly Hills CA March 2023 till date
Sr Big Data Architect/ Engineer
Client Description:
Fandango Media, LLC is an American ticketing company that sells movie tickets via their website as well
as through their mobile app.
Roles and Responsibilities:
Design, Plan, Implement and Responsible for Sales, Finance Data pipelines to Data Lake (S3) and Data
Warehouse (Redshift),
Process Data using EMR, AWS Lambda, step functions, talend jobs and Hadoop jobs.
Participate in the design, Architect, and implementation of CCPA on S3 and redshift.
Write code in Java for the pipelines to ingest data from different sources to flow through S3 and Redshift.
Designed Multple API s to orchestrate microservices using AWS API gateway and python Flask and
deployed docker contaiiners on AWS EKS.
Designed and developed dynamic data ETL/analytical pipelines using AWS Kubernetes and Python
Flask to meet real-time client needs.
Contributed to the optimization project for DBG Optimization Client Project, which involved migrating
to a new tech stack including AWS, Snowflake, Kubernetes, Docker, Bitbucket, and other newer
technologies.
Led the migration of existing pipelines from AWS Glue/Lambda and step functions to GCP using GCP
cloud functions. Utilized Airflow to create DAGs and interconnect different GCP functions.
Orchestrated the migration of 780 tables from Teradata to Snowflake using AWS Glue, AWS Lambda,
and AWS step functions. Leveraged the in-built orchestration tool to streamline the process.
Designed, created, and implemented various AWS Lambda functions, SNS, and SQS to send emails,
subscribe and trigger, improving the efficiency of data processing workflows.
Developed AWS Glue jobs to submit SQL scripts to Snowflake, ensuring smooth data integration
between different systems.
Created Execute Immediate scripts to optimize SQL code and control the flow of logic.
Successfully migrated over 150 BTEQ scripts to Snowflake scripts, ensuring data continuity and
accuracy.
Collaborated with SRE to create different projects on Bitbucket and designed CI/CD pipelines on
Bamboo, which improved the speed and quality of code deployment.
Worked with an offshore team of 15 to coordinate issues and manage the development progress, ensuring
timely delivery of projects.
Worked on Loading, Processing and Analyzing on Adobe Omniture Clickstream data coming from
different devices and for Both Fandango and Vudu Data pipelines.
Worked on Several Data pipelines using AWS lambda, AWS Glue, AWS Step Functions, AWS
Databrew. Apache Iveberg.
Monitor And Debug cloud data pipelines using job runs on Glue, Lamda and Cloud Watch logs.
Single Handedly Managed two different lines of Business ETL pipelines bought Rotten Tomatoes,
Fandango and Vudu.
Moved an Existing Legacy ETL pipelines to Cloud.

RAVIKANTH MACHAPUR
Email:[email protected] Sr Big Data Engineer

Cloudera, HWX, MapR, AWS, Azure and GCP Consultant

Page 4 of 16

Loaded 3rd party data calling different API end points and building a single source of raw data for
downstream reporting and applications.
Deploy code to dev, int and prod using built CICD pipelines in Jenkins.
Collaborate with different teams to communicate, negotiate and implement end to end solutions.
Developed several AWS lambda and AWS glue jobs on Python and deployed using terraform and cloud
formation tools.
Use EMR to process heavy load batch processing.
Used Hive, pyspark, Oozie, Kafka, Python to orchestrate jobs on EMR.
Managed Code repositories in BitBucket. Manage and Support about 43 data pipelines on JAWS, AWS,
Lamda and Talend.
Support BI team for analytics reports.
Extensively Used Postman and Insomnia tools for API testing.
Worked with DevOps to build Jenkins pipelines and deploy CI/CD pipelines,
Environment: AWS Redshift, S3, Java Spring, PostgreSQL, MS SQL, Python, AWS EMR, Github, Jenkins,
Veracode, Scala, Talend, Jaws, GCP, AWS EMR, Snowflake, Bamboo,
Deloitte Contingent Worker
Carelon (Anthem), Remote Oct 2021 March 2023
Sr Big Data Cloud Engineer
Client Description:
Carelon (Elevance Health-Anthem) is an American health insurance provider. The company's services
include medical, pharmaceutical, dental, behavioral health, long-term care, and disability plans through
affiliated companies such as Anthem Blue Cross and Blue Shield, Empire BlueCross BlueShield in New
York State, Anthem Blue Cross in California, Wellpoint, and Carelon. It is the largest for-profit managed
health care company in the Blue Cross Blue Shield Association.
Roles and Responsibilities:
Design, Plan, Implement and Responsible for Sales, Finance Data pipelines to Data Lake (S3)
and Data Warehouse (Redshift),
Process Data using EMR, AWS Lambda, step functions, talend jobs and Hadoop jobs.
Participate in the design, Architect, and implementation of CCPA on S3 and redshift.
Write code in Java for the pipelines to ingest data from different sources to flow through S3 and
Redshift.
Designed Multiple API s to orchestrate microservices using AWS API gateway and python
Flask and deployed docker containers on AWS EKS.
Designed and developed dynamic data ETL/analytical pipelines using AWS Kubernetes and
Python Flask to meet real-time client needs.
Contributed to the optimization project for DBG Optimization Client Project, which involved
migrating to a new tech stack including AWS, Snowflake, Kubernetes, Docker, Bitbucket, and
other newer technologies.
Led the migration of existing pipelines from AWS Glue/Lambda and step functions to GCP
using GCP cloud functions. Utilized Airflow to create DAGs and interconnect different GCP
functions.
Orchestrated the migration of 780 tables from Teradata to Snowflake using AWS Glue, AWS
Lambda, and AWS step functions. Leveraged the in-built orchestration tool to streamline the
process.

RAVIKANTH MACHAPUR
Email:[email protected] Sr Big Data Engineer

Cloudera, HWX, MapR, AWS, Azure and GCP Consultant

Page 5 of 16

Designed and developed interactive dashboards and reports using Oracle Orbit Analytics to
provide business insights.
Integrated data sources from Oracle databases to create customized visualizations and KPI
dashboards.
Created and maintained metadata models to support ad-hoc reporting and analysis.
Optimized performance of Orbit Analytics queries and dashboards for better response time and
scalability.
Implemented role-based access controls (RBAC) to secure data and reports.
Developed and maintained Oracle Discoverer workbooks, reports, and dashboards to support
business intelligence needs.
Designed and optimized queries for performance improvement and faster report generation.
Worked with Oracle E-Business Suite (EBS) to extract operational and financial data for
reporting.
Managed end-user access, permissions, and security settings for Discoverer reports.
Provided troubleshooting and debugging support for Discoverer reports and workbooks.
Assisted in migrating reports from Oracle Discoverer to modern BI tools due to deprecation.
Designed, created, and implemented various AWS Lambda functions, SNS, and SQS to send
emails, subscribe and trigger, improving the efficiency of data processing workflows.
Developed AWS Glue jobs to submit SQL scripts to Snowflake, ensuring smooth data
integration between different systems.
Created Execute Immediate scripts to optimize SQL code and control the flow of logic.
Successfully migrated over 150 BTEQ scripts to Snowflake scripts, ensuring data continuity
and accuracy.
Collaborated with SRE to create different projects on Bitbucket and designed CI/CD pipelines
on Bamboo, which improved the speed and quality of code deployment.
Worked with an offshore team of 15 to coordinate issues and manage the development
progress, ensuring timely delivery of projects.
Deployed more than 100 data pipelines using Talend ETL tool.
Worked on Loading, Processing and Analyzing on Adobe Omniture Clickstream data coming
from different devices and for Both Fandango and Vudu Data pipelines.
Worked on Several Data pipelines using AWS lambda, AWS Glue, AWS Step Functions, AWS
Data brew. Apache Iceberg.
Monitor And Debug cloud data pipelines using job runs on Glue, Lambda and Cloud Watch
logs.
Single Handedly Managed two different lines of Business ETL pipelines bought Rotten
Tomatoes, Fandango and Vudu.
Moved an Existing Legacy ETL pipelines to Cloud.
Loaded 3rd party data calling different API end points and building a single source of raw data
for downstream reporting and applications.
Deploy code to dev, int and prod using built CICD pipelines in Jenkins.
Collaborate with different teams to communicate, negotiate and implement end to end
solutions.
Developed several AWS lambda and AWS glue jobs on Python and deployed using terraform
and cloud formation tools.

RAVIKANTH MACHAPUR
Email:[email protected] Sr Big Data Engineer

Cloudera, HWX, MapR, AWS, Azure and GCP Consultant

Page 6 of 16
Use EMR to process heavy load batch processing.
Used Hive, pyspark, Oozie, Kafka, Python to orchestrate jobs on EMR.
Managed Code repositories in BitBucket. Manage and Support about 43 data pipelines on
JAWS, AWS, Lambda and Talend.
Support BI team for analytics reports.
Extensively Used Postman and Insomnia tools for API testing.
Worked with DevOps to build Jenkins pipelines and deploy CI/CD pipelines
Deployed more than 100 data pipelines using Talend ETL tool
Environment: AWS, S3, Python, Teradata, PostgreSQL, GCP , AWS EMR, Snowflake, Bamboo, Glue, AWS
Lambda, pyspark, GCP Cloud functions, GCS, Airflow,
Bitwise
Beverly Hills CA June 2019 Oct 2021
Sr Big Data Architect/ Engineer
Client Description:
Fandango Media, LLC is an American ticketing company that sells movie tickets via their website as well
as through their mobile app.
Roles and Responsibilities:
Design, Plan, Implement and Responsible for Sales, Finance Data pipelines to Data Lake (S3)
and Data Warehouse (Redshift),
Process Data using EMR, AWS Lambda, step functions, talend jobs and Hadoop jobs.
Participate in the design, Architect, and implementation of CCPA on S3 and redshift.
Write code in Java for the pipelines to ingest data from different sources to flow through S3 and
Redshift.
Worked on Loading, Processing and Analyzing on Adobe Omniture Clickstream data coming
from different devices and for Both Fandango and Vudu Data pipelines.
Worked on Several Data pipelines using AWS lambda, AWS Glue, AWS Step Functions, AWS
Databrew.
Monitor And Debug cloud data pipelines using job runs on Glue, Lamdda and Cloud Watch
logs.
Single Handedly Managed two different lines of Business ETL pipelines bought Rotten
Tomatoes, Fandango and Vudu.
Moved an Existing Legacy ETL pipelines to Cloud.
Loaded 3rd party data calling different API end points and building a single source of raw data
for downstream reporting and applications.
Deploy code to dev, int and prod using built CICD pipelines in Jenkins.
Collaborate with different teams to communicate, negotiate and implement end to end
solutions.
Developed several AWS lambda and AWS glue jobs on Python and deployed using terraform
and cloud formation tools.
Use EMR to process heavy load batch processing.
Used Hive, pyspark, Oozie, Kafka, Python to orchestrate jobs on EMR.
Mange Code repositories in BitBucket. Manage and Support about 43 data pipelines on JAWS,
AWS, Lamda and Talend.

RAVIKANTH MACHAPUR
Email:[email protected] Sr Big Data Engineer

Cloudera, HWX, MapR, AWS, Azure and GCP Consultant

Page 7 of 16

Support BI team for analytics reports.
Extensively Used Postman and Insomnia tools for API testing.
Environment: AWS Redshift, S3, Java Spring, PostgreSQL, MS SQL, Python, AWS EMR, Github, Jenkins,
Veracode, Scala, Talend, Jaws
Molina Health Insurance, Long Beach CA Nov 2018 May 2019
Sr Big Data Developer/ Independent Consultant
Client Description:
Molina Healthcare, a FORTUNE 500, multi-state health care organization, arranges for the delivery of health
care services and offers health information management solutions to nearly five million individuals and
families who receive their care through Medicaid, Medicare and other government-funded programs in fifteen
states.
Project Description:
Optum Data Exchange
HCG Groper Pipeline
Med Insights Pipeline Executive Dash Board
Roles and Responsibilities:
Helped client to understand performance issues on the cluster by analyzing the
Cloudera stats.
Designed and implemented ETL pipelines using Azure Data Factory for data
integration across diverse sources.
Developed and maintained robust data workflows in Azure Data Factory to ensure
seamless data flow and transformation.
Utilized Azure Data Factory s mapping data flows for complex data transformations,
ensuring data accuracy and consistency.
Integrated Azure Data Factory with various data storage solutions, including Azure
Blob Storage, Azure SQL Database, and Data Lake Storage.
Scheduled and monitored pipeline activities in Azure Data Factory to ensure timely
data processing and availability.
Implemented Azure Data Factory s Linked Services and Datasets to streamline data
connections and data set definitions.
Created custom activities in Azure Data Factory using Azure Functions and Databricks
for specialized data processing tasks.
Ensured data security and compliance by implementing Azure Data Factory s access
controls and data encryption features.
Automated deployment and versioning of Azure Data Factory pipelines using Azure
DevOps CI/CD pipelines.
Developed data models and transformations in dbt to standardize and optimize data
structures for analytics.
Created reusable macros and Jinja templates in dbt to enhance productivity and
maintain consistency across projects.

RAVIKANTH MACHAPUR
Email:[email protected] Sr Big Data Engineer

Cloudera, HWX, MapR, AWS, Azure and GCP Consultant

Page 8 of 16

Implemented rigorous data testing and validation strategies in dbt to ensure data quality
and reliability.
Collaborated with data analysts and engineers to design and build scalable dbt models
that support business intelligence needs.
Optimized SQL queries and transformations in dbt for performance and efficiency,
reducing data processing times.
Leveraged dbt documentation and lineage features to provide clear and comprehensive
data model documentation for stakeholders.
Designed and implemented Optum Data Extracts and HCG Grouper Extracts.
Improved memory and time performances for several existing pipelines.
Improved Solr Data Ingestion, data quality for Medley Pipeline.
Owned Member Sphere, Mosaic, designed and developed Optum and HCG pipelines.
Build pipelines using Scala, spark, sparksql, hive, hbase tools.
Loaded processed data into different consumption points like Apache solr, Hbase,
atscale cubes for visualization and search.
Automated the workflow using Talent Big Data.
Scheduled jobs using Autosys.
Used Bash Shell Scripting, Sqoop, AVRO, Hive, HDP, Redshift, Pig, Java, Map/Reduce daily to develop
ETL, batch processing, and data storage functionality.
Environment: Attunity, Oracle SQl, Cloudera, Spark, Talend workload automation, Jenkins, Git
Cognizant Technology Solutions. Sep 2017 Oct 2018
AAA Auto Club Of Southern California, Costa Mesa, CA.
Sr Big Data Consultant/ Digital Transformation (Cloudera)
Client Description:
The Automobile Club of Sothern California is the southern California affiliate of American Automobile
Association (AAA) federation of motor clubs. The Auto club was founded in 1990 in Los Angeles as one of
the nation s first motor clubs dedicated to improving roads, proposing traffic laws, and improvement of
overall driving conditions.
Project Description:
HortonWorks to Cloudera Migration.
TeraData Performace Optimization.
Digital Integration Google Adwords API.
Undisputed Leader Call Forecast, ETR Forecast
Attunity Replicate Solution Design
Speech Analytics
Sqoop Mainframe DB2, VSAM Solution design.
Roles and Responsibilities:
Responsible for moving all the production jobs from HortonWorks to Cloudera.
Leading a team of 2 Onsite and 4 Offshore.
Improving the Performance of Teradata queries wherever needed while migration.
Java API to build automation of Google Adwords Campaign.

RAVIKANTH MACHAPUR
Email:[email protected] Sr Big Data Engineer

Cloudera, HWX, MapR, AWS, Azure and GCP Consultant

Page 9 of 16

Maintaining weekly cubes refresh for TM1 to populate the latest data from the PROD screenshots.
Built models containing query subjects, query items, and namespaces from imported metadata.
Created Ad-hoc reports using Query Studio.
Fine-tuned and enhanced queries for the performance of the reports and Models.
Ability to work under stringent deadlines with teams as well as independent.
Lead the Undisputed Leader project.
Created Kafka Streaming for our Google Ads platform to stream real-time changes of the Customer Profile to
HBase for Google Ads App to target customers based on the latest profile.
Worked on Data pipelines to do transformations on Teradata.
Environment: SAP, Teradata 12.0. HortonWorks, Cloudera. IBM mAinframe, Oracle DB, SQL DB, Control M
workload automation
Cognizant Technology Solutions. Mar 2017 Sep 2017
Puget Sound Energy, Bellevue, WA.
Sr Cloud Architect (AWS EMR)/BI Lead
Client Description:
PUGET SOUND ENERGY, (PSE) is a Washington state energy utility providing electrical power and natural
gas primarily in the Puget Sound region of the northwest United States. The utility serves electricity to more
than 1.1 million customers in Island, King, Kitsap, Kittitas, Pierce, Skagit, Thurston, and Whatcom counties;
and provides natural gas to 750,000 customers in King, Kittitas, Lewis, Pierce, Snohomish and Thurston
counties. The company has a 6,000-square-mile (16,000 km2) electric and natural gas service area. PSE owns
coal, hydroelectric, natural gas and wind power-generating facilities, with more than 2,900 MW of capacity.
Roughly one-third each of PSE generation comes from coal, hydroelectric, and natural gas facilities, with a
small remainder coming from wind and energy efficiency programs.
Project Description:
PSE has embarked on a program called Get to Zero (GTZ), which will act as the foundational layer for
bringing about Digital transformation and enhanced customer centricity in the organization. The solution is
envisioned to integrate people, processes, and technology in Customer Service, Operations, Supply Chain,
Energy Efficiency, Workforce management and all support organizations. Consultant understands that as
part of this program, PSE is looking for the right specialized partner for creating a culture where data is
treated as a corporate asset, business decisions are data-driven, and analytics are used to make better-
informed decisions.
Roles and Responsibilities:
Worked with Business to define, identify & implement quickwins for Get2Zero GTZ program which will
deliver incremental value to the business, in collaboration with PSE.
Gathered and documented detailed business requirements to identify, and prioritize quick wins.
Engaged with the PSE team to determine the exact scope of quick wins to be delivered.
Assessed and requested for any infrastructure and environments required to implement the prioritized
quick wins.
Working with Business in requirements gathering and prepare the functional requirements document.
Analyzing the requirements and providing the estimation for the project based on the business request.
Design Cloud Architecture on AWS, Spin up cluster for developers during data processing, cleaning and
analysis.
Working on the Data Model & technical design & implementation for Hive ETL & Big Data Hadoop
projects.
Being part of critical model, design, software development and code reviews for decision making and
maintain quality standards.

RAVIKANTH MACHAPUR
Email:[email protected] Sr Big Data Engineer

Cloudera, HWX, MapR, AWS, Azure and GCP Consultant

Page 10 of 16

Worked with Infrastructure teams DBA, SAP BW, Middleware, and UNIX in setting up environments
during different levels of software lifecycle.
Working on performance tuning of the Big Data components to meet the SLA which is critical for the
customer.
Install and configure different tools like Jupyter notebook, Redshift, python libraries, Spark etc. Prepare
data for consumption into tableau visualization layer.
Developed AWS data pipeline, SNS for automating the dunning process on cloud.
Project Management for Onsite, Offshore team for assigning tasks and report development work.
Environment: HP Quality center 10.2.0 Bug tracking tool, Teradata 12.0.SAP BW, AWS EMR, Tableau, Apache
Zepllien, Jupyter Notebook, Anaconda, SAP, SAP BODS
Cognizant Technology Solutions,
Schneider National, Green Bay WI. Mar 2016 Apr 2016
Sr Big Data Advance Analytics Consultant
Client Description:
Schneider National Inc, Leading Provider of Premium truckload and Intermodal services. Has about 60 years
of transportation industry experience. Provides expert transportation solutions, number one carrier in dry-
freight, industrial glass and bulk motor carriers. Schneider is one of the largest truckload carriers in North
America, hauling 16,275 loads per day, with 11,300 company drivers, 9,600 company trucks and 31,000
trailers on the road. The company conducts business worldwide with 168 facilities, including a presence in
the United States, Canada, Mexico and China. Schneider s customers include more than two-thirds of the
FORTUNE 500 companies.
3
Project Description:
Schneider has decided to move with MapR eco system to continue their data analytics and visualizations. In
order to process huge data that generates daily they need a distributed network as Hadoop. Schneider has
initially started with Cloudera with a single node and did few POC s. In order to setup a robust environment
Schneider has bought services from MapR and to be deployed in their systems and move the existing projects
from Cloudera to MapR. Worked on Turndown Analytics, Sentiment Analysis, Structured content extraction,
Voice to Text Analytics, Image to Text Analytics.
Responsibilities:
Worked collaboratively with MapR vendor and client to manage and build out of large data clusters.
Helped design big data clusters and administered them.
Worked both independently and as an integral part of the development team.
Communicated all issues and participated in weekly strategy meetings.
Administered back end services and databases in the virtual environment.
Did several benchmark tests on hadoop sql engines (Hive, Spark-sql, Impala) and on different data
formats Avro, sequence, Parquet using different compression codecs like Gzip, snappy etc.
Worked on extracting text from Emails, Images and voice and created data pipelines.
Worked on sentiment analysis and structured content programs for creating text analytics app.
Created and Implemented applications on Oracle Big Data Discovery for Data visualization, Dashboard
and Reports.
Implemented system wide monitoring and alerts.
Installed & configured Hive, Impala, Oracle BigData Discovery, Hue, Apache Spark, Tika, Tika
Tesseract, Sqoop, Spark sql etc.
Importing and exporting data into MapRFS and Hive using Sqoop.
Used Bash Shell Scripting, Sqoop, AVRO, Hive, Impala, HDP, Pig, Java, Map/Reduce daily to develop
ETL, batch processing, and data storage functionality.

RAVIKANTH MACHAPUR
Email:[email protected] Sr Big Data Engineer

Cloudera, HWX, MapR, AWS, Azure and GCP Consultant

Page 11 of 16

Responsible for developing data pipeline using Flume, Sqoop and Pig to extract the data from weblogs
and store in HDFS.
Worked on loading all tables from the reference source database schema through Sqoop. Worked on
designed, coded and configured server side J2EE components like JSP, AWS and JAVA.
Collected data from different databases (i.e. Oracle, My Sql) to Hadoop. Used CA Workload Automation
for workflow scheduling and monitoring. .
Worked on Designing and Developing ETL Workflows using Java for processing data in MapRFS/Hbase
using Oozie.
Experienced in managing and reviewing Hadoop log files. Involved in moving all log files generated
from various sources to HDFS for further processing through Flume.
Involved in loading and transforming large sets of structured, semi structured and unstructured data from
relational databases into HDFS using Sqoop imports.
Developed Sqoop scripts to import export data from relational sources Teradata and handled incremental
loading on the customer, transaction data by date.
Developed simple and complex MapReduce programs in Java for Data Analysis on different data
formats.
Optimized MapReduce Jobs to use HDFS efficiently by using various compression mechanisms.
Worked on partitioning HIVE tables and running the scripts in parallel to reduce run-time of the scripts.
Worked on Data Serialization formats for converting Complex objects into sequence bits by using
AVRO, PARQUET, JSON, CSV formats.
Responsible for analyzing and cleansing raw data by performing Hive queries and running Pig scripts on
data. Created Hive tables, loaded data and wrote Hive queries that run within the map.
Implemented business logic by writing Pig UDF's in Java and used various UDFs from Piggybanks and
other sources
Environment: MapR eco system, ODI, Oracle Endeca.Oracle BigData Discovery, CA workload automation

SilverSpur Corporation, Cerritos CA. Jan 2015 Oct 2015
SR Architect/Big Data Consultant
Client Description:
Silver Spur Corporation is a packaging supply company that was founded in Cerritos, California in 1978.
Since inception, we have been best known for our amber glass bottles, however today, with access to more
than 45 furnaces, we can accommodate orders of all shapes, sizes, and colors at large volumes year-round.
This enables us to serve many different industries including nutraceutical, pharmaceutical, food & beverage,
cosmetic, and wine, beer, and liquor. We assure that both our custom and stock items are manufactured to the
highest quality standards and are regularly available in Amber, Green, Flint, and Cobalt Blue.
Responsibilities:
Developed MapReduce jobs in java for data cleaning and preprocessing.
Importing and exporting data into HDFS and Hive using Sqoop.
Responsible for developing data pipeline using Flume, Sqoop and Pig to extract the data from weblogs
and store in HDFS.
Worked on loading all tables from the reference source database schema through Sqoop.
Collected data from different databases (i.e. Oracle, My Sql) to Hadoop
Used Oozie and Zookeeper for workflow scheduling and monitoring.
Worked on Designing and Developing ETL Workflows using Java for processing data in HDFS/Hbase
using Oozie.
Experienced in managing and reviewing Hadoop log files.

RAVIKANTH MACHAPUR
Email:[email protected] Sr Big Data Engineer

Cloudera, HWX, MapR, AWS, Azure and GCP Consultant

Page 12 of 16

Involved in moving all log files generated from various sources to HDFS for further processing through
Flume.
Involved in loading and transforming large sets of structured, semi structured and unstructured data from
relational databases into HDFS using Sqoop imports.
Developed Sqoop scripts to import export data from relational sources and handled incremental loading
on the customer, transaction data by date.
Developed simple and complex MapReduce programs in Java for Data Analysis on different data
formats.
Optimized MapReduce Jobs to use HDFS efficiently by using various compression mechanisms.
Worked on partitioning HIVE tables and running the scripts in parallel to reduce run-time of the scripts.
Worked on Data Serialization formats for converting Complex objects into sequence bits by using
AVRO, PARQUET, JSON, CSV formats.
Responsible for analyzing and cleansing raw data by performing Hive queries and running Pig scripts on
data.
Used OOZIE Operational Services for batch processing and scheduling workflows dynamically.
Extensively worked on creating End-End data pipeline orchestration using Oozie.
California State University, Long Beach CA. Mar 2014 - Jan 2015
Graduate Assistant (under: Prof. Mehrdad Aliasgari)
Responsibilities:
Been part of developing website for purchasing and managing parking permits for students &
Employees.
Secure online payment through https implemented on openSSL.
Direct access and validation from the skidata parking machines to the internal database.
Worked on NetBeans IDE 8.0.1 for implementing project in JAVA.
Implemented strong Encryption using AES algorithm (AES/CBC/PKCS7 Padding) between parking
machines and the database.
Implemented Hybrid Encryption using AES and ElGamal algorithm between internal server and the
payment gateway.
Achieved Message Integrity using Cryptographic Hash functions (HMAC SHA256).
Deployed 2 separate servers to handle parking system and internal access to database using Apache
Tomcat and Glassfish server respectively.
Created intermediate Database to store transactions using MySQL workbench 6.2 CE.
Used Hibernator tool to connect between the internal database and the server.
Created Digital certificates for the website using SHA256 signing algorithm.
Generated public Key using RSA 2048 bytes.
Used keytool Explorer 5.1 to get the certificates signed from Certificate Authority.
Implemented Password based encryption using in Java: salt and key derivation.
Blue Cross Blue Shield
Cognizant Technology Solutions, Hyderabad India. Jan 2013 Nov 2013
Associate Data Engineer
Project # 1: BCBSMN MembersEdge Application, Jan 2013- Apr 2013
Description: As a part of Claims Modernization Program BCBSMN would depend on MembersEdge as the
source of record for billing and billing finance transactions for the Individual business migration. BCBSMN,
for the Claims Modernization Program, has chosen NASCO Model Office as its test region. NASCO is
upgrading its current MembersEdge 2.5 to MembersEdge 3.0 which requires to implement all the
functionalities covered in 2.5 are covered in 3.0 for smooth processing.

RAVIKANTH MACHAPUR
Email:[email protected] Sr Big Data Engineer

Cloudera, HWX, MapR, AWS, Azure and GCP Consultant

Page 13 of 16

Project # 2: BNCBSMN Health Reform & State Exchange, Feb 2013- Nov 2013
Description: As part of Administrative Simplification provisions of the Affordable Care Act of 2010 (ACA),
build on the Health Insurance Portability and Accountability Act of 1996 (HIPAA) Minnesota state wants to
participate in the exchange program, which requires updating the existing systems to the new business rules.
Responsibilities:
Developed the application using Struts Framework that leverages classical Model View Layer (MVC)
Architecture UML diagrams like use cases, class diagrams, interaction diagrams (sequence and collaboration)
and activity diagrams were used
Gathered business requirements and wrote functional specifications and detailed design documents
Extensively used Core Java, Servlets, JSP and XML
Designed the logical and physical data model, generated DDL scripts, and wrote DML scripts for Oracle 9i
database.
Implemented Enterprise Logging service using JMS and apache CXF.
Developed Unit Test Cases, and used JUNIT for unit testing of the application
Implemented Framework Component to consume ELS service.
Involved in designing user screens and validations using HTML, jQuery, Ext JS and JSP as per user
requirements.
Implemented JMS producer and Consumer using Mule ESB.
Wrote SQL queries, stored procedures, and triggers to perform back-end database operations
Designed Low Level design documents for ELS Service.
Closely worked with QA, Business and Architect to solve various Defects in quick and fast to meet deadlines
Worked on Python scripting.
Worked on different modules in HealthCare like Billing and Finance.
Examination of Business Prerequisites and recognizing inconsistencies in the introductory stages to guarantee
minimization of expense and time.
Designed Test Plan, Test cases, test scenarios, expected results and prioritizing tests for the whole project.
Identifying the areas that can be automated and defining the scope of automation.
Designed automation framework, developed test cases, executed and maintained the same.
Execution of test cases using Automation tools like SILK TEST, ZEENYX and SELENIUM.
Defect Raising and Defect tracking using test management tool using Borland s StarTeam.
Lead a team of four members in this project by assigning tasks on daily basis and coordination between the
team, onsite and higher management regarding the issues faced by the team.
Calculating Effort Estimation, Schedule Variance and Deviation Reports and Metrics for the project.
Worked on Test scenarios for GUI, Functionality, Security, Database and Regression Testing.
Executed the test cases and compared the expected results with actual results.
Strong command over Database Restoration and maintaining backups on SQL server 2000,2005.
Expertise on running database scripts, writing SQL queries, conducting test case reviews with onsite and
exclusively with clients.
Used REST for testing Web services.
Kaiser Permanente
Cognizant Technology Solutions, Hyderabad India. Jun 2011 Dec 2012
Associate Data Analyst
Project # 1: MON/ROC Implementation, Jun 2011 Dec 2012
Description: Kaiser Permanente has undertaken an initiative to modernize and implement a new claims
platform (Dell s Xcelys) as part of their Claims and Encounter Strategy. This initiative is expected to replace
end-of-cycle legacy claims thereby contributing to process standardization, automation of claims processing,
accuracy in payment and reduction of administrative burdens.
Project # 2: CCES Implementation, Mar 2012-Dec 2012

RAVIKANTH MACHAPUR
Email:[email protected] Sr Big Data Engineer

Cloudera, HWX, MapR, AWS, Azure and GCP Consultant

Page 14 of 16

Description: Kaiser Permanente is implementing a National Claims Platform to reduce the cost for
processing a claim, improve auto-adjudication rates, and deliver an extensible solution that can adapt to
emerging and future business needs and regulatory and compliance needs.
Responsibilities:
Worked on Various Tracks in HealthCare like Membership, Benefits, Billing and Finance.
Analysis of Business Requirements and identifying discrepancies in the initial stages to ensure minimization of
cost and time.
Strong knowledge in Developing Test Plan, Test cases, test scenarios, expected results and prioritizing tests for
various modules like membership, 834 EDI, finance, benefits.
Wrote test cases, test conditions and test scripts in MS-Excel and exported to Quality Center.
Hands on experience in maintaining the Change Request's list and updating the testing process.
Good understanding of the physical and logical data modeling, dimensional and relational schemas.
Actively participated in validation of transformations applied on source data to load target tables.
Extensively used SQL for retrieving data used for the data warehouse, Data Driven Tests to validate the same
scenario with different test data.
Designed Test Plan and Test Strategy by studying and analyzing Business Requirements of the Project in
detail.
Analyzing requirement specifications and SAD documentation to design Test Scenarios and Test Cases.
Identifying, Raising and tracking of the defect.
Responsible for closing the defects being fixed.
Tested Web services on SoapUI.
Regular interaction with the onsite and development team to ensure quality and speedy recovery of defects.
Worked on Complete Integration Testing between several third party Systems and applications like BETS,
CM, Xcelys, TMS and FS.
Presented functional demos to the client regarding the defects and the working of the application.
FreshDirect, Hyderabad India. Jan 2009 Dec 2010
Java DEVELOPER (IT Intern)
Responsibilities:
Designed and developed Web Services using Java/J2EE in WebLogic environment. Developed web pages
using Java Servlet, JSP, CSS, Java Script, DHTML, HTML5, and HTML. Added extensive Struts validation.
Involve in the Analysis, Design, and Development and testing of business requirements.
Developed business logic in JAVA/J2EE technology.
Implemented business logic and generated WSDL for those web services using SOAP.
Worked on Developing JSP pages
Implemented Struts Framework
Developed Business Logic using Java/J2EE
Modified Stored Procedures in MYSQL Database.
Developed the application using Spring Web MVC framework.
Worked with Spring Configuration files to add new content to the website.
Worked on the Spring DAO module and ORM using Hibernate. Used Hibernate Template and
HibernateDaoSupport for Spring-Hibernate Communication.
Configured Association Mappings such as one-one and one-many in Hibernate
Worked with JavaScript calls as the Search is triggered through JS calls when a Search key is entered in the
Search window
Worked on analyzing other Search engines to make use of best practices.
Collaborated with the Business team to fix defects.
Worked on XML, XSL and XHTML files.
Interacted with project management to understand, learn and to perform analysis of the Search Techniques.

RAVIKANTH MACHAPUR
Email:[email protected] Sr Big Data Engineer

Cloudera, HWX, MapR, AWS, Azure and GCP Consultant

Page 15 of 16

Used Ivy for dependency management.
Keywords: cprogramm cplusplus continuous integration continuous deployment quality analyst artificial intelligence machine learning javascript business intelligence sthree database active directory information technology container edition business works hewlett packard microsoft mississippi California Washington Wisconsin

To remove this resume please click here or send an email from [email protected] to [email protected] with subject as "delete" (without inverted commas)
[email protected];6430
Enter the captcha code and we will send and email at [email protected]
with a link to edit / delete this resume
Captcha Image: