| Veda Anand - Sr Big Data Engineer Cloudera, HWX, MapR, AWS, Azure and GCP Consultant |
| [email protected] |
| Location: Whitmore Lake, Michigan, USA |
| Relocation: No |
| Visa: GC |
| Resume file: Ravikanth_Machapur_Resume_D2024 (1)_1763044647253.docx Please check the file(s) for viruses. Files are checked manually and then made available for download. |
|
RAVIKANTH MACHAPUR
Email:[email protected] Sr Big Data Engineer Cloudera, HWX, MapR, AWS, Azure and GCP Consultant Page 1 of PROFESSIONAL SUMMARY: Over 16 years of experience in design, development, implementation of Software applications and BI/DWH solutions. Experience in data discovery and advance analytics and building business solutions with knowledge in developing strategic ideas for deploying Big Data solutions in both cloud and on-premise environments, to efficiently solve Big Data processing requirements. Build Advanced Analytics Applications on different eco systems MapR, Cloudera, HWX, GCP, Azure and AWS. Strong Understanding in distributed systems, RDBMS, large-scale & small-scale non-relational data stores, map-reduce systems, database performance, data modeling, and multi-terabyte data warehouses. Extensively used Hadoop open source tools like Hive, Hbase, Sqoop, Spark for ETL on Hadoop Cluster. Worked with Different Clients across Health Care Insurance domain (BCBS, KP, Molina Health Care) Worked with several Data Integrating and Replication tools like Informatica BDM, SAP BODS, Attunity Replicate etc. Strong knowledge on system development lifecycles and project management on BI implementations. Extensively used RDBMS like Oracle and SQL Server for developing different applications. Build several Data Lakes to help different clients to perform their advance analysis on big data. Work with Data science team to provide and feed data for AI, ML and Deep learning projects Real-time experience in Hadoop Distributed files system, Hadoop framework and Parallel processing implementation (MapR, AWS EMR,Cloudera) with hands on experience in HDFS, Map Reduce, Pig/Hive, Hbase, Yarn, Sqoop, Spark, Java, RDBMS, Linux/Unix shell scripting and Linux internals. Experience in writing UDF s and map reduce programs in java for Hive and Pig. Procedural knowledge in cleansing and analyzing data using HiveQL, Pig Latin, and custom Map Reduce programs in Java. Created Kafka data pipelines for Google Ads platform to consumer latest customer profiles. Experience in Data visualization using oracle Big data Discovery tool & IBM Cognos. Experience in Object Oriented Analysis Design (OOAD) and development of software using UML Methodology, good knowledge of J2EE design patterns and Core Java design patterns. Extensively used RDBMS like Oracle and SQL Server for developing different applications. Experience in creating scripts and Macros using Microsoft Visual Studios to automate tasks. Experience in working with GitHub Repository. Experienced in designing software which enables a system which is secure and enforces authentication, authorization, confidentiality, data integrity, accountability, availability and non- repudiation. Have experience in web designing, web hosting and DNS configurations. Other Experiences: Have experience working with web designer tools like Adobe DreamWeaver CC, Wordpress & Joomla. Proficient in Manual, Functional and Automation testing. Also experienced in Smoke, Integration, Regression, Functional, Front End and Back End Testing. Capable in developing/writing Test Plans, Test Cases, and Test Scripts based on User Requirements, and SAD documentation. Highly experienced in writing test cases and executing in HP Interactive Testing Tools: Quality Center, Quick Test Professional (QTP). RAVIKANTH MACHAPUR Email:[email protected] Sr Big Data Engineer Cloudera, HWX, MapR, AWS, Azure and GCP Consultant Page 2 of Technical Skills: Reporting Tools: Tableau 8.1 Big Data Ecosystem: HDFS, Map Reduce, Oozie, Hive, Pig, Sqoop, Flume, Zookeeper and HBase, CAWA, Spark, spark-sql, Impala, Mapr-DB, Azure, VOCI, Oracle Big Data Discovery, Kafka, Nifi, Hadoop Ecosystems: MapR, Cloudera, AWS EMR, HortonWorks. Servers: Application Servers (WAS, Tomcat), Web Servers (IIS6, 7, IHS). Operating Systems: Windows 2003 Enterprise Server, XP, 2000, UNIX, Red Hat Enterprise Linux Server release 6.7 Databases: SQL Server 2005, SQL 2008, Oracle 9i/10g, DB2, MS Access2003, Teradata.Postgres, Mysql, Mssql Languages: C, C++, Java, XML, JSP/Servlets, Struts, spring, HTML, Python, PHP, JavaScript, jQuery, Web services,Scala. Data Modeling: Star-Schema and Snowflake-schema. ETL Tools: Knowledge on Informatica & IBM Data stage 8.1,SSIS EDUCATION: Title of the Degree with Branch College/University Year of Passing MASTER OF SCIENCE (COMPUTER SCIENCE) California State University Long Beach, CA USA 2015 BACHELOR OF ENGINEERING (INFORMATION TECHNOLOGY) Vasavi College of Engineering/Osmania University, Andhra Pradesh, India 2011 Board of Intermediate Education, Andhra Pradesh, India Sri Chaitanya Junior Kalasala,ECIL, Telangana, India 2007 Board of Secondary Education, Andhra Pradesh, India St.Ann s Grammar High School, Malkajgiri, Hyderabad, Telangana, India 2005 TRAININGS: Azure Event Hub (Big Bets) Semantic Modeling Cognitive Services HDInsight Digital & IOT, Advanced SPARQL WORK EXPERIENCE: Fandango, Beverly Hills CA March 2023 till date Sr Big Data Architect/ Engineer Client Description: Fandango Media, LLC is an American ticketing company that sells movie tickets via their website as well as through their mobile app. RAVIKANTH MACHAPUR Email:[email protected] Sr Big Data Engineer Cloudera, HWX, MapR, AWS, Azure and GCP Consultant Page 3 of Roles and Responsibilities: Design, Plan, Implement and Responsible for Sales, Finance Data pipelines to Data Lake (S3) and Data Warehouse (Redshift), Process Data using EMR, AWS Lambda, step functions, talend jobs and Hadoop jobs. Participate in the design, Architect, and implementation of CCPA on S3 and redshift. Write code in Java for the pipelines to ingest data from different sources to flow through S3 and Redshift. Designed Multple API s to orchestrate microservices using AWS API gateway and python Flask and deployed docker contaiiners on AWS EKS. Designed and developed dynamic data ETL/analytical pipelines using AWS Kubernetes and Python Flask to meet real-time client needs. Contributed to the optimization project for DBG Optimization Client Project, which involved migrating to a new tech stack including AWS, Snowflake, Kubernetes, Docker, Bitbucket, and other newer technologies. Led the migration of existing pipelines from AWS Glue/Lambda and step functions to GCP using GCP cloud functions. Utilized Airflow to create DAGs and interconnect different GCP functions. Orchestrated the migration of 780 tables from Teradata to Snowflake using AWS Glue, AWS Lambda, and AWS step functions. Leveraged the in-built orchestration tool to streamline the process. Designed, created, and implemented various AWS Lambda functions, SNS, and SQS to send emails, subscribe and trigger, improving the efficiency of data processing workflows. Developed AWS Glue jobs to submit SQL scripts to Snowflake, ensuring smooth data integration between different systems. Created Execute Immediate scripts to optimize SQL code and control the flow of logic. Successfully migrated over 150 BTEQ scripts to Snowflake scripts, ensuring data continuity and accuracy. Collaborated with SRE to create different projects on Bitbucket and designed CI/CD pipelines on Bamboo, which improved the speed and quality of code deployment. Worked with an offshore team of 15 to coordinate issues and manage the development progress, ensuring timely delivery of projects. Worked on Loading, Processing and Analyzing on Adobe Omniture Clickstream data coming from different devices and for Both Fandango and Vudu Data pipelines. Worked on Several Data pipelines using AWS lambda, AWS Glue, AWS Step Functions, AWS Databrew. Apache Iveberg. Monitor And Debug cloud data pipelines using job runs on Glue, Lamda and Cloud Watch logs. Single Handedly Managed two different lines of Business ETL pipelines bought Rotten Tomatoes, Fandango and Vudu. Moved an Existing Legacy ETL pipelines to Cloud. Loaded 3rd party data calling different API end points and building a single source of raw data for downstream reporting and applications. Deploy code to dev, int and prod using built CICD pipelines in Jenkins. Collaborate with different teams to communicate, negotiate and implement end to end solutions. Developed several AWS lambda and AWS glue jobs on Python and deployed using terraform and cloud formation tools. Use EMR to process heavy load batch processing. Used Hive, pyspark, Oozie, Kafka, Python to orchestrate jobs on EMR. Managed Code repositories in BitBucket. Manage and Support about 43 data pipelines on JAWS, AWS, Lamda and Talend. RAVIKANTH MACHAPUR Email:[email protected] Sr Big Data Engineer Cloudera, HWX, MapR, AWS, Azure and GCP Consultant Page 4 of Support BI team for analytics reports. Extensively Used Postman and Insomnia tools for API testing. Worked with DevOps to build Jenkins pipelines and deploy CI/CD pipelines, Environment: AWS Redshift, S3, Java Spring, PostgreSQL, MS SQL, Python, AWS EMR, Github, Jenkins, Veracode, Scala, Talend, Jaws, GCP, AWS EMR, Snowflake, Bamboo, Deloitte Contingent Worker Carelon (Anthem), Remote Oct 2021 March 2023 Sr Big Data Cloud Engineer Client Description: Carelon (Elevance Health-Anthem) is an American health insurance provider. The company's services include medical, pharmaceutical, dental, behavioral health, long-term care, and disability plans through affiliated companies such as Anthem Blue Cross and Blue Shield, Empire BlueCross BlueShield in New York State, Anthem Blue Cross in California, Wellpoint, and Carelon. It is the largest for-profit managed health care company in the Blue Cross Blue Shield Association. Roles and Responsibilities: Design, Plan, Implement and Responsible for Sales, Finance Data pipelines to Data Lake (S3) and Data Warehouse (Redshift), Process Data using EMR, AWS Lambda, step functions, talend jobs and Hadoop jobs. Participate in the design, Architect, and implementation of CCPA on S3 and redshift. Write code in Java for the pipelines to ingest data from different sources to flow through S3 and Redshift. Designed Multiple API s to orchestrate microservices using AWS API gateway and python Flask and deployed docker containers on AWS EKS. Designed and developed dynamic data ETL/analytical pipelines using AWS Kubernetes and Python Flask to meet real-time client needs. Contributed to the optimization project for DBG Optimization Client Project, which involved migrating to a new tech stack including AWS, Snowflake, Kubernetes, Docker, Bitbucket, and other newer technologies. Led the migration of existing pipelines from AWS Glue/Lambda and step functions to GCP using GCP cloud functions. Utilized Airflow to create DAGs and interconnect different GCP functions. Orchestrated the migration of 780 tables from Teradata to Snowflake using AWS Glue, AWS Lambda, and AWS step functions. Leveraged the in-built orchestration tool to streamline the process. Designed and developed interactive dashboards and reports using Oracle Orbit Analytics to provide business insights. Integrated data sources from Oracle databases to create customized visualizations and KPI dashboards. Created and maintained metadata models to support ad-hoc reporting and analysis. Optimized performance of Orbit Analytics queries and dashboards for better response time and scalability. Implemented role-based access controls (RBAC) to secure data and reports. RAVIKANTH MACHAPUR Email:[email protected] Sr Big Data Engineer Cloudera, HWX, MapR, AWS, Azure and GCP Consultant Page 5 of Developed and maintained Oracle Discoverer workbooks, reports, and dashboards to support business intelligence needs. Designed and optimized queries for performance improvement and faster report generation. Worked with Oracle E-Business Suite (EBS) to extract operational and financial data for reporting. Managed end-user access, permissions, and security settings for Discoverer reports. Provided troubleshooting and debugging support for Discoverer reports and workbooks. Assisted in migrating reports from Oracle Discoverer to modern BI tools due to deprecation. Designed, created, and implemented various AWS Lambda functions, SNS, and SQS to send emails, subscribe and trigger, improving the efficiency of data processing workflows. Developed AWS Glue jobs to submit SQL scripts to Snowflake, ensuring smooth data integration between different systems. Created Execute Immediate scripts to optimize SQL code and control the flow of logic. Successfully migrated over 150 BTEQ scripts to Snowflake scripts, ensuring data continuity and accuracy. Collaborated with SRE to create different projects on Bitbucket and designed CI/CD pipelines on Bamboo, which improved the speed and quality of code deployment. Worked with an offshore team of 15 to coordinate issues and manage the development progress, ensuring timely delivery of projects. Deployed more than 100 data pipelines using Talend ETL tool. Worked on Loading, Processing and Analyzing on Adobe Omniture Clickstream data coming from different devices and for Both Fandango and Vudu Data pipelines. Worked on Several Data pipelines using AWS lambda, AWS Glue, AWS Step Functions, AWS Data brew. Apache Iceberg. Monitor And Debug cloud data pipelines using job runs on Glue, Lambda and Cloud Watch logs. Single Handedly Managed two different lines of Business ETL pipelines bought Rotten Tomatoes, Fandango and Vudu. Moved an Existing Legacy ETL pipelines to Cloud. Loaded 3rd party data calling different API end points and building a single source of raw data for downstream reporting and applications. Deploy code to dev, int and prod using built CICD pipelines in Jenkins. Collaborate with different teams to communicate, negotiate and implement end to end solutions. Developed several AWS lambda and AWS glue jobs on Python and deployed using terraform and cloud formation tools. Use EMR to process heavy load batch processing. Used Hive, pyspark, Oozie, Kafka, Python to orchestrate jobs on EMR. Managed Code repositories in BitBucket. Manage and Support about 43 data pipelines on JAWS, AWS, Lambda and Talend. Support BI team for analytics reports. Extensively Used Postman and Insomnia tools for API testing. Worked with DevOps to build Jenkins pipelines and deploy CI/CD pipelines Deployed more than 100 data pipelines using Talend ETL tool RAVIKANTH MACHAPUR Email:[email protected] Sr Big Data Engineer Cloudera, HWX, MapR, AWS, Azure and GCP Consultant Page 6 of Environment: AWS, S3, Python, Teradata, PostgreSQL, GCP , AWS EMR, Snowflake, Bamboo, Glue, AWS Lambda, pyspark, GCP Cloud functions, GCS, Airflow, Bitwise Beverly Hills CA June 2019 Oct 2021 Sr Big Data Architect/ Engineer Client Description: Fandango Media, LLC is an American ticketing company that sells movie tickets via their website as well as through their mobile app. Roles and Responsibilities: Design, Plan, Implement and Responsible for Sales, Finance Data pipelines to Data Lake (S3) and Data Warehouse (Redshift), Process Data using EMR, AWS Lambda, step functions, talend jobs and Hadoop jobs. Participate in the design, Architect, and implementation of CCPA on S3 and redshift. Write code in Java for the pipelines to ingest data from different sources to flow through S3 and Redshift. Worked on Loading, Processing and Analyzing on Adobe Omniture Clickstream data coming from different devices and for Both Fandango and Vudu Data pipelines. Worked on Several Data pipelines using AWS lambda, AWS Glue, AWS Step Functions, AWS Databrew. Monitor And Debug cloud data pipelines using job runs on Glue, Lamdda and Cloud Watch logs. Single Handedly Managed two different lines of Business ETL pipelines bought Rotten Tomatoes, Fandango and Vudu. Moved an Existing Legacy ETL pipelines to Cloud. Loaded 3rd party data calling different API end points and building a single source of raw data for downstream reporting and applications. Deploy code to dev, int and prod using built CICD pipelines in Jenkins. Collaborate with different teams to communicate, negotiate and implement end to end solutions. Developed several AWS lambda and AWS glue jobs on Python and deployed using terraform and cloud formation tools. Use EMR to process heavy load batch processing. Used Hive, pyspark, Oozie, Kafka, Python to orchestrate jobs on EMR. Mange Code repositories in BitBucket. Manage and Support about 43 data pipelines on JAWS, AWS, Lamda and Talend. Support BI team for analytics reports. Extensively Used Postman and Insomnia tools for API testing. Environment: AWS Redshift, S3, Java Spring, PostgreSQL, MS SQL, Python, AWS EMR, Github, Jenkins, Veracode, Scala, Talend, Jaws Molina Health Insurance, Long Beach CA Nov 2018 May 2019 Sr Big Data Developer/ Independent Consultant RAVIKANTH MACHAPUR Email:[email protected] Sr Big Data Engineer Cloudera, HWX, MapR, AWS, Azure and GCP Consultant Page 7 of Client Description: Molina Healthcare, a FORTUNE 500, multi-state health care organization, arranges for the delivery of health care services and offers health information management solutions to nearly five million individuals and families who receive their care through Medicaid, Medicare and other government- funded programs in fifteen states. Project Description: Optum Data Exchange HCG Groper Pipeline Med Insights Pipeline Executive Dash Board Roles and Responsibilities: Helped client to understand performance issues on the cluster by analyzing the Cloudera stats. Designed and implemented ETL pipelines using Azure Data Factory for data integration across diverse sources. Developed and maintained robust data workflows in Azure Data Factory to ensure seamless data flow and transformation. Utilized Azure Data Factory s mapping data flows for complex data transformations, ensuring data accuracy and consistency. Integrated Azure Data Factory with various data storage solutions, including Azure Blob Storage, Azure SQL Database, and Data Lake Storage. Scheduled and monitored pipeline activities in Azure Data Factory to ensure timely data processing and availability. Implemented Azure Data Factory s Linked Services and Datasets to streamline data connections and data set definitions. Created custom activities in Azure Data Factory using Azure Functions and Databricks for specialized data processing tasks. Ensured data security and compliance by implementing Azure Data Factory s access controls and data encryption features. Automated deployment and versioning of Azure Data Factory pipelines using Azure DevOps CI/CD pipelines. Developed data models and transformations in dbt to standardize and optimize data structures for analytics. Created reusable macros and Jinja templates in dbt to enhance productivity and maintain consistency across projects. Implemented rigorous data testing and validation strategies in dbt to ensure data quality and reliability. Collaborated with data analysts and engineers to design and build scalable dbt models that support business intelligence needs. Optimized SQL queries and transformations in dbt for performance and efficiency, reducing data processing times. Leveraged dbt documentation and lineage features to provide clear and comprehensive data model documentation for stakeholders. RAVIKANTH MACHAPUR Email:[email protected] Sr Big Data Engineer Cloudera, HWX, MapR, AWS, Azure and GCP Consultant Page 8 of Designed and implemented Optum Data Extracts and HCG Grouper Extracts. Improved memory and time performances for several existing pipelines. Improved Solr Data Ingestion, data quality for Medley Pipeline. Owned Member Sphere, Mosaic, designed and developed Optum and HCG pipelines. Build pipelines using Scala, spark, sparksql, hive, hbase tools. Loaded processed data into different consumption points like Apache solr, Hbase, atscale cubes for visualization and search. Automated the workflow using Talent Big Data. Scheduled jobs using Autosys. Used Bash Shell Scripting, Sqoop, AVRO, Hive, HDP, Redshift, Pig, Java, Map/Reduce daily to develop ETL, batch processing, and data storage functionality. Environment: Attunity, Oracle SQl, Cloudera, Spark, Talend workload automation, Jenkins, Git Cognizant Technology Solutions. Sep 2017 Oct 2018 AAA Auto Club Of Southern California, Costa Mesa, CA. Sr Big Data Consultant/ Digital Transformation (Cloudera) Client Description: The Automobile Club of Sothern California is the southern California affiliate of American Automobile Association (AAA) federation of motor clubs. The Auto club was founded in 1990 in Los Angeles as one of the nation s first motor clubs dedicated to improving roads, proposing traffic laws, and improvement of overall driving conditions. Project Description: HortonWorks to Cloudera Migration. TeraData Performace Optimization. Digital Integration Google Adwords API. Undisputed Leader Call Forecast, ETR Forecast Attunity Replicate Solution Design Speech Analytics Sqoop Mainframe DB2, VSAM Solution design. Roles and Responsibilities: Responsible for moving all the production jobs from HortonWorks to Cloudera. Leading a team of 2 Onsite and 4 Offshore. Improving the Performance of Teradata queries wherever needed while migration. Java API to build automation of Google Adwords Campaign. Maintaining weekly cubes refresh for TM1 to populate the latest data from the PROD screenshots. Built models containing query subjects, query items, and namespaces from imported metadata. Created Ad-hoc reports using Query Studio. Fine-tuned and enhanced queries for the performance of the reports and Models. Ability to work under stringent deadlines with teams as well as independent. Lead the Undisputed Leader project. RAVIKANTH MACHAPUR Email:[email protected] Sr Big Data Engineer Cloudera, HWX, MapR, AWS, Azure and GCP Consultant Page 9 of Created Kafka Streaming for our Google Ads platform to stream real-time changes of the Customer Profile to HBase for Google Ads App to target customers based on the latest profile. Worked on Data pipelines to do transformations on Teradata. Environment: SAP, Teradata 12.0. HortonWorks, Cloudera. IBM mAinframe, Oracle DB, SQL DB, Control M workload automation Cognizant Technology Solutions. Mar 2017 Sep 2017 Puget Sound Energy, Bellevue, WA. Sr Cloud Architect (AWS EMR)/BI Lead Client Description: PUGET SOUND ENERGY, (PSE) is a Washington state energy utility providing electrical power and natural gas primarily in the Puget Sound region of the northwest United States. The utility serves electricity to more than 1.1 million customers in Island, King, Kitsap, Kittitas, Pierce, Skagit, Thurston, and Whatcom counties; and provides natural gas to 750,000 customers in King, Kittitas, Lewis, Pierce, Snohomish and Thurston counties. The company has a 6,000-square-mile (16,000 km2) electric and natural gas service area. PSE owns coal, hydroelectric, natural gas and wind power-generating facilities, with more than 2,900 MW of capacity. Roughly one-third each of PSE generation comes from coal, hydroelectric, and natural gas facilities, with a small remainder coming from wind and energy efficiency programs. Project Description: PSE has embarked on a program called Get to Zero (GTZ), which will act as the foundational layer for bringing about Digital transformation and enhanced customer centricity in the organization. The solution is envisioned to integrate people, processes, and technology in Customer Service, Operations, Supply Chain, Energy Efficiency, Workforce management and all support organizations. Consultant understands that as part of this program, PSE is looking for the right specialized partner for creating a culture where data is treated as a corporate asset, business decisions are data-driven, and analytics are used to make better-informed decisions. Roles and Responsibilities: Worked with Business to define, identify & implement quickwins for Get2Zero GTZ program which will deliver incremental value to the business, in collaboration with PSE. Gathered and documented detailed business requirements to identify, and prioritize quick wins. Engaged with the PSE team to determine the exact scope of quick wins to be delivered. Assessed and requested for any infrastructure and environments required to implement the prioritized quick wins. Working with Business in requirements gathering and prepare the functional requirements document. Analyzing the requirements and providing the estimation for the project based on the business request. Design Cloud Architecture on AWS, Spin up cluster for developers during data processing, cleaning and analysis. Working on the Data Model & technical design & implementation for Hive ETL & Big Data Hadoop projects. Being part of critical model, design, software development and code reviews for decision making and maintain quality standards. Worked with Infrastructure teams DBA, SAP BW, Middleware, and UNIX in setting up environments during different levels of software lifecycle. Working on performance tuning of the Big Data components to meet the SLA which is critical for the customer. RAVIKANTH MACHAPUR Email:[email protected] Sr Big Data Engineer Cloudera, HWX, MapR, AWS, Azure and GCP Consultant Page 10 of Install and configure different tools like Jupyter notebook, Redshift, python libraries, Spark etc. Prepare data for consumption into tableau visualization layer. Developed AWS data pipeline, SNS for automating the dunning process on cloud. Project Management for Onsite, Offshore team for assigning tasks and report development work. Environment: HP Quality center 10.2.0 Bug tracking tool, Teradata 12.0.SAP BW, AWS EMR, Tableau, Apache Zepllien, Jupyter Notebook, Anaconda, SAP, SAP BODS Cognizant Technology Solutions, Schneider National, Green Bay WI. Mar 2016 Apr 2016 Sr Big Data Advance Analytics Consultant Client Description: Schneider National Inc, Leading Provider of Premium truckload and Intermodal services. Has about 60 years of transportation industry experience. Provides expert transportation solutions, number one carrier in dry-freight, industrial glass and bulk motor carriers. Schneider is one of the largest truckload carriers in North America, hauling 16,275 loads per day, with 11,300 company drivers, 9,600 company trucks and 31,000 trailers on the road. The company conducts business worldwide with 168 facilities, including a presence in the United States, Canada, Mexico and China. Schneider s customers include more than two-thirds of the FORTUNE 500 companies. 3 Project Description: Schneider has decided to move with MapR eco system to continue their data analytics and visualizations. In order to process huge data that generates daily they need a distributed network as Hadoop. Schneider has initially started with Cloudera with a single node and did few POC s. In order to setup a robust environment Schneider has bought services from MapR and to be deployed in their systems and move the existing projects from Cloudera to MapR. Worked on Turndown Analytics, Sentiment Analysis, Structured content extraction, Voice to Text Analytics, Image to Text Analytics. Responsibilities: Worked collaboratively with MapR vendor and client to manage and build out of large data clusters. Helped design big data clusters and administered them. Worked both independently and as an integral part of the development team. Communicated all issues and participated in weekly strategy meetings. Administered back end services and databases in the virtual environment. Did several benchmark tests on hadoop sql engines (Hive, Spark-sql, Impala) and on different data formats Avro, sequence, Parquet using different compression codecs like Gzip, snappy etc. Worked on extracting text from Emails, Images and voice and created data pipelines. Worked on sentiment analysis and structured content programs for creating text analytics app. Created and Implemented applications on Oracle Big Data Discovery for Data visualization, Dashboard and Reports. Implemented system wide monitoring and alerts. Installed & configured Hive, Impala, Oracle BigData Discovery, Hue, Apache Spark, Tika, Tika Tesseract, Sqoop, Spark sql etc. Importing and exporting data into MapRFS and Hive using Sqoop. Used Bash Shell Scripting, Sqoop, AVRO, Hive, Impala, HDP, Pig, Java, Map/Reduce daily to develop ETL, batch processing, and data storage functionality. RAVIKANTH MACHAPUR Email:[email protected] Sr Big Data Engineer Cloudera, HWX, MapR, AWS, Azure and GCP Consultant Page 11 of Responsible for developing data pipeline using Flume, Sqoop and Pig to extract the data from weblogs and store in HDFS. Worked on loading all tables from the reference source database schema through Sqoop. Worked on designed, coded and configured server side J2EE components like JSP, AWS and JAVA. Collected data from different databases (i.e. Oracle, My Sql) to Hadoop. Used CA Workload Automation for workflow scheduling and monitoring. . Worked on Designing and Developing ETL Workflows using Java for processing data in MapRFS/Hbase using Oozie. Experienced in managing and reviewing Hadoop log files. Involved in moving all log files generated from various sources to HDFS for further processing through Flume. Involved in loading and transforming large sets of structured, semi structured and unstructured data from relational databases into HDFS using Sqoop imports. Developed Sqoop scripts to import export data from relational sources Teradata and handled incremental loading on the customer, transaction data by date. Developed simple and complex MapReduce programs in Java for Data Analysis on different data formats. Optimized MapReduce Jobs to use HDFS efficiently by using various compression mechanisms. Worked on partitioning HIVE tables and running the scripts in parallel to reduce run-time of the scripts. Worked on Data Serialization formats for converting Complex objects into sequence bits by using AVRO, PARQUET, JSON, CSV formats. Responsible for analyzing and cleansing raw data by performing Hive queries and running Pig scripts on data. Created Hive tables, loaded data and wrote Hive queries that run within the map. Implemented business logic by writing Pig UDF's in Java and used various UDFs from Piggybanks and other sources Environment: MapR eco system, ODI, Oracle Endeca.Oracle BigData Discovery, CA workload automation SilverSpur Corporation, Cerritos CA. Jan 2015 Oct 2015 SR Architect/Big Data Consultant Client Description: Silver Spur Corporation is a packaging supply company that was founded in Cerritos, California in 1978. Since inception, we have been best known for our amber glass bottles, however today, with access to more than 45 furnaces, we can accommodate orders of all shapes, sizes, and colors at large volumes year-round. This enables us to serve many different industries including nutraceutical, pharmaceutical, food & beverage, cosmetic, and wine, beer, and liquor. We assure that both our custom and stock items are manufactured to the highest quality standards and are regularly available in Amber, Green, Flint, and Cobalt Blue. Responsibilities: Developed MapReduce jobs in java for data cleaning and preprocessing. Importing and exporting data into HDFS and Hive using Sqoop. Responsible for developing data pipeline using Flume, Sqoop and Pig to extract the data from weblogs and store in HDFS. Worked on loading all tables from the reference source database schema through Sqoop. Collected data from different databases (i.e. Oracle, My Sql) to Hadoop Used Oozie and Zookeeper for workflow scheduling and monitoring. RAVIKANTH MACHAPUR Email:[email protected] Sr Big Data Engineer Cloudera, HWX, MapR, AWS, Azure and GCP Consultant Page 12 of Worked on Designing and Developing ETL Workflows using Java for processing data in HDFS/Hbase using Oozie. Experienced in managing and reviewing Hadoop log files. Involved in moving all log files generated from various sources to HDFS for further processing through Flume. Involved in loading and transforming large sets of structured, semi structured and unstructured data from relational databases into HDFS using Sqoop imports. Developed Sqoop scripts to import export data from relational sources and handled incremental loading on the customer, transaction data by date. Developed simple and complex MapReduce programs in Java for Data Analysis on different data formats. Optimized MapReduce Jobs to use HDFS efficiently by using various compression mechanisms. Worked on partitioning HIVE tables and running the scripts in parallel to reduce run-time of the scripts. Worked on Data Serialization formats for converting Complex objects into sequence bits by using AVRO, PARQUET, JSON, CSV formats. Responsible for analyzing and cleansing raw data by performing Hive queries and running Pig scripts on data. Used OOZIE Operational Services for batch processing and scheduling workflows dynamically. Extensively worked on creating End-End data pipeline orchestration using Oozie. California State University, Long Beach CA. Mar 2014 - Jan 2015 Graduate Assistant (under: Prof. Mehrdad Aliasgari) Responsibilities: Been part of developing website for purchasing and managing parking permits for students & Employees. Secure online payment through https implemented on openSSL. Direct access and validation from the skidata parking machines to the internal database. Worked on NetBeans IDE 8.0.1 for implementing project in JAVA. Implemented strong Encryption using AES algorithm (AES/CBC/PKCS7 Padding) between parking machines and the database. Implemented Hybrid Encryption using AES and ElGamal algorithm between internal server and the payment gateway. Achieved Message Integrity using Cryptographic Hash functions (HMAC SHA256). Deployed 2 separate servers to handle parking system and internal access to database using Apache Tomcat and Glassfish server respectively. Created intermediate Database to store transactions using MySQL workbench 6.2 CE. Used Hibernator tool to connect between the internal database and the server. Created Digital certificates for the website using SHA256 signing algorithm. Generated public Key using RSA 2048 bytes. Used keytool Explorer 5.1 to get the certificates signed from Certificate Authority. Implemented Password based encryption using in Java: salt and key derivation. Blue Cross Blue Shield Cognizant Technology Solutions, Hyderabad India. Jan 2013 Nov 2013 Associate Data Engineer RAVIKANTH MACHAPUR Email:[email protected] Sr Big Data Engineer Cloudera, HWX, MapR, AWS, Azure and GCP Consultant Page 13 of Project # 1: BCBSMN MembersEdge Application, Jan 2013- Apr 2013 Description: As a part of Claims Modernization Program BCBSMN would depend on MembersEdge as the source of record for billing and billing finance transactions for the Individual business migration. BCBSMN, for the Claims Modernization Program, has chosen NASCO Model Office as its test region. NASCO is upgrading its current MembersEdge 2.5 to MembersEdge 3.0 which requires to implement all the functionalities covered in 2.5 are covered in 3.0 for smooth processing. Project # 2: BNCBSMN Health Reform & State Exchange, Feb 2013- Nov 2013 Description: As part of Administrative Simplification provisions of the Affordable Care Act of 2010 (ACA), build on the Health Insurance Portability and Accountability Act of 1996 (HIPAA) Minnesota state wants to participate in the exchange program, which requires updating the existing systems to the new business rules. Responsibilities: Developed the application using Struts Framework that leverages classical Model View Layer (MVC) Architecture UML diagrams like use cases, class diagrams, interaction diagrams (sequence and collaboration) and activity diagrams were used Gathered business requirements and wrote functional specifications and detailed design documents Extensively used Core Java, Servlets, JSP and XML Designed the logical and physical data model, generated DDL scripts, and wrote DML scripts for Oracle 9i database. Implemented Enterprise Logging service using JMS and apache CXF. Developed Unit Test Cases, and used JUNIT for unit testing of the application Implemented Framework Component to consume ELS service. Involved in designing user screens and validations using HTML, jQuery, Ext JS and JSP as per user requirements. Implemented JMS producer and Consumer using Mule ESB. Wrote SQL queries, stored procedures, and triggers to perform back-end database operations Designed Low Level design documents for ELS Service. Closely worked with QA, Business and Architect to solve various Defects in quick and fast to meet deadlines Worked on Python scripting. Worked on different modules in HealthCare like Billing and Finance. Examination of Business Prerequisites and recognizing inconsistencies in the introductory stages to guarantee minimization of expense and time. Designed Test Plan, Test cases, test scenarios, expected results and prioritizing tests for the whole project. Identifying the areas that can be automated and defining the scope of automation. Designed automation framework, developed test cases, executed and maintained the same. Execution of test cases using Automation tools like SILK TEST, ZEENYX and SELENIUM. Defect Raising and Defect tracking using test management tool using Borland s StarTeam. Lead a team of four members in this project by assigning tasks on daily basis and coordination between the team, onsite and higher management regarding the issues faced by the team. Calculating Effort Estimation, Schedule Variance and Deviation Reports and Metrics for the project. Worked on Test scenarios for GUI, Functionality, Security, Database and Regression Testing. Executed the test cases and compared the expected results with actual results. Strong command over Database Restoration and maintaining backups on SQL server 2000,2005. Expertise on running database scripts, writing SQL queries, conducting test case reviews with onsite and exclusively with clients. Used REST for testing Web services. RAVIKANTH MACHAPUR Email:[email protected] Sr Big Data Engineer Cloudera, HWX, MapR, AWS, Azure and GCP Consultant Page 14 of Kaiser Permanente Cognizant Technology Solutions, Hyderabad India. Jun 2011 Dec 2012 Associate Data Analyst Project # 1: MON/ROC Implementation, Jun 2011 Dec 2012 Description: Kaiser Permanente has undertaken an initiative to modernize and implement a new claims platform (Dell s Xcelys) as part of their Claims and Encounter Strategy. This initiative is expected to replace end-of-cycle legacy claims thereby contributing to process standardization, automation of claims processing, accuracy in payment and reduction of administrative burdens. Project # 2: CCES Implementation, Mar 2012-Dec 2012 Description: Kaiser Permanente is implementing a National Claims Platform to reduce the cost for processing a claim, improve auto-adjudication rates, and deliver an extensible solution that can adapt to emerging and future business needs and regulatory and compliance needs. Responsibilities: Worked on Various Tracks in HealthCare like Membership, Benefits, Billing and Finance. Analysis of Business Requirements and identifying discrepancies in the initial stages to ensure minimization of cost and time. Strong knowledge in Developing Test Plan, Test cases, test scenarios, expected results and prioritizing tests for various modules like membership, 834 EDI, finance, benefits. Wrote test cases, test conditions and test scripts in MS-Excel and exported to Quality Center. Hands on experience in maintaining the Change Request's list and updating the testing process. Good understanding of the physical and logical data modeling, dimensional and relational schemas. Actively participated in validation of transformations applied on source data to load target tables. Extensively used SQL for retrieving data used for the data warehouse, Data Driven Tests to validate the same scenario with different test data. Designed Test Plan and Test Strategy by studying and analyzing Business Requirements of the Project in detail. Analyzing requirement specifications and SAD documentation to design Test Scenarios and Test Cases. Identifying, Raising and tracking of the defect. Responsible for closing the defects being fixed. Tested Web services on SoapUI. Regular interaction with the onsite and development team to ensure quality and speedy recovery of defects. Worked on Complete Integration Testing between several third party Systems and applications like BETS, CM, Xcelys, TMS and FS. Presented functional demos to the client regarding the defects and the working of the application. FreshDirect, Hyderabad India. Jan 2009 Dec 2010 Java DEVELOPER (IT Intern) Responsibilities: Designed and developed Web Services using Java/J2EE in WebLogic environment. Developed web pages using Java Servlet, JSP, CSS, Java Script, DHTML, HTML5, and HTML. Added extensive Struts validation. Involve in the Analysis, Design, and Development and testing of business requirements. Developed business logic in JAVA/J2EE technology. Implemented business logic and generated WSDL for those web services using SOAP. Worked on Developing JSP pages RAVIKANTH MACHAPUR Email:[email protected] Sr Big Data Engineer Cloudera, HWX, MapR, AWS, Azure and GCP Consultant Page 15 of Implemented Struts Framework Developed Business Logic using Java/J2EE Modified Stored Procedures in MYSQL Database. Developed the application using Spring Web MVC framework. Worked with Spring Configuration files to add new content to the website. Worked on the Spring DAO module and ORM using Hibernate. Used Hibernate Template and HibernateDaoSupport for Spring-Hibernate Communication. Configured Association Mappings such as one-one and one-many in Hibernate Worked with JavaScript calls as the Search is triggered through JS calls when a Search key is entered in the Search window Worked on analyzing other Search engines to make use of best practices. Collaborated with the Business team to fix defects. Worked on XML, XSL and XHTML files. Interacted with project management to understand, learn and to perform analysis of the Search Techniques. Used Ivy for dependency management. Keywords: cprogramm cplusplus continuous integration continuous deployment quality analyst artificial intelligence machine learning javascript business intelligence sthree database active directory information technology container edition business works hewlett packard microsoft mississippi California Washington Wisconsin |