Home

Veda Anand - Sr AWSDataEngineer Cloudera, HWX, AWS, GCP, Snow Flake
[email protected]
Location: Whitmore Lake, Michigan, USA
Relocation: No
Visa: H1B
Resume file: Umakanth_Resume_1762522713228.docx
Please check the file(s) for viruses. Files are checked manually and then made available for download.
Uma KanthMachapur
Email:[email protected] Sr AWSDataEngineer
Cloudera, HWX, AWS, GCP, Snow Flake

Page 1 of 8

Professional Summary:
Over 12 years of experience in design, development, implementation of Software
applications and BI/DWH solutions. Experience in data discovery and advance analytics
and building business solutions with knowledge in developing strategic ideas for deploying
Big Data solutions in both cloud and on-premise environments, to efficiently solve Big
Data processing requirements.
Build Advanced Analytics Applications on different eco systems Cloudera, HWX,
GCP, Snow Flake and AWS.
Strong Understanding in distributed systems, RDBMS, large-scale & small-scale non-
relational data stores, map-reduce systems, database performance, data modelling, and
nifimulti-terabyte data warehouses.
Extensively used Hadoop open-source tools like Hive, HBase, Sqoop, Spark for ETL
on Hadoop Cluster.
Detail-oriented Data Analyst with a strong analytical background and a proven track
record of transforming data into actionable insights.
Proficient in ETL (Extract, Transform, Load) processes, Master Data Management
(MDM), Data Security, and Data Governance.
Proficient in analysing and interpreting complex data sets to identify trends, patterns,
and actionable insights.
Skilled in using statistical and data visualization tools to communicate findings
effectively.
Worked with several data Integrating and Replication tools like Atunity Replicate etc.
Strong knowledge on system development lifecycles and project management on BI
implementations.
Extensively used RDBMS like Oracle and SQL Server for developing different
applications.
Build several Data Lakes on top of S3, HDFS to help different clients to perform their
advance analysis on big data.
Work with Data science team to provide and feed data for AI, ML and Deep learning
projects
Real-time experience in Hadoop Distributed files system, Hadoop framework and
Parallel processing implementation (AWS EMR,Cloudera) with hands on experience in
HDFS, Map Reduce, Pig/Hive, HBase, Yarn, Sqoop, Spark, Pyspark, RDBMS,
Linux/Unix shell scripting and Linux internals.
Experience in writing UDF s and map reduce programs in java for Hive and Pig.
Created Kafka data pipelines to produce and consumer applications for log stream data.
Experience in Data visualization tools like tableau and looker.
Experience in creating scripts and Macros using Microsoft Visual Studios to automate
tasks.
Strong expertise in Master Data Management, ensuring data accuracy, consistency, and
reliability across the organization.

Uma KanthMachapur Mobile: 248-497-3001
Email:[email protected] Sr AWSDataEngineer
Cloudera, HWX, AWS, GCP, Snow Flake

Page 2 of 8

Capable of designing and implementing MDM solutions to maintain a single source of
truth for critical business data.
Knowledgeable in data security best practices, ensuring the confidentiality, integrity,
and availability of sensitive information.
Proficient in implementing data security measures, including encryption, access
controls, and data classification.
Well-versed in establishing and maintaining data governance frameworks to manage
data assets effectively.
Skilled in defining data policies, standards, and processes to ensure data quality,
compliance, and accountability.
Other Experiences:
Have experience working with web designer tools like Adobe Dreamweaver CC,
WordPress& Joomla.
Proficient in Manual, Functional and Automation testing.
Also experienced in Smoke, Integration, Regression, Functional, Front End and Back
End Testing.
Capable in developing/writing Test Plans, Test Cases, and Test Scripts based on User
Requirements, and SAD documentation.
Highly experienced in writing test cases and executing in HP Interactive Testing Tools:
Quality Centre, Quick Test Professional (QTP).
Technical Skills:
Reporting Tools: Tableau and Looker
Big Data Ecosystem: HDFS, Map Reduce, Oozie, Hive, Pig, Sqoop, Flume, Zookeeper
and HBase, CAWA, Spark, spark-sql, Impala, Mapr-DB, Azure, Oracle Big Data
Discovery, Kafka, Nifi
Hadoop Ecosystems: MapR, Cloudera, AWS EMR, Horton Works.
Cloud Platforms:AWS, GCP, Azure
Servers: Application Servers (WAS, Tomcat), Web Servers (IIS6, 7, IHS).
Operating Systems: Windows 2003 Enterprise Server, XP, 2000, UNIX, Red Hat
Enterprise Linux Server release 6.7
SQL, Palantir Foundry/Gotham, data integration pipelines, ontology modelling.
Ensure uptime,scalability, and reliability of services
Manage service-level agreements (SLAs) and incident resolution.
Translate business needs into Palantir-driven solutions.
Oversee daily operations of Palantir platforms (Foundry/Gotham).
Databases: SQL Server 2005, SQL 2008, Oracle 9i/10g, DB2, MS Access2003,
Teradata, postgresSQL
Languages:Python, Bash, SQL, XML, JSP/Servlets, Struts, spring, HTML, PHP,
JavaScript, jQuery, Web services, Scala.
Data Modelling: Star-Schema and Snowflake-schema.
Closely monitor cost dashboards to track saving from Foundry.

Uma KanthMachapur Mobile: 248-497-3001
Email:[email protected] Sr AWSDataEngineer
Cloudera, HWX, AWS, GCP, Snow Flake

Page 3 of 8

ETL Tools: Knowledge on Informatica & IBM Data stage 8.1,SSIS


Education:
Title of the Degree College/University Year of
Passing

Master of Information
Technology & Management
Studies

University Of Ballarat
Vic, Australia

2013

Bachelor Of Information
Technology

University Of Ballarat
Vic, Australia

2011

Board of Intermediate Education Narayana Jr. College
Telengana, India

2008

Board of Secondary Education St.Ann s Grammar High

School
Malkajgiri, Hyd, India

2006

Work Experience:
iTech-Go, Clarkston, MI Apr 2022 Till Date
Client: PaloAlto Networks
Data Architect
Responsibilities:
Led data ingestion projects in BigQuery and Databricks, incorporating API, Gsheets,
file, RDBMS, and SFTP sources.
Developed and maintained yearly and quarterly reports on BigQuery, contributing to
the creation of intricate executive dashboards.
Contributed to various agile projects, including Smart Recruiters pipeline, HR
Dashboards, Tableau Automation through ServiceNow, IT360, SR Dashboards, and
Internal Mobility, leveraging Databricks for enhanced data processing.
Acquired extensive knowledge in the HR domain, enhancing recruiting capabilities.
Utilized GCP technologies, including BigQuery, Kubernetes Engine, GCP buckets, and
Cloud Functions, alongside Databricks for advanced data processing and analytics.
Built and orchestrated hundreds of pipelines on Airflow and Databricks, ensuring near
real-time availability of data and dashboard reporting.
Designed and exposed APIs in Java Spring Boot and Flask for ServiceNow Tableau
access requests, facilitating data processing into the Tableau system.
Proficient in Python, SQL, Spark, and Databricks for data engineering tasks.
Implemented debugging and monitoring solutions using Airflow, Datadog, Grafana,
Kibana, Google Cloud Watch, and notifications via Datadog for Slack and emails.

Uma KanthMachapur Mobile: 248-497-3001
Email:[email protected] Sr AWSDataEngineer
Cloudera, HWX, AWS, GCP, Snow Flake

Page 4 of 8

Played a pivotal role in designing and planning new solutions for data pipelines,
ensuring seamless communication with Business, Data Analysts, Business Intelligence,
Technical Directors, and Product teams.
Provided production job support, addressing issues, and enhancing features in an agile
environment.
Demonstrated expertise in building and enhancing data pipelines, aligning with
business requirements.
Engineered scalable systems that effectively meet project requirements, guaranteeing
efficient data processing and handling.
Design and implement multiple ETL solutions with more than 50 data sources by
extensive SQL scripting, ETL tools, Python, shell scripting, and scheduling tools,
including Databricks.
Wrote scripts in BQ SQL and Spark for creating complex tables with high-performance
metrics like partitioning, clustering, and skewing.
Worked with Google Data Catalogue, Databricks, and other Google Cloud APIs for
monitoring, query, and HR-related analysis for BigQuery and Databricks usage.
Created BigQuery authorized views for row-level security or exposing the data to other
teams.
Cross Sense Analytics, Farmington Hills, MI Aug 2021 Mar 2022
Client: State of Ohio(Bureau of Worker s Compensation)
LeadGCPData Architect
Responsibilities:
Implemented data ingestion processes from diverse internal and external sources into
GCP data lake Cloud Storage buckets using GCP Dataflow, Cloud Functions, Transfer
Service, and SFTP.
Ensured efficient and secure data transfer mechanisms to maintain data integrity and
confidentiality. Writing scripts for data cleansing, data validation, data transformation
for the data coming from different source systems.
Worked on Google Cloud Dataproc and BigQuery for processing and querying data,
both batch and real-time processing using Apache Spark and Apache Kafka with
Python.
Scripted using Python and PowerShell for setting up baselines, branching, merging, and
automation processes across the process using Git.
Worked with different file formats like Parquet files and also BigQuery using PySpark
for accessing the data and performed real-time processing with Dataflow and Pub/Sub.
Worked on Data Integration for extracting, transforming, and loading processes for the
designed packages using Google Cloud Dataflow.
Designed and deployed automated ETL workflows using Cloud Functions, organized
and cleansed the data in Cloud Storage buckets using Cloud Dataprep, and processed
the data using BigQuery.
Used Informatica admin tools to manage logs, user permissions, and domain reports.
Generate and upload node diagnostics. Monitor Data Integration Service jobs and

Uma KanthMachapur Mobile: 248-497-3001
Email:[email protected] Sr AWSDataEngineer
Cloudera, HWX, AWS, GCP, Snow Flake

Page 5 of 8

applications. Domain objects include application services, nodes, grids, folders,
database connections, operating system profiles, etc.
Developed infrastructure as code using Google Cloud Deployment Manager and Cloud
Build, ensuring consistent and scalable deployment of GCP resources.
Collaborated with cross-functional teams to design and implement robust infrastructure
solutions on Google Cloud Platform.
GreenByte Technologies, Hyderabad, India Mar 2017-Jun2021
SrBig Data Developer / Lead
Responsibilities:
Helped client to understand performance issues on the cluster by analysing the
Clouderastats.
Designed and implemented Optum Data Extracts and HCG Grouper Extracts on AWS.
Improved memory and time performances for several existing pipelines.
Developed data ingestion modules (both real time and batch data load) to data into
various layers in S3, Redshift and Snowflake using AWS Kinesis, AWS Glue, AWS
Lambda and AWS Step Functions
Perform Data Cleaning, features scaling, features engineering using pandas and numpy
packages in python.
Used Bash Shell Scripting, Sqoop, AVRO, Hive, Impala, HDP, Pig, Python,
Map/Reduce daily to develop ETL, batch processing, and data storage functionality.
Build pipelines using spark, sparksql, hive, hbase tools and build pipelines using AWS
airflow and exploring the power of distributed computing on AWS EMR
Loaded processed data into different consumption points like Apache solr, Hbase, at
scale cubes for visualization and search.
Automated the workflow using Talend Big Data.
Scheduled jobs using Autosys.
Experienced in managing and reviewing Hadoop log files.
Involved in moving all log files generated from various sources to HDFS for further
processing through Flume.
Involved in loading and transforming large sets of structured, semi structured and
unstructured data from relational databases into HDFS using Sqoop imports.
Developed Sqoop scripts to import export data from relational sources and handled
incremental loading on the customer, transaction data by date.
Designed and implemented Kerberos authentication for securing access to Hadoop
cluster, enhancing overall system security and user authentication.
Configured and maintained Transport Layer Security (TLS) protocols to ensure secure
data transmission within the Hadoop ecosystem, preventing unauthorized access during
data transfer.
Deployed and configured Apache Ranger to enforce fine-grained access control
policies, enabling role-based access to Hadoop resources and ensuring data
confidentiality.

Uma KanthMachapur Mobile: 248-497-3001
Email:[email protected] Sr AWSDataEngineer
Cloudera, HWX, AWS, GCP, Snow Flake

Page 6 of 8

Spearheaded data encryption initiatives, implementing robust encryption algorithms for
sensitive data stored within Hadoop, ensuring compliance with security standards and
regulations.
Managed SSL/TLS certificates for Hadoop components, ensuring their proper
installation, renewal, and adherence to industry best practices for securing
communication channels.
Environment: AWS services, AWS S3, AWS Glue, Lambda, Oracle SQl, Cloudera, Spark,
Python, SQL,Talend workload automation, Jenkins, Git, postgresSQL
The Australian health system, Melbourne, Australia Mar 2015 Jan 2017
Sr Big Data Developer/ Digital Transformation (Cloudera)
Responsibilities:
Designed and implemented data integration solutions to extract, transform, and load (ETL)
Epic Electronic Health Record (EHR) data into data warehouses, enabling comprehensive
reporting and analytics.
Developed data models to ensure accurate representation of clinical and operational data
from Epic Systems, facilitating a better understanding of patient care and hospital
performance.
Established robust data quality checks and validation procedures to ensure the integrity and
accuracy of clinical and operational data transferred from Epic Systems to Foundry.
Ensured the secure handling of sensitive patient data by implementing data encryption,
access controls, and adherence to healthcare compliance standards, such as HIPAA.
Tuned ETL workflows to improve data processing efficiency and performance, reducing
data latency and ensuring timely access to healthcare data for analytics.
Established and maintained data warehousing infrastructure specifically tailored to Epic
EHR data, optimizing data storage and retrieval for reporting and analysis.
Built and maintained data extraction processes from Epic Systems, including Clarity,
Caboodle, Chronicles, and other Epic modules, ensuring data accuracy and consistency.
Automated ETL workflows using tools such as Informatica, Talend, or custom scripts,
streamlining data processing from Epic sources to the data warehouse.
Implemented data quality checks and validation processes to ensure the accuracy and
integrity of clinical and operational data from Epic Systems.
Designed and developed Epic-specific reports and dashboards for clinical and
administrative teams using BI tools like Tableau, Power BI, or Cognos.
Tuned ETL processes and data warehouse structures to enhance query performance,
reducing report generation time and improving user experience.
Implemented data governance policies, including data lineage, data dictionary, and data
access controls, to maintain data consistency and ensure compliance with healthcare
regulations.
Leveraged advanced analytics and machine learning techniques to extract insights from
Epic data, aiding in clinical decision support, patient outcomes analysis, and operational
improvement.
Ensured the security and privacy of patient data by implementing robust data encryption,
access controls, and compliance with HIPAA regulations.

Uma KanthMachapur Mobile: 248-497-3001
Email:[email protected] Sr AWSDataEngineer
Cloudera, HWX, AWS, GCP, Snow Flake

Page 7 of 8

Worked closely with healthcare professionals and clinicians to understand their reporting
and analytics needs, translating them into actionable data solutions.
Successfully managed data migration and transformation during Epic EHR system
upgrades, ensuring continuity of data access and reporting capabilities.
Created comprehensive documentation and conducted training sessions for end-users and
IT staff on the use of Epic data and BI tools.
Provided technical support and troubleshooting for Epic-related data issues and assisted in
problem resolution, ensuring minimal disruptions to clinical operations.
Implemented data transformation processes to standardize and cleanse Epic data, making it
ready for analysis and reporting within the Foundry environment.
Successfully managed data migration and transformation during upgrades to Epic EHR and
Foundry data platform, maintaining data accessibility and reporting capabilities.

Telstra, Melbourne, Australia Dec 2012 Jan 2015
Sr Big Data Advance Analytics Consultant
Responsibilities:
Worked collaboratively with MapR vendor and client to manage and build out of large
data clusters.
Helped design big data clusters and administered them.
Worked both independently and as an integral part of the development team.
Communicated all issues and participated in weekly strategy meetings.
Administered back-end services and databases in the virtual environment.
Did several benchmark tests on Hadoop sql engines (Hive, Spark-sql, Impala) and on
different data formats Avro, sequence, Parquet using different compression codecs like
Gzip, snappy etc.
Worked on sentiment analysis and structured content programs for creating text
analytics app.
Created and Implemented applications on Oracle Big Data Discovery for Data
visualization, Dashboard and Reports.
Collected data from different databases (i.e. Oracle, My Sql) to Hadoop. Used CA
Workload Automation for workflow scheduling and monitoring. .
Worked on Designing and Developing ETL Workflows using Java for processing data
in MapRFS/Hbase using Oozie.
Experienced in managing and reviewing Hadoop log files. Involved in moving all log
files generated from various sources to HDFS for further processing through Flume.
Involved in loading and transforming large sets of structured, semi structured and
unstructured data from relational databases into HDFS using Sqoop imports.
Developed Sqoop scripts to import export data from relational sources Teradata and
handled incremental loading on the customer, transaction data by date.
Developed simple and complex MapReduce programs in Java for Data Analysis on
different data formats.

Uma KanthMachapur Mobile: 248-497-3001
Email:[email protected] Sr AWSDataEngineer
Cloudera, HWX, AWS, GCP, Snow Flake

Page 8 of 8

Optimized MapReduce Jobs to use HDFS efficiently by using various compression
mechanisms.
Worked on partitioning HIVE tables and running the scripts in parallel to reduce run-
time of the scripts. Worked on Data Serialization formats for converting Complex
objects into sequence bits by using AVRO, PARQUET, JSON, CSV formats.
Responsible for analysing and cleansing raw data by performing Hive queries and
running Pig scripts on data. Created Hive tables, loaded data and wrote Hive queries
that run within the map.
Environment: MapR eco system, ODI, Oracle Endeca, Oracle BigData Discovery, CA
workload automation
Origin, Melbourne, Australia Jan 2011 Oct 2011
Java Developer (Contract)
Responsibilities:
Designed and developed Web Services using Java/J2EE in WebLogic environment.
Developed web pages using Java Servlet, JSP, CSS, Java Script, DHTML, HTML5,
and HTML. Added extensive Struts validation.
Involve in the Analysis, Design, and Development and testing of business
requirements.
Developed business logic in JAVA/J2EE technology.
Implemented business logic and generated WSDL for those web services using SOAP.
Worked on Developing JSP pages
Implemented Struts Framework
Developed Business Logic using Java/J2EE
Modified Stored Procedures in MYSQL Database.
Developed the application using Spring Web MVC framework.
Worked with Spring Configuration files to add new content to the website.
Worked on the Spring DAO module and ORM using Hibernate. Used Hibernate
Template and HibernateDaoSupport for Spring-Hibernate Communication.
Configured Association Mappings such as one-one and one-many in Hibernate
Worked with JavaScript calls as the Search is triggered through JS calls when a Search
key is entered in the Search window
Worked on analyzing other Search engines to make use of best practices.
Collaborated with the Business team to fix defects.
Worked on XML, XSL and XHTML files.
Interacted with project management to understand, learn and to perform analysis of the
Search Techniques.
Used Ivy for dependency management.
Keywords: artificial intelligence machine learning javascript business intelligence sthree database information technology golang hewlett packard microsoft mississippi California Michigan

To remove this resume please click here or send an email from [email protected] to [email protected] with subject as "delete" (without inverted commas)
[email protected];6376
Enter the captcha code and we will send and email at [email protected]
with a link to edit / delete this resume
Captcha Image: