Home

Jyoti - Data Engineer
[email protected]
Location: Lake Wildwood, California, USA
Relocation: Texas/California
Visa: H1b
Venkata Jyoti Indala
E-mail: [email protected]
+1 4693344999

Professional Summary

Experienced IT professional with 12+ years in designing, developing, and optimizing data-driven solutions. Specializing in Big Data, Cloud, DevOps, and Data Engineering, with expertise in Hadoop, Spark, Python, Scala, Kafka and SQL using Azure Databricks, Data Factory, ADLS, and Google Cloud Platform for end-to-end data transformation, storage, and processing.
Good leadership experience as a Team/Tech Lead, managing cross-functional and offshore teams to deliver complex data solutions. Skilled in building CI/CD pipelines, optimizing SQL queries, and driving project success using Agile and Waterfall methodologies. Adept at collaborating across teams and ensuring smooth execution with tools like JIRA and ServiceNow.

TECHNICAL SKILLS

Cloud Platform Azure Data Lake services (ADLS), Azure Databricks, Azure Data factory, Google Cloud Services/Platform (GCS/GCP)
Hadoop Ecosystem Spark, HDFS, MapReduce, Sqoop, Hive, Flume, Ambari, Hue, Yarn, Cloudera, Hbase, Solr, Janus Graph DB, Kafka
CICD tool Git, Jenkins, Concord, Airflow
IDE s & Utilities IntelliJ, Databricks notebook, Jupyter,JIRA,ServiceNoq,QC
Programming/Scripting Languages C/C++, shell/Unix scripting, Scala, Python, basic Java, SQL, Spark, pySpark
RDBMS/DataBase BigQuery, SQL Server, My SQL

PROFESSIONAL EXPERIENCE

Walmart, Bentonville, AR Sep 21 Present Data Engineer/Tech Lead

Working at Walmart, multiple roles, As Data Engineer, processing large volumes of financial and retail data to ensure its accessibility, accuracy, and reliability for analysis and decision-making involves designing and building data pipelines. As Tech Lead, gather requirement from business, design and ETL process the finance item level ledger data, pipeline implementation and ensure the seamless delivery of processed data to downstream systems for reporting and analytics

Responsibilities:
Design, develop, and maintain scalable data pipelines for processing financial ledger data and retail supply chain data to ensure high-quality, accurate data delivery for analytics and reporting.
Implement and manage ETL (Extract, Transform, Load) processes using Azure Data Factory (ADF) to extract data from multiple sources, transform it, and load it into GCS buckets, ADLS, provide final BQ views for downstream analysis.
Build and optimize complex SQL queries in BQ to handle large datasets, create efficient views and tables for data storage and reporting. Configured Azure Pipelines to automate build and release processes, improving deployment efficiency.
Created end-to-end Azure pipelines and triggers, saving final data into SQL tables and creating views for business and analytical teams.
Collaborated with cross-functional teams to define data requirements, improving data quality and reporting accuracy.
Developed transformations and actions on Spark Data Frames, utilized Spark SQL for efficient data processing, and wrote Scala test suites.
Used Power BI for data visualization, providing actionable insights to stakeholders.
As a DevOps Lead Engineer, gathered requirements, developed, tested, and deployed features using CI/CD methodologies.
Engineered a Data Platform template for loading financial data into GCS bucket, handling large volumes of structured, semi-structured, and unstructured data using Spark with Scala and python, sftp final data to down streams.

Monster Government Solutions Oct 20 Aug 21 Sr Big Data Engineer

As a Big Data Engineer for Monster Government Solutions Migration project, extracted data from RDBMS Oracle, Spark ETL processed data into EMR RDS MySQL. This data was then utilized by the analytics and reporting teams.

Responsibilities:
Analyzed the data flow of the existing ETL system and evaluated transactional tables to define source-target mappings.
Developed comprehensive Data Lineage documentation to trace the flow of data from source to target tables.
Extracted large datasets from Oracle Database to EMR S3 storage using Apache Spark.
Designed and implemented SQL views for multi-level data transformations.
Developed Spark Scala code to extract data from Oracle DB, apply transformations, load to Hive external tables, and refine data into AWS RDS MySQL.
Led the development of data for 15 critical reports, ensuring accurate transformation, aggregation, and loading using Spark Scala and shell scripts.


MetLife, Cary, NC Oct 18 Sep 20 Sr Big Data Engineer

Worked on the Enterprise Online Storing System, which ingests real-time data from various insurance entities and vendors into Big Data Real-Time Systems to support enterprise 360 call center services. Worked on the new Disability Platform, importing and processing claims data using Hadoop architecture and loading the processed data into Hadoop real-time ingesting systems for further analytical and reporting needs.

Responsibilities:
Built a data flow ingestion framework (DFE) from external sources (DB2/IBM MQ/Informatica/SQL servers) into Big Data ecosystems (Hive/HBase/Solr/Titan/Janus) using tools like Apache Spark, Spring Boot, and Spark SQL.
Collaborated with connecting partners (GSSP/SPI or API) to gather requirements and support the framework.
Created HBase/NoSQL designs, including Slowly Changing Dimension (SCD) to maintain history of updates.
Developed Hive designs, including Slowly Changing Dimension (Type-2 and Type-4) to maintain history of updates.
Developed Spark Scala scripts to extract large historical and incremental data from legacy applications (Oracle, DB2, and SQL Server) into Big Data for data and analytics.
Designed and implemented the injection, transformation, and processing of daily business datasets using Sqoop, Hive, Spark, IBM BigSQL, and IBM Maestro.
Created various database objects and fine-tuned SQL tables and queries.
Implemented dynamic partitions and bucketing in Hive for efficient data access.
Gained hands-on experience in data deduplication and data profiling for many production tables.


State Farm Insurance, TX Feb 18 Oct 18 Big Data ProdSupport Engineer

At State Farm Insurance, the project focused on enabling the enterprise to lead with analytics and operate in a Big Data environment by sunsetting financial legacy systems, data, and reports. We received policies from our Guidewire Policy center, and FRR and Non-FRR feeds data were landed in HDFS. Processed data according to business requirements using Spark Scala.
Responsibilities:
Improved performance and optimization of existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frames, Pair RDD's, and SparkYARN.
Developed Spark data transformation programs based on Data Mapping.
Provided production support and resolved issues to ensure timely data flow to downstream systems.
Developed scripts for build, deployment, maintenance, and related tasks to implement CI (Continuous Integration) system using Jenkins, Docker, Maven, Python, and Bash.
POC on Nifi, Bamboo, and Docker for continuous integration and end-to-end automation for all build and deployments.

Other Experience:
Emerson, Austin, TX Apr 17 Feb 18 Software Engineer
Worked as a Software Engineer on a Distributed Control System (DCS) and designed an internal web portal for statistical analysis, which included functionalities like data ingestion, ETL processing, and data exposure through Hadoop. Responsibilities included managing Hadoop clusters, create and manage hive tables, yarn cluster manage, developing MapReduce jobs for data processing. Also created technical documentation, and used tools like Hive, Sqoop for data extraction and transformation.

TEK Systems Apr 12 Jan 15 Software/QA Engineer
Followed test-driven development in an Agile environment, automating most tests and using Cruise Control for Continuous Integration. Executed over a million automated tests daily, using simulators and emulators, and performed manual testing for cases not covered by automation. Responsibilities included designing UIs, database design, performance testing, and triaging firmware issues.

Value Labs, India Apr 07 Feb 09 RF Engineer
At NIELSEN Mobile, provided syndicated network performance information to major mobile operators in the US, Canada, and Europe. Assessed mobile voice and data network performance across various technologies, analyzed RF drive test data, and validated call drops through Layer-3 analysis. Designed RF post-processing tools, optimized system performance, and responded to customer queries regarding network performance.

EDUCATION
Master s in computer science, Texas A&M, Commerce, Texas. (2016)
Bachelor of Technology, Electrical Engineering, Andhra University, India. (2006)
Keywords: cprogramm cplusplus continuous integration continuous deployment quality analyst message queue business intelligence sthree database information technology Arkansas North Carolina Texas

To remove this resume please click here or send an email from [email protected] to [email protected] with subject as "delete" (without inverted commas)
[email protected];4749
Enter the captcha code and we will send and email at [email protected]
with a link to edit / delete this resume
Captcha Image: