| Naveen Krishna Devulapally - GCP Data Engineer |
| [email protected] |
| Location: , , |
| Relocation: |
| Visa: |
|
SUMMARY
IT professional with over 6 years of experience as a Data Engineer specializing in cloud-based data solutions and software development. Proven expertise in building scalable ETL/streaming pipelines, data migration, and automated data processing using Python, SQL, and cloud platforms. Strong background in data modeling, performance optimization, and advanced analytics, with experience enabling reliable reporting, real-time insights, and data-driven decision-making. Seeking to apply deep data engineering expertise to solve complex data challenges at scale. TECHNICAL SKILLS Programming Languages: Python, R, SQL Cloud Technologies: GCP (Cloud Storage, BigQuery, Cloud Function, Dataflow, Dataproc, Composer (Airflow), Vertex AI integration, Pub/Sub), AWS (S3, Glue, EMR, Kinesis, Lambda, DynamoDB, API Gateway, SNS/SQS, Athena, Redshift), Azure (Data Factory, Data Lake Storage Gen2, Synapse Analytics, Databricks, Blob Storage, Event Hubs, Stream Analytics, Functions, Purview) ETL and Data Integration Tools: Informatica, Talend, SSIS, Azure Data Factory (ADF) Big Data Technologies: Hadoop Ecosystem (HDFS, Sqoop, HBase, Hive, MapReduce), Apache Spark, Kafka Reporting Tools: Tableau, QlikView, Power BI Database: PostgreSQL, MS SQL, MySQL, Cassandra, DynamoDB Scripting and Development: HTML, CSS, JavaScript, Unix Shell Scripting Packages: NumPy, Pandas, Matplotlib, SciPy, Scikit-learn, Seaborn, TensorFlow Version Control: Git, GitHub Tools: Alation, ThoughtSpot, Striim, Control-M, Airflow, TOAD, SecureFX, SecureCRT, SQL Server Management Studio, Oracle SQL Developer, Splunk IDE s: VS code, PyCharm, Jupyter Notebook PROFESSIONAL EXPERIENCE Jenius Bank client Phoenix, AZ, USA Data Engineer Aug 2024 Current Developed and deployed GCP ingestion pipelines processing 15 GB/hour from REST API and SFTP sources using Cloud Composer-orchestrated PySpark jobs on Dataproc, beam jobs on DataFlow. Built YAML-driven configuration framework that reduced development effort by 30% by standardizing pipeline behavior and eliminating repetitive Spark and orchestration code across datasets. Designed and developed high-performance ETL pipelines using Teradata SQL and BTEQ, processing large-scale enterprise data. Built enterprise data warehouse pipelines managing 2 TB of data with Spark transformations, implementing SCD Type 1/2 models and snapshot tables in BigQuery. Improved query performance by 40% and reduced storage costs by 25% through strategic partitioning, clustering, and data pruning optimization techniques. Engineered Spark-based analytical datasets processing millions of daily records using complex joins, window functions, and data validation to deliver vendor-level performance metrics. Optimized batch and near-real-time workloads with efficient writes to partitioned GCS layers and BigQuery, reducing scan costs and improving downstream analytical query performance. KeyGlee Tempe, AZ, USA Data Engineer Feb 2024 July 2024 Designed and implemented a multi-cloud data pipeline integrating AWS services with Google Cloud Platform to optimize data storage and processing workflows. Configured AWS API Gateway for secure access to backend services and automated data ingestion workflows using Python with AWS Lambda, integrating Amazon SNS, SQS, and Kinesis Data Firehose for enhanced system scalability and reliability. Maintained Amazon DynamoDB tables for real-time data storage needs, ensuring low-latency data access and high throughput performance. Orchestrated the cross-cloud data transfer from AWS to GCP using AWS-native tools and Google s Data Transfer Service, ensuring data consistency and integrity. Created and managed Cloud Storage buckets in Google Cloud (data-lake-bronze) as part of a tiered data storage strategy, facilitating efficient data lifecycle management. Implemented Cloud Functions in Google Cloud to automate ETL tasks and data transformations, reducing manual effort and speeding up data availability for analytics. University of Illinois Springfield Springfield, IL, USA Graduate Student Assistant Jan 2023 - Dec 2023 Developed an AI-based system to predict student performance and identify at-risk students to provide timely interventions. Utilized Scikit-learn and TensorFlow to create predictive models based on student demographic data, academic records, and engagement metrics. Collaborated with the university s administration to gather and preprocess data from various sources, including student information systems, learning management systems, and survey results. Deployed machine learning models into production using Azure functions and integrated them with the university's existing systems for real-time predictions. Successfully identified at-risk students with an accuracy rate of over 85%, enabling the university to implement targeted support measures and improve overall student retention rates. Developed a recommendation system to suggest personalized interventions and resources for at-risk students, improving their academic outcomes. Worked closely with faculty members and academic advisors to refine models and ensure their relevance and accuracy in predicting student performance. Ensured the privacy and security of student data by adhering to FERPA guidelines and implementing robust data handling practices. Company: Accenture/CNA Insurance Jun 2020 Jul 2022 Position: Data Engineer Built and maintained 20+ ETL/ELT pipelines using Cloud Composer (Airflow) and Dataflow to migrate legacy data sources into GCP, processing 500GB+ of data daily. Developed Python-based data transformation jobs in Dataflow to cleanse, standardize, and load data from legacy systems into BigQuery tables. Created and optimized BigQuery datasets with partitioning and clustering strategies, improving query performance and reducing costs. Supported Cloud Storage data lake architecture with proper folder structures and lifecycle policies to support raw, staging, and curated data layers. Engineered Raw Data Factory solution on GCP to consolidate 15+ legacy data sources, establishing standardized ingestion patterns and data quality checks. Wrote SQL scripts and stored procedures to extract, transform, and validate data during migration from on-premise systems to BigQuery. Created automated data reconciliation scripts using Python and BigQuery to verify row counts, data types, and business logic alignment between legacy and migrated systems. Implemented pipeline monitoring and alerting using Cloud Monitoring and logging to track job failures, data quality issues, and SLA breaches. Debugged and resolved data pipeline failures, reducing average incident resolution time. Tech Tycoon Hyderabad, India Web Development Intern Dec 2019 - April 2020 Assisted in designing and developing a responsive website using HTML, CSS, and JavaScript, resulting in improved SEO and increased website traffic. Collaborated with cross-functional team members to develop and launch new features, identify and resolve software issues. Prepared proper documentation and deliverables are completed on time by following Agile methodologies. Salesforce Cognizant Internship Hyderabad, India Student Intern April 2019 - Sep 2020 Engaged in hands-on application of Salesforce.com administration, assisting in the customization of the CRM platform to streamline processes for sales and marketing teams. Supported the development and maintenance of Apex classes and triggers, contributing to database management tasks. Utilized Trailhead learning platform to continuously enhance my Salesforce skill set, with a focus on Apex programming. EDUCATION University of Illinois - Springfield Springfield, IL Master's, Data Analytics | GPA: 3.79/4 Aug 2022 - Dec 2023 Sreenidhi Institute of Science & Technology Hyderabad, India Bachelor's, Computer Science | GPA: 9.02/10 Aug 2016 - May 2020 PROJECTS Graduate admission prediction (Associated with University of Illinois Springfield) Car Make and Model detection (Associated with University of Illinois Springfield) Food Calorie Estimation Model (Associated with Sreenidhi Institute of Science & Technology) Data encryption and decryption with AES (Associated with Sreenidhi Institute of Science & Technology) Keywords: artificial intelligence business intelligence sthree rlang information technology microsoft mississippi Arizona Illinois Keywords: artificial intelligence business intelligence sthree rlang information technology microsoft mississippi Arizona Illinois |