Home

Priyanka Yadagiri - Data Engineer
[email protected]
Location: Dallas, Texas, USA
Relocation: Any Location
Visa: OPT
Resume file: PRIYANKA YADAGIRI_ AZURE DATA ENGINEER_1772725388457.docx
Please check the file(s) for viruses. Files are checked manually and then made available for download.
Over 4+ years of hands-on experience following SDLC methodologies to deliver robust data engineering solutions.
Proficient in Python, Scala, SQL, and PL/SQL for data manipulation and analysis.
Extensive expertise in cloud platforms including AWS (S3, EC2, Redshift, EMR, Lambda, API Gateway, IAM, Kinesis), GCP (Dataflow, Pub/Sub, BigQuery, Google Analytics, GCS), and Microsoft Azure (ADLS, Event Hubs, ADF, DevOps, Azure AD).
Skilled in data integration tools like AWS Glue, AWS Data Pipeline, Talend, Apache Nifi, Informatica, Databricks, and Apache Sqoop.
Skilled in utilizing data visualization tools like Tableau, Power BI, and Google Data Studio to generate informative reports and dashboards.
Hands-on experience with data processing libraries and frameworks, including Pandas, NumPy, Apache Spark, Spark MLlib, Spark SQL, PySpark, Mahout, and Scikit-Learn.
Strong knowledge of infrastructure automation tools like AWS CloudFormation, Terraform, Ansible, Jenkins, and Maven for efficient deployment and management of data solutions.
Experienced in monitoring and logging tools such as AWS CloudWatch, ELK stack (Elasticsearch, Logstash, Kibana), Data Dog, and Splunk for performance optimization and troubleshooting.
Working experience with a wide range of databases, including MySQL, MongoDB, PostgreSQL, Redis, Oracle SQL, SQL Server, Cosmos DB, Hive, and HBase.
Familiarity with cloud data warehousing platforms such as Snowflake, Big Query, and RedShift for scalable and high-performance data storage and analytics.
Skilled in data serialization formats like JSON, XML, XSD, and XSLT for efficient data exchange and transformation.
Experience with containerization technologies like Docker and orchestration tools like Kubernetes for scalable and portable deployment of data applications.
Robust version control and collaboration skills using Git, GitLab CI, GitHub, JIRA, Confluence, and Slack for seamless teamwork and project management.
Proficient in implementing data security measures using AWS IAM, AWS KMS, Azure AD, and OAuth to ensure data confidentiality and integrity.
Showcased analytical thinking, problem-solving, collaboration, and communication skills, facilitating productive engagement with cross-functional teams and stakeholders to deliver meaningful data solutions.

Technical Skills:

Databases MySQL, Oracle, PostgreSQL, Microsoft Azure SQL Database, MongoDB
Big Data technologies Hadoop, Hive, HDFS, MapReduce, Spark, Spark SQL, Sqoop, Kafka, Airflow,NiFi
Scripting languages Python, Shell Scripting, Scala
Cloud Platform Amazon Web Services, Microsoft Azure
Marketing and Analytics Tools Google Ads, Facebook Ads, Google Analytics, Looker Studio, Tableau, Power BI

Operating Systems Windows, MacOS
Other/Version Tools GIT, Jenkins, Jira.
Cloud DW EC2, S3, CloudWatch, RedShift, Azure blob storage, Azure Synapse, Snowflake, Snow SQL, Snowpipe
Data Governance and Compliance GDPR, CCPA, HIPAA, Metadata Management


Data Engineer
AT&T Duration: Mar 2024 to Present

Responsibilities
Conducted regular stakeholder meetings to present project updates, address concerns, and incorporate feedback into data initiatives.
Developed ETL jobs using SQL Server Integration Services (SSIS) to extract, clean, transform, and load data into the data warehouse.
Implemented comprehensive error handling and logging mechanisms at both the package and task levels in SSIS and T-SQL scripts.
Developed complex business logic using T-SQL stored procedures, functions, views, and advanced query concepts to meet data transformation needs.
Conceived, designed, and implemented ETL solutions for healthcare data processing using SSIS.
Integrated CDP data with downstream platforms like Google Ads and Facebook Ads for targeted ad campaigns and audience synchronization.
Utilized Power BI Desktop to generate interactive visualizations, dashboards, and KPI scorecards to monitor healthcare data metrics.
Implemented data pipelines on GCP, leveraging Big Query for data warehousing and Cloud Dataflow for real-time and batch processing of banking datasets.
Created robust data ingestion solutions using Cloud Storage and Pub/Sub, ensuring secure and efficient data handling across multiple systems.
Integrated GCP services with Snowflake and Azure for seamless hybrid cloud data workflows.
Built and optimized ETL pipelines to aggregate marketing and conversion data, ensuring accurate tracking and seamless integration with Google Ads and Microsoft Ads.
Developed monitoring and alerting mechanisms for GCP-based workflows using Stack driver, ensuring high availability and reliability.
Collaborated with cross-functional teams to leverage CDP insights for improving customer acquisition strategies and retention rates.
Collaborated with cross-functional teams to integrate GCP services with existing Azure and Snowflake infrastructure for hybrid cloud solutions.
Ensured compliance with GDPR and CCPA regulations while processing and storing customer data in CDPs, maintaining high data security standards.
Developed interactive dashboards in Looker Studio to visualize campaign KPIs, including ROAS, CAC, and CLV, providing actionable insights to stakeholders.
Developed pipelines to extract and transform customer behavior data for use in CDPs, ensuring compliance with GDPR and CCPA regulations.
Ensured regulatory compliance by applying GDPR and CCPA standards to all data handling processes.
Designed and implemented silver and gold data aggregation layers for processing user behavior and marketing data.
Developed real-time streaming data pipelines using Pub/Sub and Dataflow, ensuring low-latency processing for live events and analytics.
Built scalable data pipelines to process, cleanse, and transform raw tracking data such as Custom Events, Commerce Events, and User Attributes.
Automated ingestion and transformation of customer data from various touchpoints into CDPs, ensuring data accuracy and consistency across marketing channels.
Collaborated with product managers, scrum masters, and engineers to build agile processes, creating documentation for retrospectives, backlogs, and project meetings.
Developed unified data transformation logic using T-SQL and Spark SQL, ensuring consistent processing across different data sources.
Worked with stakeholders to define customer attributes and segmentation rules for use in CDPs, aligning strategies with business goals.
Optimized performance of Looker Studio dashboards by implementing efficient data queries and caching mechanisms.
Generated Tableau ad-hoc reports from Excel sheets, flat files, and CSV files, publishing interactive data visualizations and dashboards for healthcare stakeholders.
Proficient in scripting with Python API and Spark API for data analysis, with hands-on experience in Spark Core, Spark SQL, Spark Streaming, and managing data frames in Scala.
Ensured compliance with HIPAA regulations in all data handling procedures, including secure data transfer and storage.
Converted healthcare data from multiple sources into FHIR-compliant formats for interoperability across systems.
Applied FHIR standards to integrate diverse healthcare systems and applications, enabling seamless data exchange and improving patient data management.
Worked with healthcare data interchange standards such as FHIR, CDA, and HL7 in conjunction with APIs to enable smooth data integration and interoperability.
Demonstrated excellence in Data Analysis, Data Profiling, Data Validation, Data Cleansing, and Data Verification, with strong skills in identifying data mismatches and anomalies. Applied proper database standards and processes for enterprise data hierarchy design.
Designed Star and Snowflake schemas for healthcare data warehouses using tools like Erwin Data Modeler and Power Designer. Experience with IBM Ionosphere in managing MDM, data profiling, and data modeling.
Skilled in designing Data Marts using Dimensional Data Mart Modeling techniques by Ralph Kimball and Bill Inmon. Experience in working with both OLTP/OLAP systems and Kimball Data Warehousing environments.
Engaged with stakeholders to clarify report requirements and resolve data issues in Power BI. Designed, developed, and tracked Key Performance Indicators (KPIs), creating dashboards to monitor healthcare performance metrics.
Automated data extraction from Google Analytics using APIs to feed reporting dashboards and measure campaign performance.
Designed workflows for integrating Google Analytics data with data warehouses to enable advanced user behavior analysis.
Managed complex, end-to-end data migration, conversion, and data modeling projects using tools like Alteryx and SQL, while creating high-quality dashboards with Tableau to deliver actionable insights.

Environment: Big Query, Cloud Dataflow, Pub/Sub, Snowflake, Informatica, Erwin, SQL, Jenkins, SSIS, Google Ads API, Python, PL/SQL, T-SQL, SSRS, MDM, Power BI, Kimball.

Azure Data Engineer Duration: Feb 2020 to Jun2023
Infosys

Responsibilities
Collaborated with marketing, analytics, and engineering teams to define KPIs and implement solutions for improved campaign performance tracking.
Worked extensively with Azure BLOB Storage and Azure Data Lake Storage to manage and load large datasets into Snowflake Data Warehouse for secure and scalable data management.
Utilized SQL Server Integration Services (SSIS) packages to design and implement comprehensive ETL processes, loading data into databases used by Reporting Services for real-time and batch analytics.
Developed and optimized ETL processes using T-SQL and SSIS, handling various data loads, including transactional data, customer data, and compliance data critical to the banking industry.
Built ETL jobs to ingest JSON and server data into MongoDB, and further transported the data into Snowflake for secure and efficient data warehousing.
Orchestrated ETL data pipelines using Azure Data Factory (ADF), setting up custom alerts and monitoring mechanisms to ensure smooth data flow and processing.
Developed PySpark programs to process streaming data from Azure Event Hubs, loading real-time transactional data into Azure Databricks and Snowflake.
Designed and developed API-driven pipelines to integrate data with external platforms like Google Ads and Facebook Ads.
Established data governance policies and conducted periodic audits to maintain data quality and regulatory compliance (GDPR, CCPA).
Implemented custom Airflow operators in Python to streamline data extraction, transformation, and loading (ETL) from various data sources, including financial databases and third-party APIs.
Created and managed Airflow scheduling scripts in Python to automate and monitor ETL jobs, ensuring efficient and timely data availability for banking operations.
Implemented ETL processes for loading data from diverse sources into Azure Databricks tables and Azure Synapse Analytics, facilitating real-time reporting and data analysis for financial services.
Worked in an Agile development environment, integrating DevOps practices with CI/CD pipelines to automate and streamline data workflows and deployments.
Monitored and ensured compliance with GDPR and CCPA regulations in all marketing data handling processes.
Used JIRA for bug tracking, issue tracking, and project management, enhancing development efficiency and transparency in delivering critical data solutions.
Maintained and automated various Azure Services using PowerShell and Azure CLI, streamlining cloud resource management and operational efficiency.
Proficient in Azure Cloud Services (PaaS & IaaS), including Azure Data Factory, Azure Data Lake, Azure Monitoring, and Azure SQL, with strong experience developing and optimizing data pipelines for ingestion, storage, and processing.
Implemented real-time data validation, error handling, and flow monitoring within NiFi, ensuring data consistency and reliability.
Configured role-based access controls (RBAC), data encryption, and audit trails in NiFi to maintain data security and compliance with organizational policies.
Maintained comprehensive API documentation for GraphQL schemas and queries.
Managed schema versions to support backward compatibility during iterative API updates.
Configured Spark Streaming and Kafka for real-time data processing, ensuring timely and accurate delivery of critical financial data for compliance and business reporting.
Implemented dynamic mapping and schema handling in data pipelines, allowing seamless adaptability to changes in banking data structures and source formats.
Implemented delta logic extractions from various data sources with control table mechanisms to handle deadlocks, error recovery, and data logging, ensuring robust and resilient data pipelines.
Extensive experience in Informatica Repository Configuration, creating mappings, mapplets, sessions, worklets, and workflows using Informatica Designer to move data between multiple source systems and target destinations.
Collaborated with cross-functional teams, including financial analysts and compliance officers, to ensure accurate and timely data processing for regulatory reporting, fraud detection, and business intelligence.
Ensured data governance and security standards, including GDPR and PCI-DSS compliance, in all data processing activities, adhering to strict data privacy regulations in the banking sector.

Environment: Azure Data Factory, Azure Databricks, Azure Data Lake, Blob Storage, Google Ads API, Spark, SQL, Hive, Tableau, Kafka, Jenkins, Scala, SSIS. ............................
Keywords: continuous integration continuous deployment business intelligence sthree database active directory procedural language

To remove this resume please click here or send an email from [email protected] to [email protected] with subject as "delete" (without inverted commas)
[email protected];6948
Enter the captcha code and we will send and email at [email protected]
with a link to edit / delete this resume
Captcha Image: