| Shr.V - Senior AI & Azure Data Engineer |
| [email protected] |
| Location: Chicago, Illinois, USA |
| Relocation: NO |
| Visa: GC |
| Resume file: Shravan_Azure_AI_Data_Engineer_1781534834012.docx Please check the file(s) for viruses. Files are checked manually and then made available for download. |
|
Name: Shravan P
GC - Only C2C roles please Azure / AI Data Engineer Professional Summary Senior Azure AI & Data Engineering professional with 14+ years of experience designing and delivering enterprise-scale cloud data platforms across banking, insurance, healthcare, and energy domains. Proven expertise in architecting modern lake house solutions leveraging Azure Databricks, ADLS Gen2, Delta Lake, and Snowflake to support scalable analytics and AI workloads. Extensive experience in building end-to-end data pipelines including ingestion, transformation, orchestration, and consumption layers for real-time and batch processing systems. Strong hands-on experience with Azure OpenAI and Generative AI solutions, including RAG architectures, prompt engineering, and enterprise AI integrations. Expertise in designing and implementing vector search and semantic retrieval systems using Azure AI Search for intelligent knowledge discovery. Proficient in developing distributed data processing frameworks using PySpark with performance optimization techniques such as partitioning, caching, and adaptive query execution. Deep experience in Snowflake data warehousing, including dimensional modeling, secure data sharing, RBAC, and performance tuning for enterprise reporting workloads. Skilled in implementing DBT-based transformation frameworks, enabling modular, testable, and version-controlled data modeling aligned with modern data engineering practices. Strong experience in building and managing workflow orchestration pipelines using Apache Airflow for complex, multi-stage data processing ecosystems. Hands-on experience with streaming and event-driven architectures using Kafka, Azure Event Hubs, Event Grid, and Spark Structured Streaming. Expertise in implementing Change Data Capture (CDC) patterns for incremental data processing and near real-time data synchronization. Proven ability to design and implement enterprise-grade data governance frameworks using Azure Purview, RBAC, and data security best practices. Experience in developing secure and compliant data platforms aligned with industry standards such as HIPAA, financial regulations, and enterprise security policies. Strong knowledge of DevOps and Infrastructure as Code (IaC) using Terraform, Azure DevOps, and GitHub Actions for automated deployments and CI/CD pipelines. Proficient in building API-driven data services and integrating data platforms with enterprise applications using RESTful architectures. Extensive experience in data migration and modernization, transforming legacy ETL systems into scalable cloud-native architectures. Expertise in designing high-performance data models, including star schemas and data marts for analytical and reporting use cases. Strong experience in monitoring and observability frameworks, leveraging Azure Monitor, Log Analytics, and custom alerting solutions. Proven track record of optimizing data processing performance and cost efficiency, reducing execution time and cloud resource utilization. Experience in implementing AI governance and responsible AI frameworks, including prompt validation, moderation, and model evaluation pipelines. Skilled in working with cross-functional teams, collaborating with business stakeholders, data scientists, and engineering teams to deliver data-driven solutions. Strong experience in client-facing roles, including requirement gathering, solution design, and architecture presentations to stakeholders. Demonstrated ability to lead technical design discussions and mentor junior engineers, ensuring best practices and high-quality deliverables. Familiarity with multi-cloud concepts, with foundational exposure to AWS services and comparative cloud architecture patterns. Results-driven professional with a strong focus on delivering scalable, reliable, and business-aligned data and AI solutions in fast-paced enterprise environments. Technical Skills: Category Skills Cloud Platform Microsoft Azure (Primary), Azure AI Services, Azure OpenAI, Azure AI Studio, Azure AI Search, Azure ML Data Engineering Azure Data Factory (ADF), Azure Databricks, ADLS Gen2, Azure Synapse Analytics, Microsoft Fabric Data Warehousing Snowflake, Azure Synapse (Dedicated & Serverless), Delta Lake Big Data & Processing Apache Spark, PySpark, Spark SQL, Distributed Data Processing Programming Languages Python, PySpark, SQL, T-SQL, Scala AI / GenAI Retrieval-Augmented Generation (RAG), Prompt Engineering, Semantic Kernel, Prompt Flow, Vector Databases Streaming & Messaging Apache Kafka, Azure Event Hubs, Event Grid, Spark Structured Streaming Orchestration Tools Apache Airflow, Azure Data Factory Pipelines Data Transformation DBT (Data Build Tool), ETL/ELT Frameworks DevOps & CI/CD Azure DevOps, GitHub Actions, CI/CD Pipelines Infrastructure as Code Terraform Data Governance & Security Azure Purview, RBAC, Managed Identity, Key Vault, Defender for Cloud Monitoring & Logging Azure Monitor, Log Analytics, Custom Alerting Databases Snowflake, SQL Server, Oracle API & Integration REST APIs, API Integration, Microservices Integration Architecture Data Lakehouse, Medallion Architecture, Data Modeling (Star Schema), Distributed Systems Compliance & Security HIPAA, Financial Data Compliance, Secure Data Architecture Version Control Git, GitHub Methodologies Agile, Scrum Multi-Cloud Exposure AWS (Basic S3, Glue, Redshift Conceptual/Comparative Knowledge) Velera Senior AI & Azure Data Engineer Duration: Aug 2021 Present Responsibilities Architected an enterprise-scale Azure Lakehouse platform integrated with Snowflake, enabling unified data access across multiple business domains and improving analytics scalability. Designed and implemented end-to-end data pipelines supporting both batch and real-time ingestion using Azure Data Factory, Event Hubs, and Databricks. Built modular DBT transformation frameworks aligned with medallion architecture (Bronze, Silver, Gold), enabling governed and reusable data models. Developed high-performance PySpark pipelines processing large-scale datasets (TB-level) using partitioning, caching, and adaptive query optimization techniques. Implemented Apache Airflow orchestration workflows coordinating complex dependencies across Spark jobs, Snowflake transformations, and ingestion pipelines. Designed and implemented Retrieval-Augmented Generation (RAG) architecture using Azure OpenAI and Azure AI Search for enterprise knowledge retrieval systems. Built vector indexing and semantic search capabilities to enhance intelligent document and knowledge discovery across enterprise datasets. Developed AI experimentation workflows using Azure AI Studio, enabling prompt lifecycle management, evaluation, and optimization. Integrated Azure OpenAI function calling to orchestrate enterprise APIs and enable structured AI-driven workflows. Designed document intelligence pipelines using Azure AI Document Intelligence to extract structured insights from unstructured documents. Established Responsible AI frameworks, including prompt validation, moderation layers, and output evaluation to ensure compliance and ethical AI usage. Implemented CI/CD pipelines using Azure DevOps and GitHub Actions, automating data pipeline deployments and infrastructure provisioning. Designed data governance and security frameworks using Azure Purview, RBAC, Key Vault, and Managed Identity. Collaborated with business stakeholders and cross-functional teams to gather requirements, define architecture, and deliver scalable solutions. Led technical design discussions and mentoring sessions, ensuring adherence to best practices and improving overall engineering quality. Led the design of a scalable enterprise data platform architecture, aligning data engineering and AI initiatives with long-term business strategy and enabling seamless integration across multiple data domains. Established data standardization and modeling guidelines, ensuring consistency, reusability, and governance across all layers of the lakehouse architecture. Built reusable data engineering frameworks and accelerators, reducing development effort and enabling faster onboarding of new data sources and use cases. Collaborated closely with product owners, data scientists, and business stakeholders to translate complex requirements into scalable technical solutions and AI-driven use cases. Defined and implemented enterprise-level monitoring, alerting, and observability standards, ensuring proactive issue detection and reliability across critical data and AI pipelines. Environment:Microsoft Azure (ADF, ADLS Gen2, Azure Databricks, Synapse Analytics, Azure OpenAI, Azure AI Search, Azure AI Studio, Azure ML), Snowflake, Delta Lake, DBT, Apache Airflow, Event Hubs, Event Grid, Kafka, PySpark, Python, Terraform, Azure DevOps, GitHub Actions, Azure Purview, Key Vault, RBAC, Azure Monitor. GEICO Senior Data Engineer Duration: Oct 2018 Jul 2021 Roles & Responsibilities Designed and implemented enterprise data ingestion frameworks using Azure Data Factory supporting batch, incremental, and CDC-based data loads. Developed scalable PySpark transformation pipelines in Azure Databricks for processing large insurance datasets across claims and underwriting systems. Built Snowflake data warehouse models using star schema design to support actuarial analysis and regulatory reporting. Implemented Apache Airflow orchestration pipelines managing complex ETL workflows with dependency handling and retry mechanisms. Designed secure Snowflake RBAC models and secure views, ensuring compliance with enterprise data access policies. Integrated Azure Private Link and network security configurations to ensure secure data movement across services. Leveraged Azure Synapse Serverless SQL for ad-hoc analytics directly on data lake storage. Developed Python-based monitoring and alerting utilities, improving operational visibility and reducing pipeline failures. Implemented data quality validation frameworks, ensuring accuracy and consistency across data pipelines.Designed incremental and CDC processing mechanisms, reducing processing time and improving efficiency. Collaborated with data analysts and business users to design optimized data models for reporting and analytics.Optimized query performance in Snowflake and Databricks, reducing execution time and improving system efficiency. Implemented DevOps best practices, including CI/CD pipelines and version control using Git.Participated in architecture discussions and solution design, contributing to enterprise data strategy.Provided production support and troubleshooting, ensuring high availability and reliability of data pipelines. Designed robust data ingestion strategies supporting multiple data sources including transactional systems, APIs, and third-party feeds with consistent and reliable data flow. Established data lineage and traceability frameworks, enabling better auditability and transparency across the data lifecycle. Implemented data validation and reconciliation processes to ensure high data quality and consistency across downstream analytical systems. Partnered with business and analytics teams to design optimized data models supporting reporting, dashboards, and advanced analytics use cases. Contributed to enterprise data platform modernization initiatives, helping transition legacy systems into scalable cloud-native architectures. Environment:Microsoft Azure (Azure Data Factory, Azure Databricks, ADLS Gen2, Azure Synapse Analytics), Snowflake, Apache Airflow, PySpark, Python, SQL, Azure Private Link, Azure DevOps, Git, Azure Monitor, Log Analytics, Data Governance (RBAC), Serverless SQL Pools. Anthem Data Engineer Duration: Oct 2017 Sep 2018 Responsibilities Migrated legacy ETL systems to Azure-native architecture, modernizing healthcare data platforms.Developed Azure Data Factory pipelines for ingestion and transformation of healthcare datasets. Built Snowflake-based reporting data warehouse, supporting claims, eligibility, and provider analytics.Implemented DBT transformation frameworks to standardize data modeling and testing processes. Designed Apache Airflow DAGs for orchestrating complex ETL workflows.Developed PySpark-based transformation logic for processing large-scale healthcare data. Implemented PHI-compliant data pipelines, ensuring adherence to healthcare regulations.Configured Azure security services, including Managed Identity, Key Vault, and secure networking. Designed Azure Synapse hybrid analytics solutions integrating structured and semi-structured data.Developed data validation and reconciliation frameworks to ensure data integrity. Implemented data governance policies using Azure Policy and Defender for Cloud.Optimized data pipelines for performance and cost efficiency, reducing processing time. Collaborated with business stakeholders and healthcare analysts to deliver reporting solutions.Provided support for production pipelines, ensuring reliability and minimal downtime.Documented technical architecture and data flows, ensuring knowledge sharing and compliance. Designed and implemented secure data handling practices for sensitive healthcare data, ensuring compliance with enterprise and regulatory requirements. Built standardized ETL frameworks improving consistency, maintainability, and scalability of healthcare data processing pipelines. Collaborated with compliance and governance teams to align data solutions with healthcare data privacy standards. Developed data integration solutions combining structured and unstructured healthcare data for comprehensive analytics. Supported data platform enhancements and optimizations, improving reliability and maintainability of critical data workflows. Environment:Microsoft Azure (ADF, ADLS Gen2, Azure Databricks, Azure Synapse), Snowflake, DBT, Apache Airflow, PySpark, Python, Azure Policy, Defender for Cloud, Key Vault, Managed Identity, Azure Monitor, Secure Networking, Healthcare Compliance (PHI/HIPAA). Regions Bank Data Engineer Duration: Jan 2015 Sep 2017 Responsibilities Led migration of on-prem ETL systems to Azure cloud, enabling scalable and cost-effective data processing.Developed Azure Data Factory pipelines for ingestion of financial transaction data. Built Snowflake data marts supporting AML, Basel, and regulatory reporting requirements.Designed Apache Airflow workflows for orchestrating multi-stage ETL processes. Developed PySpark pipelines for processing high-volume financial datasets.Implemented RBAC-based security models, ensuring compliance with banking standards. Designed event-driven architectures using Azure Event Grid for near real-time data ingestion.Developed Python scripts for automation, improving operational efficiency. Implemented data quality frameworks, ensuring accuracy of financial data.Optimized SQL queries and data models, improving reporting performance. Built monitoring and logging solutions using Azure Log Analytics.Collaborated with risk and compliance teams to support regulatory reporting. Provided technical support and troubleshooting for ETL workflows. Participated in design and architecture discussions, contributing to modernization initiatives.Ensured data security and governance compliance, aligning with enterprise policies. Designed scalable data ingestion frameworks for financial systems ensuring reliable and consistent data movement across multiple sources. Worked closely with risk, compliance, and audit teams to ensure data solutions met regulatory reporting requirements Implemented data governance controls and access policies, ensuring secure and controlled access to sensitive financial data. Built data transformation frameworks supporting complex business logic and financial calculations. Participated in data architecture planning and cloud migration strategy discussions, contributing to long-term platform evolution. Environment:Microsoft Azure (Azure Data Factory, ADLS Gen2, Azure Databricks), Snowflake, Apache Airflow, PySpark, Python, Azure Event Grid, RBAC Security, Azure Log Analytics, SQL Server, Financial Data Compliance, Git. Baker Hughes ETL Developer Duration: Aug 2010 Dec 2014 Responsibilities Developed ETL workflows using Informatica PowerCenter, supporting large-scale data warehouse systems.Designed dimensional data models, including fact and dimension tables for reporting. Implemented Slowly Changing Dimensions (SCD Type 1 & 2) for historical data tracking.Optimized SQL queries in Oracle and SQL Server, improving batch performance. Developed data validation and reconciliation frameworks, ensuring data accuracy.Built Python scripts for automation, reducing manual effort in data processing. Designed batch scheduling workflows using Control-M, ensuring job reliability.Developed data integration pipelines, handling multiple data sources. Implemented error handling and logging mechanisms, improving pipeline stability.Collaborated with business analysts to gather requirements and deliver solutions. Supported production deployments and issue resolution, ensuring system stability.Performed performance tuning and optimization, reducing processing time. Maintained documentation for ETL processes and workflows, ensuring knowledge transfer.Ensured data quality and consistency across systems, supporting reporting accuracy.Participated in team meetings and technical discussions, contributing to continuous improvement. Designed and developed enterprise ETL solutions supporting integration of data from multiple upstream systems into centralized data warehouses. Collaborated with cross-functional teams to gather requirements and translate them into scalable ETL workflows. Implemented data validation and auditing mechanisms, ensuring accuracy and consistency across reporting systems. Supported data warehouse enhancements and optimization initiatives, improving overall system performance and reliability. Contributed to process improvements and automation initiatives, reducing manual effort and increasing operational efficiency. Environment:Informatica PowerCenter, Oracle, SQL Server, Python, Control-M, UNIX Shell Scripting, Data Warehousing (Star Schema), ETL Frameworks, Data Validation & Reconciliation, Batch Processing Systems. Education Bachelor of Technology (B.Tech) in Computer Science Engineering Keywords: continuous integration continuous deployment artificial intelligence machine learning sthree active directory green card |