| Sri Jana - Data Engineer |
| [email protected] |
| Location: Columbus, Ohio, USA |
| Relocation: Yes |
| Visa: OPTEAD |
| Resume file: Sri_Jana_updated_resume_1769549942084.docx Please check the file(s) for viruses. Files are checked manually and then made available for download. |
|
Sri Jana
+1 (469)737-0562 | [email protected] |LinkedIn PROFESSIONAL SUMMARY Data Engineer with 6 years of experience delivering end-to-end data solutions across cloud, and analytics environments. Skilled in designing scalable ELT pipelines, modernizing analytics platforms, and enabling reliable data delivery in enterprise-scale settings. Hands-on experience building and migrating data pipelines using Snowflake, DBT, Fivetran, ADF, Informatica, and cloud platforms across Azure, AWS, and Databricks. Strong background in data mapping, regression testing, and model validation, including record-level comparisons and schema checks. Adept at partnering with product owners, BI teams, and engineering leads to translate business requirements into technical deliverables. Experienced in Agile delivery participating in sprint planning, backlog refinement, daily standups, retrospectives, and cross-team coordination using Jira and Confluence. Excel SME with experience delivering executive-ready dashboards, reports, and presentations using Excel and PowerPoint. Known for improving data quality, automating manual validation processes, documenting complex pipelines, and driving continuous improvement across data teams. Proficient in SQL across SQL Server, Snowflake, Azure SQL, and AWS Redshift, with a strong understanding of data governance, cloud migration best practices, and modern analytics workflows. Delivered executive-ready presentations using Microsoft Excel and PowerPoint to communicate analytical findings, model performance, and platform readiness. TECHNICAL SKILLS Languages: Python (Pandas, NumPy), SQL (ANSI, T-SQL), PL/SQL, Java, Shell, C, C#, PySpark Databases: Snowflake, SQL Server, Oracle, PostgreSQL, MySQL, MongoDB, Redshift, Databricks ETL/ELT: Fivetran, DBT, ADF, Snowpark, PySpark, Informatica, Databricks, AWS S3, Lambda, Spark Visualization: Power BI (DAX, RLS), Tableau, QuickSight, Excel (Pivot, Power Query, VBA) DevOps & Orchestration: Git, GitHub, Docker, Kubernetes, Airflow, REST APIs, Jira Security & Governance: Microsoft Entra (SSO), RBAC, GDPR Tools: SSMS, Azure Data Studio, Jupyter, Confluence, Visual Studio PROFESSIONAL EXPERIENCE Circle K | Data Engineer Columbus, OH | Apr 2025 Present Led the end-to-end migration of enterprise data pipelines from SQL Servers, Azure Blob Storage, Azure Databricks, and ADF into Snowflake, redesigning ingestion and transformation workflows using Fivetran and DBT to improve reliability, flexibility, and long-term maintainability. Oversaw the full migration of historical datasets (including those previously downloaded via Power Automate scripts) into Snowflake using optimized Fivetran connectors, and set up ongoing ingestion for live ATP inventory data to ensure real-time data availability and accuracy. Migrated large volumes of marketing data from both direct source systems and legacy Blob files into Snowflake using tuned Fivetran pipelines to support analytics, segmentation, and reporting. Ingested data from multiple SQL Server sources using Fivetran HVA connectors, building and modeling new datasets to support BI and analytics reporting needs across the business. Supported the BI team s transition from legacy reporting tools to Microsoft Fabric by optimizing dataset design, enhancing refresh performance, reducing compute consumption, and streamlining query patterns. Designed and modeled datasets consumed by business and analytics teams, ensuring alignment with stakeholder requirements and downstream reporting expectations. Built a clear, layered Snowflake (Medallion) architecture including raw, staging, and consumption layers to improve data separation, lineage visibility, and warehouse performance. Created modular DBT pipelines using reusable macros, source freshness checks, and data quality tests to improve reliability and maintainability of transformation logic. Designed dimensional models for campaign performance and ML scoring dashboards using star schema principles and incremental logic to support scalable reporting. Migrated Visual FoxPro cloud backup datasets into Snowflake and made them available to users through interactive Excel-based analysis tools. Developed custom connectors using the Fivetran SDK to support incremental loading, schema evolution, and automated retry logic when failures occurred. Replaced legacy notebook-based ETL flows with Snowflake-native DBT ELT pipelines, significantly improving version control, deployment consistency, and CI/CD integration. Implemented automation workflows integrating Snowflake with Workato, helping reduce manual effort and streamline operational processes. Created Azure AD groups and configured secure, table-level access for technical users, enabling controlled self-service querying while maintaining governance standards. Identified unused downstream tables, stale views, and redundant DBT models, and performed targeted cleanup to reduce clutter, improve performance, and strengthen Snowflake data governance. Refactored and optimized DBT models to improve traceability, readability, and warehouse performance. Conducted a deep analysis of Fivetran MAR spending and identified a marketing connector loading ~1.34M rows/day and contributing ~95% of total Fivetran costs. Reduced the connector s daily load to ~110k rows by pruning unnecessary schema objects, achieving ~50% cost reduction for major connectors and an overall 24 25% reduction in total Fivetran costs. Tuned Snowflake workloads using clustering keys, result caching, and right-sizing virtual warehouses, lowering query latency by 41% and reducing compute spend. Troubleshot and resolved ingestion pipeline issues across Blob, SQL Server, ADF, and Fivetran sources, ensuring stable data replication and consistent pipeline reliability. Integrated Snowflake with Microsoft Entra SSO and configured RBAC for secure, scalable access across Power BI and technical teams. Documented migration strategy, data flows, lineage, and DBT transformation logic in Confluence; created runbooks, validation documents, and flow diagrams for engineering, QA, and business users. Deployed DBT models using Git-based CI/CD pipelines, ensuring consistent, repeatable releases across DEV, QA, and PROD environments. Mentored an intern on DBT development, merge strategies, RBAC implementation, data modeling, and CI/CD deployment processes. Actively participated in Agile ceremonies including standups, retrospectives, and sprint planning, collaborating closely with QA, BI, and data engineering teams. Circle K | Business Data Analyst Tempe, Arizona | Sep 2023 Mar 2025 Developed and coordinated PySpark-based pipelines in Azure Databricks to transform large-scale pricing and marketing datasets, which were integrated with Azure Data Lake and Power BI. Implemented data validation and QA logic directly in Databricks notebooks, streamlining regression testing and reducing dependency on manual SQL scripts. Defined and maintained over 70 JIRA tickets, including user stories, epics, and enhancements, improving sprint visibility and delivery consistency by 98%. Collaborated with business stakeholders and lead developers to gather and translate complex pricing requirements into detailed functional specifications for an Excel-based retail pricing solution. Facilitated cross-functional requirement elicitation sessions, producing comprehensive use case documents, process flow diagrams, and change requests to support iterative Agile development. Automated repetitive tasks (handling raw data files, validation checks, job scheduling, refreshing), improving pipeline stability and reducing manual intervention. Authored and executed advanced SQL validation scripts in Azure Data Studio to ensure data integrity, support QA test cases, and troubleshoot production issues during deployment and post-release. Documented full data pipeline logic and integration workflows spanning Azure Databricks, Data Lake, Power BI, and VBA, maintaining traceability for future updates and change management. Designed and published 13+ interactive Power BI dashboards focused on pricing KPIs, delivering actionable insights for leadership decision-making and pricing optimization. Automated pricing model refresh workflows using PySpark and Azure Data Lake, significantly improving consistency and reducing manual intervention. Delivered recurring analytical reports and ad hoc financial insights using advanced DAX calculations to track performance by category, region, and time. Acted as the primary liaison between data engineering, category management, and IT teams, ensuring end-to-end alignment between technical execution and strategic pricing goals. Partnered with data science teams to deliver clean, structured datasets that supported model training and improved predictive accuracy in pricing recommendations. Enforced robust data governance and access controls across Snowflake and Azure platforms, adhering to best practices and compliance standards for secure analytics operations. Produced ongoing pricing analyses supporting decisions for over $137 million in inventory, contributing to a 5.7% lift in promotional margin impact. Southern Arkansas University | Graduate Assistant Magnolia, Arkansas | August 2022 - August 2023 Built an AI-based mobile image generator using SwiftUI and OpenAI s DALL E API for text-to-image conversion Performed EDA on 10GB+ datasets using pandas, NumPy, and NLP (sentiment, stopword removal) to support cybersecurity research Gained knowledge on containerized network security system using Docker, Linux, and Python, enabling real-time packet inspection with tools Wireshark and Scapy. Assisted over 70 students in learning and applying web development fundamentals (HTML, CSS, JavaScript, PHP) through lab support, project troubleshooting, and skill-building sessions. Reviewed and graded 200+ student assignments, providing constructive feedback and helping identify learning gaps in technical coursework Cognizant | Programmer Analyst Bangalore, India | October 2020 August 2022 Designed and deployed Informatica PowerCenter ETL workflows to support Apple s Order Management System, sourcing operational data from Oracle, PostgreSQL, and flat files into a centralized warehouse for order processing and inventory visibility. Tuned PL/SQL queries and materialized views used in OMS-related reporting, improving billing and order-processing dashboard performance by 60%. Led developer testing for staging and final-layer OMS tables, creating SQL test scripts to validate row counts, detect null anomalies, and verify SCD behavior across data refresh cycles. Designed star-schema aligned dimension and fact tables, including SCD Type II logic, to support long-term OMS reporting, change tracking, and downstream operational analytics. Used legacy AWS services S3 bucket, AWS Redshift and containerizing ingestion scripts using AWS Lambda to improve scalability and reduce infrastructure overhead. Automated daily OMS job flows using Shell scripting and Informatica session logs, increasing reliability and significantly reducing manual run effort. Built Power BI dashboards to visualize order processing metrics, shipment timelines, and inventory KPIs across four major warehouse hubs supporting Apple s distribution network. Collaborated in Agile/Jira sprints, providing estimates, supporting burn-down tracking, participating in daily standups, and contributing to sprint reviews and retrospectives. EDUCATION Southern Arkansas University 4.0 CGPA May 2022 August 2023 Master of Science, Computer and Information Science Magnolia, Arkansas Coursework: Data Analytics, Advanced Programming, Software Engineering, Computer Networking, Machine Learning Mahatma Gandhi Institute of Technology 3.7 CGPA June 2017 May 2021 Bachelor of Technology, Electrical and Electronics Engineering Hyderabad, India Relevant Coursework: Data Structures, C, OOP in JAVA, Linear algebra, MATLAB/Simulink, Probability and Statistics CERTIFICATIONS DP-300: Administering Microsoft Azure SQL Solutions (DBA), Udemy, Aug 2024 AWS Cloud Practitioner and Foundation, AWS, Feb 2024 SQL Programming, LinkedIn Learning, Aug 2023 Microsoft SQL Server 2022 Essentials, LinkedIn Learning, Aug 2023 Oracle SQL Certification, Udemy, Mar 2022 Power BI Essentials, LinkedIn Learning, Dec 2021 Complete PL/SQL, Udemy, Aug 2021 Keywords: cprogramm csharp continuous integration continuous deployment quality analyst artificial intelligence machine learning business intelligence sthree active directory information technology procedural language Ohio |