Home

ALISHA KADIRI - Sr. Data Engineer
[email protected]
Location: Tallahassee, Florida, USA
Relocation: OPEN
Visa: H1B
ALISHA KADIRI
Email: [email protected]
Phone: +1 (609) 566-4637
LinkedIn:


SUMMARY:
Sr. Data Engineer with 9 years of experience in building scalable ETL/ELT pipelines across Azure, AWS, and hybrid environments, with a proven ability to gather business requirements, define KPIs, and document workflows using BRDs, FRDs, and RTMs in both Waterfall and Agile SDLCs.
Built scalable ETL/ELT pipelines across Azure, AWS, and hybrid environments. proven ability to gather business requirements, define KPIs, and document workflows using BRDs, FRDs, and RTM in Waterfall and Agile SDLCs.
Proficient in developing ADF pipelines with ARM templates for batch automation, job orchestration, and error handling.
Experienced in Databricks, PySpark, and Delta Lake for large-scale data processing and real-time enrich-ment.
Integrated on-prem data from SQL Server, DB2, and Cassandra into Azure SQL and Data Lake using secure patterns.
Led SSIS-to-cloud migrations using Azure Synapse, Glue, and event-based triggers for modernized data architecture.
Skilled in data modeling, SCD logic, and metadata-driven pipelines using Unity Catalog and enterprise documentation tools.
Built CI/CD pipelines using Azure DevOps, GitHub Actions, and REST APIs for version control and multi-stage deployment.
Troubleshot issues in Databricks clusters, integration runtimes, and VNet/storage configs across complex cloud environments.
Converted business needs into data flows using ERDs, UMLs, KPIs, and Agile documentation to support delivery planning and tracking.
Communicate complex insights clearly via dashboards, Confluence, stakeholder updates, sprint demos; de-liver executive-ready presentations with visual impact. Recognized for strong oral communication skills.
Certified Azure and AWS Data Engineer with an M.S. in Data Science and proven record of enterprise data project success.
Delivered GCP-based data integration solutions using BigQuery, Cloud Composer, Dataproc, Cloud Func-tions, and Pub/Sub to support dbt workflows, real-time ingestion, and automated reportin
EDUCATION:
Bachelor s in computer science from JNTU - A, Tirupati, A.P. 2016

CERTIFICATIONS:
AWS Certified Data Engineer Associate Microsoft (2025)
Certified: Azure Data Engineer Associate (2021 Renewed 2025)

TECHNICAL SKILLS:
Programming Query Languages Python (Pandas, NumPy), SQL, PL/SQL, Java, Shell, R, JavaScript, C
Data Engineering Tools Databricks, Apache Spark (PySpark), Delta Lake, Delta Live Tables, dbt, Air-flow, ADF, Azure Synapse, Git, Jenkins, CI/CD
Cloud Platforms Azure (Data Factory, Microsoft Fabric Synapse, Data Lake, Key Vault, DevOps), AWS (S3, RDS, Glue, Lambda, Step Functions, Secrets Manager), GCP
Databases Oracle, Azure SQL, SQL Server, Amazon RDS, PostgreSQL, MySQL, Snow-flake, DynamoDB, DB2, Cassandra
Visualization Reporting Power BI, Tableau, Qlik Sense, Excel (Pivot Tables, Charts), Salesforce Dashboards, Confluence
Data Governance Modeling Unity Catalog, Data Lineage, Metadata Documentation, Data Mapping, ERD (MS Visio), UML, SCD Types
APIs Integration Tools REST APIs, Swagger, Postman, JSON, XML, Salesforce Data Loader, Opty-myze
Machine Learning Concepts Supervised & Unsupervised Learning, Regression, Classification, Clustering (K-Means, KNN), SVM, PCA, Time Series, Transformers, LLMs, Model Eval-uation, Hyperparameter Tuning
Machine Learning Concepts Supervised & Unsupervised Learning, Regression, Classification, Clustering (K-Means, KNN), SVM, PCA, Time Series Forecasting, Transformers, LLMs, Model Evaluation, Hyperparameter Tuning

PROFESSIONAL EXPERIENCE:
Department of Environmental Protection of Florida Tallahassee, FL Oct 2024 Pre-sent
Role: Sr. Data Engineer
Responsibilities:
Develop, maintain end-to-end ETL pipelines using SQL and PL/SQL to load and export environmental da-ta into Oracle for EPA reporting.
Engineer scalable workflows to ingest and merge field and lab datasets, streamlining compliance re-porting across 6,950+ monitoring stations.
Detect anomalies and environmental trends using advanced SQL analytics, enabling proactive correc-tions and audit-ready datasets.
Build, optimize regulatory SQL workflows, collaborating with data scientists, QA teams, compliance managers to ensure reporting accuracy.
Perform geospatial processing in ArcGIS and R, extracting spatial patterns for visual impact assessments and EPA audits.
Tune Oracle queries and indexes to reduce pipeline latency by 40
Modernize legacy pipelines using Microsoft Fabric components including Data Factory, Synapse, and Lakehouse to streamline compliance reporting architecture.
Build unified reports in Power BI (Fabric) with direct lake access to visualize KPIs, station metrics, and re-al-time compliance indicators.
Document data flows, data lineage, and business rules across Fabric workloads to support governance, reusability, and audit transparency.
Drive ongoing infrastructure upgrades and team adoption of Fabric-native workflows, simplifying oper-ations and enhancing delivery speed.

UPS (Washington Dc, Maryland) Mar 2021 Dec 2022
Role: Data Engineer.
Responsibilities:
Built scalable ETL pipelines using Python and advanced SQL to consolidate CRM, web, and sales data into a unified Snowflake warehouse.
Designed a dimensional model in Snowflake with clustering and materialized views to optimize performance for customer segmentation analytics.
Developed modular ELT workflows using dbt and Git, enabling versioned transformations with auto-documentation and reusable macros.
Ingested raw clickstream and transaction data from AWS S3, processed using PySpark on EMR to support daily customer insight reporting.
Orchestrated multi-stage data pipelines using Apache Airflow, incorporating SLA monitoring, failure alerts, and dependency management.
Integrated AWS Glue, Lambda, and Athena with Snowflake to build a hybrid pipeline for batch and on-demand processing needs.
Modeled Slowly Changing Dimensions (SCD Type 2) and surrogate keys in Snowflake to enable full histori-cal tracking of customer profiles.
Automated CI/CD for dbt transformations with GitHub Actions, incorporating lint checks, model testing, and staging deployments.
Deployed data quality microservices using Docker, and configured isolated validation environments using Kubernetes (EKS) clusters.
Collaborated with internal teams to design and implement a customer 360 solution using AWS and Snow-flake for unified customer analytics.
Project 2: Coupa
Built modular ELT workflows in dbt and BigQuery, transforming procurement and spend data using reusa-ble macros and CI-integrated tests.
Ingested raw JSON/CSV files from Google Cloud Storage, processed using PySpark on Dataproc to support daily reporting pipelines.
Orchestrated workflows with Cloud Composer (Airflow), integrating Cloud Functions and Pub/Sub for event-driven ingestion.
Modeled SCD Type 2 dimensions in BigQuery for supplier hierarchies and invoice history with role-based access controls.
Automated deployments using Cloud Build and GitHub Actions, supporting multi-environment releases and dbt model validations.

British American Tobacco Bengaluru, KA Apr 2020 Mar 2021
Role: Software Engineer
Responsibilities:
Built ETL pipelines using AWS Glue and Python to extract and transform data from RDS, S3, and SQL Server into partitioned S3 layers.
Designed daily full and incremental load workflows using Step Functions and job bookmarks to man-age state across Glue jobs.
Orchestrated workflows with Step Functions and Lambda, incorporating Wait, Choice, and failover handling for robust data pipelines.
Automated metadata updates and logging by integrating Athena queries and Lambda functions into ingestion pipelines.
Migrated legacy SSIS packages to AWS Glue and S3-based workflows, reducing manual intervention and batch delays.
Transferred structured and unstructured data from SQL Server, DB2, and Cassandra to S3, registering schemas in Glue Catalog.
Implemented CloudWatch and S3 event triggers to launch pipelines and monitor batch loads across raw and processed zones.
Processed JSON, CSV, and Parquet files using PySpark in Databricks, performing joins, filters, and window operations at scale.
Scheduled and managed Databricks jobs using Workflows with REST API hooks for cross-platform or-chestration and error capture.
Resolved compute and memory issues in Databricks clusters, optimizing configurations for better stability and parallelism.
Queried raw S3 data using Athena and Redshift Spectrum to validate ingested files and perform schema-on-read transformations.
Secured access credentials using AWS Secrets Manager, integrated with Glue and Databricks for to-ken-based auth flows.

City Bank Chennai, TN Jun 2017 Apr 2020
Role: Associate Software Engineer
Responsibilities:
Designed reusable ADF pipelines using Copy, Web, ForEach, and Stored Procedure to automate on-prem to cloud data migrations.
Migrated enterprise data from SQL Server and DB2 to Azure SQL, Data Lake, and SQL DW using pa-rameterized ADF pipelines.
Scheduled and triggered ADF workflows via SQL Server Agent for daily, weekly, and monthly batch loads across multiple data domains.
Investigated failures using Azure Monitor, pipeline logs, and diagnostics to troubleshoot job issues and minimize runtime delays.
Executed full SQL migrations to Azure SQL and Managed Instance, validating schema, indexes, stored procedures, and performance.
Built scalable data models and transformations in Azure SQL DW for reporting, analytics, and cross-system data access use cases.
Automated deployment of ADF resources using ARM templates for linked services, datasets, and en-vironment-specific configurations.
Processed semi-structured batch files via HDInsight clusters integrated with blob storage and orches-trated using ADF pipelines.
Resolved infrastructure issues involving storage permissions, VNet security, and integration runtime configs during cloud migration phases.
Collaborated in an onsite-offshore delivery model, managing development tasks, handovers, and pipeline troubleshoot with distributed teams.

Xoriant (City Bank) Chennai, TN Dec 2016 May 2017
Role: Data Engineer Intern
Responsibilities:
Streamlined data ingestion by uploading and organizing large datasets into Amazon S3 with structured folder hierarchies and metadata tagging, enabling faster access and improved traceability for analytics teams.
Automated routine data workflows by developing Python scripts using Pandas for data cleaning and re-porting; integrated with Amazon S3 and RDS to support scalable data processing and reduce manual effort by 60%.
Conducted exploratory data analysis (EDA) using advanced Excel functions and SQL queries on Amazon RDS to uncover data quality issues, identify trends, and guide subsequent data transformation and visu-alization tasks.
Keywords: cprogramm continuous integration continuous deployment quality analyst business intelligence sthree rlang trade national microsoft procedural language Alabama Delaware Florida Pennsylvania Tennessee

To remove this resume please click here or send an email from [email protected] to [email protected] with subject as "delete" (without inverted commas)
[email protected];5932
Enter the captcha code and we will send and email at [email protected]
with a link to edit / delete this resume
Captcha Image: