Home

Ram Thalla - Sr Data Engineer
[email protected]
Location: , ,
Relocation: Any
Visa: GC
Resume file: RamThalla_Data_Engineer_1756747242063.pdf
Please check the file(s) for viruses. Files are checked manually and then made available for download.
Ram Thalla - Sr Data Engineer / Python Developer - AI/ML
(561)-359-4473
[email protected]
LinkedIn:RamThalla

Professional Summary:
A senior Python developer and data engineer with more than 9 years of expertise building scalable data
pipelines, cloud-based data solutions, and AI/ML integrations across healthcare, energy, and enterprise domains.
Demonstrated ownership of the full lifecycle for AI-driven clinical decision support modules, from conception to
production deployment.
Engineered features from complex clinical datasets to build predictive classification models for conditions such as
diabetes and heart disease. Automated the training and deployment of these models using Python scripts to establish
scalable and repeatable ML workflows.
Validated model performance using key metrics including precision, recall, and ROC-AUC. Integrated model
predictions into production APIs to support real-time, data-driven decision making for healthcare providers.
Conducted A/B testing to empirically evaluate the impact of ML outputs on clinical decisions and developed custom
logging tools to monitor model performance and data drift in production environments.
Architected and implemented modern data solutions using Microsoft Azure PaaS services to facilitate real-time data
visualization and reporting.
Led the successful migration of legacy data systems to Azure Synapse Analytics and Azure Data Lake Storage,
achieving a 25 percentage reduction in infrastructure costs while improving scalability.
Managed large-scale data ingestion from servers into HDFS, followed by bulk loading into HBase for scalable storage
and retrieval. Administered and performance-tuned Spark Databricks clusters to ensure optimal resource utilization.
Automated the deployment and scaling of cloud data services using Azure Resource Manager templates.
Built and maintained robust, scalable ETL pipelines using Python and Apache Airflow to process high volumes of
customer and transactional data from diverse sources.
Automated manual data ingestion workflows, resulting in reduced processing times, fewer errors, and enhanced
reliability.
Identified and resolved performance bottlenecks within data pipelines to significantly improve data throughput and
system reliability.
Utilized SQL and Tableau to develop analytics for a marketing campaign, tracking KPIs and contributing to a 15
percent increase in client retention.
Authored and optimized complex SQL queries for dashboards and ad hoc reporting, achieving performance
improvements of up to 40 percent.
Designed and assisted in creating a centralized data mart using a star schema to unify disparate customer touchpoints
into a single source of truth for analytics.
Engineered and deployed RESTful APIs with Python, Flask, and SQLAlchemy to serve real-time clinical data and
ML model outputs to healthcare dashboards.
Managed the end-to-end build and release processes for multiple production modules within a CI/CD framework using
Visual Studio Team Services (VSTS).
Successfully led cross-functional teams and assumed full ownership of project delivery to ensure timely and
high-quality outcomes.
Actively mentored junior engineers in Python development, data engineering best practices, and ML
deployment strategies, fostering team growth and technical excellence.
Collaborated effectively with data scientists, clinical analysts, and business stakeholders to align technical
solutions with strategic objectives and drive innovation.

Skills
Programming and APIs: Python, Flask, SQLAlchemy
Databases: PostgreSQL, PL/SQL
ETL and Data Engineering: Apache Airflow, Custom ETL scripts, Data cleaning, Transformation
AI/ML: Feature engineering, Classification models, Model evaluation, ML integration, A/B testing
Cloud Platforms: Microsoft Azure (PaaS), Azure Synapse Analytics, Azure Data Lake Storage
Big Data: HDFS, HBase, Spark, Databricks
DevOps and CI/CD: Visual Studio Team Services (VSTS), Azure Resource Manager
BI and Reporting: SQL, Tableau
Collaboration: Cross-functional teams, Technical documentation, Data profiling

Professional Experience
Senior Python Developer with AI/ML June 2022 PRESENT
JPMorgan Chase USA
Developed robust ETL pipelines to extract, transform, and load healthcare data from diverse sources into structured
formats for analytics and machine learning workflows.
Designed and maintained relational databases using PostgreSQL and PL/SQL, ensuring optimized performance and
scalability for AI-driven analytics.
Built and deployed RESTful APIs using Python, Flask, and SQLAlchemy to serve real-time clinical data and ML model
outputs to healthcare dashboards.
Engineered features from clinical datasets to support ML models for disease prediction, including diabetes and heart
disease risk scoring.
Collaborated with data scientists to train and validate classification models using structured patient data, contributing to
improved risk stratification.
Assisted in model evaluation by preparing validation datasets and analyzing metrics such as precision, recall, and
ROC-AUC.
Integrated ML model predictions into production APIs to support real-time decision-making for healthcare providers.
Identified performance bottlenecks in data pipelines and implemented optimizations to improve throughput and reliability.
Managed build and release processes for multiple modules in production using Visual Studio Team Services (VSTS).
Designed and implemented data validation and preprocessing routines to ensure high-quality inputs for machine learning
models used in clinical risk prediction.
Automated the training and deployment of classification models using Python scripts, enabling scalable and repeatable
ML workflows.
Developed custom logging and monitoring tools for ML pipelines to track model performance and data drift in
production environments.
Led cross-functional collaboration efforts, mentoring junior developers and guiding data scientists on ML integration best
practices to enhance hospital decision support systems.
Conducted A/B testing on ML model outputs to evaluate impact on clinical decision making and refine model
parameters based on feedback.
Drove the business validation of AI solutions by integrating model predictions into clinical workflows and conducting
rigorous A/B testing to measure their direct impact on provider decision-making.
Mentored junior engineers on Python development and ML deployment strategies, fostering team growth and technical
excellence.
Took ownership of end-to-end delivery for AI-driven clinical decision support modules, ensuring timely and high-quality
implementation.
Technologies Used:
Programming and APIs: Python, Flask, SQLAlchemy.
Databases: PostgreSQL, PL/SQL.
ETL and Data Engineering: Custom ETL scripts, data cleaning, transformation.
AI/ML: Feature engineering, classification models, model evaluation (precision, recall, ROC-AUC), ML integration,
A/B testing.
DevOps and CI/CD: Visual Studio Team Services (VSTS).
Collaboration: Worked closely with data scientists, clinical analysts, and cross-functional teams.

Cloud Data Engineer March 2019 May 2022
Apple USA
Analyzed, designed, and built modern data solutions using Azure PaaS services to support real-time data visualization
and reporting.
Extracted large volumes of structured and unstructured data from servers into HDFS, followed by bulk loading into
HBase for scalable storage and retrieval.
Estimated cluster sizing and managed Spark Databricks clusters, including performance monitoring and troubleshooting
to ensure optimal resource utilization.
Developed and maintained CI/CD pipelines for multiple production modules using Visual Studio Team Services (VSTS).
Implemented data ingestion workflows and transformation logic to support downstream analytics and business
intelligence tools.
Collaborated with cross-functional teams to align data architecture with business requirements and operational goals.
Optimized data flow and storage strategies to reduce latency and improve throughput across distributed systems.
Ensured data integrity and security across cloud environments by applying best practices in access control and encryption.
Automated deployment and scaling of data services using Azure Resource Manager templates and scripting.
Provided technical documentation and knowledge transfer sessions to support ongoing maintenance and onboarding.
Technologies Used:
Cloud Platform: Microsoft Azure (PaaS).
Big Data: HDFS, HBase, Spark, Databricks.
DevOps and CI/CD: Visual Studio Team Services (VSTS), Azure Resource Manager.
Data Engineering: Data ingestion workflows, transformation logic, performance optimization.
Security and Governance: Access control, encryption.
Collaboration and Documentation: Technical documentation, cross-functional team alignment

Data Engineer March 2016 Feb 2019
DXC Technology India
Built and maintained scalable ETL pipelines using Python and Apache Airflow to process customer and transactional
data from multiple sources.
Migrated legacy data systems to Azure Synapse Analytics and Azure Data Lake Storage, improving scalability and
reducing infrastructure costs by 25 percentage.
Developed optimized SQL queries for dashboards and ad hoc reporting, improving query performance by up to 40
percentage.
Automated manual data ingestion workflows, reducing processing time and errors while improving reliability.
Collaborated with BI and product teams to understand data requirements and deliver clean, well-structured datasets.
Wrote technical documentation for pipeline architecture, data lineage, and recovery procedures.
Collaborated on a marketing analytics project to track campaign KPIs using SQL and Tableau.
Assisted in the creation of a centralized data mart using star schema to unify multiple customer touchpoints.
Performed data cleaning and transformation using Python (Pandas) on CSV and Excel data from sales teams.
Generated insights for client presentations, contributing to a 15 percentage increase in client retention for Q1 2022.
Technologies Used:
Programming and Scripting: Python (Pandas)
Data Storage and Processing: CSV, Excel
Data Warehousing: Azure Synapse Analytics, Azure Data Lake Storage
ETL and Workflow Automation: Apache Airflow
Databases and Querying: SQL
Visualization and Reporting: Tableau
Data Modeling: Star Schema
Documentation: Technical documentation for pipeline architecture and data lineage.

Education
B.Tech in Computer Science, JNTUH Aug 2012 - May 2016
Keywords: continuous integration continuous deployment artificial intelligence machine learning business intelligence active directory procedural language
Keywords: continuous integration continuous deployment artificial intelligence machine learning business intelligence active directory procedural language

To remove this resume please click here or send an email from [email protected] to [email protected] with subject as "delete" (without inverted commas)
[email protected];6062
Enter the captcha code and we will send and email at [email protected]
with a link to edit / delete this resume
Captcha Image: