Home

Sri Bhavana - Data Scientist
[email protected]
Location: Sunnyvale, California, USA
Relocation: Yes
Visa: GC
Resume file: Sri Bhavana Data Scientist ml Resume --1_1778176560385.docx
Please check the file(s) for viruses. Files are checked manually and then made available for download.
Name: Sri Bhavana
Senior Data Scientist
Phone: 571-653-5545
Email: [email protected]
Professional Summary:
Seasoned Senior Data Scientist with around 12+ years of experience specializing in building advanced predictive models, large-scale analytics pipelines and enterprise data platforms across banking, telecom, healthcare and retail domains, leveraging Python, SQL, Data Analysis, Applied Data Science, Machine Learning, pandas, NumPy, scikit-learn, Apache Spark, Apache Kafka, Matplotlib, Tableau and Power BI with distributed workloads deployed across AWS S3, Amazon EC2, Amazon RDS, Azure Blob Storage and Azure Virtual Machines to deliver scalable analytics solutions and operational insights.
Experienced Data Scientist skilled in developing scalable machine learning and statistical analytics solutions using algorithms such as Random Forest, XGBoost, Logistic Regression, Gradient Boosting, K-Means, DBSCAN and Hierarchical Clustering, implemented using Python, PySpark, pandas, NumPy, SciPy and scikit-learn, with model lifecycle management through MLflow and large-scale processing through Apache Spark, Databricks, AWS EMR, Azure Synapse Analytics and Azure Machine Learning for enterprise predictive analytics and precision/recall optimization of models.
Proficient in data engineering and large-scale data processing frameworks including Apache Spark, Spark Streaming, Apache Hive, Apache Kafka, Apache Airflow, Apache Storm, Redis and SQLAlchemy, building distributed analytics pipelines integrated with enterprise storage platforms such as Snowflake, Amazon Redshift and Azure SQL Database, supporting real-time data ingestion, data analysis workflows and experimentation frameworks.
Adept at designing scalable real-time analytics architectures and data pipelines using Apache Kafka, Amazon Kinesis, Spark Streaming, Apache Airflow, FastAPI, Docker, Kubernetes and Terraform, orchestrating machine learning pipelines and streaming analytics across distributed infrastructures including AWS Lambda, Amazon VPC, Azure Virtual Network and Azure Event Hubs for high-volume event processing and operational intelligence using applied data science and machine learning techniques.
Skilled in enterprise data integration and ETL development using SQL, ETL, SSIS, SSMS, DB2, Oracle, MySQL and SQL Server, implementing scalable relational and analytical data models with Pivot Tables, Power Query, MS Excel and SSRS to enable operational reporting and data analysis-driven decision support systems.
Experienced in designing advanced analytics dashboards and business intelligence platforms using Tableau, Power BI, QlikView and Matplotlib, integrating enterprise datasets from Snowflake, Databricks, Azure Data Factory and Amazon S3, enabling executive-level reporting and KPI monitoring driven by data analysis and machine learning insights.
Proficient in machine learning engineering and feature engineering pipelines using Python, pandas, NumPy, PySpark, scikit-learn and spaCy, enabling predictive analytics, anomaly detection, recommendation engines and behavioral analytics using machine learning, natural language processing (NLP), document processing and A/B testing methodologies.
Experienced in cloud-native analytics development and infrastructure automation using Terraform, AWS CloudFormation and ARM Templates, deploying containerized analytics services using Docker and Kubernetes with CI/CD pipelines integrated with GitHub Actions, Jenkins, Git and JIRA, incorporating AI-assisted development (GitHub Copilot) to enhance productivity and code quality.
Skilled in developing distributed microservices and data APIs using REST APIs, FastAPI, Java and Python, implementing secure authentication frameworks using OAuth2, JWT and AWS IAM, enabling secure integration between machine learning pipelines, applied data science systems and enterprise applications.
Experienced in designing enterprise monitoring and observability frameworks using Prometheus, Grafana and Amazon CloudWatch, ensuring real-time monitoring of ETL pipelines, streaming analytics workloads and machine learning model performance with precision/recall tracking and latency analysis.
Strong expertise in enterprise database technologies and governance frameworks including SQL, Oracle, DB2, MySQL, SQL Server, Azure SQL Database and Snowflake, implementing secure analytics architectures using AWS Macie, AWS KMS and Azure Key Vault to protect sensitive data across data analysis and machine learning platforms.
Adept at Agile data science delivery practices using Agile, Git, GitHub, SVN, JIRA and Confluence, enabling collaborative model development, experimentation frameworks and iterative A/B testing approaches across cross-functional teams.
Experienced in healthcare and regulated data ecosystem integrations using HL7 and FHIR, enabling interoperability between clinical datasets and analytics platforms, applying NLP and document processing techniques for structured and unstructured healthcare data.
Deep understanding of enterprise analytics architecture and regulatory-compliant platforms across banking and telecom domains, implementing scalable predictive modeling, streaming analytics and enterprise pipelines leveraging AWS Glue, Apache Spark, Apache Kafka, Databricks, Snowflake, Power BI and Tableau using applied data science and machine learning methodologies to deliver strategic insights.

Education
Jntuk, Material Science and Metallurgical Engineering, April, 2009
Certifications:
AWS certified Machine Learning Specialty.
Microsoft Certified Azure Data Scientist Associate.
Professional Track Record:

Client : Goldman Sachs New York, NY Duration : Mar 2023 - Present
Senior Data Scientist

Functional Role Details:
Directed end-to-end financial data science and analytics workflows using Python, SQL and Data Analysis, building reproducible pipelines and real-time analytics processes on AWS cloud platforms, supporting credit card transactions, payment processing logs, customer behavioral datasets and regulatory reporting systems across enterprise banking platforms.
Designed scalable data ingestion and transformation pipelines using AWS Glue, Amazon Redshift, Snowflake and Amazon Kinesis, enabling real-time analytics for fraud detection, credit risk monitoring, transaction authorization analysis and customer behavior modeling using Applied Data Science and Machine Learning techniques.
Developed large-scale financial data preparation workflows using SQL, implementing joins, cleansing, normalization and transformation across credit card transactions, merchant payments, customer accounts, lending portfolios and digital banking datasets, strengthening data analysis and feature engineering pipelines.
Built distributed data processing pipelines using Apache Spark and Databricks, processing high-volume financial transaction logs and customer interaction events to support machine learning models and experimentation frameworks for fraud detection and risk analytics.
Implemented streaming data pipelines using Apache Kafka and Amazon Kinesis, ingesting real-time transaction streams, fraud alerts and payment gateway events into centralized analytics platforms for anomaly detection, incorporating precision/recall optimization strategies for fraud classification models.
Designed automated data orchestration workflows using Apache Airflow, enabling reliable scheduling and monitoring of ETL pipelines, while supporting A/B testing and experimentation frameworks for continuous model improvement.
Built scalable analytics APIs using FastAPI, exposing fraud detection scores, credit risk indicators and transaction anomaly signals powered by machine learning and applied data science models to internal banking applications.
Developed machine learning pipelines for fraud detection and anomaly identification using Python, Pandas, NumPy and PySpark, applying precision/recall optimization and model evaluation techniques to improve fraud detection accuracy and reduce false positives.
Engineered containerized analytics services using Docker and Kubernetes, enabling scalable model inference services and supporting AI-assisted development (GitHub Copilot) for faster pipeline development and code optimization.
Implemented secure identity and access controls using OAuth2, JWT and AWS IAM, ensuring protection of sensitive financial datasets and compliance with PCI-DSS and SOX standards across ML-driven platforms.
Automated CI/CD pipelines using GitHub Actions, enabling automated testing, validation and deployment of machine learning models, ETL workflows and analytics APIs, leveraging AI-assisted development (Claude, GitHub Copilot) for productivity improvements.
Provisioned scalable cloud infrastructure using Terraform and AWS CloudFormation, enabling automated deployment of data pipelines and analytics clusters supporting large-scale data science and ML workloads.
Built hybrid batch and streaming analytics pipelines using Spark Streaming and Databricks, integrating structured and semi-structured datasets to support machine learning pipelines and experimentation use cases.
Developed centralized financial data lake architectures using Amazon S3, organizing datasets to support data analysis, ML model training and NLP-based document processing workflows for financial reporting and compliance.
Implemented enterprise monitoring solutions using Prometheus, Grafana and CloudWatch, tracking ETL pipeline health, streaming latency and model performance metrics aligned with precision/recall KPIs.
Built executive dashboards using Power BI, visualizing fraud patterns, transaction trends and customer behavior insights derived from machine learning and advanced data analysis models.
Developed data governance frameworks using AWS Macie and KMS, ensuring secure handling of financial datasets, including document processing and NLP-based classification of sensitive financial records.
Integrated financial datasets across systems using Kafka, AWS Glue and Airflow pipelines, enabling unified analytics platforms supporting applied data science and ML-driven insights.
Conducted advanced analytics using SQL, Python and Snowflake, performing root-cause analysis of fraud incidents and applying machine learning, NLP techniques and statistical data analysis for deeper insights.
Mentored junior engineers and data scientists on applied data science, machine learning, experimentation frameworks, A/B testing and reinforcement learning concepts (exploratory use cases) across enterprise banking systems.

Client : Centene Corporation St. Louis, MO Duration : Sep 2020 Feb 2023
Senior Data Scientist

Functional Role Details:
Architected enterprise healthcare analytics platforms supporting population health insights, member risk analysis and care management optimization using Python, SQL, Data Analysis and Applied Data Science, integrating large-scale healthcare datasets into Snowflake, Azure Synapse Analytics and Azure Data Lake Storage Gen2.
Designed and implemented scalable data preparation and feature engineering pipelines using Python, SQL, PySpark, Databricks and Delta Lake, enabling large-scale machine learning and applied data science workflows on healthcare datasets including claims, enrollment and clinical records.
Developed predictive healthcare analytics models using Python, scikit-learn, pandas, NumPy and PySpark to support member risk scoring, readmission prediction and utilization forecasting using machine learning techniques with precision/recall optimization for healthcare risk models.
Implemented advanced data quality validation and anomaly detection frameworks using Python, SQL and Databricks, applying data analysis and statistical modeling techniques to identify irregularities in claims and provider billing patterns.
Built unsupervised machine learning models using clustering techniques (K-Means, DBSCAN, Hierarchical Clustering) to identify patient segments and high-risk cohorts, leveraging applied data science and machine learning methodologies.
Developed data processing utilities to parse healthcare claims files and electronic health records using Python, HL7 and FHIR standards, incorporating document processing and NLP techniques to extract structured insights from clinical narratives and unstructured healthcare data.
Evaluated predictive healthcare models using ROC-AUC, Precision-Recall, F1-Score, RMSE and MAE, emphasizing precision/recall optimization and model performance tuning for care management and risk prediction systems.
Integrated machine learning insights and healthcare KPIs into Power BI, Tableau and SQL dashboards, enabling stakeholders to monitor outputs derived from data analysis and applied data science models.
Built scalable ELT pipelines using Azure Data Factory, Databricks and Snowflake, transforming datasets to support machine learning pipelines, experimentation frameworks and analytics-driven healthcare insights.
Developed real-time healthcare analytics pipelines using Apache Kafka and Azure Event Hubs, enabling streaming analytics for care coordination and fraud detection using machine learning and real-time data analysis techniques.
Containerized analytical workloads using Docker and Kubernetes, enabling scalable deployment of machine learning models and applied data science services across healthcare environments.
Managed machine learning lifecycle workflows using Databricks, Azure ML and MLflow, supporting reproducible pipelines and enabling A/B testing and experimentation frameworks for model validation and continuous improvement.
Provisioned healthcare analytics infrastructure using Terraform and ARM Templates, supporting scalable cloud environments for data science, machine learning and analytics workloads.
Implemented CI/CD pipelines using GitHub Actions and Jenkins, incorporating AI-assisted development (GitHub Copilot) to accelerate development and deployment of data pipelines and ML models.
Built executive dashboards using Tableau and Power BI, delivering insights derived from data analysis, machine learning outputs and applied data science models.
Automated batch analytics pipelines using Python, SQL and Airflow, orchestrating workflows supporting experimentation frameworks and model retraining cycles.
Collaborated with clinical teams to translate requirements into analytics solutions using Python, Databricks and Snowflake, applying applied data science and machine learning techniques to improve care outcomes.
Conducted advanced exploratory analysis using Python, pandas, NumPy and SQL, identifying care gaps and utilization trends through deep data analysis and statistical modeling.

Client : Fidelity Investments, MA Duration : Jun 2017 Aug 2020
Senior Data Scientist

Functional Role Details:
Led strategic analytics initiatives for investment management and digital wealth platforms using Python, SQL, Data Analysis and Applied Data Science, building scalable pipelines and analytical services to analyze trading activity, portfolio performance, customer behaviour and financial risk indicators.
Designed and automated enterprise financial data pipelines integrating datasets such as market data feeds, trading transactions and portfolio positions using AWS S3, AWS EMR, Apache Spark, Databricks and Airflow, enabling machine learning and applied data science workflows.
Owned the full analytics lifecycle from ingestion and feature engineering to deployment using Python, PySpark, scikit-learn and MLflow, supporting machine learning model development and experimentation frameworks.
Built customer churn, portfolio recommendation and CLV models using Python, pandas, NumPy and scikit-learn, applying machine learning techniques with precision/recall optimization to improve customer targeting and financial advisory outcomes.
Developed large-scale financial analytics platforms using Apache Spark, Databricks and AWS EMR, enabling data analysis and machine learning-driven insights for portfolio risk and trading performance.
Engineered data validation, reconciliation and quality frameworks using Python, SQL and Spark, applying data analysis techniques to ensure integrity of financial datasets including transactions and portfolio valuations.
Built reusable machine learning frameworks supporting classification, regression and clustering, accelerating applied data science and machine learning model development across financial use cases.
Developed predictive models using XGBoost, Random Forest and Logistic Regression, evaluating using ROC-AUC, Precision-Recall, F1-Score, RMSE and MAE, ensuring precision/recall optimization and robust model performance.
Performed advanced feature engineering using pandas, NumPy and PySpark, improving model performance for machine learning and applied data science pipelines.
Delivered dashboards using Tableau, Power BI and SQL queries, visualizing insights derived from data analysis and machine learning outputs for business stakeholders.
Conducted exploratory data analysis on market data, transactions and financial instruments using Python and SQL, identifying trends using deep data analysis and statistical techniques.
Architected event-driven analytics using Kafka, AWS Lambda and Spark Streaming, enabling real-time monitoring of trading activity and fraud detection using machine learning and streaming data analysis.
Collaborated with portfolio managers and analysts to translate analytical findings into strategies, applying applied data science and machine learning insights to optimize investment decisions.
Implemented governance and security frameworks aligned with SEC, SOX and GDPR, ensuring secure handling of datasets used in data analysis and machine learning pipelines.
Leveraged customer analytics techniques including segmentation and behavioral modeling using machine learning and data analysis approaches to improve retention and personalization.
Applied natural language processing (NLP) and document processing techniques using Python (spaCy) to analyze financial news, analyst reports and client feedback, extracting sentiment signals impacting investment strategies.
Managed AWS infrastructure using Terraform and CloudFormation, provisioning environments supporting machine learning and applied data science workloads.
Enabled Agile delivery using Git, GitHub, JIRA and Confluence, supporting collaborative development of data science and machine learning solutions.
Performed scenario modeling and stress testing using Python simulations, supporting applied data science and statistical modeling for portfolio risk evaluation.
Monitored model performance using CloudWatch and Grafana, identifying drift and degradation while applying model evaluation and precision/recall monitoring techniques.
Delivered executive presentations and storytelling using insights derived from data analysis, machine learning and applied data science models.

Client : Verizon Communications New York, NY Duration : Dec 2015 May 2017
Senior Data Scientist

Functional Role Details:
Architected end-to-end telecom data ingestion and transformation pipelines using SQL, Python (pandas), Azure Virtual Machines and Azure Blob Storage, enabling scalable processing of subscriber activity logs, call detail records (CDRs), billing data and device usage metrics supporting customer analytics and operational intelligence.
Engineered and maintained structured data repositories using SQLAlchemy, MySQL and optimized indexing strategies to manage high-volume telecom datasets including subscriber profiles, network activity logs and service usage records.
Performed comprehensive exploratory data analysis and statistical evaluations on large-scale telecom datasets using NumPy, SciPy and pandas, identifying patterns in customer usage behavior, network traffic fluctuations and service adoption trends.
Developed predictive analytics frameworks for subscriber churn prediction, service upgrade recommendations and customer segmentation using Python (scikit-learn), evaluating models with ROC-AUC, F1 Score, RMSE and MAE metrics.
Leveraged distributed data platforms including Apache Hive and Apache Spark to process massive telecom datasets such as call detail records, network performance logs and subscriber interaction events, enabling scalable analytics and reporting.
Built streaming analytics pipelines using Apache Storm, Kafka and Redis deployed on Azure Virtual Machines, enabling near real-time detection of network anomalies, fraudulent activity and service disruptions across telecom infrastructure.
Applied association rule mining and behavioral analytics using SQL and Python (pandas) to identify relationships between telecom services such as data plans, roaming services, messaging bundles and premium subscriptions, enabling targeted cross-sell strategies.
Conducted subscriber lifecycle and retention analysis using cohort-based analytics with SQL, Python (pandas) and Matplotlib, analyzing customer tenure, usage evolution and churn behavior.
Performed controlled experimentation (A/B testing) using Python, SciPy and advanced SQL, evaluating the impact of pricing plans, service promotions and digital customer engagement initiatives.
Designed interactive operational dashboards using Tableau, QlikView and Matplotlib, visualizing subscriber growth metrics, churn trends, network usage patterns, service adoption rates and operational KPIs for business and engineering teams.
Automated telecom data preparation and ingestion workflows using Python automation scripts, improving processing efficiency for network logs, subscriber datasets and billing files used in downstream analytics.
Built scalable telecom data pipelines integrating Azure Blob Storage, Hive, Spark and Python, enabling structured storage and analytics of large volumes of CDR data, network telemetry and customer interaction events.
Collaborated with network engineering, marketing analytics and customer experience teams to translate data insights into initiatives improving service reliability, subscriber engagement and network performance optimization.
Implemented telecom fraud detection analytics by analyzing abnormal usage patterns, SIM activity anomalies and unusual network behavior, improving operational monitoring and risk mitigation.
Supported telecom marketing teams by building customer segmentation models and service-adoption predictions, enabling targeted campaigns for data plans, roaming packages and premium services.
Implemented data validation and quality monitoring frameworks using Python, SQL and Spark, ensuring integrity and reliability of telecom analytics datasets.
Worked within Agile development environments, collaborating with cross-functional teams to deliver iterative improvements to telecom analytics platforms and machine learning models.
Provided executive-level analytical reports highlighting subscriber growth, churn drivers, service utilization patterns and network performance insights, supporting strategic telecom business decisions.

Client: Palantir Technologies, CA Duration : Feb 2013 Nov 2015
Data Analyst

Functional Role Details:
Extracted, analyzed and validated enterprise datasets using SQL, DB2 and Oracle, supporting data quality initiatives and improving accuracy of operational analytics across enterprise data platforms.
Designed and maintained KPI-driven analytical dashboards using MS Excel 2010, SQL Server 2010 Reporting Services (SSRS) and Tableau, enabling visualization of trends, performance metrics and operational indicators for business stakeholders.
Optimized complex analytical queries in SQL Server 2010, DB2 and Oracle, improving query execution performance and enabling faster analysis of large operational datasets.
Performed large-scale data analysis using SQL, MS Excel 2010, Pivot Tables, Power Query and statistical functions, generating actionable insights supporting business decision-making and operational improvements.
Built structured analytical datasets using SQL Server Integration Services (SSIS) and ETL workflows, enabling efficient ingestion and transformation of enterprise data from multiple source systems.
Developed reusable data extraction and transformation scripts using Java and Eclipse IDE, supporting automation of data processing, cleansing and migration tasks across analytical systems.
Implemented relational data models in SQL Server 2010, Oracle and DB2, designing scalable schema structures that improved reporting performance and analytical query efficiency.
Conducted data validation and reconciliation processes using SQL, ensuring integrity and consistency of datasets used in reporting, analytics and operational dashboards.
Automated repetitive data preparation tasks using Python scripts and Excel automation, reducing manual data processing effort and improving overall workflow efficiency.
Performed statistical trend analysis using MS Excel 2010, Matplotlib and SQL queries to identify operational patterns, anomalies and performance improvements across enterprise datasets.
Designed lightweight REST-based data services supporting secure data exchange between internal analytics systems and enterprise applications.
Monitored database activity using SQL Server Management Studio (SSMS) and Oracle monitoring tools, diagnosing query bottlenecks and improving system performance.
Managed source code version control and project documentation using SVN (Subversion Repository), enabling collaboration and controlled development workflows across analytics teams.
Collaborated with cross-functional stakeholders including business analysts, product teams and engineering groups to translate analytical requirements into data models, queries and reporting solutions.
Implemented indexing strategies, query optimization and database maintenance activities in SQL Server, DB2 and Oracle, improving data retrieval speed and system throughput.
Built operational dashboards and reports using Tableau, Excel charts and SQL queries, delivering insights into operational KPIs, data trends and business performance indicators.
Documented data definitions, transformation logic and reporting workflows using technical documentation repositories, ensuring clarity and maintainability of analytics processes.
Assisted engineering teams in integrating analytical datasets with enterprise applications through data services and SQL interfaces, enabling seamless data consumption across systems.
Supported Agile development processes by participating in requirement gathering, sprint planning and iterative analytics delivery across cross-functional teams.
Keywords: continuous integration continuous deployment artificial intelligence machine learning business intelligence sthree microsoft mississippi California Massachusetts Missouri New York

To remove this resume please click here or send an email from [email protected] to [email protected] with subject as "delete" (without inverted commas)
[email protected];7298
Enter the captcha code and we will send and email at [email protected]
with a link to edit / delete this resume
Captcha Image: