Resume View

Home

Pratyusha Manjuluri - Sr. AI/ML Engineer

Location: , , USA

Relocation:

Visa:

Resume file: Pratyusha Munjuluri_GEN_AIML Engineer (1)_1765464921928.docx
Please check the file(s) for viruses. Files are checked manually and then made available for download.

Pratyusha
Senior AI/MLEngineer
SUMMARY:

Senior AI/ML Engineer with 11+ years building production AI systems across healthcare, finance, retail, and defense sectors. Expert in generative AI, machine learning, and enterprise-scale data engineering.
Built end-to-end RAG pipelines reducing support costs 60% and content generation costs 65%. Fine-tuned GPT-3.5 Turbo, LLaMA2, and open-source models using LoRA/QLoRA parameter-efficient techniques on GPU clusters (A100s). Implemented CoT prompting, zero-shot/one-shot learning achieving 95% answer relevance across 500K+ monthly requests.
Pioneer in agentic frameworks with intelligent routing algorithms and autonomous decision-making agents. Deployed Quality Assessment Agents using self-attention mechanisms, Fallback Strategy Agents for automated recovery, and Routing Agents with adaptive learning capabilities. Achieved 99.9% system uptime handling 50K+ requests/hour peak loads.
Delivered fraud detection systems (90% AUC, 35% false positive reduction), credit scoring models (0.78 0.90 AUC improvement), and recommendation engines serving 850K+ users. Built anomaly detection using autoencoders, ensemble models with XGBoost/LightGBM, and real-time streaming analytics with sub-2-second inference.
Implemented comprehensive data consolidation methodologies processing 50M+ heterogeneous records monthly. Built robust ETL pipelines using Apache Spark (PySpark, Spark SQL, Structured Streaming), Azure Data Factory, and Apache Airflow. Optimized data processing from 8 hours to 45 minutes through parallel computing and cluster optimization.
Expert in Azure (Databricks, ML, Functions, Synapse Analytics) and AWS (SageMaker, Lambda, EMR, Glue) platforms. Architected multi-cloud solutions with Delta Lake, Feature Store, and Databricks Marketplace for secure data sharing. Implemented horizontal scaling, spot-instance optimization, and auto-scaling strategies.
Built automated CI/CD pipelines with Terraform, GitHub Actions, Jenkins, and Azure DevOps. Implemented blue/green deployments, model versioning with MLflow, and comprehensive monitoring using Application Insights, CloudWatch, and Prometheus. Achieved 99.9% uptime with automated retraining workflows.
Deep expertise in Unity Catalog, RBAC, SCIM, and schema contracts for DoD, HIPAA, and financial compliance. Implemented fine-grained access controls, entitlement policies, and audit-ready governance frameworks. Managed secure data sharing across public/private Databricks Marketplaces.
Expert in transformer architectures, self-attention mechanisms, BERT, and domain adaptation techniques. Built complaint classification (88% F1 score), sentiment analysis, entity extraction, and abstractive summarization systems. Processed 1M+ call transcripts with transfer learning and multi-task learning approaches.
Designed Flask-based APIs, Docker containerization, and Kubernetes (EKS) orchestration. Deployed scalable microservices using Azure Functions, AWS Lambda, and Azure Container Apps. Implemented API Gateway patterns, load balancing, and service mesh architectures.
Knowledge on Engineered scalable ML pipelines leveraging Google Cloud Platform (GCP) services, including BigQuery, Vertex AI, Dataflow, and Cloud Functions, resulting in efficient end-to-end model deployment workflows.
Built streaming analytics with Databricks Structured Streaming, Kinesis, and Event Grid systems. Implemented real-time fraud detection, anomaly scoring, and content generation with sub-150ms latency. Designed event-driven architectures with automated alerting and escalation workflows.
Created comprehensive dashboards using Power BI, Tableau, and SSRS for executive reporting. Built real-time analytics, KPI tracking, and interactive visualizations supporting data-driven decision making. Integrated ML insights into business applications and authoring interfaces.
Expert in SQL (Oracle, MySQL, SQL Server) and NoSQL (MongoDB) systems. Proficient in Spark SQL, PySparkDataFrames, and advanced query optimization. Implemented data lakes, warehouses, and hybrid storage architectures with HDFS, Azure Blob Storage, and ADLS Gen2, Worked on Pinecone VectoreDb to store vectore embeddings.
Achieved 25-65% cloud cost reductions through intelligent resource management, model quantization, and right-sizing strategies. Implemented dynamic scaling, compute optimization, and storage tiering. Reduced infrastructure costs while maintaining strict SLA requirements (sub-150ms inference, 99.9% availability).
Engineered end to end MLOps architecture on the Databricks Lakehouse, covering scalable model training, deployment, lineage, and observability.
Led cross-functional teams, POCs on emerging technologies, and strategic AI initiatives. Mentored junior engineers, conducted technical reviews, and drove organizational AI maturity. Evaluated new frameworks, benchmarked solutions, and made architectural decisions for enterprise deployments.
Full-stack proficiency in Python, Scala, R, and SQL. Expert in distributed computing, parallel processing, and high-performance computing techniques. Experience with version control (Git), code review processes, and agile development methodologies.
Consistently delivered measurable results: 20% marketing CTR lift, 45% first-call resolution improvement, 18% inventory optimization, 19% care gap closure improvement, and 10% fraud detection precision gains across multiple enterprise deployments serving millions of users.
EDUCATION:

Master of Science: Computer Science
University: Central Michigan University

TECHNICAL SKILLS:

Technology / Tool Description / Usage
Cloud Platforms Azure (Databricks, Azure ML, Azure Functions, Azure Monitor, Azure Synapse Analytics, Azure Data Factory), AWS (S3, SageMaker, Lambda, API Gateway, EMR, Glue, CloudWatch) GCP
Data Engineering & Storage Delta Lake, Databricks Feature Store, Databricks Marketplace, Apache Hudi, HDFS, Apache Sqoop, Azure Blob Storage, ADLS Gen2, Data Consolidation Methodologies
Big Data & Processing Apache Spark (PySpark, Spark SQL, Structured Streaming), Scala, EMR, Apache Airflow, Apache Oozie, Distributed Computing Frameworks, Parallel Computing Techniques
Databases & Query Languages SQL (Oracle, MySQL, SQL Server), NoSQL (MongoDB), Spark SQL, PySparkDataFrames
Machine Learning Frameworks TensorFlow, PyTorch, XGBoost, LightGBM, Spark MLlib, Hugging Face Transformers, Prophet, ARIMA
Generative AI & LLMs Azure OpenAI (GPT-3.5 Turbo), OpenAI APIs, Vertex AI Databricks Mosaic AI, Anthropic Claude, LLaMA/LLaMA2, Vectore DB, Pinecone embeddings, LoRA, QLoRA, Parameter Efficient Fine-tuning, Chain-of-Thought (CoT) Prompting
NLP Tools &Techniques spaCy, Hugging Face Transformers, Sentiment Analysis, Entity Extraction, Abstractive Summarization, Self-Attention mechanisms, Zero-shot/One-shot learning, Transfer Learning, Layer Freezing, Domain Adaptation, Multi-task Learning
MLOps& Automation MLflow, Terraform, GitHub Actions, Jenkins, AWS CodePipeline, Model versioning and experiment tracking, Agentic Workflows, Automated Fallback Strategies
Containerization & Microservices Docker, Azure Functions, AWS Lambda, Flask API development, Microservices Orchestration
Monitoring & Logging Application Insights, Azure Monitor, Prometheus, Grafana, AWS CloudWatch, AWS X-Ray, Model Drift Detection, Quality Assessment Systems
Data Visualization & BI Power BI, SSRS, Tableau, Real-time Dashboards
Programming Languages Python, Scala, R, SQL, PySpark, Spark SQL
Security & Governance Unity Catalog, Role-Based Access Control (RBAC), SCIM, Schema Contracts, Entitlement Policies, Compliance Automation
Model Deployment & Serving
Kubernetes (EKS), API Gateway, Microservices architecture, GPU-optimized model serving, Multi-model Orchestration, Intelligent Routing Algorithms
AI Agents & Automation Autonomous Decision-making Agents, Quality Assessment Agents, Fallback Strategy Agents, Routing Agents with Adaptive Learning, Langchain, CrewAI, Cursor AI, LangGraph, LangSmith

WORK EXPERIENCE:

Optum, Dallas Tx July2024-Present

Senior GenAi Engineer
Led development of an enterprise GenAI automation system that extracts medical billing codes (CPT, ICD-10, HCPCS) from radiology reports using Azure OpenAI (GPT-4/4o), LangChain, and multi-agent orchestration, reducing manual coding time by enabling automated analysis of clinical documentation with high accuracy and payer compliance.
Built RAG-driven radiology coding agents using Azure AI Search, LangChain Retrieval, and metadata-driven hybrid search, achieving high accuracy in contextual code extraction from modality guidelines and payer policies.
Designed a multi-agent orchestration framework leveraging LangChain Agents and Azure Semantic Kernel (planner executor pattern) to coordinate specialized reasoning agents for code validation and clinical context analysis.
Integrated LangSmith for experiment tracking, prompt evaluation, and hallucination reduction, improving model reliability and explainability across production workflows.
Developed scalable GenAI pipelines in Azure AI Prompt Flow with CI/CD-style evaluation using precision, recall, and retrieval metrics before production deployment.
Applied ReAct, Chain-of-Thought (CoT), and few-shot prompting to enhance LLM reasoning for radiology-specific code classification and payer compliance rules.
Fine-tuned GPT models (LoRA & QLoRA variants) for retrieval-aware behavior using RAG-aligned datasets, reducing context drift and enhancing factual accuracy in medical coding workflows.
Implemented vector-based retrieval pipelines with Azure AI Search Vector Index and Pinecone, optimizing similarity metrics (cosine/dot-product), reranking, and threshold tuning for radiology contexts.
Developed hybrid search mechanisms in Azure AI Search combining vector embeddings, keyword matching, and metadata filters to retrieve modality details, contrast usage protocols, and policy guidelines with high precision.
Optimized large-scale vector database performance via indexing strategies, storage compression, and intelligent sharding for faster context retrieval across thousands of radiology documents.
Automated prompt, pipeline, and evaluation code generation using GitHub Actions and GitHub Copilot, streamlining continuous experimentation and data-driven evaluations.
Conducted model performance evaluations (accuracy, F1 score, recall, hallucination rate) to benchmark and iteratively improve GenAI model performance on radiology datasets.
________________________________________

AAFES(Army and Air force Exchange), Dallas, TX Apr 2021 July2024
Senior AI/ML Engineer
Roles & Responsibilities:

1. GenAI-Driven Marketing Content Generator
Orchestrated end-to-end GenAI pipelines implementing sophisticated data consolidation methodologies for processing 5M+ unstructured marketing records across multiple data sources, leveraging Delta Lake and Feature Store in Azure Databricks, publishing curated campaign datasets as governed, shareable assets on both public and private Databricks Marketplaces for internal teams and external partners.

Built a modular, agentic RAG (Retrieval-Augmented Generation) architecture using Python and LangGraph, defining multi-step workflows with agents responsible for document retrieval, reranking, relevance checking, and final answer synthesis using Langchain s Runnable and MultiPromptChaincomponents.Defined and enforced schema contracts, entitlement policies, and fine-grained access controls via Unity Catalog and RBAC (using SCIM groups and workspace identities), ensuring secure and compliant data sharing within the marketing ecosystem.

Created advanced LangGraph stateful workflows, leveraging node-based agent execution graphs for,Task routing based on model cost-performance matrix,Context window optimization based on token budgets, Designed and optimized vector indexing strategies (HNSW, IVF, cosine similarity) in Pinecone to improve retrieval precision and reduce query latency

Built Copilot-driven conversational workflows for FAQs, ticket classification, and escalation, integrating custom skills for translation, metadata extraction, and compliance checks.

Configured Copilot connectors with SharePoint, Teams, and SQL Server to provide real-time insights, empowering business users with self-service analytics and reducing dependency on IT.

Engineered dynamic prompt templates utilizing CoT (Chain-of-Thought) prompting, zero-shot and one-shot learning techniques, embedding customer-segment attributes and compliance rules, boosting content relevance by 18% and driving a 20% lift in click-through rates through parameter-efficient fine-tuning of LLaMA2 models using LoRA and QLoRA methodologies on GPU clusters (A100s), supplemented by GPT-3.5 Turbo fine-tuning on top-tier campaign copy styles.

Worked with CursorAI agents including Quality Assessment Agents utilizing self-attention mechanisms for content evaluation, Fallback Strategy Agents handling model failures and automated recovery, and Routing Agents with adaptive learning capabilities for dynamic model selection ensuring 99.9% system uptime and seamless failover between models during peak traffic loads exceeding 50K requests/hour.

Utilized LangGraphToolNode and ConditionalNode APIs to manage external tool calling (e.g., product catalog lookups, tone scoring functions), integrating external APIs into autonomous GenAI loops.

Automated quarterly model retraining triggered by Application Insights metrics, continuously updating GenAI models with new campaigns while integrating feedback loops for prompt-engineering refinements.

Developed Dockerized Flask-based microservices and Azure Functions serving embeddings from Pinecone and prompt completions from Azure OpenAI endpoints, implementing LangChain-based agentic workflows for autonomous content generation and validation, achieving sub-150ms latency through GPU-optimized model hosting and efficient API orchestration.

Enabled multi-layer monitoring and feedback loops, combining Application Insights metrics with LangSmith error spans and crew agent scoring history, triggering retraining workflows every quarter.

Embedded AI-powered suggestions directly in marketing authoring interfaces, reducing context switching and increasing creator productivity.

Enabled data science teams through operationalized feature pipelines using Databricks Online Feature Store, Delta Lake, Unity Catalog, and Databricks Asset Bundles.

Instrumented ML performance tracking and usage monitoring via Application Insights, supporting Marketplace asset usage analytics and governance.

2. Retrieval-Augmented GenAI Support Service
Built a comprehensive RAG pipeline with significant business impact (reducing support ticket resolution time by 60% and increasing first-call resolution rates by 45%) in Azure Databricks, implementing advanced data consolidation methodologies to process and index 2,000+ heterogeneous policy and FAQ documents, using OpenAI's text-embedding-ada-002 and custom fine-tuned LLaMA2 models with self-attention mechanisms stored in Pinecone, published as secure, governable data assets on Databricks Marketplaces.
Pinecone Vector DB, enabling low-latency semantic search over thousands of knowledge documents with OpenAI / Azure OpenAI embeddings.
Integrated Vertex AI APIs with backend systems (Python, Node.js, C#) to deliver GenAI features into enterprise application
Developed reusable Python classes and decorators to manage agent lifecycles, fallback logic, telemetry hooks, retry policies, and structured logging. Integrated LangSmith tracing to visualize prompt chains, track execution timelines, and debug performance bottlenecks in agentic flows.
Applied Unity Catalog governance, enforcing table- and column-level access control and role-based entitlements to maintain data security and compliance within the support knowledge base.
Implemented parameter-efficient fine-tuning using LoRA and QLoRA techniques on both GPT-3.5 Turbo and open-source LLaMA2 models, training on 30,000 curated support Q&A pairs utilizing GPU clusters for accelerated training, achieving 95% answer relevance and halving fallback rates during UAT through advanced zero-shot and one-shot prompting strategies.
Automated embedding job orchestration, fine-tuning runs, model registration, and blue/green deployments via Terraform and GitHub Actions, ensuring seamless MLOps integration for Marketplace assets.
Integrated Copilot with RAG pipelines (Azure AI Search + GPT-4/LLaMA2) to provide context-aware responses, achieving 95%+ accuracy in knowledge retrieval.

Built monitoring dashboards with Application Insights to track Copilot usage, latency, and resolution rates, feeding metrics into continuous improvement loops.

Deployed monitoring and alerting with Azure Monitor and Application Insights to detect embedding latency and model drift, triggering automated retraining to sustain 99.5% answer accuracy.
Collaborated with support SMEs to curate domain-specific dialogue datasets, aligning model outputs with Exchange policies and terminology while ensuring compliance within data assets shared on the Marketplace.
3. Enterprise Credit-Risk & Fraud-Detection Platform
Implemented advanced data consolidation methodologies for processing 5M+ heterogeneous transaction and account records from multiple legacy systems, utilizing comprehensive ETL pipelines with Apache Spark and custom Python scripts to consolidate data into Delta Lake on Azure Databricks, publishing curated credit-risk datasets to both public and private Databricks Marketplaces.
Implemented Unity Catalog with fine-grained table- and column-level access controls and RBAC roles (Admin, Owner, User), ensuring compliance with DoD security standards across all Marketplace data assets.
Engineered 50+ predictive features using Spark SQL and Databricks Feature Store, reducing feature development time by 40% and enhancing model performance for 200,000+ cardholders.
Developed and productionized an XGBoost credit-scoring model in Azure ML, improving AUC from 0.78 to 0.90 and reducing false positives by 15%.
Implemented real-time fraud detection with TensorFlow autoencoders on Databricks Structured Streaming and Azure Functions, detecting anomalies under 2 seconds and cutting false positives by 35%.
Automated MLOps workflows via Terraform and GitHub Actions, covering infrastructure provisioning, CI/CD pipelines, training, retraining, and MLflow experiment tracking, ensuring 99.9% uptime under strict compliance.
Optimized compute costs by right-sizing Databricks clusters and applying model quantization for inference, achieving ~25% cloud spend reduction while meeting latency SLAs (<150 ms for inference, sub-2 seconds for streaming).
Migrated Hive and SQL workflows to performant Spark transformations using Scala and PySpark, improving scalability and maintainability of Marketplace data assets.
Built and maintained sophisticated ETL pipelines implementing data consolidation methodologies across disparate data sources, utilizing Azure Data Factory, Spark SQL, and Airflow with custom transformation logic and data quality frameworks to ingest data from Azure SQL, Blob Storage, and ADLS Gen2 into Delta Lake and Azure Synapse Analytics for analytics and reporting consumption.
Delivered Power BI and SSRS dashboards providing actionable insights on credit risk, fraud trends, and campaign performance to stakeholders, leveraging Marketplace datasets for real-time decision-making.

BCBS Sept 2019 Mar 2021
Senior AI/ML Engineer
Roles & Responsibilities:
Generative AI PoC GPT-3 Fine-Tuning for Automated Pharmacy Support
Led an internal PoC implementing agentic approaches for autonomous pharmacy support, assessing the feasibility of parameter-efficient fine-tuning of GPT-3 techniques for auto-generating pharmacy support responses, developing comprehensive data consolidation methodologies to process multi-source conversational data using Azure OpenAI.
Collected and anonymized ~65,000 pharmacy email/chat logs using custom NER models and regex-based de-identification scripts in Python (spaCy, re).
Engineered prompt-response datasets and fine-tuned GPT-3 with Azure OpenAI CLI, adjusting n_epochs, learning_rate_multiplier, and prompt_loss_weight for stability and contextual alignment.
Benchmarked model performance with BLEU, METEOR, BERTScore, and human validation loops (using pharmacist-annotated truth sets), yielding a 35% improvement in contextual relevance.
Packaged the model into a Dockerized Flask API with comprehensive API endpoint management and GPU-optimized model serving, deployed via Azure Container Apps with advanced hosting strategies for LLM inference.
1.ML-Based Claims Fraud Detection via AnomalysScoring
Built an unsupervised anomaly detection framework to identify fraud in pharmacy claims using Isolation Forest, LOF, and Autoencoder-based reconstruction loss.
Implemented comprehensive data consolidation methodologies for processing over 150M+ heterogeneous claim records monthly, utilizing robust ETL pipelines with PySpark on Azure Databricks and GPU-accelerated processing, applying dimensionality reduction (PCA), time-windowed aggregations, and feature hashing on categorical fields (NDC, NPI, RX code).
Trained ensemble models for different fraud types (overbilling, excessive refills) and logged detection metrics in MLflow, tracking precision, recall, and flagged cases.
Built interactive visualizations and dashboards using tools such as Tableau, Power BI, Plotly, and GenAI-based visualization frameworks to effectively communicate complex analytics results
Triggered real-time scoring via Azure Data Factory pipelines and Event Grid-based alert systems, feeding high-risk claims to fraud analysts with metadata-rich payloads.
Achieved a 27% increase in true positive rate (TPR) and reduced false alerts by 22% compared to rules-based legacy detection systems.
2.Medication Demand Forecasting for Pharmacy Inventory Planning
Designed time-series and regression-based forecasting models (using Facebook Prophet, XGBoost, and ARIMA) to predict 30-day demand per NDC and store location.
Processed historical sales, prescription fulfillment data, seasonality trends, and promotional events using PySpark pipelines in Azure Databricks.
Integrated external signals such as flu seasonality, weather trends, and regional healthcare events, creating aggregated feature sets at store, product, and region levels.
Built automated retraining workflows using Azure ML and Data Factory, with scheduled model refresh and performance tracking in MLflow.
Visualized forecast outputs and accuracy metrics in Power BI, enabling supply chain teams to adjust distribution plans.
Resulted in 18% reduction in out-of-stock events and 12% improvement in stock-to-demand alignment during flu and allergy seasons.
3. Preventive Care Gap Risk Model for Medicare Advantage
Developed a predictive classification model using Random Forests, XGBoost, and Logistic Regression to identify ~1.8M Medicare members at high risk of missing preventive screenings (A1C, flu shot, mammogram).
Extracted and transformed longitudinal patient histories using Azure Synapse SQL, PySpark, and custom Python ETL scripts; joined claims with CMS HEDIS measures.
Performed feature selection using Recursive Feature Elimination (RFE) and correlation analysis; optimized pipeline with scikit-learn Pipelines API and stratified 5-fold CV.
Deployed the model via Azure ML endpoints, integrated with Salesforce Health Cloud to drive outreach prioritization by population health teams.
Led to a 19% increase in care gap closures in pilot regions (TX, FL, OH) and reduced outreach waste by 24% via precision targeting.
Designed and automated end-to-end ML pipelines using Azure ML, Azure Data Factory, and DatabricksJobs for data ingestion, training, validation, and model versioning with MLflow, ensuring reproducibility and governance compliance.
Deployed scalable models via Azure Container Apps, AKS, and Azure ML Endpoints, incorporatingCI/CD workflowsusingGit, YAML pipelines, and integration with Azure DevOps for continuous delivery and rollback.
Implemented model monitoring and drift detection using custom metrics (precision, recall, data drift indicators) logged in MLflow, triggering retraining workflows and alerts through Azure Monitor and Event Grid.
Bank of America,Dallas, Tx Jan 2018 Aug 2019
Machine Learning Engineer
Roles & Responsibilities:
1. NLP-Based Complaint Classification
Built and deployed advanced NLP models utilizing self-attention mechanisms and transformer architectures, including BERT and fine-tuned open-source models with transfer learning and layer freezing techniques on GPU clusters, using Amazon SageMaker with comprehensive data consolidation methodologies to flag regulatory complaints from ~1 million anonymized call transcripts, achieving an 88% F1 score across priority categories through domain adaptation and multi-task learning approaches.
Extracted complaint signals using TF-IDF, LDA (via Gensim), and spaCy NER, processing financial and legal language patterns.
Exposed the model via a SageMaker Endpoint wrapped in a Lambda function with Flask-based API orchestration and GPU-optimized model hosting, enabling near real-time call transcript scoring through agentic workflows for autonomous complaint routing and escalation.
Implemented monthly retraining pipelines with SageMaker Pipelines, integrating labeled feedback from compliance analysts stored in Amazon S3 and versioned with Amazon CodeCommit.
Developed regex-based extractors and generated topic summaries using AWS Glue + Athena, facilitating fast filtering of flagged calls.
Enabled model explainability with SHAP visualizations served via an internal API Gateway + CloudWatch dashboard, supporting SR 11-7 audit requirements.
2. Real-Time Transaction Fraud Detection
Designed and trained a fraud detection engine using XGBoost on SageMaker and deployed it on historical batches of 2 million credit card transactions.
Engineered 150+ features such as transaction time gaps, merchant category density, location drift, and device fingerprints using AWS Glue and PySpark on EMR.
Streamed live transactions via Amazon Kinesis and performed real-time fraud scoring with SageMaker Hosting + Lambda, maintaining <1.5s prediction latency.
Achieved a 10% precision gain and 8% recall lift over existing business rules through hyperparameter tuning and SMOTE-based data balancing.
Created fraud analyst dashboards with Amazon QuickSight, embedding SHAP risk insights and transaction context.
Enabled model performance tracking, data drift alerts, and retraining triggers using Amazon CloudWatch, SageMaker Model Monitor, and Step Functions.
Built scalable ML pipelines using Amazon SageMaker Pipelines, integrating data from S3, transformation logic in AWS Glue, and orchestration with Step Functions.
Containerized ML models with Docker and deployed them on Amazon ECS with Fargate and SageMaker Endpoints for API-based consumption.
Implemented centralized monitoring using Amazon CloudWatch, Model Monitor, and drift detection logic to trigger automated retraining workflows via SageMaker Pipelines and Lambda functions.

Gap Inc, SFO, CA Feb 2017 Dec 2017
Machine Learning Engineer
Roles & Responsibilities:
Developed and deployed a scalable recommendation engine for Gap Inc s e-commerce platform using Apache Spark (ALS algorithm), delivering personalized product suggestions to over 850,000 active users.
Implemented advanced data consolidation methodologies and comprehensive ETL pipelines to aggregate and process 6M+ heterogeneous customer transactions and browsing records from web and in-store systems utilizing distributed computing frameworks (Apache Spark) to build unified customer profiles across Gap, Banana Republic, and Old Navy brands.
Optimized data processing performance through parallel computing techniques and cluster optimization, reducing customer profile generation time from 8 hours to 45 minutes and enabling near real-time customer segmentation for personalized recommendations.
Conducted exploratory data analysis (E;mggDA) on 4 TB of user behavior data, identifying key product affinities, segment patterns, and buying triggers.
Engineered features including RFM scores, product co-purchase frequency, and content-based similarity metrics, improving model input quality and boosting recommendation accuracy by 15%.
Performed grid search and cross-validation to fine-tune hyperparameters, increasing recommendation precision by 11% and recall by 9% compared to the legacy system.
Deployed the model as a distributed Spark job with Flask-based API endpoints and GPU-optimized model serving, enabling near real-time updates with an average prediction latency of <2.5 seconds through agentic recommendation workflows and zero-shot personalization techniques.
Collaborated with digital marketing teams to integrate the engine into on-site product pages and email campaigns, reaching over 1.2M customers with dynamically generated offers.
Led end-to-end A/B testing of recommendation variants, demonstrating a 10% uplift in click-through rate (CTR) and a 7% increase in conversion rate for recommended products.
Contributed to a 9% lift in average order value (AOV) and a 6% improvement in customer retention within six months of system launch.

Open space Innovates,Hyderabad, India Jun2012 Dec2014
Data Scientist
Roles & Responsibilities:
1.Recommendation Systems:
Designed and implemented collaborative and content-based filtering algorithms to generate personalized product recommendations, leveraging user behavior and product metadata.
Conducted exploratory data analysis to identify key user-item interaction patterns, improving recommendation relevance and diversity.
Validated model performance using offline metrics (precision, recall, MAP), achieving measurable gains in user engagement and conversion rates.
2.Sales Forecasting:
Developed time-series forecasting models (ARIMA, Prophet) on historical sales data to predict demand, accounting for seasonality and promotional effects.
Engineered features from calendar events, pricing, and external factors to enhance model accuracy by over 15%.
Evaluated and compared multiple forecasting approaches using RMSE and MAE metrics to select optimal models for deployment.
Keywords: csharp continuous integration continuous deployment artificial intelligence machine learning javascript business intelligence sthree database rlang information technology microsoft mississippi California Colorado Delaware Florida Ohio Texas

To remove this resume please click here or send an email from [email protected] to [email protected] with subject as "delete" (without inverted commas)

[email protected];6510

Enter the captcha code and we will send and email at [email protected]
with a link to edit / delete this resume
Captcha Image: