Home

Sujitha Cheruku - Sr AI Machine Learning Engineer | Gen AI Engineer | Agentic AI Engineer | Python Develoer
[email protected]
Location: Mclean, Virginia, USA
Relocation: Yes
Visa: GC
Resume file: Resume_Sujitha_Cheruku_1778505698405.docx
Please check the file(s) for viruses. Files are checked manually and then made available for download.
Sujitha Cherukuthota
[email protected]
+1 (757) 936-9318



Summary:
Senior AI / ML Engineer with 10 years of experience building Generative AI, Machine Learning, NLP, and cloud-native analytics solutions across Healthcare, Banking, Retail, and Government enterprise environments.
Developed enterprise Conversational AI and AI Assistant applications using AWS Bedrock, Azure OpenAI, LangChain, LangGraph, Claude, Crew AI, MCP, and LlamaIndex supporting intelligent automation workflows and contextual enterprise search.
Built production-grade RAG Pipelines using Pinecone, FAISS, semantic retrieval, embedding models, contextual grounding, retrieval-aware prompting, and hybrid search improving enterprise chatbot relevance and response quality.
Strong expertise developing Machine Learning and Deep Learning solutions using Scikit-learn, TensorFlow, PyTorch, Keras, Spark MLlib, Pandas, NumPy, and SciPy supporting forecasting, fraud detection, anomaly detection, and predictive analytics initiatives.
Experienced in developing transformer-based NLP Solutions using Hugging Face, BERT, GPT, T5, spaCy, and NLTK supporting summarization, classification, entity extraction, conversational AI, and intelligent document processing workflows.
Designed scalable AI Architectures using FastAPI, REST APIs, Microservices Architecture, Docker, Kubernetes, Amazon EKS, and Azure AKS supporting secure model deployment and distributed inference service integration capabilities.
Implemented enterprise MLOps and model lifecycle workflows using MLflow, Jenkins, GitHub Actions, Terraform, CI/CD Pipelines, Kubernetes, Prometheus, and CloudWatch supporting deployment automation, monitoring, and scalable AI operations.
Developed distributed Data Engineering and ETL pipelines using PySpark, Databricks, SQL, Hive, BigQuery, Azure Data Factory, AWS Glue, and Pandas supporting enterprise analytics and machine learning model training workflows.
Hands-on experience with enterprise Cloud Platforms including AWS, Azure, and Google Cloud Platform implementing scalable AI workloads using Bedrock, AWS Lambda, Azure Synapse, BigQuery, and distributed cloud-native services.
Experienced in developing enterprise Data Visualization and reporting solutions using Tableau, Matplotlib, Seaborn, Grafana, Power BI, and operational dashboards supporting executive analytics and business decision-making initiatives.

Technical Skills:
Programming languages: Python, SQL, Shell Scripting, PL/SQL
Generative AI & LLMs: AWS Bedrock, Azure OpenAI, Claude, Titan Embeddings, LangChain, LangGraph, LlamaIndex, Crew AI, MCP, Hugging Face, Prompt Engineering, ReAct Reasoning, HITL, RAG Pipelines
Machine Learning & Deep Learning: Scikit-learn, TensorFlow, PyTorch, Keras, Spark MLlib, MLflow, Predictive Modeling, Forecasting, Classification, Regression, Clustering, Anomaly Detection, A/B Testing, Feature Engineering
NLP & Vector Search: Pinecone, FAISS, Azure AI Search, NLTK, Vector Retrieval, Semantic Search, Embedding Models, Conversational AI
Data Engineering & Big Data: PySpark, Databricks, AWS Glue, Azure Data Factory, Hadoop, Hive, ETL Pipelines, Delta Lake, Data Validation, Data Transformation
Cloud Platforms & Storage: AWS, Azure, Google Cloud Platform, Amazon S3, Redshift, DynamoDB, Azure Blob Storage, Azure Synapse Analytics, BigQuery
Frameworks & Backend Development: FastAPI, REST APIs, Node.js, React.js, Microservices Architecture, Distributed Systems, Azure Functions
DevOps, CI/CD & MLOps: Docker, Kubernetes, Amazon EKS, AKS, GitHub Actions, Jenkins, Terraform, CloudFormation, CI/CD Pipelines
Visualization & Analytics: Tableau, Matplotlib, Seaborn, Pandas, NumPy, SciPy, Statistical Analysis, Data Visualization, Operational Reporting
Monitoring & Testing: Prometheus, Grafana, CloudWatch, Evidently AI, RAGAS, PyTest, Swagger API Validation, Model Monitoring

Experience:
HCA Healthcare
Senior AI / Machine Learning Engineer Richmond, VA | Feb 2024 - Present
Developed healthcare AI Assistant solutions using AWS Bedrock, LangChain, FastAPI, and Pinecone, helping care teams retrieve discharge summaries, referral notes, and clinical documents through conversational workflows efficiently.
Designed scalable Generative AI workflows using LangGraph, REST APIs, MCP servers, and vector retrieval pipelines, supporting operational healthcare requests, provider communications, and internal documentation search activities across support teams.
Engineered distributed ingestion pipelines using Python and AWS Lambda, processing discharge summaries, PDF records, provider communications, and operational datasets supporting conversational AI analytics and semantic retrieval capabilities.
Built preprocessing workflows using PySpark, Databricks, and AWS Glue, transforming structured healthcare datasets while improving conversational response quality and enterprise AI data preparation across production support environments.
Managed healthcare storage architectures using Amazon S3, Redshift, and DynamoDB, supporting governed analytics, secure healthcare accessibility, and scalable querying across reporting platforms and conversational AI support services.
Implemented semantic retrieval workflows using Pinecone and FAISS, improving contextual healthcare search relevance through embedding-based retrieval and intelligent conversational response generation supporting clinical support operations.
Utilized AWS Bedrock foundation models including Claude and Titan embeddings, improving contextual response quality while enabling secure healthcare document understanding for operational healthcare support workflows.
Built production-ready RAG Pipelines using LlamaIndex, LangChain, and embedding models, improving chatbot relevance through semantic retrieval, contextual grounding, retrieval-aware prompting, and conversational healthcare assistance capabilities.
Enhanced enterprise Prompt Engineering workflows using ReAct Reasoning, HITL validation, and contextual prompting techniques, reducing hallucinations while improving conversational consistency across healthcare production AI environments.
Developed orchestration services using LangGraph and Crew AI, supporting multi-agent conversational workflows, intelligent automation pipelines, and secure LLM integrations across operational healthcare assistant platforms.
Integrated GitHub Copilot and modular backend development practices using Python and Microservices Architecture, improving developer productivity, reusable integrations, deployment consistency, and scalable AI service development capabilities.
Created lightweight healthcare support interfaces using Node.js and REST APIs, enabling operational teams to securely access conversational AI services through internal healthcare document retrieval support portals.
Analyzed operational healthcare datasets using PySpark and Databricks, identifying workflow bottlenecks, improving retrieval relevance, and supporting AI-driven recommendations across conversational healthcare support environments.
Utilized MLflow and PyTorch workflows for experiment tracking, prompt versioning, model evaluation, and deployment lifecycle management across production conversational AI and machine learning environments.
Optimized low-latency inference workflows using FastAPI and asynchronous processing, improving conversational response performance and concurrent healthcare request handling across enterprise AI support applications.
Built scalable inference services using Docker and AWS ECR, enabling portable deployments, runtime consistency, secure container management, and standardized conversational AI execution across distributed healthcare environments.
Deployed conversational AI workloads on Amazon EKS and Kubernetes, enabling autoscaling, workload isolation, fault-tolerant orchestration, and highly available healthcare AI platform deployments across production environments.
Automated deployment workflows through CI/CD Pipelines using GitHub Actions and Jenkins, improving release reliability, deployment governance, rollback readiness, and continuous delivery across conversational AI application pipelines.
Monitored conversational AI services using CloudWatch, Prometheus, and RAGAS evaluation workflows, tracking latency, retrieval quality, operational alerts, and conversational performance across distributed healthcare production environments.
Performed Unit Testing and API validation using PyTest and Swagger specifications, ensuring stable integrations, secure API functionality, deployment reliability, and long-term maintainability across healthcare AI application environments.
Environment
Python, AWS Bedrock, Claude, Titan Embeddings, LangChain, LangGraph, LlamaIndex, Crew AI, MCP, Pinecone, FAISS, Hugging Face, FastAPI, Node.js, REST APIs, PySpark, Databricks, AWS Glue, Amazon S3, Redshift, DynamoDB, Docker, Kubernetes, Amazon EKS, Jenkins, GitHub Actions, Terraform, CloudFormation, MLflow, RAGAS, Prometheus, CloudWatch, PyTorch, TensorFlow, Scikit-learn, Microservices Architecture, Distributed Systems, CI/CD Pipelines, Agile Scrum.
Sallie Mae Bank
AI / Machine Learning Engineer Newark, DE | May 2022 - Jan 2024
Developed intelligent Fraud Detection solutions using Azure OpenAI, TensorFlow, FastAPI, and Scikit-learn, helping financial analysts identify suspicious loan activities and transactional anomalies across banking support workflows.
Designed scalable AI-driven workflows using Azure Functions, REST APIs, and event-driven pipelines, supporting fraud monitoring, document validation, conversational banking assistance, and operational financial investigation processes.
Engineered ingestion pipelines using Python and Azure Data Factory, processing loan applications, repayment histories, customer transactions, and operational banking datasets supporting fraud analytics and machine learning workflows.
Built preprocessing workflows using PySpark, SQL, and Azure Databricks, transforming structured financial datasets while improving fraud scoring accuracy, feature quality, and downstream machine learning model preparation pipelines.
Managed banking storage architectures using Azure Blob Storage, Delta Lake, and Azure Synapse, supporting governed analytics, secure financial accessibility, and scalable querying across fraud monitoring platforms and reporting services.
Implemented semantic validation workflows using Azure AI Search and vector indexing, improving fraud investigation processes through intelligent retrieval, anomaly tracing, and contextual financial support response generation capabilities.
Utilized Azure OpenAI foundation models with TensorFlow and Scikit-learn, supporting fraud prediction workflows, intelligent document validation, conversational assistance, and operational banking investigation activities supporting fraud analysts.
Built production-ready ML Pipelines using MLflow, Spark MLlib, and Scikit-learn, improving fraud detection accuracy through feature engineering, model evaluation, and retrieval-assisted financial intelligence processing capabilities.
Conducted A/B Testing and offline validation workflows, comparing fraud scoring responses, classification consistency, and model behavior across production financial datasets supporting operational fraud investigation activities.
Enhanced enterprise Prompt Engineering workflows using contextual reasoning and HITL validation, reducing false positives while improving fraud classification quality and conversational response consistency across financial AI environments.
Developed orchestration services using FastAPI and LangChain, supporting conversational fraud investigation workflows, intelligent automation pipelines, and secure LLM integrations across operational banking assistant platforms.
Integrated GitHub Copilot and modular backend development practices using Python and Microservices Architecture, improving reusable integrations, deployment consistency, and scalable fraud analytics service development capabilities.
Created lightweight banking support interfaces using React.js and REST APIs, enabling fraud analysts to review conversational AI responses, validation insights, and transaction risk assessment workflows through secured dashboards.
Analyzed operational banking datasets using PySpark and Azure Databricks, identifying fraud patterns, improving anomaly detection relevance, and supporting AI-driven financial investigation recommendations across operational banking environments.
Utilized MLflow and TensorFlow workflows for experiment tracking, model evaluation, prompt versioning, and deployment lifecycle management across production fraud analytics and machine learning environments.
Optimized low-latency inference workflows using FastAPI and asynchronous processing, improving fraud scoring performance and concurrent banking request handling across enterprise financial AI support applications.
Built scalable inference services using Docker and Azure Container Registry, enabling portable deployments, runtime consistency, secure container management, and standardized AI execution across distributed banking environments.
Deployed fraud analytics workloads on AKS and Kubernetes, enabling autoscaling, workload isolation, fault-tolerant orchestration, and highly available financial AI platform deployments supporting banking transaction environments.
Automated deployment workflows through CI/CD Pipelines using GitHub Actions and Jenkins, improving rollback readiness, deployment governance, and release management across fraud analytics application deployment pipelines.
Monitored fraud analytics services using Prometheus, Grafana, and Evidently AI workflows, tracking latency, model drift, operational alerts, and fraud detection quality across production banking environments.
Environment
Python, Azure OpenAI, TensorFlow, PyTorch, Scikit-learn, Spark MLlib, FastAPI, LangChain, Azure Functions, Azure Databricks, Azure Data Factory, Azure Blob Storage, Delta Lake, Azure Synapse Analytics, Azure AI Search, MLflow, Docker, Kubernetes, AKS, GitHub Actions, Jenkins, Prometheus, Grafana, Evidently AI, React.js, REST APIs, SQL, PySpark, Microservices Architecture, Distributed Systems, CI/CD Pipelines.
State of California, San Francisco, CA
Data Scientist / Machine Learning Engineer Feb 2020 - Apr 2022
Developed large-scale exploratory analysis using Python and R, identifying operational trends, data inconsistencies, and forecasting gaps across statewide healthcare and public program reporting datasets.
Developed interactive analytical dashboards using Tableau and Matplotlib, transforming complex statistical findings into executive-level reporting insights supporting policy planning and operational decision-making initiatives.
Built demand forecasting and resource planning models using Scikit-learn and Pandas, improving next-cycle prediction accuracy through historical trend analysis, feature engineering, and validation benchmarking techniques.
Designed predictive analytics solutions using Random Forest and Logistic Regression, supporting operational forecasting, citizen service analysis, program utilization tracking, and data-driven planning initiatives across departments.
Evaluated machine learning model performance using NumPy and SciPy, applying statistical validation, error analysis, and comparative testing to ensure reliable forecasting outcomes across enterprise operational datasets.
Conducted detailed univariate and bivariate analysis using Seaborn and Pandas, identifying feature relationships, variable distributions, and behavioral trends supporting model optimization and analytical decision strategies.
Applied supervised learning techniques using SVM and KNN, solving prediction challenges related to healthcare utilization, operational planning, citizen engagement, and statewide public service analytics initiatives.
Implemented clustering workflows using K-means and DBSCAN, segmenting behavioral patterns and operational records to support anomaly identification, population analysis, and improved statewide reporting capabilities.
Built reusable preprocessing pipelines using Scikit-learn and Feature Engineering, standardizing data cleansing, transformation, scaling, and validation workflows across multiple machine learning model development initiatives.
Performed enterprise statistical analysis using R and SQL, generating trend reports, operational summaries, and evidence-based recommendations supporting statewide program management and strategic planning efforts.
Developed deep learning prototypes using TensorFlow and Keras, exploring nonlinear relationships and predictive modeling improvements across healthcare analytics and operational classification use cases.
Applied advanced text processing workflows using NLTK and Python, supporting text normalization, keyword extraction, and semantic preprocessing across statewide public communication and reporting datasets.
Conducted forecasting validation and Model Testing using historical operational datasets, improving analytical reliability, prediction consistency, and reporting accuracy before enterprise-level deployment and stakeholder adoption.
Developed scalable analytical workflows using Python and SQL, automating reporting logic, validation routines, and statistical calculations supporting operational analytics across multiple statewide business units.
Integrated enterprise reporting datasets using Pandas and Data Visualization, enabling centralized analytics, operational transparency, and improved decision-making support for statewide healthcare administration teams.
Collaborated within Agile Scrum environments, working with analysts, reporting teams, and business stakeholders to deliver predictive analytics solutions aligned with statewide operational and compliance requirements.
Built automated validation routines using Data Validation and statistical quality checks, identifying inconsistencies, missing values, and reporting anomalies across healthcare and operational enterprise datasets.
Supported operational reporting initiatives using Tableau and Statistical Analysis, delivering executive dashboards, trend summaries, and actionable insights across healthcare and statewide public service programs.
Performed production support and Performance Monitoring for forecasting models, ensuring analytical consistency, stable reporting behavior, and operational reliability across enterprise analytical workflows.
Prepared technical documentation and Analytical Reports supporting onboarding, reporting governance, model transparency, and long-term maintainability across statewide analytics and machine learning initiatives.
Environment:
Python, R, Tableau, SQL, NumPy, Pandas, Matplotlib, Seaborn, SciPy, Scikit-learn, TensorFlow, Keras, NLTK, Logistic Regression, Random Forest, SVM, KNN, Classification, Regression, Clustering, K-means, DBSCAN, Feature Engineering, Statistical Analysis, Forecasting, Data Validation, Data Visualization, Agile Scrum.
Walmart Global Tech
Data Engineer Bentonville, AR | Oct 2016 - Dec 2019
Developed large-scale Exploratory Analysis workflows using Python and R, identifying operational trends, reporting inconsistencies, and forecasting gaps across statewide healthcare and public program datasets.
Built interactive Tableau Dashboards and Matplotlib visualizations, transforming statistical findings into executive-level reporting insights supporting policy planning, operational reviews, and statewide decision-making initiatives.
Developed demand forecasting models using Scikit-learn and Pandas, improving next-cycle prediction accuracy through historical trend analysis, feature engineering, and forecasting validation across operational healthcare reporting workflows.
Designed predictive analytics solutions using Random Forest and Logistic Regression, supporting operational forecasting, citizen service analysis, program utilization tracking, and statewide planning initiatives across multiple departments.
Evaluated machine learning model performance using NumPy and SciPy, applying statistical validation, error analysis, and comparative testing supporting reliable forecasting outcomes across statewide operational reporting datasets.
Conducted univariate and bivariate analysis using Seaborn and Pandas, identifying feature relationships, variable distributions, and behavioral trends supporting model optimization and analytical decision-making strategies across reporting environments.
Applied supervised learning techniques using SVM and KNN, solving prediction challenges related to healthcare utilization, operational planning, citizen engagement, and statewide public service analytics initiatives.
Implemented clustering workflows using K-means and DBSCAN, segmenting operational records and behavioral patterns supporting anomaly identification, population analysis, and statewide reporting improvement initiatives across healthcare programs.
Built reusable preprocessing pipelines using Scikit-learn and Feature Engineering techniques, standardizing data cleansing, transformation, scaling, and validation workflows across multiple machine learning development initiatives.
Performed enterprise statistical analysis using R and SQL, generating trend reports, operational summaries, and evidence-based recommendations supporting statewide healthcare program management and strategic planning initiatives.
Developed deep learning prototypes using TensorFlow and Keras, exploring nonlinear relationships and predictive modeling improvements across healthcare analytics, operational classification, and statewide reporting use cases.
Created automated reporting workflows using Python and Tableau, supporting operational analytics, executive dashboards, validation routines, and long-term analytical reporting across statewide healthcare administration teams.
Environment
Python, R, SQL, Pandas, NumPy, SciPy, Seaborn, Matplotlib, Tableau, Scikit-learn, TensorFlow, Keras, NLTK, Random Forest, Logistic Regression, SVM, KNN, K-means, DBSCAN, Feature Engineering, Statistical Analysis, Forecasting Models, Data Visualization, Predictive Analytics, Data Validation, Machine Learning, Deep Learning.
Citibank
Python Developer Hyderabad, India | Aug 2015 - Sep 2016
Developed Financial Reporting workflows using Python and SQL, helping operations teams analyze transaction records, customer activity, and compliance datasets supporting enterprise banking reconciliation and reporting requirements.
Engineered ETL Processes using Python, Pandas, and SQL, extracting banking datasets from relational systems while transforming operational records supporting downstream reporting and reconciliation workflows across banking environments.
Built reusable data processing scripts using Pandas and NumPy, automating cleansing, validation, normalization, and reporting preparation across transactional banking systems, customer account datasets, and operational reporting workflows.
Designed optimized queries using MySQL and Oracle, combining customer transactions, account histories, and operational datasets supporting enterprise banking analysis, compliance reporting, and regulatory reporting requirements efficiently.
Developed operational reporting utilities using MS Excel and Python, reducing manual reporting efforts while improving financial data validation, reconciliation accuracy, and banking operations reporting across financial environments.
Performed Data Validation and reconciliation using SQL and Pandas, identifying missing values, duplicate transactions, and reporting inconsistencies across enterprise banking operational workflows, reporting systems, and transactional datasets.
Automated scheduled reporting workflows using Shell Scripting and Python, supporting timely financial report generation, reporting automation, and consistent banking dataset availability across operational support and reporting environments.
Supported modular backend development using Reusable Scripts and Python, improving maintainability, reporting consistency, operational efficiency, and reusable workflow execution across enterprise financial reporting applications.
Assisted testing and production validation using PyTest and SQL, ensuring processed banking datasets aligned with operational business rules, enterprise reporting standards, and downstream financial reporting requirements accurately.
Monitored Operational Workflows using Python and SQL, troubleshooting processing failures, validating scheduled jobs, and supporting stable execution across enterprise banking reporting environments and financial operational systems.
Environment
Python, SQL, Pandas, NumPy, MySQL, Oracle, MS Excel, Shell Scripting, ETL Workflows, Data Validation, Financial Reporting, Operational Reporting, Data Cleansing, PyTest, Relational Databases, Reporting Automation, Banking Analytics.
Education:
Bachelor of Technology(B.Tech) in Computer Science
Sreyas Institute of Engineering and Technology Hyderabad, India
Keywords: continuous integration continuous deployment artificial intelligence machine learning javascript business intelligence sthree rlang microsoft mississippi procedural language bay area Arkansas California Delaware Virginia

To remove this resume please click here or send an email from [email protected] to [email protected] with subject as "delete" (without inverted commas)
[email protected];7305
Enter the captcha code and we will send and email at [email protected]
with a link to edit / delete this resume
Captcha Image: