Home

Sujitha Cheruku - Sr AI Machine Learning Engineer | MLOps | Gen AI Engineer | Agentic AI Engineer
[email protected]
Location: Mclean, Virginia, USA
Relocation: Yes
Visa: Green Card
Resume file: Resume_Sujitha_Ch_1778852115643.docx
Please check the file(s) for viruses. Files are checked manually and then made available for download.
Sujitha Cherukuthota
+1 (757) 936-9318
[email protected]

Summary:
Senior AI / ML Engineer with 10 years of experience designing scalable Artificial Intelligence, Machine Learning, Generative AI, NLP, and cloud-native analytics solutions across Healthcare, Banking, Government, and Retail enterprise environments.
Developed enterprise Generative AI and Conversational AI applications using AWS Bedrock, Azure OpenAI, Claude, LangChain, LangGraph, and LlamaIndex, enabling intelligent automation, contextual search, and enterprise knowledge assistant capabilities.
Built production-grade RAG Pipelines using Pinecone, FAISS, Embeddings, Hybrid Retrieval, Semantic Search, Contextual Grounding, and Prompt Engineering techniques, improving enterprise AI response quality and retrieval relevance significantly.
Strong expertise in Machine Learning model development using Scikit-learn, XGBoost, TensorFlow, PyTorch, Keras, Pandas, NumPy, and SciPy for forecasting, fraud detection, anomaly detection, clustering, and predictive analytics initiatives.
Experienced in developing transformer-based NLP Solutions using Hugging Face Transformers, BERT, GPT, T5, spaCy, and NLTK for summarization, classification, entity extraction, sentiment analysis, and conversational AI workflows.
Designed scalable AI Architectures using FastAPI, REST APIs, Microservices Architecture, Docker, Kubernetes, Amazon EKS, and Azure AKS, enabling secure model deployment and distributed inference service integration capabilities.
Implemented enterprise MLOps and Model Lifecycle Management workflows using MLflow, Kubeflow, Jenkins, GitHub Actions, Terraform, CI/CD Pipelines, and Kubernetes supporting automated deployment, monitoring, and scalable AI operations.
Developed distributed Data Engineering and ETL pipelines using PySpark, SQL, Hive, BigQuery, Azure Data Factory, Pandas, and Distributed Processing frameworks supporting enterprise analytics and machine learning model training initiatives.
Built enterprise Vector Search and Semantic Retrieval systems using Pinecone, Embedding Models, Contextual Search, Hybrid Retrieval, and Retrieval-aware Prompting improving chatbot intelligence and AI-driven response generation accuracy.
Applied advanced Deep Learning techniques using TensorFlow, Keras, CNNs, OpenCV, and PyTorch for intelligent automation, image classification, fraud analytics, operational forecasting, and predictive modeling use cases.
Hands-on expertise with enterprise Cloud Platforms including AWS, Azure, and Google Cloud Platform implementing scalable AI workloads using Bedrock, Azure Synapse, AWS Lambda, BigQuery, and Google AI Platform services.
Experienced Experienced developing enterprise Data Visualization and analytical reporting solutions using Tableau, Power BI, Grafana, Matplotlib, and Seaborn while collaborating with cross-functional teams within Agile Scrum delivery environments.
Technical Skills:
Programming languages: Python, SQL, R, Java, PL/SQL, PHP
Generative AI & LLMs: AWS Bedrock, Azure OpenAI, Claude, LangChain, LangGraph, LlamaIndex, Hugging Face Transformers, BERT, T5, RAG Pipelines, Prompt Engineering, ReAct Reasoning
Machine Learning & NLP: Scikit-learn, XGBoost, TensorFlow, Keras, PyTorch, spaCy, NLTK, Predictive Modeling, Classification, Regression, Clustering, Forecasting, Anomaly Detection
Data Engineering & Processing: PySpark, Pandas, NumPy, ETL Workflows, Data Transformation, Feature Engineering, Statistical Analysis, Data Validation, Distributed Processing
Cloud Platforms: AWS, Azure, Google Cloud Platform (GCP), Amazon S3, AWS Lambda, Redshift, Azure Synapse Analytics, Azure Blob Storage, BigQuery, Google Cloud Storage
Vector Databases & Search: Pinecone, FAISS, Embeddings, Semantic Retrieval, Vector Search, Contextual Search, Hybrid Retrieval
Frameworks & APIs: FastAPI, REST APIs, Microservices Architecture, SOAP, WSDL, Distributed Systems
DevOps / MLOps: Docker, Kubernetes, Amazon EKS, Azure AKS, Terraform, CloudFormation, Azure DevOps, GitHub Actions, Jenkins, MLflow, CI/CD Pipelines
Databases & Warehousing: MySQL, Oracle 10g, DB2, DynamoDB, Hive, Delta Lake, Relational Databases, OLTP, OLAP
Visualization & Monitoring: Tableau, Matplotlib, Seaborn, Data Studio, Grafana, Prometheus, CloudWatch, Swagger API, MS Excel
Experience:
HCA Healthcare
Senior AI / Machine Learning Engineer Richmond, VA | Feb 2024 - Present
Developed healthcare AI Assistant solutions using AWS Bedrock, LangChain, FastAPI, and Pinecone, helping hospital staff retrieve referral notes and clinical documents while reducing manual search efforts significantly.
Designed scalable Generative AI workflows using REST APIs, LangGraph, and vector retrieval pipelines, enabling conversational healthcare support services handling operational requests and internal documentation queries efficiently daily.
Worked within Agile Scrum teams alongside Product Owners and healthcare stakeholders, delivering conversational AI enhancements across biweekly sprints aligned with clinical workflows and enterprise operational modernization initiatives successfully.
Built distributed ingestion pipelines using Python and AWS Lambda, processing healthcare documents, provider communications, discharge summaries, and operational datasets supporting downstream conversational AI and analytics workflows reliably.
Developed preprocessing pipelines using PySpark and AWS Glue, transforming structured and unstructured healthcare datasets while reducing enterprise AI data preparation efforts by 40% across production environments efficiently.
Managed healthcare storage architectures using Amazon S3 and Redshift, supporting governed analytics, secure healthcare data accessibility, and scalable querying across operational reporting and AI-driven support environments organization-wide.
Implemented semantic retrieval workflows using Pinecone and FAISS, improving contextual healthcare search relevance through embedding-based retrieval and intelligent conversational response generation across enterprise healthcare applications successfully.
Utilized AWS Bedrock foundation models including Claude, improving contextual response quality while enabling secure healthcare document understanding and conversational assistance across operational support workflows effectively.
Built production-ready RAG Pipelines using LlamaIndex and embedding models, improving chatbot relevance through semantic retrieval, contextual grounding, and retrieval-aware prompting strategies across healthcare support applications successfully.
Enhanced enterprise Prompt Engineering workflows using ReAct Reasoning and contextual prompting techniques, reducing hallucinations while improving conversational consistency and healthcare response accuracy across production AI environments.
Developed orchestration services using LangGraph and FastAPI, supporting distributed conversational workflows, secure LLM integrations, intelligent automation pipelines, and scalable healthcare assistant capabilities across operational support teams.
Applied modular backend development practices using Python and Microservices Architecture, improving maintainability, reusable integrations, deployment consistency, and scalable AI service development across enterprise healthcare applications effectively.
Conducted offline evaluations and Validation Testing using healthcare query datasets, improving conversational reliability and contextual accuracy before large-scale production deployments across enterprise AI support environments successfully.
Built scalable inference services using Docker and AWS ECR, enabling portable deployments, runtime consistency, secure container management, and standardized conversational AI execution across distributed healthcare environments reliably.
Deployed conversational AI workloads on Amazon EKS and Kubernetes, enabling autoscaling, workload isolation, fault-tolerant orchestration, and highly available healthcare AI platform deployments across production environments successfully.
Automated deployment workflows through CI/CD Pipelines using GitHub Actions and Jenkins, improving release reliability, deployment governance, rollback readiness, and continuous delivery across conversational AI application environments.
Managed cloud infrastructure using Terraform and CloudFormation, supporting scalable provisioning, infrastructure standardization, secure environment management, and governed cloud resource deployment across enterprise AI systems.
Monitored conversational AI services using CloudWatch and Prometheus, tracking latency, operational alerts, infrastructure utilization, and conversational performance metrics across distributed healthcare production environments continuously.
Performed Unit Testing and API Validation using PyTest, ensuring stable integrations, reliable deployments, secure API functionality, and scalable conversational AI reliability across enterprise healthcare application environments.
Prepared technical documentation and Swagger API specifications supporting onboarding, governance standards, operational maintainability, enterprise knowledge transfer, and long-term sustainability across healthcare AI implementations.
Environment: Python, AWS Bedrock, Claude, LangChain, LangGraph, LlamaIndex, Pinecone, FAISS, FastAPI, REST APIs, PySpark, AWS Glue, Amazon S3, Redshift, DynamoDB, Docker, Kubernetes, Amazon EKS, Jenkins, GitHub Actions, Terraform, CloudFormation, MLflow, Prometheus, CloudWatch, PyTorch, Scikit-learn, Microservices Architecture, Distributed Systems, CI/CD Pipelines, Agile Scrum.
Sallie Mae Bank
AI / Machine Learning Engineer Newark, DE | May 2022 - Jan 2024
Engineered intelligent loan servicing solutions using Azure OpenAI and LangChain, streamlining customer support workflows, repayment assistance operations, and financial document interactions across enterprise banking environments.
Designed event-driven AI architectures using Conversational AI and Microservices, enabling secure financial query handling, workflow orchestration, and scalable customer engagement capabilities across banking support platforms.
Worked within Agile Scrum delivery models, collaborating with product managers, fraud analysts, and compliance teams to deliver AI-powered banking solutions aligned with operational and regulatory requirements.
Built enterprise ingestion frameworks using Azure Data Factory and Python, integrating loan records, transaction histories, payment logs, and customer communication datasets from distributed banking applications.
Developed high-volume transformation pipelines using Azure Synapse and PySpark, processing financial datasets for fraud analytics, customer intelligence, risk scoring, and machine learning feature engineering workflows.
Structured governed financial storage solutions using Azure Blob Storage and Delta Lake, supporting scalable reporting, enterprise analytics, and secure access management across AI-driven banking applications.
Created semantic knowledge retrieval systems using Pinecone and Vector Search, enabling intelligent document discovery, contextual recommendations, and enterprise financial knowledge accessibility across support operations.
Fine-tuned enterprise NLP workflows using Hugging Face and BERT, supporting financial summarization, intent recognition, intelligent routing, and automated classification across customer servicing environments.
Implemented scalable RAG Pipelines using LangChain and embeddings, improving contextual banking response generation through retrieval-aware prompting and optimized financial document relevance techniques.
Enhanced enterprise Prompt Engineering strategies using contextual prompting and response grounding, improving conversational consistency while minimizing hallucinations across financial AI support environments.
Developed predictive fraud analytics using Scikit-learn and XGBoost, improving anomaly detection accuracy through feature selection, statistical validation, model optimization, and classification performance tuning methodologies.
Applied enterprise NLP processing using spaCy and NLTK, performing entity extraction, sentiment analysis, text normalization, and semantic preprocessing across large-scale financial communication datasets.
Performed enterprise Model Validation and performance benchmarking using historical transaction datasets, improving operational reliability, fraud prediction accuracy, and production readiness across AI-driven banking systems.
Built scalable inference services using Docker and Azure Container Registry, enabling portable deployments, workload standardization, runtime consistency, and reliable AI application delivery across banking environments.
Orchestrated distributed AI workloads using Azure AKS and Kubernetes, enabling autoscaling, workload resiliency, secure deployment isolation, and highly available machine learning operations across enterprise platforms.
Automated enterprise CI/CD Pipelines using Azure DevOps and GitHub Actions, improving deployment governance, release reliability, rollback management, and continuous integration across banking AI applications.
Provisioned enterprise AI infrastructure using Terraform and Azure Resource Manager, supporting governed cloud provisioning, secure networking, and scalable banking platform deployment management standards.
Tracked operational AI performance using Grafana and Prometheus, monitoring infrastructure health, inference latency, operational alerts, and production reliability across enterprise financial environments continuously.
Executed Unit Testing and integration validation using PyTest, ensuring deployment stability, secure API functionality, and enterprise compliance across production AI and machine learning systems.
Produced technical knowledge artifacts and Swagger API documentation supporting onboarding, governance standards, operational maintainability, and long-term sustainability across enterprise banking AI implementations.
Environment: Python, Azure OpenAI, LangChain, Hugging Face Transformers, BERT, Pinecone, Vector Search, Azure Synapse Analytics, Azure Data Factory, Azure Blob Storage, Delta Lake, PySpark, Docker, Kubernetes, Azure AKS, Azure DevOps, Terraform, Azure Resource Manager, Scikit-learn, XGBoost, spaCy, NLTK, REST APIs, CI/CD Pipelines, Grafana, Prometheus, PyTest, Swagger API, Microservices Architecture, Distributed Systems, Agile Scrum.
State of California, San Francisco, CA
Data Scientist / Machine Learning Engineer Feb 2020 - Apr 2022
Developed large-scale exploratory analysis using Python and R, identifying operational trends, data inconsistencies, and forecasting gaps across statewide healthcare and public program reporting datasets.
Developed interactive analytical dashboards using Tableau and Matplotlib, transforming complex statistical findings into executive-level reporting insights supporting policy planning and operational decision-making initiatives.
Built demand forecasting and resource planning models using Scikit-learn and Pandas, improving next-cycle prediction accuracy through historical trend analysis, feature engineering, and validation benchmarking techniques.
Designed predictive analytics solutions using Random Forest and Logistic Regression, supporting operational forecasting, citizen service analysis, program utilization tracking, and data-driven planning initiatives across departments.
Evaluated machine learning model performance using NumPy and SciPy, applying statistical validation, error analysis, and comparative testing to ensure reliable forecasting outcomes across enterprise operational datasets.
Conducted detailed univariate and bivariate analysis using Seaborn and Pandas, identifying feature relationships, variable distributions, and behavioral trends supporting model optimization and analytical decision strategies.
Applied supervised learning techniques using SVM and KNN, solving prediction challenges related to healthcare utilization, operational planning, citizen engagement, and statewide public service analytics initiatives.
Implemented clustering workflows using K-means and DBSCAN, segmenting behavioral patterns and operational records to support anomaly identification, population analysis, and improved statewide reporting capabilities.
Built reusable preprocessing pipelines using Scikit-learn and Feature Engineering, standardizing data cleansing, transformation, scaling, and validation workflows across multiple machine learning model development initiatives.
Performed enterprise statistical analysis using R and SQL, generating trend reports, operational summaries, and evidence-based recommendations supporting statewide program management and strategic planning efforts.
Developed deep learning prototypes using TensorFlow and Keras, exploring nonlinear relationships and predictive modeling improvements across healthcare analytics and operational classification use cases.
Applied advanced text processing workflows using NLTK and Python, supporting text normalization, keyword extraction, and semantic preprocessing across statewide public communication and reporting datasets.
Conducted forecasting validation and Model Testing using historical operational datasets, improving analytical reliability, prediction consistency, and reporting accuracy before enterprise-level deployment and stakeholder adoption.
Developed scalable analytical workflows using Python and SQL, automating reporting logic, validation routines, and statistical calculations supporting operational analytics across multiple statewide business units.
Integrated enterprise reporting datasets using Pandas and Data Visualization, enabling centralized analytics, operational transparency, and improved decision-making support for statewide healthcare administration teams.
Collaborated within Agile Scrum environments, working with analysts, reporting teams, and business stakeholders to deliver predictive analytics solutions aligned with statewide operational and compliance requirements.
Built automated validation routines using Data Validation and statistical quality checks, identifying inconsistencies, missing values, and reporting anomalies across healthcare and operational enterprise datasets.
Supported operational reporting initiatives using Tableau and Statistical Analysis, delivering executive dashboards, trend summaries, and actionable insights across healthcare and statewide public service programs.
Performed production support and Performance Monitoring for forecasting models, ensuring analytical consistency, stable reporting behavior, and operational reliability across enterprise analytical workflows.
Prepared technical documentation and Analytical Reports supporting onboarding, reporting governance, model transparency, and long-term maintainability across statewide analytics and machine learning initiatives.
Environment:
Python, R, Tableau, SQL, NumPy, Pandas, Matplotlib, Seaborn, SciPy, Scikit-learn, TensorFlow, Keras, NLTK, Logistic Regression, Random Forest, SVM, KNN, Classification, Regression, Clustering, K-means, DBSCAN, Feature Engineering, Statistical Analysis, Forecasting, Data Validation, Data Visualization, Agile Scrum.
Walmart Global Tech
Data Engineer Bentonville, AR | Oct 2016 - Dec 2019
Developed scalable retail data solutions using Google Cloud Storage and BigQuery, supporting high-volume data ingestion, enterprise reporting, operational analytics, and business intelligence initiatives across distributed retail environments.
Built end-to-end ETL workflows using GCP, Python, and BigQuery, transforming structured and unstructured retail datasets into analytics-ready formats supporting reporting, forecasting, and operational decision-making processes.
Engineered cloud-native data processing workflows using Google AI Platform, Hadoop, and Hive, enabling scalable analytics, distributed processing, predictive modeling, and enterprise modernization initiatives across reporting systems.
Designed enterprise data lake architectures using BigQuery and Google Cloud Storage, improving data accessibility, reporting performance, inventory analytics, and centralized operational intelligence across retail business applications.
Developed interactive reporting dashboards using Data Studio and BigQuery, enabling self-service analytics, operational visibility, and business insights for reporting analysts and cross-functional retail stakeholders.
Integrated enterprise systems using SOAP and WSDL services, supporting reliable communication between inventory applications, operational platforms, and partner-facing retail systems across distributed business environments.
Automated enterprise data ingestion and SQL transformation workflows using Python, PL/SQL, Hive, Oracle 10g, and DB2, improving reporting consistency, query performance, and operational data reliability significantly.
Collaborated within Agile Scrum environments supporting enterprise reporting modernization, metadata management, OLTP/OLAP processing, technical documentation, and scalable analytics delivery across retail operational systems.
Environment: Python, Google Cloud Platform, Google Cloud Storage, BigQuery, Google AI Platform, Data Studio, SOAP, WSDL, Hadoop, Hive, Oracle 10g, DB2, OLTP, OLAP, Metadata Management, MS Excel, MS Visio, Rational Rose, PL/SQL, PHP, SQL, Agile Scrum.
Citibank
Python Developer Hyderabad, India | Aug 2015 - Sep 2016
Developed financial data processing and Operational Reporting solutions using Python and SQL, supporting transaction analysis, reconciliation activities, compliance tracking, and enterprise banking workflows across finance and reporting teams.
Built scalable ETL Workflows using Python, Pandas, MySQL, and Oracle, transforming transactional banking datasets into structured formats supporting downstream analytics, operational reporting, and financial data processing requirements.
Designed optimized SQL Queries and multi-table joins consolidating customer records, transaction histories, and account-level datasets supporting enterprise financial reporting, reconciliation processes, and banking analytics initiatives.
Implemented robust Data Validation and preprocessing routines using Pandas and Python, resolving duplicate records, missing values, formatting inconsistencies, and operational reporting issues across banking datasets.
Automated recurring reporting and batch-processing workflows using Python Scripting, improving report generation efficiency, reducing manual processing efforts, and supporting timely financial reporting across compliance-driven banking environments.
Maintained enterprise Relational Databases including MySQL and Oracle, supporting structured data storage, query optimization, reporting accessibility, and reliable processing across banking operational reporting systems.
Supported production testing and workflow monitoring activities by validating processed outputs, troubleshooting reporting pipelines, and developing reusable scripts aligned with Agile Scrum delivery and banking operational standards.
Environment: Python, SQL, Pandas, MySQL, Oracle, ETL Workflows, Data Processing, Data Transformation, Data Validation, Report Automation, Relational Databases, SQL Queries, Financial Reporting, Compliance Reporting, Batch Processing, Agile Scrum, Git, Linux, Operational Reporting.

Education:
Bachelor of Technology in Computer Science
Sreyas Institute of Engineering and Technology Hyderabad, India
Keywords: continuous integration continuous deployment artificial intelligence machine learning business intelligence sthree rlang microsoft mississippi procedural language bay area Arkansas California Delaware Virginia

To remove this resume please click here or send an email from [email protected] to [email protected] with subject as "delete" (without inverted commas)
[email protected];7329
Enter the captcha code and we will send and email at [email protected]
with a link to edit / delete this resume
Captcha Image: