| Sujitha Cheruku - AI / ML Engineer | Machine Learning | Gen AI Engineer | Agentic AI | |
| [email protected] |
| Location: Mclean, Virginia, USA |
| Relocation: Yes |
| Visa: Green Card |
| Resume file: Resume_Sujitha_Ch_1779370323887.docx Please check the file(s) for viruses. Files are checked manually and then made available for download. |
|
Sujitha Cherukuthota
Senior AI / Machine Learning Engineer +1 (757) 936-9318 | [email protected] Summary: Senior AI / Machine Learning Engineer with 10 years of experience delivering scalable AI, Machine Learning, NLP, and cloud-native analytics solutions across Healthcare, Banking, Government, and Retail enterprise environments. Built enterprise Agentic AI and Generative AI applications using AWS Bedrock, Azure OpenAI, LangChain, LangGraph, Crew AI, MCP, and LlamaIndex supporting intelligent automation and contextual enterprise search capabilities. Developed production-ready RAG Pipelines using Pinecone, FAISS, embeddings, semantic retrieval, contextual grounding, and retrieval-aware prompting improving enterprise AI response relevance and document discovery quality significantly. Experienced building intelligent AI agents supporting planning, reasoning, execution, validation, memory handling, and human-in-the-loop workflows across healthcare and financial operational environments using modern orchestration frameworks. Strong expertise developing Machine Learning and predictive analytics solutions using Scikit-learn, XGBoost, TensorFlow, PyTorch, Keras, and Spark MLlib supporting fraud detection, forecasting, anomaly detection, and classification initiatives. Developed transformer-based NLP Solutions using Hugging Face, BERT, GPT, spaCy, and NLTK supporting summarization, entity extraction, conversational AI, intelligent routing, and document processing workflows. Designed scalable AI platforms using FastAPI, REST APIs, Docker, Kubernetes, Amazon EKS, and Azure AKS supporting distributed inference services, microservices architectures, and enterprise AI deployment requirements. Built distributed Data Engineering and ETL workflows using PySpark, Databricks, SQL, Hive, Azure Data Factory, and AWS Glue supporting large-scale analytics and machine learning data preparation activities. Implemented enterprise MLOps workflows using MLflow, GitHub Actions, Jenkins, Terraform, CI/CD Pipelines, Prometheus, and CloudWatch supporting deployment automation, monitoring, governance, and operational reliability requirements. Experienced working with healthcare and banking business domains including PBM, Prior Authorization, Claims Adjudication, Member Eligibility, Fraud Analytics, and financial operational reporting supporting enterprise AI modernization initiatives. Technical Skills: Programming languages: Python, SQL, Java, PL/SQL, Shell Scripting Agentic AI & Generative AI: AWS Bedrock, Azure OpenAI, Claude, LangChain, LangGraph, Crew AI, MCP, AI Agents, Agent Orchestration, Tool Invocation, LlamaIndex, Prompt Engineering, ReAct Reasoning, HITL, RAG Pipelines, RAGAS Machine Learning & Deep Learning: Scikit-learn, XGBoost, TensorFlow, PyTorch, Keras, Spark MLlib, MLflow, Predictive Modeling, Forecasting, Classification, Regression, Clustering, Anomaly Detection, A/B Testing, Feature Engineering NLP & Semantic Retrieval: Hugging Face, BERT, spaCy, NLTK, Pinecone, FAISS, Embeddings, Semantic Search, Hybrid Retrieval, Contextual Grounding, Vector Databases, Intelligent Document Processing Data Engineering & Big Data: PySpark, Databricks, AWS Glue, Azure Data Factory, Hive, ETL Pipelines, Data Transformation, Data Validation, Statistical Analysis, Distributed Processing Cloud Platforms & Storage: AWS, Azure, Amazon S3, AWS Lambda, Redshift, DynamoDB, Azure Blob Storage, Azure Synapse Analytics, Delta Lake Frameworks & APIs: FastAPI, REST APIs, Microservices Architecture, Distributed Systems, Workflow Automation, Enterprise Integrations DevOps / MLOps: Docker, Kubernetes, Amazon EKS, Azure AKS, GitHub Actions, Jenkins, Terraform, CloudFormation, Azure DevOps, CI/CD Pipelines, Model Lifecycle Management Monitoring & Validation: Prometheus, CloudWatch, Grafana, RAGAS, PyTest, Swagger API Validation, Model Monitoring, Performance Evaluation Visualization & Reporting: Tableau, Power BI, Matplotlib, Seaborn, Grafana, Operational Reporting, Data Visualization, Executive Dashboards Experience: HCA Healthcare Senior AI / Machine Learning Engineer Richmond, VA | Feb 2024 - Present Built healthcare AI Assistant platform using AWS Bedrock, LangChain, and FastAPI, helping support teams retrieve Prior Authorization records, PBM documents, and clinical notes during provider case review activities. Integrated referral notes, FHIR records, HL7 messages, and pharmacy claims using Python and AWS Lambda, establishing validated ingestion workflows for downstream retrieval, analytics, and operational healthcare processing activities. Processed structured claims datasets and unstructured clinical documents using PySpark, Databricks, and AWS Glue, preparing normalized healthcare data for semantic retrieval, forecasting models, and reporting pipeline requirements. Stored curated healthcare datasets within Amazon S3, Redshift, and DynamoDB, enabling governed access patterns for provider searches, operational reporting, AI retrieval services, and enterprise healthcare compliance requirements. Implemented semantic indexing workflows using Pinecone, FAISS, and Titan Embeddings, improving contextual healthcare search accuracy across referral summaries, clinical records, provider communications, and pharmacy support documentation requests. Configured AWS Bedrock with Claude foundation models supporting healthcare summarization, contextual question answering, and grounded responses over retrieved provider records, utilization reviews, and healthcare operational knowledge repositories. Built production-ready RAG Pipelines using LlamaIndex and contextual chunking techniques, improving healthcare document retrieval relevance while maintaining grounded responses across provider support and claims review activities. Developed healthcare forecasting and anomaly detection workflows using SageMaker, TensorFlow, Scikit-learn, and MLflow, supporting utilization prediction, provider trend analysis, and operational healthcare classification initiatives. Applied Prompt Engineering, ReAct Reasoning, and contextual retrieval validation techniques, improving healthcare response consistency, grounded search quality, and execution reliability during enterprise healthcare workflow testing activities phases. Designed LangGraph orchestration flows separating planning, retrieval, validation, and response generation stages, allowing healthcare agents to complete multi-step provider support requests through governed execution checkpoints safely. Integrated Crew AI and MCP-based tool connectors supporting provider lookup, eligibility validation, healthcare summarization, and contextual retrieval workflows across distributed operational healthcare systems and governed support environments. Developed reusable FastAPI services exposing retrieval, summarization, validation, and healthcare classification endpoints, supporting scalable integrations between provider systems, operational dashboards, and enterprise AI orchestration components. Created lightweight Node.js interfaces allowing healthcare support teams review retrieved evidence, validate AI-generated responses, submit analyst feedback, and escalate uncertain cases through governed operational review processes. Conducted enterprise A/B Testing and offline retrieval evaluations comparing chunking strategies, reranking approaches, prompt variations, and grounded healthcare response quality before production rollout across operational support environments. Implemented human-in-the-loop review workflows and policy guardrails ensuring healthcare AI responses remained explainable, auditable, compliant, and aligned with enterprise governance requirements across provider support operations environments. Leveraged GitHub Copilot during backend API development and healthcare integration activities, improving reusable service creation, engineering productivity, deployment consistency, and enterprise AI delivery timelines. Managed MLflow and SageMaker experiment tracking workflows covering model evaluation, prompt versioning, endpoint testing, inference monitoring, and healthcare machine learning deployment lifecycle activities across production environments. Containerized inference workloads using Docker and AWS ECR, enabling portable healthcare AI deployments, runtime consistency, secure image management, and standardized execution across distributed operational platform environments. Deployed healthcare AI services on Amazon EKS and Kubernetes, supporting autoscaling, workload resiliency, deployment isolation, and highly available orchestration across enterprise provider support and healthcare operations platforms. Automated deployment pipelines using GitHub Actions, Jenkins, Terraform, and CloudFormation, improving infrastructure provisioning, rollback readiness, deployment governance, and release management activities. Environment: Python, AWS Bedrock, Claude, Titan Embeddings, LangChain, LangGraph, Crew AI, MCP, LlamaIndex, Pinecone, FAISS, Hugging Face, FastAPI, Node.js, REST APIs, AWS SageMaker, MLflow, PySpark, Databricks, AWS Glue, AWS Lambda, Amazon S3, Redshift, DynamoDB, Docker, Kubernetes, Amazon EKS, AWS ECR, GitHub Actions, Jenkins, Terraform, CloudFormation, TensorFlow, Scikit-learn, RAGAS, HITL, ReAct Reasoning, AI Agents, Tool Invocation, RAG Pipelines, Healthcare Interoperability, HL7, FHIR, PBM, Prior Authorization, Claims Adjudication, Member Eligibility, Pharmacy Claims, Utilization Management, Clinical Data, CI/CD Pipelines. Sallie Mae Bank AI / Machine Learning Engineer Newark, DE | May 2022 - Jan 2024 Built intelligent Fraud Analytics solutions using Azure OpenAI, LangChain, and FastAPI, helping fraud teams review suspicious transactions, customer interactions, repayment activities, and loan servicing operations across banking platforms. Integrated loan applications, repayment histories, transaction logs, and customer communication datasets using Python and Azure Data Factory, establishing validated ingestion workflows across distributed banking operational systems environments. Processed structured financial records and transactional datasets using PySpark, Azure Databricks, and Azure Synapse, preparing analytics-ready banking data supporting fraud scoring, risk modeling, and operational reporting requirements. Managed governed financial storage architectures using Azure Blob Storage, Delta Lake, and Azure Synapse, enabling secure banking data access patterns across fraud analytics and reporting workflows internally. Developed semantic retrieval workflows using Pinecone and Azure AI Search, improving contextual financial document discovery, repayment assistance retrieval, and enterprise banking knowledge accessibility across operational servicing teams. Fine-tuned enterprise NLP Models using Hugging Face, BERT, spaCy, and NLTK, supporting customer intent classification, entity extraction, fraud communication analysis, and intelligent financial document processing activities. Built contextual RAG Pipelines using LangChain and embedding-based retrieval, improving grounded banking response generation while reducing irrelevant recommendations across repayment support and fraud investigation workflows significantly. Developed predictive fraud detection workflows using Scikit-learn, XGBoost, TensorFlow, and MLflow supporting anomaly detection, fraud classification, transaction risk scoring, and operational banking analytics across production systems. Applied statistical validation and A/B Testing methodologies comparing fraud classification accuracy, retrieval quality, model consistency, and operational banking response performance across enterprise financial AI workflow environments. Optimized Prompt Engineering strategies and contextual grounding techniques improving conversational banking consistency, fraud investigation relevance, and financial document response quality across customer servicing support environments. Developed reusable FastAPI services exposing fraud scoring, retrieval, summarization, and validation endpoints supporting integrations between operational banking systems, internal dashboards, and enterprise AI workflow components internally. Integrated enterprise banking APIs and operational platforms enabling secure financial data retrieval, contextual workflow automation, intelligent routing, and customer servicing support across distributed banking application environments effectively. Leveraged GitHub Copilot during backend API development and fraud analytics integration activities improving reusable service creation, engineering productivity, deployment consistency, and enterprise banking AI delivery timelines significantly. Managed MLflow workflows covering experiment tracking, model evaluation, inference optimization, prompt versioning, and banking machine learning deployment lifecycle activities across distributed operational production environments effectively. Containerized banking inference workloads using Docker and Azure Container Registry, enabling runtime consistency, portable deployments, secure image management, and standardized execution across enterprise banking AI platforms environments. Deployed fraud analytics workloads on Azure AKS and Kubernetes supporting autoscaling, deployment resiliency, workload isolation, and highly available orchestration across enterprise banking operations and servicing platforms environments. Automated release workflows using Azure DevOps, GitHub Actions, Terraform, and Jenkins improving infrastructure provisioning, deployment governance, rollback readiness, and banking AI release management activities across environments. Monitored enterprise banking AI platforms using Grafana and Prometheus tracking inference latency, infrastructure health, operational alerts, fraud model behavior, and production reliability across distributed banking environments continuously. Executed PyTest validation and Swagger API testing ensuring stable integrations, secure banking API functionality, deployment consistency, and enterprise compliance across production fraud analytics and AI systems environments. Produced governed technical documentation and Swagger API specifications supporting onboarding activities, operational maintainability, integration standards, and long-term sustainability across enterprise banking AI implementation environments. Environment: Python, Azure OpenAI, LangChain, Hugging Face, BERT, spaCy, NLTK, Pinecone, Azure AI Search, FastAPI, REST APIs, PySpark, Azure Databricks, Azure Data Factory, Azure Synapse Analytics, Azure Blob Storage, Delta Lake, TensorFlow, Scikit-learn, XGBoost, MLflow, Docker, Kubernetes, Azure AKS, Azure DevOps, GitHub Actions, Jenkins, Terraform, Grafana, Prometheus, PyTest, Swagger API, RAG Pipelines, Fraud Detection, Anomaly Detection, Predictive Analytics, CI/CD Pipelines, Microservices Architecture, Distributed Systems, Loan Servicing, Risk Scoring, Financial Analytics. State of California, San Francisco, CA Data Scientist / Machine Learning Engineer Feb 2020 - Apr 2022 Developed intelligent financial AI solutions using Azure OpenAI, LangChain, FastAPI, and Scikit-learn supporting fraud investigation workflows, loan servicing operations, repayment assistance, and banking document validation activities. Designed scalable AI orchestration workflows using LangChain, REST APIs, and Microservices Architecture enabling contextual reasoning, workflow automation, intelligent routing, and semi-autonomous financial support operations. Built enterprise ingestion frameworks using Python and Azure Data Factory integrating loan applications, repayment histories, customer communications, transactional datasets, and financial operational reporting records from distributed banking systems. Developed high-volume transformation pipelines using PySpark, Azure Databricks, and Azure Synapse processing financial datasets supporting fraud analytics, anomaly detection, risk scoring, and machine learning feature engineering activities. Managed governed financial storage architectures using Azure Blob Storage, Delta Lake, and Azure Synapse supporting secure banking data accessibility, enterprise analytics, and scalable financial reporting capabilities. Implemented semantic retrieval workflows using Pinecone and Azure AI Search enabling intelligent financial document discovery, contextual recommendations, fraud investigation assistance, and enterprise banking knowledge retrieval activities. Fine-tuned enterprise NLP workflows using Hugging Face, BERT, spaCy, and NLTK supporting intent recognition, text classification, entity extraction, sentiment analysis, and automated banking document processing activities. Built scalable RAG Pipelines using LangChain, embeddings, and semantic retrieval improving contextual response generation, financial document grounding, and intelligent banking assistance across operational support environments. Developed predictive fraud analytics workflows using Scikit-learn, XGBoost, TensorFlow, and MLflow supporting anomaly detection, fraud classification, model evaluation, and enterprise financial risk analysis initiatives. Conducted enterprise A/B Testing and offline validation workflows comparing fraud scoring accuracy, classification consistency, contextual retrieval quality, and operational model performance across banking AI environments. Enhanced AI response behavior using Prompt Engineering and contextual grounding techniques improving fraud investigation quality, conversational consistency, and financial document response relevance across operational banking workflows. Developed intelligent orchestration services using FastAPI and LangChain supporting fraud investigation assistance, workflow execution, contextual validation, and secure LLM integrations across enterprise banking support systems. Integrated enterprise financial APIs and operational banking systems enabling secure data retrieval, workflow automation, contextual reasoning, and intelligent response generation across customer servicing environments. Leveraged GitHub Copilot during backend API development and fraud analytics integrations improving reusable service creation, engineering productivity, and scalable financial AI application delivery workflows. Utilized MLflow and TensorFlow workflows for experiment tracking, model evaluation, prompt versioning, inference optimization, and AI deployment lifecycle management across banking machine learning environments. Built scalable inference services using Docker and Azure Container Registry enabling workload standardization, runtime consistency, secure deployments, and portable AI execution across enterprise financial platforms. Deployed fraud analytics workloads on Azure AKS and Kubernetes enabling autoscaling, workload resiliency, deployment isolation, and highly available AI platform orchestration across banking operational environments. Automated enterprise deployment workflows through CI/CD Pipelines using Azure DevOps, GitHub Actions, Terraform, and Jenkins improving release governance, rollback readiness, and deployment reliability across banking AI systems. Tracked operational AI performance using Grafana, Prometheus, and model monitoring workflows identifying latency issues, infrastructure alerts, model drift, and fraud detection inconsistencies across production environments. Performed Unit Testing and API validation using PyTest and Swagger specifications ensuring stable integrations, deployment reliability, secure API functionality, and enterprise compliance across financial AI applications. Environment: Python, Azure OpenAI, LangChain, Hugging Face, BERT, Pinecone, Azure AI Search, FastAPI, REST APIs, PySpark, Azure Databricks, Azure Data Factory, Azure Synapse Analytics, Azure Blob Storage, Delta Lake, TensorFlow, Scikit-learn, XGBoost, spaCy, NLTK, MLflow, Docker, Kubernetes, Azure AKS, GitHub Actions, Azure DevOps, Jenkins, Terraform, Grafana, Prometheus, PyTest, Swagger API, RAG Pipelines, Semantic Retrieval, Fraud Detection, Anomaly Detection, Predictive Analytics, CI/CD Pipelines, Microservices Architecture, Distributed Systems. Walmart Global Tech Data Engineer Bentonville, AR | Oct 2016 - Dec 2019 Developed scalable retail data solutions using Google Cloud Storage and BigQuery, supporting high-volume data ingestion, enterprise reporting, operational analytics, and business intelligence initiatives across distributed retail environments. Built end-to-end ETL workflows using GCP, Python, and BigQuery, transforming structured and unstructured retail datasets into analytics-ready formats supporting reporting, forecasting, and operational decision-making processes. Engineered cloud-native data processing workflows using Google AI Platform, Hadoop, and Hive, enabling scalable analytics, distributed processing, predictive modeling, and enterprise modernization initiatives across reporting systems. Designed enterprise data lake architectures using BigQuery and Google Cloud Storage, improving data accessibility, reporting performance, inventory analytics, and centralized operational intelligence across retail business applications. Developed interactive reporting dashboards using Data Studio and BigQuery, enabling self-service analytics, operational visibility, and business insights for reporting analysts and cross-functional retail stakeholders. Integrated enterprise systems using SOAP and WSDL services, supporting reliable communication between inventory applications, operational platforms, and partner-facing retail systems across distributed business environments. Automated enterprise data ingestion and SQL transformation workflows using Python, PL/SQL, Hive, Oracle 10g, and DB2, improving reporting consistency, query performance, and operational data reliability significantly. Collaborated within Agile Scrum environments supporting enterprise reporting modernization, metadata management, OLTP/OLAP processing, technical documentation, and scalable analytics delivery across retail operational systems. Environment: Python, Google Cloud Platform, Google Cloud Storage, BigQuery, Google AI Platform, Data Studio, SOAP, WSDL, Hadoop, Hive, Oracle 10g, DB2, OLTP, OLAP, Metadata Management, MS Excel, MS Visio, Rational Rose, PL/SQL, PHP, SQL, Agile Scrum. Citibank Python Developer Hyderabad, India | Aug 2015 - Sep 2016 Developed Financial Reporting solutions using Python and SQL supporting transaction reconciliation, compliance tracking, operational reporting, and customer account analysis for daily banking activities and finance operations. Built scalable ETL Workflows using Python, Pandas, MySQL, and Oracle transforming transactional banking records into structured reporting datasets supporting reconciliation and financial processing requirements effectively. Designed optimized SQL Queries and multi-table joins consolidating customer transactions, account histories, and operational datasets supporting reconciliation checks, reporting accuracy, and banking data analysis activities consistently. Implemented robust Data Validation routines using Pandas and Python identifying duplicate records, missing values, transactional inconsistencies, and formatting issues within high-volume financial reporting datasets efficiently. Automated recurring reporting and Batch Processing activities using Python and Shell Scripting improving operational efficiency, reducing manual reporting efforts, and supporting scheduled financial reporting execution requirements effectively. Maintained enterprise Relational Databases including MySQL and Oracle supporting query optimization, structured data storage, transaction processing, and reliable reporting accessibility for operational banking systems consistently. Developed reusable backend scripts using Python and SQL supporting operational reporting automation, transaction monitoring, reconciliation workflows, and standardized financial data processing activities within reporting applications effectively. Performed transaction-level analysis using SQL and Pandas supporting operational investigations, reconciliation validation, compliance reporting checks, and financial reporting quality assurance activities during production processing cycles consistently. Executed Unit Testing and workflow validation activities ensuring accurate transaction calculations, stable reporting logic, reliable processing behavior, and operational consistency across enterprise financial reporting systems effectively. Collaborated within Agile Scrum teams supporting reporting enhancements, operational workflow improvements, backend processing requirements, and technical documentation activities aligned with enterprise banking reporting standards consistently. Environment: Python, SQL, Pandas, MySQL, Oracle, ETL Workflows, Data Processing, Data Transformation, Data Validation, Financial Reporting, Compliance Reporting, Batch Processing, Shell Scripting, SQL Queries, Relational Databases, Operational Reporting, Report Automation, Agile Scrum, Git, Linux. Education: Bachelor of Technology in Computer Science Sreyas Institute of Engineering and Technology Hyderabad, India Keywords: continuous integration continuous deployment artificial intelligence javascript business intelligence sthree microsoft mississippi procedural language bay area Arkansas California Delaware Virginia |