| Sujitha C - Sr. AI / ML Engineer |
| [email protected] |
| Location: Mclean, Virginia, USA |
| Relocation: Yes |
| Visa: GC |
| Resume file: Resume_Sujitha_Ch_1778680860032.docx Please check the file(s) for viruses. Files are checked manually and then made available for download. |
|
Sujitha Cherukuthota
Senior AI / Machine Learning Engineer +1 (757) 936-9318 | [email protected] Summary: Senior AI / ML Engineer with 10 years of experience designing scalable Artificial Intelligence, Machine Learning, Generative AI, NLP, and cloud-native analytics solutions across Healthcare, Banking, Government, and Retail enterprise environments. Developed enterprise Generative AI and Conversational AI applications using AWS Bedrock, Azure OpenAI, Claude, LangChain, LangGraph, and LlamaIndex, enabling intelligent automation, contextual search, and enterprise knowledge assistant capabilities. Built production-grade RAG Pipelines using Pinecone, FAISS, Embeddings, Hybrid Retrieval, Semantic Search, Contextual Grounding, and Prompt Engineering techniques, improving enterprise AI response quality and retrieval relevance significantly. Strong expertise in Machine Learning model development using Scikit-learn, XGBoost, TensorFlow, PyTorch, Keras, Pandas, NumPy, and SciPy for forecasting, fraud detection, anomaly detection, clustering, and predictive analytics initiatives. Experienced in developing transformer-based NLP Solutions using Hugging Face Transformers, BERT, GPT, T5, spaCy, and NLTK for summarization, classification, entity extraction, sentiment analysis, and conversational AI workflows. Designed scalable AI Architectures using FastAPI, REST APIs, Microservices Architecture, Docker, Kubernetes, Amazon EKS, and Azure AKS, enabling secure model deployment and distributed inference service integration capabilities. Implemented enterprise MLOps and Model Lifecycle Management workflows using MLflow, Kubeflow, Jenkins, GitHub Actions, Terraform, CI/CD Pipelines, and Kubernetes supporting automated deployment, monitoring, and scalable AI operations. Developed distributed Data Engineering and ETL pipelines using PySpark, SQL, Hive, BigQuery, Azure Data Factory, Pandas, and Distributed Processing frameworks supporting enterprise analytics and machine learning model training initiatives. Built enterprise Vector Search and Semantic Retrieval systems using Pinecone, Embedding Models, Contextual Search, Hybrid Retrieval, and Retrieval-aware Prompting improving chatbot intelligence and AI-driven response generation accuracy. Applied advanced Deep Learning techniques using TensorFlow, Keras, CNNs, OpenCV, and PyTorch for intelligent automation, image classification, fraud analytics, operational forecasting, and predictive modeling use cases. Hands-on expertise with enterprise Cloud Platforms including AWS, Azure, and Google Cloud Platform implementing scalable AI workloads using Bedrock, Azure Synapse, AWS Lambda, BigQuery, and Google AI Platform services. Experienced in developing enterprise Data Visualization and analytical reporting solutions using Tableau, Matplotlib, Seaborn, Grafana, Power BI, and Data Studio supporting operational insights and executive decision-making initiatives. Worked extensively within Agile Scrum environments collaborating with product owners, architects, analysts, and cross-functional stakeholders to deliver scalable AI, Machine Learning, and enterprise analytics solutions successfully. Technical Skills: Programming languages: Python, SQL, R, Java, PL/SQL, PHP Generative AI & LLMs: AWS Bedrock, Azure OpenAI, Claude, LangChain, LangGraph, LlamaIndex, Hugging Face Transformers, BERT, T5, RAG Pipelines, Prompt Engineering, ReAct Reasoning Machine Learning & NLP: Scikit-learn, XGBoost, TensorFlow, Keras, PyTorch, spaCy, NLTK, Predictive Modeling, Classification, Regression, Clustering, Forecasting, Anomaly Detection Data Engineering & Processing: PySpark, Pandas, NumPy, ETL Workflows, Data Transformation, Feature Engineering, Statistical Analysis, Data Validation, Distributed Processing Cloud Platforms: AWS, Azure, Google Cloud Platform (GCP), Amazon S3, AWS Lambda, Redshift, Azure Synapse Analytics, Azure Blob Storage, BigQuery, Google Cloud Storage Vector Databases & Search: Pinecone, FAISS, Embeddings, Semantic Retrieval, Vector Search, Contextual Search, Hybrid Retrieval Frameworks & APIs: FastAPI, REST APIs, Microservices Architecture, SOAP, WSDL, Distributed Systems DevOps / MLOps: Docker, Kubernetes, Amazon EKS, Azure AKS, Terraform, CloudFormation, Azure DevOps, GitHub Actions, Jenkins, MLflow, CI/CD Pipelines Databases & Warehousing: MySQL, Oracle 10g, DB2, DynamoDB, Hive, Delta Lake, Relational Databases, OLTP, OLAP Visualization & Monitoring: Tableau, Matplotlib, Seaborn, Data Studio, Grafana, Prometheus, CloudWatch, Swagger API, MS Excel Experience: HCA Healthcare Senior AI / Machine Learning Engineer Richmond, VA | Feb 2024 - Present Developed healthcare AI Assistant solutions using AWS Bedrock, LangChain, FastAPI, and Pinecone, helping hospital staff retrieve referral notes and clinical documents while reducing manual search efforts significantly. Designed scalable Generative AI workflows using REST APIs, LangGraph, and vector retrieval pipelines, enabling conversational healthcare support services handling operational requests and internal documentation queries efficiently daily. Worked within Agile Scrum teams alongside Product Owners and healthcare stakeholders, delivering conversational AI enhancements across biweekly sprints aligned with clinical workflows and enterprise operational modernization initiatives successfully. Built distributed ingestion pipelines using Python and AWS Lambda, processing healthcare documents, provider communications, discharge summaries, and operational datasets supporting downstream conversational AI and analytics workflows reliably. Developed preprocessing pipelines using PySpark and AWS Glue, transforming structured and unstructured healthcare datasets while reducing enterprise AI data preparation efforts by 40% across production environments efficiently. Managed healthcare storage architectures using Amazon S3 and Redshift, supporting governed analytics, secure healthcare data accessibility, and scalable querying across operational reporting and AI-driven support environments organization-wide. Implemented semantic retrieval workflows using Pinecone and FAISS, improving contextual healthcare search relevance through embedding-based retrieval and intelligent conversational response generation across enterprise healthcare applications successfully. Utilized AWS Bedrock foundation models including Claude, improving contextual response quality while enabling secure healthcare document understanding and conversational assistance across operational support workflows effectively. Built production-ready RAG Pipelines using LlamaIndex and embedding models, improving chatbot relevance through semantic retrieval, contextual grounding, and retrieval-aware prompting strategies across healthcare support applications successfully. Enhanced enterprise Prompt Engineering workflows using ReAct Reasoning and contextual prompting techniques, reducing hallucinations while improving conversational consistency and healthcare response accuracy across production AI environments. Developed orchestration services using LangGraph and FastAPI, supporting distributed conversational workflows, secure LLM integrations, intelligent automation pipelines, and scalable healthcare assistant capabilities across operational support teams. Applied modular backend development practices using Python and Microservices Architecture, improving maintainability, reusable integrations, deployment consistency, and scalable AI service development across enterprise healthcare applications effectively. Conducted offline evaluations and Validation Testing using healthcare query datasets, improving conversational reliability and contextual accuracy before large-scale production deployments across enterprise AI support environments successfully. Built scalable inference services using Docker and AWS ECR, enabling portable deployments, runtime consistency, secure container management, and standardized conversational AI execution across distributed healthcare environments reliably. Deployed conversational AI workloads on Amazon EKS and Kubernetes, enabling autoscaling, workload isolation, fault-tolerant orchestration, and highly available healthcare AI platform deployments across production environments successfully. Automated deployment workflows through CI/CD Pipelines using GitHub Actions and Jenkins, improving release reliability, deployment governance, rollback readiness, and continuous delivery across conversational AI application environments. Managed cloud infrastructure using Terraform and CloudFormation, supporting scalable provisioning, infrastructure standardization, secure environment management, and governed cloud resource deployment across enterprise AI systems. Monitored conversational AI services using CloudWatch and Prometheus, tracking latency, operational alerts, infrastructure utilization, and conversational performance metrics across distributed healthcare production environments continuously. Performed Unit Testing and API Validation using PyTest, ensuring stable integrations, reliable deployments, secure API functionality, and scalable conversational AI reliability across enterprise healthcare application environments. Prepared technical documentation and Swagger API specifications supporting onboarding, governance standards, operational maintainability, enterprise knowledge transfer, and long-term sustainability across healthcare AI implementations. Environment: Python, AWS Bedrock, Claude, LangChain, LangGraph, LlamaIndex, Pinecone, FAISS, FastAPI, REST APIs, PySpark, AWS Glue, Amazon S3, Redshift, DynamoDB, Docker, Kubernetes, Amazon EKS, Jenkins, GitHub Actions, Terraform, CloudFormation, MLflow, Prometheus, CloudWatch, PyTorch, Scikit-learn, Microservices Architecture, Distributed Systems, CI/CD Pipelines, Agile Scrum. Sallie Mae Bank AI / Machine Learning Engineer Newark, DE | May 2022 - Jan 2024 Engineered intelligent loan servicing solutions using Azure OpenAI and LangChain, streamlining customer support workflows, repayment assistance operations, and financial document interactions across enterprise banking environments. Designed event-driven AI architectures using Conversational AI and Microservices, enabling secure financial query handling, workflow orchestration, and scalable customer engagement capabilities across banking support platforms. Worked within Agile Scrum delivery models, collaborating with product managers, fraud analysts, and compliance teams to deliver AI-powered banking solutions aligned with operational and regulatory requirements. Built enterprise ingestion frameworks using Azure Data Factory and Python, integrating loan records, transaction histories, payment logs, and customer communication datasets from distributed banking applications. Developed high-volume transformation pipelines using Azure Synapse and PySpark, processing financial datasets for fraud analytics, customer intelligence, risk scoring, and machine learning feature engineering workflows. Structured governed financial storage solutions using Azure Blob Storage and Delta Lake, supporting scalable reporting, enterprise analytics, and secure access management across AI-driven banking applications. Created semantic knowledge retrieval systems using Pinecone and Vector Search, enabling intelligent document discovery, contextual recommendations, and enterprise financial knowledge accessibility across support operations. Fine-tuned enterprise NLP workflows using Hugging Face and BERT, supporting financial summarization, intent recognition, intelligent routing, and automated classification across customer servicing environments. Implemented scalable RAG Pipelines using LangChain and embeddings, improving contextual banking response generation through retrieval-aware prompting and optimized financial document relevance techniques. Enhanced enterprise Prompt Engineering strategies using contextual prompting and response grounding, improving conversational consistency while minimizing hallucinations across financial AI support environments. Developed predictive fraud analytics using Scikit-learn and XGBoost, improving anomaly detection accuracy through feature selection, statistical validation, model optimization, and classification performance tuning methodologies. Applied enterprise NLP processing using spaCy and NLTK, performing entity extraction, sentiment analysis, text normalization, and semantic preprocessing across large-scale financial communication datasets. Performed enterprise Model Validation and performance benchmarking using historical transaction datasets, improving operational reliability, fraud prediction accuracy, and production readiness across AI-driven banking systems. Built scalable inference services using Docker and Azure Container Registry, enabling portable deployments, workload standardization, runtime consistency, and reliable AI application delivery across banking environments. Orchestrated distributed AI workloads using Azure AKS and Kubernetes, enabling autoscaling, workload resiliency, secure deployment isolation, and highly available machine learning operations across enterprise platforms. Automated enterprise CI/CD Pipelines using Azure DevOps and GitHub Actions, improving deployment governance, release reliability, rollback management, and continuous integration across banking AI applications. Provisioned enterprise AI infrastructure using Terraform and Azure Resource Manager, supporting governed cloud provisioning, secure networking, and scalable banking platform deployment management standards. Tracked operational AI performance using Grafana and Prometheus, monitoring infrastructure health, inference latency, operational alerts, and production reliability across enterprise financial environments continuously. Executed Unit Testing and integration validation using PyTest, ensuring deployment stability, secure API functionality, and enterprise compliance across production AI and machine learning systems. Produced technical knowledge artifacts and Swagger API documentation supporting onboarding, governance standards, operational maintainability, and long-term sustainability across enterprise banking AI implementations. Environment: Python, Azure OpenAI, LangChain, Hugging Face Transformers, BERT, Pinecone, Vector Search, Azure Synapse Analytics, Azure Data Factory, Azure Blob Storage, Delta Lake, PySpark, Docker, Kubernetes, Azure AKS, Azure DevOps, Terraform, Azure Resource Manager, Scikit-learn, XGBoost, spaCy, NLTK, REST APIs, CI/CD Pipelines, Grafana, Prometheus, PyTest, Swagger API, Microservices Architecture, Distributed Systems, Agile Scrum. State of California, San Francisco, CA Data Scientist / Machine Learning Engineer Feb 2020 - Apr 2022 Developed large-scale exploratory analysis using Python and R, identifying operational trends, data inconsistencies, and forecasting gaps across statewide healthcare and public program reporting datasets. Developed interactive analytical dashboards using Tableau and Matplotlib, transforming complex statistical findings into executive-level reporting insights supporting policy planning and operational decision-making initiatives. Built demand forecasting and resource planning models using Scikit-learn and Pandas, improving next-cycle prediction accuracy through historical trend analysis, feature engineering, and validation benchmarking techniques. Designed predictive analytics solutions using Random Forest and Logistic Regression, supporting operational forecasting, citizen service analysis, program utilization tracking, and data-driven planning initiatives across departments. Evaluated machine learning model performance using NumPy and SciPy, applying statistical validation, error analysis, and comparative testing to ensure reliable forecasting outcomes across enterprise operational datasets. Conducted detailed univariate and bivariate analysis using Seaborn and Pandas, identifying feature relationships, variable distributions, and behavioral trends supporting model optimization and analytical decision strategies. Applied supervised learning techniques using SVM and KNN, solving prediction challenges related to healthcare utilization, operational planning, citizen engagement, and statewide public service analytics initiatives. Implemented clustering workflows using K-means and DBSCAN, segmenting behavioral patterns and operational records to support anomaly identification, population analysis, and improved statewide reporting capabilities. Built reusable preprocessing pipelines using Scikit-learn and Feature Engineering, standardizing data cleansing, transformation, scaling, and validation workflows across multiple machine learning model development initiatives. Performed enterprise statistical analysis using R and SQL, generating trend reports, operational summaries, and evidence-based recommendations supporting statewide program management and strategic planning efforts. Developed deep learning prototypes using TensorFlow and Keras, exploring nonlinear relationships and predictive modeling improvements across healthcare analytics and operational classification use cases. Applied advanced text processing workflows using NLTK and Python, supporting text normalization, keyword extraction, and semantic preprocessing across statewide public communication and reporting datasets. Conducted forecasting validation and Model Testing using historical operational datasets, improving analytical reliability, prediction consistency, and reporting accuracy before enterprise-level deployment and stakeholder adoption. Developed scalable analytical workflows using Python and SQL, automating reporting logic, validation routines, and statistical calculations supporting operational analytics across multiple statewide business units. Integrated enterprise reporting datasets using Pandas and Data Visualization, enabling centralized analytics, operational transparency, and improved decision-making support for statewide healthcare administration teams. Collaborated within Agile Scrum environments, working with analysts, reporting teams, and business stakeholders to deliver predictive analytics solutions aligned with statewide operational and compliance requirements. Built automated validation routines using Data Validation and statistical quality checks, identifying inconsistencies, missing values, and reporting anomalies across healthcare and operational enterprise datasets. Supported operational reporting initiatives using Tableau and Statistical Analysis, delivering executive dashboards, trend summaries, and actionable insights across healthcare and statewide public service programs. Performed production support and Performance Monitoring for forecasting models, ensuring analytical consistency, stable reporting behavior, and operational reliability across enterprise analytical workflows. Prepared technical documentation and Analytical Reports supporting onboarding, reporting governance, model transparency, and long-term maintainability across statewide analytics and machine learning initiatives. Environment: Python, R, Tableau, SQL, NumPy, Pandas, Matplotlib, Seaborn, SciPy, Scikit-learn, TensorFlow, Keras, NLTK, Logistic Regression, Random Forest, SVM, KNN, Classification, Regression, Clustering, K-means, DBSCAN, Feature Engineering, Statistical Analysis, Forecasting, Data Validation, Data Visualization, Agile Scrum. Walmart Global Tech Data Engineer Bentonville, AR | Oct 2016 - Dec 2019 Developed scalable retail data solutions using Google Cloud Storage and BigQuery, supporting high-volume data ingestion, enterprise reporting, operational analytics, and business intelligence initiatives across distributed retail environments. Built end-to-end ETL workflows using GCP, Python, and BigQuery, transforming structured and unstructured retail datasets into analytics-ready formats supporting reporting, forecasting, and operational decision-making processes. Engineered cloud-native data processing workflows using Google AI Platform, Hadoop, and Hive, enabling scalable analytics, distributed processing, predictive modeling, and enterprise modernization initiatives across reporting systems. Designed enterprise data lake architectures using BigQuery and Google Cloud Storage, improving data accessibility, reporting performance, inventory analytics, and centralized operational intelligence across retail business applications. Developed interactive reporting dashboards using Data Studio and BigQuery, enabling self-service analytics, operational visibility, and business insights for reporting analysts and cross-functional retail stakeholders. Integrated enterprise systems using SOAP and WSDL services, supporting reliable communication between inventory applications, operational platforms, and partner-facing retail systems across distributed business environments. Automated enterprise data ingestion and SQL transformation workflows using Python, PL/SQL, Hive, Oracle 10g, and DB2, improving reporting consistency, query performance, and operational data reliability significantly. Collaborated within Agile Scrum environments supporting enterprise reporting modernization, metadata management, OLTP/OLAP processing, technical documentation, and scalable analytics delivery across retail operational systems. Environment: Python, Google Cloud Platform, Google Cloud Storage, BigQuery, Google AI Platform, Data Studio, SOAP, WSDL, Hadoop, Hive, Oracle 10g, DB2, OLTP, OLAP, Metadata Management, MS Excel, MS Visio, Rational Rose, PL/SQL, PHP, SQL, Agile Scrum. Citibank Python Developer Hyderabad, India | Aug 2015 - Sep 2016 Developed financial data processing and reporting solutions using Python and SQL, supporting transaction analysis, operational reporting, compliance tracking, and daily banking activities across enterprise finance teams. Built scalable ETL workflows using Python, Pandas, MySQL, and Oracle, transforming transactional banking datasets into structured formats supporting downstream reporting and operational analytics requirements efficiently. Designed optimized SQL queries and multi-table joins, consolidating customer records, transaction histories, and account-level information from relational databases for enterprise financial reporting and reconciliation activities. Implemented robust data validation and preprocessing routines using Pandas and Python, resolving missing values, duplicate entries, and formatting inconsistencies across high-volume banking and reporting datasets. Automated recurring report generation and batch-processing tasks using Python scripting, reducing manual reporting effort while improving turnaround time and operational efficiency across finance and compliance teams. Maintained enterprise relational databases including MySQL and Oracle, supporting structured data storage, efficient retrieval, reporting accessibility, and reliable processing across financial operations environments. Collaborated within Agile Scrum teams alongside analysts and senior developers, developing reusable scripts and modular workflows aligned with enterprise banking requirements and operational reporting standards. Supported production testing and monitoring activities by troubleshooting data workflows, validating processed outputs, and ensuring stable execution of scheduled reporting pipelines across compliance-driven banking environments. Environment: Python, SQL, Pandas, MySQL, Oracle, ETL Workflows, Data Processing, Data Transformation, Data Validation, Report Automation, Relational Databases, SQL Queries, Financial Reporting, Compliance Reporting, Batch Processing, Agile Scrum, Git, Linux, Operational Reporting. Education: Bachelor of Technology in Computer Science Sreyas Institute of Engineering and Technology Hyderabad, India Keywords: continuous integration continuous deployment artificial intelligence machine learning business intelligence sthree rlang microsoft mississippi procedural language bay area Arkansas California Delaware Virginia |