Home

Ashok - DevOps Engineer
[email protected]
Location: Irving, Texas, USA
Relocation: Yes
Visa: H1B
Resume file: Raju_jsResume_1763257245971.docx
Please check the file(s) for viruses. Files are checked manually and then made available for download.
Ashok Raju
Site Reliability Engineer
Phone: 346-375-0776
Email:[email protected]

Professional Experience:

Senior SRE Engineer with 12 years of experience in admin/Build/Release/DevOps/DevSecOps specialist responsible for Cloud Automation through Open Source DevOps tools like Ansible, Chef, Jenkins &Docker.
Worked with AWS Services like EC2, S3, ELB, Auto scaling Servers, Glacier, EKS, ECS, Storage Lifecycle rules, Elastic Beanstalk, Cloud Formation, RDS, VPC, Route 53, Cloud watch, IAM &Roles, AWS Systems Manager(AWS SSM) , SNS subscription service and provisioned using terraform.
Experience working with MLOps pipelines and supported automating the deployment, monitoring, and maintenance of machine learning models in production environments using MLFLOW.
Involved in Microsoft Azure Cloud Services ( PaaS & IaC & CaC ), Application Insights, Document DB, Internet of Things (IoT), Azure Monitoring, Key Vault, Visual Studio Online (VSO) and SQL Azure.
Good knowledge on writing ARM templates and creating AZURE AKS clusters.
Worked on ansible, Puppet and Salt configuration and automation tools.
Involved in developing and maintaining build, deployment scripts for test, staging and Production environments using ANT, Maven.
Created Build scripts & automated solutions using various scripting languages like bash, Python, Yaml.
Involved in automating deployment using Bash and Python scripting with focus on DevOps tools, CI/CD tools such as Jenkins, Bamboo, Cruise, Control and GoCD.
Maintained high Availability clustered and standalone server environments and refined automation component with scripting and configuration management.
Established SLOs with aligned alerting thresholds to reduce alert fatigue by 70%, ensuring on-call engineers were only paged for issues that truly impacted customer experience.
Automated dashboards and runbooks based on defined SLIs, empowering development teams to self-serve and troubleshoot their services, which reduced the operational load on the SRE team.
Worked on DevOps/Agile operations process and tools area (Code review, unit test automation, Build & Release automation, Environment, Service, Incident and Change Management).Configured AWS IAM and Storage Group Security in Public and Private Subnets in VPC.
Knowledge in querying MySQL and SQL Server by using SQL for data integrity. Good knowledge on Industry Standard Software Development Life Cycle (SDLC), Software Testing Life Cycle (STLC). Worked on NO SQL databases like Cassandra.
Designed and developed RESTful web services using Java, ensuring adherence to REST principles and best practices.
Troubleshooting and problem solving of Linux/UNIX /windows servers, debugging OS failure.
Developed processes for continuous monitoring(DynaTrace , NewRelic) and performance testing of models, identifying opportunities for improvement and innovation within the MLOps ecosystem.
Implemented and configured New Relic APM and Dynatrace to monitor application performance, identify bottlenecks, and optimize system performance. Even migrated from legacy APM tools (Datadog APM, Jaeger, etc.) to OpenTelemetry, standardizing telemetry collection.
Performed all necessary day-to-day support for different projects like Check-in, Checkouts, import, export, branching, tagging, and conflict resolution using SCM tools like GIT, Sub Versions.
Created Ansible playbooks to provision Apache Web servers, Tomcat servers, Nginx, Apache Spark and other applications.
Good Knowledge on cloud native applications Kubernetes administration to design, develop, implement, test, and maintain business and applications software s.
Responsible for ensuring Systems & Network Security, maintaining performance and setting up monitoring using Cloud Watch, Datadog, Prometeus, New Relic and AppDynamics.
______________________________________________________________________________________________________
Technical skills:

DevOps Ansible, Terraform, Jenkins, Kubernetes, Vagrant, Docker, Nagios, ELK, Git, SonarQube
Cloud Amazon Web Services(AWS), Azure, GCP
Databases Mongo DB, Dynamo DB, No SQL.
Monitoring Tools Cloud Watch, Nagios, Splunk, App Dynamics, New Relic:, Datadog, Grafana, Prometeus, ELK, Dynatrace.
Automation Tools Chef, Puppet, Ansible, Kubernetes (EKS), Helm.
Build Tools Jenkins, Ant, Maven, Auto tools, J2EE
Web and App Servers Apache Tomcat, NGINX.
Version Controllers SVN, GIT, GITHUB, CVS.
Scripting languages Python, Go, Shell, Java script, Ruby, JavaScript, Typescript, Power Shell.
Databases MS-Access, MYSQL, Oracle, SQL Server, DB2
Virtualization Tools AWS, VMware, Oracle, Postgress DB, Virtual Box.
CI/CD Tools Jenkins, Hudson.
Operating Systems Linux, UNIX, Windows 2008/2012, Win 7/10

_______________________________________________________________________________________________________

Work Experience:
Site Reliability Engineer
JPMC, Dallas, Tx
June 2025 present

Responsibilities:
Implemented AWS solutions using EC2, S3, RDS, ELB, ASG, RDS, ALB/NLB, s3, IAM, AWS SSM, IAM, cloud front, API gateway, AWS Glue, AWS API Gateway configured Elastic Load Balancers with EC2 Auto scaling groups and provisioned the infrastructure using terraform, Cloud Formation and build resilient infrastructure that with stand failovers and downtime of the application.
Automated using shell and python scripting to deploy and manage Java applications across Linux servers.
Created cloud-native Kubernetes clusters on on-premise bare metal servers using kubespray; installation of prerequisites for the hosts and deployed dependencies and software on various k8s clusters.
Experience in creating Grafana dashboards and monitors Kubernetes cluster using Prometheus that shows cluster CPU / Memory / Filesystem usage as well as individual pod, containers, events and other metrics.
Developed multi-cloud deployment pipelines using Groovy DSL with automated environment provisioning across AWS, and Azure.
Monitored application logs on Splunk and Dynatrace along with creating dashboards for server usages, implemented rapid-provisioning and life-cycle management for ubuntu, Linux using Amazon EC2, Ansible and custom Python/Bash scripts.
Worked on Kusto query s for processing data and aggregate reports for reporting to higher management.
Designed and Developed Groovy DSL pipelines with integrated SAST/DAST tools (SonarQube, Checkmarx) in pre-commit hooks, build stages and vulnerabilities before code review.
Implemented integrating Jenkins(CI) with spinnaker (CD) for continuous integration and Continuous deployment on K8S clusters via GITHUB, SonarQube.
Implemented telemetry data enrichment at the Collector level using OpenTelemetry Transformation Language to add context like tenant ID, feature flags, user tiers to all metrics.
Orchestrated various jobs using shell, ansible and python scripts and even evicted manual intervention by scheduling jobs.
Developed SLO dashboards for key services using Datadog, providing engineering and product leadership with a clear, shared understanding of service health and customer impact.
Deployed a multi-tenant, cloud-native platform on Kubernetes to host JupyterHub for data science and Airflow for workflow orchestration, enabling self-service provisioning for development teams.
Experience in migrating application from legacy environment to hybrid OV data centers and even experience on migrating platforms to AWS cloud.
Experience in Managing/Tracking the defects status by using JIRA tool and also has experience on service-now for incident and change management tickets. Good Knowledge on ATDD framework.
Involved in writing lambda functions for orchestration AWS glue using pyspark.
Utilized J2EE frameworks and microservices architecture to build scalable web applications.
Hands on experience in Kubernetes administration to design, develop, implement, and maintain business and applications software s and good knowledge on developing helm charts.
Supported Rancher Kubernetes Engine (RKE), installed and configured existing Kubernetes clusters to rancher. Worked with k9s terminal UI to interact with Kubernetes clusters.
Developed processes for continuous monitoring and performance testing of models, identifying opportunities for improvement and innovation within the MLOps ecosystem.
Established Model Governance, Model Data Management (Data Collection, Data Quality, Data Privacy), Monitoring, Documentation, Reporting, Risk Management, Continuous Improvement, Prometeus.
Deployed, configured, and maintained ELK stack components (Elasticsearch, Logstash, Kibana, and Beats) to ensure seamless data flow and efficient log management.
Set up real-time monitoring dashboards using New Relic to track key performance indicators (KPIs) and ensure system reliability.
Unified observability platform integrating OpenTelemetry for application metrics with data observability tools, providing correlated views of business transactions and their underlying data dependencies.
Proactively identified and resolved performance issues using New Relic's alerting and notification features.
Trouble shouted and resolved issues related to ELK stack components, ensuring minimal downtime and optimal performance.
Worked on API security protocols such as OAuth2, JWT, and mTLS, implementing robust security measures for API interactions using Okta and Auth0.
Implemented MLflow Tracking to log parameters, metrics, and artifacts during ML experiments, ensuring reproducibility and traceability.
Utilized MLflow's API and UI for tracking and comparing hyperparameters, model performance, and code versions across experiments.
Set up monitoring and logging for RESTFUL web services to track performance, errors, and usage patterns(grafan and Dynatrace).
Worked on migrating k8s clusters from bare metal to managed AWS EKS environment.
Good Knowledge on ROBIN Cloud Native Storage and manage namespaces.
Supported application running on amazon managed EKS clusters, where applications fetch data from s3 for the service running on them.
Involved in AWS Step Functions for orchestrating lambda and multiple AWS components like Glue, sns, sqs, s3.,
Orchestrated various jobs using shell, ansible and python scripts and even evicted manual intervention by scheduling jobs.

Site Reliability Engineer
PWC, Tampa
May 2020 to June 2025

Responsibilities:
Implemented AWS solutions using E2C, S3, RDS, EKS, IAM, ECS, Auto scaling groups and Configured Elastic Load Balancers with EC2 Auto scaling groups.
Involved in designing Azure Resource Manager Template and in designing custom build steps using PowerShell.
Configured and deployed Microsoft Azure for a multitude of applications utilizing the Azure stack (Including Compute, Web & Mobile, Blobs, Resource Groups, Azure SQL, AKS, Cloud Services, and ARM), focusing on high - availability, fault tolerance, and auto-scaling.
Implemented Service Level Objectives (SLOs) and Service Level Indicators (SLIs) for critical user journeys, shifting the team's focus from pure infrastructure and application monitoring to user-centric reliability.
Installed Jenkins/Plugins for GIT Repository, Setup SCM Polling for Immediate Build with Maven and Maven Repository and Deployed Apps in AWS using Terraform. Implemented continuous integration through web hooks and workflows around Jenkins to automate the dev test deploy workflow.
Performed built Elastic search, Log stash for centralized logging and then store logs, metrics into S3bucket using Lambda function.
Implemented observability strategy using Datadog and Dynatrace, integrating metrics, logs, traces, and real-user monitoring (RUM) to provide a unified view of application health and performance.
Used AWS ECS to leverage container technology. Code Deploy for deploying the application code within EC2 instances.
Configured support scripts to chef recipes and AWS server provisioning using Chef Recipes. Developed Chef Recipes to configure, deploy and maintain software components of the existing infrastructure.
Maintaining Multi zone data backup on Amazon EC2 S3 Locally maintaining data archives using daily/weekly/monthly log rotate scheme.
Created custom dashboards and reports in New Relic to visualize application performance metrics and trends.
Implemented Continuous Integration and Continuous Delivery (CI & CD) Process stack using AWS, GITHUB/GIT, Jenkins groovy DSL, Artifactory, Chef and Kubernetes(AKS/EKS).
Worked on creating terraform modules to create AWS components like VPC, SGs, API Gateway, E2C, S3, RDS, EKS, etc..
Installed Jenkins/Plugins for GIT Repository, Setup SCM Polling for Immediate Build with Maven and Maven Repository (Nexus Artifactory) and also used web hooks to ensure Jenkins is listening to a particular branch.
Developed automation scripting in Python (core) to deploy and manage Java applications across Linux servers.
Migrated oracle database from on-prem to AWS RDS using DMS service.
Implemented data pipeline observability using OpenTelemetry to instrument Airflow/Prefect workflows, capturing data quality metrics, processing latency, and business-level metadata as span attributes
Performed APM metrics with infrastructure and log data to get root cause analysis, moving from symptom detection to identifying the underlying cause of issues.
Implemented OpenTelemetry for distributed tracing across microservices and integrated with data observability platform to monitor data quality in real-time.
Monitoring and performance testing of models, identifying opportunities for improvement and innovation within the MLOps ecosystem.
Developed automated root cause analysis by correlating service degradation (via OpenTelemetry) with concurrent data pipeline failures or data quality anomalies.
Deployed New Relic Infrastructure to monitor server and cloud infrastructure health and performance.
Involved in Installation/Administration of TCP/IP, NIS/NIS+, NFS, DNS, NTP, Auto mounts, Send mail and Print servers as per the client s requirement on Red hat Linux/Debian Servers.
Created CloudFormation Template, using which the whole infrastructure is launched and also launched the instances with VPC and Subnet for each user and each Instance would be attached with an Elastic IP.
_______________________________________________________________________________________________________


Senior Cloud and DevSecOps Engineer
Comcast Cable Inc, PA
Feb 2017 to May 2020

Responsibilities:
Performed built Elastic search, Log stash for centralized logging and then store logs, metrics into S3bucket using Lambda function.
Used AWSGUI on a Linux instance for running application on Linux.
Used Amazon RDS to manage, create snapshots, and automate backup of database. Integrated and Configured Elastic Load Balancers with auto scaling group
Created snapshots and Amazon machine images (AMIs) instances for backup and also created Security Groups, configuring Inbound /Outbound rules, creating and importing Key Pairs.
Puppet Modules to manage configurations and automate installation process. Deployed Puppet and Puppet DB for configuration management to existing infrastructure.
Working on DockerHub, creating docker images and handling multiple images primarily for middleware installations and domain configurations.
Used puppet to configure and automate server instances in AWS.
Supported ASP.NET MVC, C#, JavaScript and unit testing of the software components using NUnit 2.2.9
Used GZIP with AWS CloudFront to forward compressed files to destination node/instances
Developed, configured CI/CD pipelines with Jenkins, concourse, git, docker, maven build plugins. Built and deployed artifacts to application servers such as tomcat, httpd servers. Developed parameterized jenkins jobs with various plugins for backup jobs, and jobs for system upgrades.
Used GIT for version control and Jenkins for integration and deployment of applications.
Experience in writing shell scripts to automate the administrative tasks and management using cron jobs.
Created Jobs in Jenkins, and set up global permission and scheduling jobs in pole SCM.
_______________________________________________________________________________________________________


Build DevOps Engineer
PayPal, San Jose, CA
July 2016 Feb 2017

Transformed the way its employees work, with flexible, secure access to applications from anywhere, on any device. This system also allows product inventory to be checked more regularly, improving services and ensuring products can be placed more accurately and efficiently.

Responsibilities:
Created a fully Automated Build, Deployment Platform and coordinating code builds, promotions and orchestrated deployments using Hudson and Subversion
Deployed applications to Application servers in an Agile continuous integration environment and also automated the whole process. Build scripts for ANT and MAVEN and also build tools like Jenkins, SonarQube to move from one environment to other environments.
Involved in editing the existing ANT files in case of errors or changes in the project requirements.
Managed Maven project dependencies by creating parent-child relationships between Projects.
Performed typical administrative activities such as Backup, Restore, Site Creation, and User Issue Resolution.
Used Jython and Jacl scripting languages to modify the dynamic operations. Coded Python and shell scripts as part of daily ad-hoc requests
Provided end-user support, performed baseline build, merges, software release management, and other SCM activities.
Created a script to generate tar files for the change-set related to a particular JIRA ticket, which was then uploaded automatically to the FTP server.
_______________________________________________________________________________________________________

Senior Devops/Cloud Consultant
Macy s, Bangalore, India
June 2013 May 2015

The world s largest independent mobile advertising network. InMobi provides advertisers, developers, and publishers with a uniquely global mobile advertising solution.

Responsibilities:
Build virtualized Linux servers on ESXi, vSphere servers to host multiple applications on same chassis across different server hosts. Installation and administration of RedHat & CentOS using RPM and YUM package installations, patch and other server management.
Improved LINUX OS deployment and management by creating customized kick-start scripts and performed in Developing KORN, BASH, PERL scripts to automate cron jobs.
Maintained records of daily data communication transactions, problems and remedial actions taken. Trained users in the proper use of hardware or software.
Managed Logical volumes by Implemented LVM storage, extended logical volumes and filesystem, extending and reducing volume group, creating snapshot of a logical volume for databackups, Configured Linux network, basic trouble shooting process, Diagnose and corrective network problems, Proactive maintenance on systems by timely upgrading Patches to the systems and applications.
Automated code builds, server deployments and fully automated testing for WebSphere Portal.
Collaborated with Network Admin in Installing, configuring, securing, and implementing slave
replication on DNS, BIND servers, Knowledge of core system and security applications and protocols such as BGP, TCP/IP, DNS and SSH.
_______________________________________________________________________________________________________
Education: Bachelor of Technology in Computer Science from VIT, JNTUK India, 2013.
Keywords: csharp continuous integration continuous deployment machine learning user interface sthree database active directory golang microsoft mississippi California Idaho Pennsylvania Texas

To remove this resume please click here or send an email from [email protected] to [email protected] with subject as "delete" (without inverted commas)
[email protected];6428
Enter the captcha code and we will send and email at [email protected]
with a link to edit / delete this resume
Captcha Image: