Job Details

Home

Job Opportunity - SRE - Site Reliability Engineer at Atlanta, Georgia, USA

http://bit.ly/4ey8w48
https://jobs.nvoids.com/job_details.jsp?id=2316505&uid=

From:

iswarya,

Smart Tech Link

[email protected]

Reply to: [email protected]

Hi,

We do have a priority requirement with one of our clients. Kindly review and let me know if you have any questions.

SRE - Kubernetes
Location - Atlanta GA (onsite 3 days )

Mandatory Areas
Must Have Skills
- Skill 1 7 Yrs of Exp Kubernetes, GitLab, Splunk o11y
- Skill 2 7 Yrs of Exp , Prometheus , python or go language scripting
- Skill 3 5Yrs of Exp , Java troubleshooting, cisco observability

Responsibilities
Infrastructure Management: Design, implement, and manage Kubernetes clusters in production environments to ensure high availability and reliability.
Automation: Build and manage automation tools and scripts for continuous deployment, scaling, and self-healing of applications using Kubernetes and associated tooling (Helm, kubectl, Kustomize).
Monitoring and Metrics: Implement robust monitoring solutions using Prometheus, Grafana, and other observability tools to track the health of Kubernetes clusters, applications, and services.
Incident Management: Work with cross-functional teams to respond to incidents, identify root causes, and implement solutions to prevent recurrence.
CI/CD Pipeline Optimization: Design and maintain continuous integration and deployment pipelines to improve the release cycle and reduce downtime.
Capacity Planning: Forecast resource needs, scale systems efficiently, and optimize cloud infrastructure to meet growing demand.
Disaster Recovery: Define and implement strategies for backup, recovery, and failover to ensure data integrity and uptime.
Collaboration: Partner closely with development teams to help design scalable, resilient, and performant architectures on Kubernetes.
Security: Ensure that the Kubernetes infrastructure follows best practices for security, including network policies, RBAC, and Pod security policies.

Required Skills & Qualifications:
Experience with Kubernetes: Hands-on experience in deploying and managing Kubernetes clusters (preferably in production environments).
Cloud Platforms: Strong experience with cloud platforms like AWS, GCP, or Azure, with a focus on Kubernetes as a service (e.g., EKS, GKE, AKS).
Containerization: Expertise in container technologies like Docker, container orchestration with Kubernetes, and Helm charts.
Automation Tools: Familiarity with Infrastructure-as-Code tools such as Terraform, Ansible, or CloudFormation.
Monitoring & Observability: Knowledge of monitoring tools such as Prometheus, Grafana, ELK stack, or similar.
Networking: Understanding of networking concepts (DNS, Load Balancers, etc.) and how they apply to Kubernetes.
CI/CD Pipelines: Strong knowledge of CI/CD tools like Jenkins, GitLab CI, or CircleCI.
Scripting: Proficiency in scripting languages such as Bash, Python, or Go. Incident Response & Root Cause
Analysis: Experience in managing and resolving production incidents with a focus on improving systems after the event.
Collaboration & Communication: Excellent communication skills to work in cross-functional teams and interact with stakeholders across the company.

Thanks & Regards

Iswarya | Technical Recruiter

[email protected] |
www.smarttechlink.com
|
Keywords: continuous integration continuous deployment golang Georgia
Job Opportunity - SRE - Site Reliability Engineer
[email protected]
http://bit.ly/4ey8w48
https://jobs.nvoids.com/job_details.jsp?id=2316505&uid=

[email protected]
View All

08:52 PM 04-Apr-25

To remove this job post send "job_kill 2316505" as subject from [email protected] to [email protected]. Do not write anything extra in the subject line as this is a automatic system which will not work otherwise.

Your reply to [email protected] -

To

Subject
Message -

iswarya.k@smarttechlink.com wrote:
From:

iswarya,

Smart Tech Link

iswarya.k@smarttechlink.com

Reply to:   iswarya.k@smarttechlink.com

Hi,

We do have a priority requirement with one of our clients. Kindly review and let me know if you have any questions.

SRE - Kubernetes
Location - Atlanta GA (onsite 3 days )

Mandatory Areas
Must Have Skills 
- Skill 1  7 Yrs of Exp  Kubernetes, GitLab, Splunk o11y
- Skill 2  7 Yrs of Exp  , Prometheus , python or go language scripting
- Skill 3  5Yrs of Exp , Java troubleshooting, cisco observability

Responsibilities
Infrastructure Management: Design, implement, and manage Kubernetes clusters in production environments to ensure high availability and reliability.
Automation: Build and manage automation tools and scripts for continuous deployment, scaling, and self-healing of applications using Kubernetes and associated tooling (Helm, kubectl, Kustomize).
Monitoring and Metrics: Implement robust monitoring solutions using Prometheus, Grafana, and other observability tools to track the health of Kubernetes clusters, applications, and services.
Incident Management: Work with cross-functional teams to respond to incidents, identify root causes, and implement solutions to prevent recurrence.
CI/CD Pipeline Optimization: Design and maintain continuous integration and deployment pipelines to improve the release cycle and reduce downtime.
Capacity Planning: Forecast resource needs, scale systems efficiently, and optimize cloud infrastructure to meet growing demand.
Disaster Recovery: Define and implement strategies for backup, recovery, and failover to ensure data integrity and uptime.
Collaboration: Partner closely with development teams to help design scalable, resilient, and performant architectures on Kubernetes.
Security: Ensure that the Kubernetes infrastructure follows best practices for security, including network policies, RBAC, and Pod security policies.

Required Skills & Qualifications:
Experience with Kubernetes: Hands-on experience in deploying and managing Kubernetes clusters (preferably in production environments).
Cloud Platforms: Strong experience with cloud platforms like AWS, GCP, or Azure, with a focus on Kubernetes as a service (e.g., EKS, GKE, AKS).
Containerization: Expertise in container technologies like Docker, container orchestration with Kubernetes, and Helm charts.
Automation Tools: Familiarity with Infrastructure-as-Code tools such as Terraform, Ansible, or CloudFormation.
Monitoring & Observability: Knowledge of monitoring tools such as Prometheus, Grafana, ELK stack, or similar.
Networking: Understanding of networking concepts (DNS, Load Balancers, etc.) and how they apply to Kubernetes.
CI/CD Pipelines: Strong knowledge of CI/CD tools like Jenkins, GitLab CI, or CircleCI.
Scripting: Proficiency in scripting languages such as Bash, Python, or Go. Incident Response & Root Cause
Analysis: Experience in managing and resolving production incidents with a focus on improving systems after the event.
Collaboration & Communication: Excellent communication skills to work in cross-functional teams and interact with stakeholders across the company.

Thanks & Regards

Iswarya | Technical Recruiter

Iswarya.k@smarttechlink.com | 
www.smarttechlink.com
 |
Keywords: continuous integration continuous deployment golang Georgia 
Job Opportunity - SRE - Site Reliability Engineer
iswarya.k@smarttechlink.com

Your email id:

Captcha Image:

Captcha Code:

Pages not loading, taking too much time to load, server timeout or unavailable, or any other issues please contact admin at [email protected]

Time Taken: 9

Location: Atlanta, Georgia