Home

Need Lead Site Reliability Engineer (SRE) AND Need EX T-Mobile candidates at Remote, Remote, USA
Email: [email protected]
Hi,

This is Lucy from the American Unit. I hope you are doing great,

please see the job description below and let me know if you are interested and comfortable with this position. Please forward your updated resume to

[email protected]

JOB DESCRIPTION:-

Job Title: Lead Site Reliability Engineer (SRE)

Location: Remote

Note:
Submit only

EX T-Mobile.

Job Description:

Overview:

We are seeking a highly skilled and experienced Senior/Lead Site Reliability Engineer (SRE) to join our team.

The ideal candidate will have a strong background in Kubernetes, disaster recovery setup, CI/CD pipelines, and exposure to a variety
of technologies, including Cassandra, Rabbit MQ, GitLab, and Redis/in-memory databases.

This role requires oversight and mentorship of other SREs on the project, ensuring reliability, scalability, and performance of our
applications and infrastructure.

Key Responsibilities:

Leadership and Oversight:
Provide leadership and oversight to the SRE team, guiding best practices in systems reliability,
monitoring, and incident response.

Kubernetes Management:

Utilize Kubernetes expertise (preferred experience with TKE) to manage container orchestration for scalable applications.

Disaster Recovery (DR) Setup:
Design and implement high availability (HA) strategies, load balancing, and disaster recovery
setups to ensure seamless operations and minimize downtime.

CI/CD Implementation:
Set up, manage, and optimize continuous integration and continuous delivery (CI/CD) processes and pipelines.

Technology Exposure:
Leverage knowledge of Cassandra, Rabbit MQ, and Redis/in-memory databases to support application performance
and reliability.

Cloud Infrastructure:
Utilize cloud experience, particularly with Kubernetes on AWS, to design and maintain scalable cloud
solutions.

Collaboration:
Work closely with software engineers, product teams, and stakeholders to ensure that operational considerations
are integrated into application design and development.

Incident Management:
Develop and maintain incident response protocols, conduct post-mortem analyses, and implement improvements
based on findings.

Documentation:

Create and maintain thorough documentation for systems, processes, and incident reports to ensure knowledge sharing within the team.

Qualifications:

Bachelor's degree in Computer Science, Information Technology, or a related field (Master's degree preferred).

Experience:

5+ years of experience in systems engineering, site reliability engineering, or DevOps.

Proven experience managing Kubernetes clusters and containerized applications.

Demonstrated expertise in setting up disaster recovery solutions and high availability setups.

Proficiency in CI/CD tools and practices.

Technical Skills:

Strong knowledge of Cassandra, Rabbit MQ, GitLab, and Redis/in-memory databases.

Experience with cloud environments, especially Kubernetes on AWS.

Proficiency in scripting and automation tools (e.g., Python, Bash, Terraform).

Soft Skills:

Strong leadership and mentoring capabilities.

Excellent problem-solving and troubleshooting skills.

Ability to work in a fast-paced and dynamic environment.

Thanks & Regards
..

Lucy (Priyanka Moyila)

American Unit Inc.

2901 N Dallas Pkwy, Suite 333

Plano, TX  75093

Email:

[email protected]

--

Keywords: continuous integration continuous deployment message queue information technology Texas
Need Lead Site Reliability Engineer (SRE) AND Need EX T-Mobile candidates
[email protected]
[email protected]
View All
12:18 AM 28-Jan-25


To remove this job post send "job_kill 2116921" as subject from [email protected] to [email protected]. Do not write anything extra in the subject line as this is a automatic system which will not work otherwise.


Your reply to [email protected] -
To       

Subject   
Message -

Your email id:

Captcha Image:
Captcha Code:


Pages not loading, taking too much time to load, server timeout or unavailable, or any other issues please contact admin at [email protected]


Time Taken: 0

Location: ,