Home

Site Reliability Engineer (SRE)::Remote at Remote, Remote, USA
Email: shivi94gir@gmail.com
https://short-link.me/15H5b
https://jobs.nvoids.com/job_details.jsp?id=2220657&uid=
 Site
Reliability Engineer (SRE)::Remote

Detailed Skills Needed:

Primary Goals:
Ensure system reliability: 99.999%
uptime and minimal downtime. Improve system scalability: Support business
growth and increased traffic. Enhance system performance: Optimize resource
utilization and latency. Develop and implement monitoring: Proactive issue
detection and resolution. Collaborate with DevOps: Align SRE practices with
DevOps goals. Logging/Monitoring: Configure logging/archival solution, and
support dashboards and monitors

Documentation: Document all SRE functions, processes, and flows Training:
Provide training and support during migration transition

Responsibilities

System Reliability, Design and implement highly available systems. Develop and
execute reliability testing. Scalability and Performance

Optimize system resources for scalability. Monitor and analyze performance
metrics. Monitoring and Alerting. Develop and implement monitoring tools
(EventBridge, CloudWatch, Datadog). Create alerting and notification systems.

Incident Management

Respond to and resolve system incidents. Conduct post-incident reviews and
implement improvements.

Collaboration

Work with DevOps to integrate SRE practices. Collaborate with development teams
on system design.

Knowledge Management

Document SRE process and lessons learned. Develop knowledge base.

Requirements

Essential Qualifications

5+ years of experience in SRE, DevOps or systems engineering.

Proven experience with cloud-based systems (AWS, GCP, Azure).

Strong understanding of Linux, networking and system architecture.

Strong understanding of event driven architectures

Strong understanding of monitoring tools (EventBridge, CloudWatch, Oracle
Enterprise Monitoring, Datadog).

Desirable Qualifications

Certification in SRE or cloud engineering.

Experience with agile methodologies (Scrum, Kanban).

Knowledge of programming languages (Python, Java, Bash).

--

Keywords: information technology
Site Reliability Engineer (SRE)::Remote
shivi94gir@gmail.com
https://short-link.me/15H5b
https://jobs.nvoids.com/job_details.jsp?id=2220657&uid=
shivi94gir@gmail.com
View All
09:12 PM 03-Mar-25


To remove this job post send "job_kill 2220657" as subject from shivi94gir@gmail.com to usjobs@nvoids.com. Do not write anything extra in the subject line as this is a automatic system which will not work otherwise.


Your reply to shivi94gir@gmail.com -
To       

Subject   
Message -

Your email id:

Captcha Image:
Captcha Code:


Pages not loading, taking too much time to load, server timeout or unavailable, or any other issues please contact admin at me@nvoids.com


Time Taken: 0

Location: ,