Site Reliability Engineer (SRE)::Remote at Remote, Remote, USA |
Email: shivi94gir@gmail.com |
https://short-link.me/15H5b https://jobs.nvoids.com/job_details.jsp?id=2220657&uid= Site Reliability Engineer (SRE)::Remote Detailed Skills Needed: Primary Goals: Ensure system reliability: 99.999% uptime and minimal downtime. Improve system scalability: Support business growth and increased traffic. Enhance system performance: Optimize resource utilization and latency. Develop and implement monitoring: Proactive issue detection and resolution. Collaborate with DevOps: Align SRE practices with DevOps goals. Logging/Monitoring: Configure logging/archival solution, and support dashboards and monitors Documentation: Document all SRE functions, processes, and flows Training: Provide training and support during migration transition Responsibilities System Reliability, Design and implement highly available systems. Develop and execute reliability testing. Scalability and Performance Optimize system resources for scalability. Monitor and analyze performance metrics. Monitoring and Alerting. Develop and implement monitoring tools (EventBridge, CloudWatch, Datadog). Create alerting and notification systems. Incident Management Respond to and resolve system incidents. Conduct post-incident reviews and implement improvements. Collaboration Work with DevOps to integrate SRE practices. Collaborate with development teams on system design. Knowledge Management Document SRE process and lessons learned. Develop knowledge base. Requirements Essential Qualifications 5+ years of experience in SRE, DevOps or systems engineering. Proven experience with cloud-based systems (AWS, GCP, Azure). Strong understanding of Linux, networking and system architecture. Strong understanding of event driven architectures Strong understanding of monitoring tools (EventBridge, CloudWatch, Oracle Enterprise Monitoring, Datadog). Desirable Qualifications Certification in SRE or cloud engineering. Experience with agile methodologies (Scrum, Kanban). Knowledge of programming languages (Python, Java, Bash). -- Keywords: information technology Site Reliability Engineer (SRE)::Remote shivi94gir@gmail.com https://short-link.me/15H5b https://jobs.nvoids.com/job_details.jsp?id=2220657&uid= |
shivi94gir@gmail.com View All |
09:12 PM 03-Mar-25 |