Job Opportunity - Site Reliability Engineer - Washington state - Only Locals at Bothell, Washington, USA |
Email: [email protected] |
http://bit.ly/4ey8w48 https://jobs.nvoids.com/job_details.jsp?id=2257899&uid= From: iswarya, Smart Tech Link [email protected] Reply to: [email protected] Hi , We do have a priority requirement with one of our clients. Kindly review and let me know if you have any questions. USA-Bothell-T-Mobile Role : Site Reliability Engineer Location: Bothell WA (Onsite) - Only Locals Looking for a 6-10 Yrs experienced SRE with the below skills. Responsibilities: Collaborate with development and operations teams to design, implement, and maintain reliable and scalable systems. Monitor the health of systems, applications, and infrastructure using modern monitoring tools and proactively identify areas for improvement. Develop and implement automated solutions for operational tasks such as deployment, scaling, and monitoring. Troubleshoot and resolve complex production issues in a timely manner. Define and enforce Service Level Objectives (SLOs), Service Level Indicators (SLIs), and Service Level Agreements (SLAs) to measure and ensure system reliability. Continuously improve system reliability by automating repetitive tasks and reducing manual intervention. Perform root cause analysis for incidents and implement corrective actions to prevent recurrence. Participate in on-call rotations and provide support during high-severity incidents. Work closely with cross-functional teams to ensure successful software delivery, infrastructure capacity planning, and system performance optimization. Document procedures, runbooks, and best practices for system administration and incident management. Skill Needs Kubernetes Python Envoy REST/gRPC/HTTP OTEL Networking Python Observability RAG LLM Proven experience as a Site Reliability Engineer, DevOps Engineer, or similar role. Proficient in scripting languages such as Python, Go, Bash, or similar. Strong understanding of system performance, scalability, and reliability principles. Familiarity with monitoring and observability tools like Prometheus, Grafana, Datadog, or similar. Experience with CI/CD pipelines, automation, and infrastructure as code (Terraform, Ansible, etc.). Strong problem-solving skills with the ability to troubleshoot complex systems issues. Excellent communication and collaboration skills, with the ability to work in a fast-paced environment. Experience with distributed systems and microservices architectures. Familiarity with networking, security, and data storage technologies. Experience with incident management and post-mortem analysis. Knowledge of Agile/Scrum methodologies and DevOps best practices Thanks & Regards Iswarya | Technical Recruiter [email protected] | www.smarttechlink.com | Keywords: continuous integration continuous deployment golang Washington Job Opportunity - Site Reliability Engineer - Washington state - Only Locals [email protected] http://bit.ly/4ey8w48 https://jobs.nvoids.com/job_details.jsp?id=2257899&uid= |
[email protected] View All |
11:12 PM 14-Mar-25 |