Site Reliability Engineer Seattle, WA at Seattle, Washington, USA |
Email: [email protected] |
Job Description: Platform / Site Reliability Engineer Location: Seattle, WA Hybrid role, 3 days per week in the Seattle office required Description: We are looking for a skilled Platform Engineer / SRE to design, implement, and maintain our cloud infrastructure and platforms. The ideal candidate will have a strong background in Kubernetes administration, Azure cloud services, infrastructure as code, and automation. You will play a crucial role in ensuring the scalability, reliability, and security of our systems while supporting our AI/ML initiatives. Responsibilities: * Design, deploy, and manage infrastructure solutions using Terraform, ensuring scalability, security, and reliability. * Develop and maintain infrastructure as code scripts to automate the provisioning and configuration of resources. * Ensure version-controlled, repeatable deployments using IaC best practices. * Implement and manage Kubernetes clusters for containerized applications. * Collaborate with development teams to deploy, scale, and optimize applications in Kubernetes environments. * Leverage scripting languages (e.g Python) to automate routine tasks and streamline workflows. * Implement continuous integration and continuous deployment (CI/CD) pipelines for efficient software delivery. * Ensure seamless integration of infrastructure components with CI/CD pipelines. * Design, deploy, and maintain scalable and reliable infrastructure for AI/ML platforms. * Implement containerization (Docker) and orchestration (Kubernetes) solutions for deploying and managing AI/ML applications. * Ensure containerized applications are secure, scalable, and easily deployable. * Enable seamless integration of AI/ML models into the platform, ensuring data pipelines are efficient and reliable. * Establish monitoring and alerting systems to ensure the health and performance of AI/ML platforms. * Implement security best practices for AI/ML platforms, ensuring data privacy and compliance with industry standards Required skills: * Bachelor's degree in computer science, Engineering, or a related field * Proven experience in Kubernetes administration, specifically with Azure Kubernetes Service (AKS) * Strong proficiency in Azure cloud services and Azure ARM templates * Expert-level scripting skills in Power and Python * Hands-on experience with Terraform for infrastructure as code * Solid understanding of CI/CD principles and experience with Azure DevOps * Experience with containerization technologies, particularly Docker * Strong problem-solving skills and ability to work in a fast-paced environment * Excellent communication and collaboration skills -- Keywords: continuous integration continuous deployment artificial intelligence machine learning information technology Washington Site Reliability Engineer Seattle, WA [email protected] |
[email protected] View All |
02:05 AM 06-Feb-25 |