Site Reliability Engineer - Hybrid at Remote, Remote, USA |
Email: [email protected] |
From: Nihitha, GAC Solutions [email protected] Reply to: [email protected] Hi, hope youre doing well!! Java Developer This is Nihitha from GAC Solutions. regarding a fantastic job opportunity with our preferred client. So, please go through the below job description & let us know your interest. Title: Site Reliability Engineer Location: Seattle, WA JOB DESCRIPTION: We are looking for a skilled Platform Engineer / SRE to design, implement, and maintain our cloud infrastructure and platforms. The ideal candidate will have a strong background in Kubernetes administration, Azure cloud services, infrastructure as code, and automation. You will play a crucial role in ensuring the scalability, reliability, and security of our systems while supporting our AI/ML initiatives. * Design, deploy, and manage infrastructure solutions using Terraform, ensuring scalability, security, and reliability. * Develop and maintain infrastructure as code scripts to automate the provisioning and configuration of resources. * Ensure version-controlled, repeatable deployments using IaC best practices. * Implement and manage Kubernetes clusters for containerized applications. * Collaborate with development teams to deploy, scale, and optimize applications in Kubernetes environments. * Leverage scripting languages (e.g Python) to automate routine tasks and streamline workflows. * Implement continuous integration and continuous deployment (CI/CD) pipelines for efficient software delivery. * Ensure seamless integration of infrastructure components with CI/CD pipelines. * Design, deploy, and maintain scalable and reliable infrastructure for AI/ML platforms. * Implement containerization (Docker) and orchestration (Kubernetes) solutions for deploying and managing AI/ML applications. * Ensure containerized applications are secure, scalable, and easily deployable. * Enable seamless integration of AI/ML models into the platform, ensuring data pipelines are efficient and reliable. * Establish monitoring and alerting systems to ensure the health and performance of AI/ML platforms. * Implement security best practices for AI/ML platforms, ensuring data privacy and compliance with industry standards * Bachelor's degree in computer science, Engineering, or a related field * Proven experience in Kubernetes administration, specifically with Azure Kubernetes Service (AKS) * Strong proficiency in Azure cloud services and Azure ARM templates * Expert-level scripting skills in Power and Python * Hands-on experience with Terraform for infrastructure as code * Solid understanding of CI/CD principles and experience with Azure DevOps * Experience with containerization technologies, particularly Docker * Strong problem-solving skills and ability to work in a fast-paced environment * Excellent communication and collaboration skills Keywords: continuous integration continuous deployment artificial intelligence machine learning golang Washington Site Reliability Engineer - Hybrid [email protected] |
[email protected] View All |
04:23 AM 06-Feb-25 |