Job Details

Home

Senior ML Platform Engineer || 12+ || Remote at Remote, Remote, USA

From:

Srivalli,

Fluxteksolutions

srivalli@fluxteksol.com

Reply to: srivalli@fluxteksol.com

Job Title: Senior ML Platform Engineer (Serving Infrastructure)
Location: [San Jose/Remote Option]

Department: ML Platform Engineering
Exp: 12+years

Visa: Any Visa is fine

Role Overview: We're looking for an experienced engineer to build our ML serving infrastructure. You'll create the platforms and systems that enable reliable, scalable model deployment and inference. This role focuses on the runtime infrastructure that powers our production ML capabilities.

Key Responsibilities:

Design and implement scalable model serving platforms for both batch and real-time inference
Build model deployment pipelines with automated testing and validation
Develop monitoring, logging, and alerting systems for ML services
Create infrastructure for A/B testing and model experimentation
Implement model versioning and rollback capabilities
Design efficient scaling and load balancing strategies for ML workloads
Collaborate with data scientists to optimize model serving performance

Technical Requirements:

12+ years of software engineering experience, with 4+ years in ML serving/infrastructure
Strong expertise in container orchestration (Kubernetes) and cloud platforms
Experience with model serving technologies (TensorFlow Serving, Triton, KServe)
Deep knowledge of distributed systems and microservices architecture
Proficiency in Python and experience with high-performance serving
Strong background in monitoring and observability tools
Experience with CI/CD pipelines and GitOps workflows

Nice to Have:
Experience with model serving frameworks:
o TorchServe for PyTorch models
o TensorFlow Serving for TF models
o Triton Inference Server for multi-framework support
o BentoML for unified model serving
Expertise in model runtime optimizations:
o Model quantization (INT8, FP16)
o Model pruning and compression
o Kernel optimizations
o Batching strategies
o Hardware-specific optimizations (CPU/GPU)
Experience with model inference workflows:
o Pre/post-processing pipeline optimization
o Feature transformation at serving time
o Caching strategies for inference
o Multi-model inference orchestration
o Dynamic batching and request routing
Experience with GPU infrastructure management
Knowledge of low-latency serving architectures
Familiarity with ML-specific security requirements
Background in performance profiling and optimization
Experience with model serving metrics collection and analysis

Keywords: continuous integration continuous deployment machine learning
Senior ML Platform Engineer || 12+ || Remote
srivalli@fluxteksol.com

srivalli@fluxteksol.com
View All

12:39 AM 08-Mar-25

To remove this job post send "job_kill 2238892" as subject from srivalli@fluxteksol.com to usjobs@nvoids.com. Do not write anything extra in the subject line as this is a automatic system which will not work otherwise.

Your reply to srivalli@fluxteksol.com -

To

Subject
Message -

srivalli@fluxteksol.com wrote:
From:

Srivalli,

Fluxteksolutions

srivalli@fluxteksol.com

Reply to:   srivalli@fluxteksol.com

Job Title: Senior ML Platform Engineer (Serving Infrastructure)
Location: [San Jose/Remote Option]

Department: ML Platform Engineering
Exp: 12+years

Visa: Any Visa is fine

Role Overview: We're looking for an experienced engineer to build our ML serving infrastructure. You'll create the platforms and systems that enable reliable, scalable model deployment and inference. This role focuses on the runtime infrastructure that powers our production ML capabilities.

Key Responsibilities:

Design and implement scalable model serving platforms for both batch and real-time inference
 Build model deployment pipelines with automated testing and validation
 Develop monitoring, logging, and alerting systems for ML services
 Create infrastructure for A/B testing and model experimentation
 Implement model versioning and rollback capabilities
 Design efficient scaling and load balancing strategies for ML workloads
 Collaborate with data scientists to optimize model serving performance

Technical Requirements:

12+ years of software engineering experience, with 4+ years in ML serving/infrastructure
 Strong expertise in container orchestration (Kubernetes) and cloud platforms
 Experience with model serving technologies (TensorFlow Serving, Triton, KServe)
 Deep knowledge of distributed systems and microservices architecture
 Proficiency in Python and experience with high-performance serving
 Strong background in monitoring and observability tools
 Experience with CI/CD pipelines and GitOps workflows

Nice to Have:
 Experience with model serving frameworks:
o TorchServe for PyTorch models
o TensorFlow Serving for TF models
o Triton Inference Server for multi-framework support
o BentoML for unified model serving
 Expertise in model runtime optimizations:
o Model quantization (INT8, FP16)
o Model pruning and compression
o Kernel optimizations
o Batching strategies
o Hardware-specific optimizations (CPU/GPU)
 Experience with model inference workflows:
o Pre/post-processing pipeline optimization
o Feature transformation at serving time
o Caching strategies for inference
o Multi-model inference orchestration
o Dynamic batching and request routing
 Experience with GPU infrastructure management
 Knowledge of low-latency serving architectures
 Familiarity with ML-specific security requirements
 Background in performance profiling and optimization
 Experience with model serving metrics collection and analysis

Keywords: continuous integration continuous deployment machine learning 
Senior ML Platform Engineer || 12+ || Remote
srivalli@fluxteksol.com

Your email id:

Captcha Image:

Captcha Code:

Pages not loading, taking too much time to load, server timeout or unavailable, or any other issues please contact admin at me@nvoids.com

Time Taken: 1

Location: ,