Home

Cognizant | Senior ML Platform Engineer (Serving Infrastructure) | San Jose , CA at San Jose, California, USA
Email: [email protected]
Hello

Hope you are doing great!

This is Nitin, I work as a Tech. Recruiter at NYTP. Reach me at   
[email protected]) if you want to apply for the below role:

Position:  
Senior ML Platform Engineer (Serving Infrastructure)

Location:   San Jose, CA

Type : Long Term

Job Summary:

Job Description: Senior ML Platform Engineer (Serving Infrastructure)

Location: [Location/Remote Option] Department: ML Platform Engineering

Role Overview: We're looking for an experienced engineer to build our ML serving infrastructure. You'll create the platforms and systems that enable reliable, scalable model deployment
and inference. This role focuses on the runtime infrastructure that powers our production ML capabilities.

Key Responsibilities:

Design and implement scalable model serving platforms for both batch and real-time inference

Build model deployment pipelines with automated testing and validation

Develop monitoring, logging, and alerting systems for ML services

Create infrastructure for A/B testing and model experimentation

Implement model versioning and rollback capabilities

Design efficient scaling and load balancing strategies for ML workloads

Collaborate with data scientists to optimize model serving performance

Technical Requirements:

7+ years of software engineering experience, with 3+ years in ML serving/infrastructure

Strong expertise in container orchestration (Kubernetes) and cloud platforms

Experience with model serving technologies (TensorFlow Serving, Triton, KServe)

Deep knowledge of distributed systems and microservices architecture

Proficiency in Python and experience with high-performance serving

Strong background in monitoring and observability tools

Experience with CI/CD pipelines and GitOps workflows

Nice to Have:

Experience with model serving frameworks:

o TorchServe for PyTorch models

o TensorFlow Serving for TF models

o Triton Inference Server for multi-framework support

o BentoML for unified model serving

Expertise in model runtime optimizations:

o Model quantization (INT8, FP16)

o Model pruning and compression

o Kernel optimizations

o Batching strategies

o Hardware-specific optimizations (CPU/GPU)

Experience with model inference workflows:

o Pre/post-processing pipeline optimization

o Feature transformation at serving time

o Caching strategies for inference

o Multi-model inference orchestration

o Dynamic batching and request routing

Experience with GPU infrastructure management

Knowledge of low-latency serving architectures

Familiarity with ML-specific security requirements

Background in performance profiling and optimization

Experience with model serving metrics collection and analysis

Thanks,

_______________________________________

Nitin Pillai | New York Technology Partners

120 Wood Avenue S | Suite 504 | Iselin NJ 08830

LinkedIn |

www.nytp.com

We respect your online privacy. If you would like to be removed from our mailing list please reply with "Remove" in the subject and we will
comply immediately. We apologize for any inconvenience caused. Please let us know if you have more than one domain. The material in this e-mail is intended only for the use of the individual to whom it is addressed and may contain information that is confidential,
privileged, and exempt from disclosure under applicable law. If you are not the intended recipient, be advised that the unauthorized use, disclosure, copying, distribution, or the taking of any action in reliance on this information is strictly prohibited.

--

Keywords: continuous integration continuous deployment machine learning information technology California New Jersey
Cognizant | Senior ML Platform Engineer (Serving Infrastructure) | San Jose , CA
[email protected]
[email protected]
View All
11:33 PM 07-Mar-25


To remove this job post send "job_kill 2238477" as subject from [email protected] to [email protected]. Do not write anything extra in the subject line as this is a automatic system which will not work otherwise.


Your reply to [email protected] -
To       

Subject   
Message -

Your email id:

Captcha Image:
Captcha Code:


Pages not loading, taking too much time to load, server timeout or unavailable, or any other issues please contact admin at [email protected]


Time Taken: 10

Location: San Jose, California