Product Manager, CloudlyMELT

Job Description

As Product Manager for CloudlyMELT, you will own the roadmap for a platform that sits at the intersection of two of the fastest-moving areas in enterprise technology: AI infrastructure and observability. Your customers are the engineers and leaders responsible for keeping expensive GPU clusters running efficiently for the AI workloads that power their business. They are technically sophisticated, cost-conscious, and operating under real pressure. Building a product they trust and rely on requires deep understanding of both their infrastructure world and the AI capabilities that make CloudlyMELT different from every legacy monitoring tool they have already tried.

ABOUT CLOUDLYMELT

CloudlyMELT is an AI-native GPU observability platform that correlates network, GPU, and application layers in a single view, reducing MTTR from hours to seconds. It addresses GPU underutilization averaging 15 to 25% in Kubernetes clusters, straggler bottlenecks in distributed training, silent thermal throttling, and the cross-layer blind spot that makes GPU infrastructure incidents so expensive and slow to resolve. Built on OpenTelemetry, Prometheus, and DCGM, it delivers ML-powered predictive failure detection, LLM-driven root cause analysis, cost attribution, and multi-tenant fairness controls for organizations running serious AI infrastructure

Job Requirement

Own and maintain the CloudlyMELT product roadmap including cross-layer correlation capabilities, GPU failure prediction, straggler detection, cost attribution features, LLM root cause analysis, and platform observability infrastructure
Conduct ongoing discovery with AI engineering teams, MLOps leads, infrastructure architects, and FinOps stakeholders to understand their GPU operational challenges, cost pain points, and current observability gaps
Write clear, detailed product requirements for technically complex observability features, with precision about data sources, model behavior, output format, and integration requirements
Define success metrics for CloudlyMELT features in terms customers care about: MTTR reduction, GPU utilization improvement, cost attribution accuracy, and time to insight
Lead go-to-market planning for new platform capabilities in collaboration with marketing and sales, including benchmark data, competitive positioning, and demo environment development
Track competitive developments in the GPU observability and AIOps market including Datadog, Prometheus, Run:ai, and NVIDIA Base Command, and maintain CloudlyMELT's differentiated positioning
Manage the relationship between CloudlyMELT's internal AIOps capabilities and its customer-facing product, ensuring internal learnings feed product improvements

YOU MAY BE A GOOD FIT IF YOU HAVE

2 to 4 years of product management experience at a B2B technology, infrastructure, observability, or AI/ML company
Strong technical literacy in GPU infrastructure, Kubernetes, distributed training, or cloud observability tooling: you can have a meaningful conversation with a senior ML infrastructure engineer about why GPU utilization is hard to measure accurately
Experience defining products that combine ML capabilities and real-time data infrastructure
Strong analytical instincts: you define the right metrics before building and you use them honestly to evaluate outcomes
Ability to translate deeply technical infrastructure capabilities into clear, compelling product narratives for both engineering buyers and FinOps or executive stakeholders
Comfort working with ML, platform, and data engineering teams on features with significant technical complexity and dependency
Competitive awareness and the ability to articulate specifically why CloudlyMELT wins against alternatives

PREFERRED QUALIFICATIONS

Experience with observability platforms, infrastructure monitoring tools, or AIOps products
Familiarity with GPU compute, Kubernetes cluster management, or distributed ML training workflows
Knowledge of FinOps practices and cloud cost optimization in AI infrastructure contexts
Experience with open standards such as OpenTelemetry, Prometheus, or DCGM
Experience shipping ML-powered product features in a production observability or infrastructure context
Bachelor's degree in Computer Science, Engineering, or a related field

COMPENSATION & BENEFITS

Salary: Competitive base, negotiable based on experience
Performance-based commission structure: your earnings scale directly with your results
Two annual festive bonuses, each equivalent to half a month's salary
Two-day weekends, 10 days casual leave, 10 days sick leave, and 14 public holidays per CloudlyIO's global holiday calendar for Bangladesh
Fully subsidized lunch and evening snacks, plus tea and coffee throughout the day
Direct collaboration with US clients and teams, with real exposure to global enterprise AI deals from day one

Posted By

by CloudlyIO, Inc.

Job Locations

Remote, Bangladesh

Job Category

Product

Total Positions

Scan to Apply

Apply for this Job

Share this job opening