close

DEV Community

# mlops

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
Never lose a training run again: a checkpoint-and-resume playbook for ephemeral GPUs

Never lose a training run again: a checkpoint-and-resume playbook for ephemeral GPUs

Image Image Image 5
Comments 1
6 min read
Stop building custom wrappers for your ML models.

Stop building custom wrappers for your ML models.

Comments
4 min read
Using the channels-last memory format reduced the latency of our conversation backbone by 22%

Using the channels-last memory format reduced the latency of our conversation backbone by 22%

Image 1
Comments
4 min read
Machine learning in production: the model is the easy part

Machine learning in production: the model is the easy part

Image 3
Comments 1
3 min read
ML Observability on EKS: Logs, Metrics and Tracing Head-to-Head

ML Observability on EKS: Logs, Metrics and Tracing Head-to-Head

Comments
11 min read
Benchmarking 5 LLM providers on one eval set, no SDK per vendor

Benchmarking 5 LLM providers on one eval set, no SDK per vendor

Comments
4 min read
Building a Self-Hosted MLOps Platform from Scratch with FastAPI, PostgreSQL, GCS, and Docker

Building a Self-Hosted MLOps Platform from Scratch with FastAPI, PostgreSQL, GCS, and Docker

Comments
4 min read
temperature=0 didn't make our LLM evals reproducible

temperature=0 didn't make our LLM evals reproducible

Comments
4 min read
The SDXL VAE overflow that decoded black images in fp16

The SDXL VAE overflow that decoded black images in fp16

Image 1
Comments
4 min read
Harvesting a regression test set from gateway logs with a plugin

Harvesting a regression test set from gateway logs with a plugin

Comments
4 min read
Semantic caching our flaky-test summariser: 58% fewer LLM calls

Semantic caching our flaky-test summariser: 58% fewer LLM calls

Comments
4 min read
If a 270M Model Already Worked, Why Did I Fine-Tune a 7B One?

If a 270M Model Already Worked, Why Did I Fine-Tune a 7B One?

Comments
3 min read
Data Contracts in Production: Stop Trusting Your Upstream Sources

Data Contracts in Production: Stop Trusting Your Upstream Sources

Comments
5 min read
Perplexity held flat after INT4. Task accuracy dropped 7 points.

Perplexity held flat after INT4. Task accuracy dropped 7 points.

Comments
4 min read
The seam our tiled upscaler left on every 4K product render

The seam our tiled upscaler left on every 4K product render

Comments
4 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.