Skip to content

Vasyl’s Dev Notes

•

About

Coding, Coffee & Chapter Notes

An AI Feature Has No “Tests Pass” Moment. So I Write the Eval First.

A reader on chapter 3 must never get spoilers from chapter 30. That one rule taught me why, for AI features, the eval is the spec — and why I write it before the feature. — read more

Jun 17, 2026
The AI Engineer Interview Is a Backend Interview: 18 Real Questions, Answered for .NET Developers

Open any “AI engineer interview prep” guide and you’ll find the same thing: prompting tricks, model trivia, and Python code. Here’s what the guides miss. When you look at what companies actually ask in 2026 — design a RAG system, debug retrieval failures, keep latency under 800ms, build LLM-as-judge evals — these are not prompt… — read more

Jun 10, 2026
From a Number to a Gate: Evals in CI and Production

Part 5, the finale, of a series on building production AI on .NET. We’ve built the pieces — what evals are, error analysis, golden datasets, and a trustworthy judge. Now we make them earn their keep. By now you can produce a defensible quality score for an AI feature. But a score you only look… — read more

Jun 10, 2026
LLM-as-Judge, Done Right

Part 4 of a series on building production AI on .NET. We’ve covered what evals are, error analysis, and golden datasets. Now: how do you turn a paragraph into a number you can trust? You have a golden dataset and your feature’s real output for each case. Now you need a score. But you can’t… — read more

Jun 10, 2026
Golden Datasets That Don’t Lie

Part 3 of a series on building production AI on .NET. Part 1 was the overview; Part 2 was error analysis. Now we turn the failure taxonomy you built into something you can measure against — without quietly fooling yourself. A golden dataset is a set of representative inputs, each paired with a reference answer… — read more

Jun 10, 2026
Error Analysis: The Unglamorous Superpower Behind Good Evals

Part 2 of a series on building production AI on .NET. Part 1 covered what evals are and the Analyze → Measure → Improve lifecycle. This post is about the step everyone wants to skip: Analyze. When a team decides to “take evals seriously,” the first thing they usually do is wrong. They open a… — read more

Jun 10, 2026
AI Evals, Explained: How We Actually Know Our AI Is Any Good

Everyone says evals are the most important skill in AI engineering. Few show the unglamorous parts: a golden set that doesn’t lie, a judge you can trust, and a regression gate that won’t fire on noise. The whole thing — in C#, on a live product. — read more

Jun 10, 2026
Four Hidden Gates Between Your Expo Build and Google Play in 2026

Real time from eas build to my first tester on Google Play: four hours and seven builds. Google rolled out Android Developer Verification ahead of its September 2026 mandate, and the path from a fresh EAS-built AAB to an Internal Testing release no longer looks like the tutorials. Below is the map I wish I’d… — read more

May 15, 2026
I put Ollama on a 4 GB mobile GPU and got 2.5× — here’s the VRAM math

TL;DR. Same prompt, same model, same box. The only thing that changed was whether Ollama was allowed to touch the GPU. On CPU alone the model ran at 17 tokens per second and took about five and a half seconds per call. With the GPU enabled, Ollama put almost the whole transformer stack on the… — read more

May 12, 2026
Open-source licenses 101: which one to actually pick

Sooner or later, every developer runs into The License Question. You shipped something to GitHub, GitHub asked you to pick a license, and you scrolled the dropdown — MIT, Apache, GPL, AGPL, BUSL, MPL, ISC, Unlicense, “Other” — and picked whatever sounded least scary. That’s how I did it. That’s also how I ended up… — read more

May 6, 2026

GitHub | LinkedIn

Loading Comments...

Write a Comment...

Email (Required)

Name (Required)

Website