<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Davidslv</title>
    <description>Senior Software Engineer · 15+ years of Ruby · Author of two books on software architecture and game development.
</description>
    <link>https://davidslv.uk/</link>
    <atom:link href="https://davidslv.uk/feed.xml" rel="self" type="application/rss+xml"/>
    <pubDate>Fri, 26 Jun 2026 08:05:17 +0000</pubDate>
    <lastBuildDate>Fri, 26 Jun 2026 08:05:17 +0000</lastBuildDate>
    <generator>Jekyll v3.10.0</generator>
    
      <item>
        <title>The View Layer Rails Couldn&apos;t See</title>
        <description>&lt;p&gt;For most of Rails’ history, almost every layer of the stack has had a tool that &lt;em&gt;parses&lt;/em&gt; it. RuboCop reads your Ruby as a syntax tree; Brakeman traces tainted data through it; ESLint reads your JavaScript. The view layer was the exception. The ERB linters we had worked on the &lt;em&gt;text&lt;/em&gt; of the template, not on the HTML that text produces.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://herb-tools.dev&quot;&gt;Marco Roth’s Herb&lt;/a&gt; closes that gap: written in C, it parses HTML-with-embedded-ERB into a real syntax tree. We have run it in production in a Rails engine for a few months. Here is what we wired in, what it caught, and what is still rough.&lt;/p&gt;

&lt;h2 id=&quot;the-blind-spot&quot;&gt;The blind spot&lt;/h2&gt;

&lt;p&gt;For about twenty years, ERB tooling was string-based. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;erb_lint&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;better-html&lt;/code&gt; did useful work, but they pattern-matched over text — they had no parse tree of &lt;em&gt;the HTML you were producing&lt;/em&gt;, so they could not reliably catch a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;&amp;lt;div&amp;gt;&lt;/code&gt; opened inside a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;&amp;lt;p&amp;gt;&lt;/code&gt;, or a value interpolated into an attribute where the escaping rules differ from body text.&lt;/p&gt;

&lt;p&gt;The harder case in our codebase was not ERB at all. Parts of our admin surface were written in &lt;strong&gt;Arbre&lt;/strong&gt; — the object-oriented “DOM in Ruby” that ActiveAdmin uses to build markup with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;div do … end&lt;/code&gt; blocks instead of templates. Arbre first appeared in 2011 and solved a real problem of its era. I never warmed to the abstraction myself, but taste is beside the point here. The real issue is that Arbre is invisible to static analysis: there is no HTML in the file, only Ruby that emits HTML at runtime. No HTML linter sees it, no language server checks the Stimulus targets inside it, no formatter touches it. Every &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;.html.arb&lt;/code&gt; file is a file outside the ecosystem.&lt;/p&gt;

&lt;h2 id=&quot;what-a-parse-tree-makes-possible&quot;&gt;What a parse tree makes possible&lt;/h2&gt;

&lt;p&gt;Once you have a tree, checks become expressible that a string linter cannot do reliably. In our engine, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;herb analyze&lt;/code&gt; gates on two:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Nesting&lt;/strong&gt; — invalid element nesting, the kind browsers silently repair and then render differently than you meant.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Accessibility&lt;/strong&gt; — structural checks (missing labels and the like) that only make sense with the element tree in front of you.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Herb can also check security — for instance, output in an attribute position, where a naive &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;&amp;lt;%= %&amp;gt;&lt;/code&gt; can break out of the attribute. We leave those rules off, though: &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;better_html&lt;/code&gt; (under &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;erb_lint&lt;/code&gt;) already owns ERB safety, and running both would just double-report. Either way the point holds: the tree is what makes the check possible at all.&lt;/p&gt;

&lt;h2 id=&quot;the-view-layer-has-been-consolidating-on-erb&quot;&gt;The view layer has been consolidating on ERB&lt;/h2&gt;

&lt;p&gt;Herb did not appear in a vacuum. For a decade the view layer has been moving in one direction:&lt;/p&gt;

&lt;style&gt;
.vl-timeline{margin:2rem 0;padding-left:1.6rem;border-left:2px solid var(--border-color)}
.vl-tl{position:relative;padding-bottom:1.6rem}
.vl-tl:last-child{padding-bottom:0}
.vl-tl::before{content:&quot;&quot;;position:absolute;left:calc(-1.6rem - 7px);top:5px;width:11px;height:11px;border-radius:50%;background:#c2a06a;border:2px solid var(--background-color)}
.vl-tl .vl-yr{display:block;font-family:var(--code-font, &quot;SFMono-Regular&quot;, Menlo, Consolas, monospace);font-size:.78rem;font-weight:600;letter-spacing:.02em;color:#c2a06a;text-transform:uppercase}
.vl-tl h4{margin:.15rem 0 .2rem;font-size:1.02rem;line-height:1.3;color:var(--text-color)}
.vl-tl p{margin:0;font-size:.92rem;line-height:1.5;color:var(--meta-color)}
&lt;/style&gt;

&lt;div class=&quot;vl-timeline&quot; role=&quot;list&quot; aria-label=&quot;Timeline of ERB&apos;s reinvigoration&quot;&gt;
  &lt;div class=&quot;vl-tl&quot; role=&quot;listitem&quot;&gt;
    &lt;span class=&quot;vl-yr&quot;&gt;2019&lt;/span&gt;
    &lt;h4&gt;ViewComponent born at GitHub&lt;/h4&gt;
    &lt;p&gt;An ERB template plus a Ruby class. Built and run by the most prolific ERB renderer on the planet — and still rendering ERB.&lt;/p&gt;
  &lt;/div&gt;
  &lt;div class=&quot;vl-tl&quot; role=&quot;listitem&quot;&gt;
    &lt;span class=&quot;vl-yr&quot;&gt;2023 · Rails 7.1&lt;/span&gt;
    &lt;h4&gt;Strict locals land&lt;/h4&gt;
    &lt;p&gt;A magic comment gives plain partials required, typed parameters — closing much of the safety gap that sent people looking for alternatives.&lt;/p&gt;
  &lt;/div&gt;
  &lt;div class=&quot;vl-tl&quot; role=&quot;listitem&quot;&gt;
    &lt;span class=&quot;vl-yr&quot;&gt;2024–25&lt;/span&gt;
    &lt;h4&gt;Phlex 2 ships&lt;/h4&gt;
    &lt;p&gt;Pure-Ruby views — fast and composable, but deliberately outside ERB. Good, and against the grain of everything around it.&lt;/p&gt;
  &lt;/div&gt;
  &lt;div class=&quot;vl-tl&quot; role=&quot;listitem&quot;&gt;
    &lt;span class=&quot;vl-yr&quot;&gt;2025&lt;/span&gt;
    &lt;h4&gt;Herb arrives&lt;/h4&gt;
    &lt;p&gt;An HTML-aware ERB parser, linter, formatter and language server, written in C. The first time the view layer gets a real parse tree.&lt;/p&gt;
  &lt;/div&gt;
  &lt;div class=&quot;vl-tl&quot; role=&quot;listitem&quot;&gt;
    &lt;span class=&quot;vl-yr&quot;&gt;2025–26&lt;/span&gt;
    &lt;h4&gt;GitHub adopts Herb · Rails core leans in&lt;/h4&gt;
    &lt;p&gt;Herb runs across GitHub&apos;s enormous ERB surface, and a proposed “ReActionView” direction at Rails World 2025 imagines Herb as the foundation Rails parses views with.&lt;/p&gt;
  &lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;Two things get conflated here — a commenter rightly called me on it. The &lt;em&gt;substrate&lt;/em&gt; has settled: ERB is where the ecosystem’s weight sits, and that is a safe bet. The &lt;em&gt;tooling&lt;/em&gt; on top is young — Herb is at 0.10. That is not a contradiction; it is the reason this is worth writing down. (The one popular bet the other way is Phlex — pure-Ruby views, genuinely good for composition. It is a different trade: write views as Ruby, and you are back to having no HTML for an HTML-aware tool to read.)&lt;/p&gt;

&lt;h2 id=&quot;the-lived-test-porting-arbre-out-of-an-engine&quot;&gt;The lived test: porting Arbre out of an engine&lt;/h2&gt;

&lt;p&gt;Here is where it became a diff. We wrote a hard rule into the engine’s style guide: &lt;strong&gt;new view partials are &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;.html.erb&lt;/code&gt;, and the moment you touch an existing &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;.html.arb&lt;/code&gt; file for any reason, you port it to ERB in the same pull request.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Existing Arbre files keep working untouched until something forces an edit — a bug fix, a copy change, a new field. Then you port the whole file first and make your change in ERB. Pre-commit and CI both fail on any added or modified &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;.html.arb&lt;/code&gt; in the diff (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;git diff --diff-filter=AM | grep &apos;\.html\.arb$&apos;&lt;/code&gt;). No hotfix bypass — the gate is only credible if it survives pressure.&lt;/p&gt;

&lt;p&gt;That has the engine at &lt;strong&gt;193 ERB partials and three remaining Arbre files&lt;/strong&gt; — the three are just ones nobody has needed to touch. There was no big-bang migration; the legacy set shrinks as you work.&lt;/p&gt;

&lt;p&gt;What makes the rule fair is that porting Arbre is mechanical. You convert the whole file once, and it is done. That is cheap enough to demand on every touch — which it would not be for, say, a cop that rewrote existing method calls.&lt;/p&gt;

&lt;p&gt;Here is one of those three files, trimmed. In Arbre the markup is Ruby — there is no &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;&amp;lt;div&amp;gt;&lt;/code&gt;, no class, no attribute for a tool to read:&lt;/p&gt;

&lt;div class=&quot;language-ruby highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;div&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;class: &lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&apos;bg-white rounded-xl border px-6 py-5&apos;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;para&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;t&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;#{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i18n&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;.heading&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;class: &lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&apos;text-xs uppercase&apos;&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;text_node&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;button_to&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;t&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;#{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i18n&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;.button&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;sync_record_path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;record&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
            &lt;span class=&quot;ss&quot;&gt;method: :post&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
            &lt;span class=&quot;ss&quot;&gt;class: &lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&apos;px-4 py-2 rounded-lg bg-blue-600 text-white&apos;&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Ported, the same output is real HTML that &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;herb analyze&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;herb-lsp&lt;/code&gt; can parse:&lt;/p&gt;

&lt;div class=&quot;language-erb highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nt&quot;&gt;&amp;lt;div&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;class=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;bg-white rounded-xl border px-6 py-5&quot;&lt;/span&gt;&lt;span class=&quot;nt&quot;&gt;&amp;gt;&lt;/span&gt;
  &lt;span class=&quot;nt&quot;&gt;&amp;lt;p&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;class=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;text-xs uppercase&quot;&lt;/span&gt;&lt;span class=&quot;nt&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;cp&quot;&gt;&amp;lt;%=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;t&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;#{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i18n&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;.heading&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;cp&quot;&gt;%&amp;gt;&lt;/span&gt;&lt;span class=&quot;nt&quot;&gt;&amp;lt;/p&amp;gt;&lt;/span&gt;
  &lt;span class=&quot;cp&quot;&gt;&amp;lt;%=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;button_to&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;t&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;#{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i18n&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;.button&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;sync_record_path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;record&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
        &lt;span class=&quot;ss&quot;&gt;method: :post&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
        &lt;span class=&quot;ss&quot;&gt;class: &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;px-4 py-2 rounded-lg bg-blue-600 text-white&quot;&lt;/span&gt; &lt;span class=&quot;cp&quot;&gt;%&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/div&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;div do … end&lt;/code&gt; becomes &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;&amp;lt;div&amp;gt;…&amp;lt;/div&amp;gt;&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;para&lt;/code&gt; becomes &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;&amp;lt;p&amp;gt;&lt;/code&gt;, and the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;text_node&lt;/code&gt; wrappers fall away. One file, one pass.&lt;/p&gt;

&lt;h2 id=&quot;honest-about-the-rough-edges&quot;&gt;Honest about the rough edges&lt;/h2&gt;

&lt;p&gt;In the editor, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;herb-lsp&lt;/code&gt; (we list &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;marcoroth.herb-lsp&lt;/code&gt; as a recommended extension) gives live HTML diagnostics as you type. But this is a young toolchain — 0.10.1 — and two things are still rough:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Stimulus checks are off, on purpose.&lt;/strong&gt; We bootstrap Stimulus inline in an ERB partial rather than in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;application.js&lt;/code&gt;, and the Stimulus parser cannot follow controller registrations embedded in ERB — so it flags every real, registered controller as unknown. The four &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;stimulus-data-*&lt;/code&gt; rules are disabled in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;.herb.yml&lt;/code&gt; until our bootstrap moves to a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;.js&lt;/code&gt; file. So the diagnostic I most wanted — catching a typo’d &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;data-controller&lt;/code&gt; as I type it — is not live for us yet.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Two ERB linters, side by side.&lt;/strong&gt; &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;herb analyze&lt;/code&gt; for the parser-level validators, Shopify’s &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;erb_lint&lt;/code&gt; for whitespace and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;better_html&lt;/code&gt; safety. Where they overlap, we keep the rule in one so they do not double-report. The new tool has not replaced the old one; it sits next to it and will subsume more over time.&lt;/p&gt;

&lt;h2 id=&quot;the-lesson&quot;&gt;The lesson&lt;/h2&gt;

&lt;p&gt;One idea is worth keeping: &lt;strong&gt;the reach of your tooling is an architectural property, and you choose it.&lt;/strong&gt; Picking Arbre years ago did not just pick a way to write markup — it put that markup beyond every analyser we would later wish we had. Picking ERB put the view layer back inside reach of the linters, the language servers, the safety checks, and whatever Rails builds next on Herb. The template-engine debate usually turns on ergonomics; the more durable axis is legibility to tools, and on that axis HTML-aware ERB is in a different class from the DSLs it replaces.&lt;/p&gt;

&lt;p&gt;So: write ERB, lint it with something that actually parses the HTML, and when you find a corner the tools cannot see, treat it as a bug in your architecture — not a quirk you live with.&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;&lt;em&gt;The setup here is a production Rails engine running &lt;a href=&quot;https://herb-tools.dev&quot;&gt;Herb&lt;/a&gt; 0.10.1 (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;herb analyze&lt;/code&gt; in pre-push and CI), &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;erb_lint&lt;/code&gt; 0.9.0 with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;better_html&lt;/code&gt; for safety, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;marcoroth.herb-lsp&lt;/code&gt; as a recommended extension, and an enforced Arbre-to-ERB porting rule that has the engine at 193 ERB partials to 3 remaining Arbre files.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;If you want the longer story on building Rails applications that stay maintainable as they grow — boundaries, engines, testing, and honest trade-offs — that is what &lt;a href=&quot;/modular-rails/&quot;&gt;Modular Rails: Architecture for the Long Game&lt;/a&gt; covers in depth. &lt;a href=&quot;/books/modular-rails/&quot;&gt;Read it free on the web&lt;/a&gt;, or get the &lt;a href=&quot;https://www.amazon.com/dp/1066649405&quot;&gt;paperback&lt;/a&gt; (&lt;a href=&quot;https://www.amazon.co.uk/dp/1066649405&quot;&gt;UK&lt;/a&gt;).&lt;/em&gt;&lt;/p&gt;
</description>
        <pubDate>Wed, 24 Jun 2026 00:00:00 +0000</pubDate>
        <link>https://davidslv.uk/2026/06/24/the-view-layer-rails-couldnt-see.html</link>
        <guid isPermaLink="true">https://davidslv.uk/2026/06/24/the-view-layer-rails-couldnt-see.html</guid>
        
        
      </item>
    
      <item>
        <title>The Propshaft Version Lever You Were Told Was Gone</title>
        <description>&lt;p&gt;A piece of feedback to the Rails community crossed my feed this week. A team had migrated an application to Rails 8.1.3, adopted Propshaft — the asset pipeline that replaced Sprockets as the Rails 8 default — and concluded that it had removed the ability to set a version string to force new fingerprints on precompile. Their words were that this introduced “a weakness to the platform.” The reasoning was sound: they used that lever to be certain a client was running the latest deployed assets, and now it appeared to be gone.&lt;/p&gt;

&lt;p&gt;The instinct is correct. That lever matters. But the conclusion is wrong, and the way I know it is wrong is the point of this post: I cloned Propshaft, read the source, and then generated a fresh Rails 8.1.3 app and tested it, rather than trusting the blog posts. The version setting is still there. It is wired into the digest. The Rails generator writes it into your initializer with a comment explaining what it does. And when I precompiled twice to prove it, it behaved exactly as advertised. It is missing from Propshaft’s README — which is a different problem from being removed, and a much smaller one.&lt;/p&gt;

&lt;h2 id=&quot;what-the-blog-posts-will-tell-you&quot;&gt;What the blog posts will tell you&lt;/h2&gt;

&lt;p&gt;Search for “Propshaft cache busting” and every result says the same thing, more or less correctly:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Propshaft appends a content-based fingerprint to each filename. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;application.css&lt;/code&gt; becomes &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;application-a1b2c3d4.css&lt;/code&gt;. When the content changes, the digest changes, the filename changes, and the browser fetches the new file. Unlike Sprockets, there is no &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;config.assets.version&lt;/code&gt; to manage — the content hash handles everything.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That last sentence is the one that does the damage. It is repeated across tutorials, and it is the source of the belief that the lever was deleted. The first half is true. The second half is folklore.&lt;/p&gt;

&lt;h2 id=&quot;what-the-source-actually-says&quot;&gt;What the source actually says&lt;/h2&gt;

&lt;p&gt;Propshaft is small enough to read in a sitting, which is exactly why I reach for &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;git clone&lt;/code&gt; before I reach for an opinion. The whole digest mechanism is one method in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;lib/propshaft/asset.rb&lt;/code&gt;:&lt;/p&gt;

&lt;div class=&quot;language-ruby highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;digest&lt;/span&gt;
  &lt;span class=&quot;vi&quot;&gt;@digest&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;||=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Digest&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;SHA1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;hexdigest&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;#{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;content_with_compile_references&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}#{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;load_path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;version&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;first&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;8&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Read it slowly. The string being hashed is not just the file’s content. It is the content &lt;strong&gt;concatenated with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;load_path.version&lt;/code&gt;&lt;/strong&gt;. The fingerprint is a SHA1 of &lt;em&gt;content plus a version string&lt;/em&gt;, truncated to eight characters.&lt;/p&gt;

&lt;p&gt;So where does &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;load_path.version&lt;/code&gt; come from? Two short hops up. The load path is built in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;lib/propshaft/assembly.rb&lt;/code&gt;:&lt;/p&gt;

&lt;div class=&quot;language-ruby highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;load_path&lt;/span&gt;
  &lt;span class=&quot;vi&quot;&gt;@load_path&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;||=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Propshaft&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;LoadPath&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;new&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;config&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;paths&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;ss&quot;&gt;compilers: &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;compilers&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;ss&quot;&gt;version: &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;config&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;version&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;          &lt;span class=&quot;c1&quot;&gt;# &amp;lt;- right here&lt;/span&gt;
    &lt;span class=&quot;ss&quot;&gt;file_watcher: &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;config&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;file_watcher&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;ss&quot;&gt;integrity_hash_algorithm: &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;config&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;integrity_hash_algorithm&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;And &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;config.version&lt;/code&gt; is &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;config.assets.version&lt;/code&gt;, which the Propshaft railtie sets a default for in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;lib/propshaft/railtie.rb&lt;/code&gt;:&lt;/p&gt;

&lt;div class=&quot;language-ruby highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;config&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;assets&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;version&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;1&quot;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The chain is unbroken:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-mermaid&quot;&gt;flowchart LR
    A[&quot;config.assets.version&amp;lt;br/&amp;gt;(generated app: &amp;amp;quot;1.0&amp;amp;quot;)&quot;] --&amp;gt; B[Assembly]
    B --&amp;gt;|&quot;version: config.version&quot;| C[LoadPath.version]
    C --&amp;gt; D[&quot;Asset#digest&amp;lt;br/&amp;gt;SHA1(content + version)&quot;]
    D --&amp;gt; E[&quot;application-a1b2c3d4.css&quot;]

    style A fill:#e8a838,stroke:#b07828,color:#fff
    style D fill:#4a90d9,stroke:#2c5f8a,color:#fff
    style E fill:#27ae60,stroke:#1e8449,color:#fff
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;config.assets.version&lt;/code&gt; exists in Propshaft. The railtie defaults it to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;&quot;1&quot;&lt;/code&gt;, it is folded into every single asset digest, and this is Propshaft 1.3.2 — the version shipping with current Rails 8.&lt;/p&gt;

&lt;p&gt;And here is the part that turns “undocumented” into “actually documented, in your own repository.” Generate a fresh Rails 8.1.3 app and open &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;config/initializers/assets.rb&lt;/code&gt;, and the generator has already written this for you:&lt;/p&gt;

&lt;div class=&quot;language-ruby highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c1&quot;&gt;# Version of your assets, change this if you want to expire all your assets.&lt;/span&gt;
&lt;span class=&quot;no&quot;&gt;Rails&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;application&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;config&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;assets&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;version&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;1.0&quot;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The generated app overrides the railtie’s &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;&quot;1&quot;&lt;/code&gt; with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;&quot;1.0&quot;&lt;/code&gt;, but the point is the comment. The exact “enter version information to force new fingerprints” feature the feedback believed was disabled is scaffolded into every new Rails app, on a line whose comment names the precise use case: &lt;em&gt;change this if you want to expire all your assets&lt;/em&gt;. Nobody removed the lever. It is sitting in an initializer the generator wrote, one &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;git grep version config/&lt;/code&gt; away.&lt;/p&gt;

&lt;h2 id=&quot;using-it&quot;&gt;Using it&lt;/h2&gt;

&lt;p&gt;Because the version string is part of the hashed input, changing it changes the hash for &lt;strong&gt;every asset&lt;/strong&gt;, regardless of whether any file content changed. Edit the line the generator already gave you:&lt;/p&gt;

&lt;div class=&quot;language-ruby highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c1&quot;&gt;# config/initializers/assets.rb&lt;/span&gt;
&lt;span class=&quot;no&quot;&gt;Rails&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;application&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;config&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;assets&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;version&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;2.0&quot;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Precompile, and every fingerprint differs from the previous build — the same shift for every file in the pipeline. The asset URLs embedded in your views all change, and any client requesting an old URL gets a cache miss and pulls the fresh file. That is precisely the “force new fingerprint generation on precompile” behaviour the feedback assumed had been taken away.&lt;/p&gt;

&lt;p&gt;It is worth noting this is &lt;em&gt;more&lt;/em&gt; reliable than the feature people remember. Sprockets had a long-standing bug (&lt;a href=&quot;https://github.com/rails/sprockets-rails/issues/240&quot;&gt;sprockets-rails#240&lt;/a&gt;) where bumping &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;config.assets.version&lt;/code&gt; produced identical digests anyway — the lever was connected to nothing. Propshaft’s version genuinely participates in the hash. The thing that was supposedly removed actually works better than the original.&lt;/p&gt;

&lt;h2 id=&quot;i-didnt-take-the-sources-word-for-it&quot;&gt;I didn’t take the source’s word for it&lt;/h2&gt;

&lt;p&gt;Reading the source tells you what &lt;em&gt;should&lt;/em&gt; happen. Before publishing this I generated a clean Rails 8.1.3 app (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;propshaft (1.3.2)&lt;/code&gt;, no Sprockets in the lockfile), added one declaration to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;app/assets/stylesheets/application.css&lt;/code&gt;, and precompiled it twice. Between the two builds I changed nothing except the version string, and I checked that the CSS file was byte-for-byte identical (same MD5) across both runs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Build one&lt;/strong&gt;, with the default &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;config.assets.version = &quot;1.0&quot;&lt;/code&gt;, produced this manifest:&lt;/p&gt;

&lt;div class=&quot;language-json highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;application.css&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;digested_path&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;application-a863ad16.css&quot;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Build two&lt;/strong&gt;, after changing only the version to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;&quot;2.0&quot;&lt;/code&gt; — no content change, identical MD5 — produced:&lt;/p&gt;

&lt;div class=&quot;language-json highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;application.css&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;digested_path&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;application-09b5bd28.css&quot;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Every digest moved, not just the stylesheet’s:&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;asset&lt;/th&gt;
      &lt;th&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;version = &quot;1.0&quot;&lt;/code&gt;&lt;/th&gt;
      &lt;th&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;version = &quot;2.0&quot;&lt;/code&gt;&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;application.css&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;a863ad16&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;09b5bd28&lt;/code&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rails-ujs.esm.js&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;e925103b&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;a4ead74f&lt;/code&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rails-ujs.js&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;20eaf715&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;0b7c6ef1&lt;/code&gt;&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;Then, to be sure I understood &lt;em&gt;why&lt;/em&gt;, I reproduced the digest by hand from the formula in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;asset.rb&lt;/code&gt; — &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;SHA1(content + version)&lt;/code&gt; truncated to eight characters:&lt;/p&gt;

&lt;div class=&quot;language-ruby highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nb&quot;&gt;require&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;digest&quot;&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;content&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;File&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;read&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;app/assets/stylesheets/application.css&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;no&quot;&gt;Digest&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;SHA1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;hexdigest&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;#{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;content&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;1.0&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;first&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;8&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# =&amp;gt; &quot;a863ad16&quot;  ✓ matches build one&lt;/span&gt;
&lt;span class=&quot;no&quot;&gt;Digest&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;SHA1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;hexdigest&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;#{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;content&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;2.0&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;first&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;8&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# =&amp;gt; &quot;09b5bd28&quot;  ✓ matches build two&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Both hand-computed digests matched the precompiled filenames exactly. The lever works, it works for the documented reason, and the mechanism is no deeper than concatenating a string before hashing.&lt;/p&gt;

&lt;h2 id=&quot;why-you-almost-never-need-it&quot;&gt;Why you almost never need it&lt;/h2&gt;

&lt;p&gt;Here is the more interesting half, and the reason the lever is quietly scaffolded rather than loudly advertised.&lt;/p&gt;

&lt;p&gt;With Sprockets, the version string earned its keep because Sprockets’ digests were not purely content-addressed and were occasionally inconsistent between environments and gem versions. The version knob was the manual override you reached for when you did not trust the automatic digest. It was a workaround for unpredictability.&lt;/p&gt;

&lt;p&gt;Propshaft’s digest is a plain SHA1 of the content. It is deterministic: identical bytes always produce the identical fingerprint, and any byte change produces a new one. The automatic case is now trustworthy, so the manual override has almost nothing left to do. If you changed an asset, its fingerprint already changed — you do not bump a version to “make sure,” because the content &lt;em&gt;is&lt;/em&gt; the version.&lt;/p&gt;

&lt;p&gt;The only scenario where the global lever is the right tool is when you want every asset to get a new URL &lt;em&gt;without changing any content&lt;/em&gt;: forcing a CDN that keyed on something unexpected to re-pull, recovering from a poisoned edge cache, or invalidating after a build-toolchain change that altered how files are produced but not what they contain. Real situations, but rare ones. That rarity is why the generated app sets the field to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;&quot;1.0&quot;&lt;/code&gt; and most teams never touch it again — not because it was deleted.&lt;/p&gt;

&lt;h2 id=&quot;the-cache-busting-problem-the-lever-does-not-solve&quot;&gt;The cache-busting problem the lever does not solve&lt;/h2&gt;

&lt;p&gt;There is a deeper point hiding in the original feedback, and it is the part worth carrying away even if you never touch &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;config.assets.version&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The stated goal was “make sure the client indeed has the latest asset being deployed.” Bumping the asset version does not, on its own, guarantee that — because the fingerprinted URL lives &lt;strong&gt;inside your HTML&lt;/strong&gt;:&lt;/p&gt;

&lt;div class=&quot;language-html highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nt&quot;&gt;&amp;lt;link&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;rel=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;stylesheet&quot;&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;href=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;/assets/application-9f8e7d6c.css&quot;&lt;/span&gt;&lt;span class=&quot;nt&quot;&gt;&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;A fingerprinted asset is safe to cache forever, which is the whole win: the URL only points at one immutable version of the file. But the &lt;em&gt;document&lt;/em&gt; that references it is a different cache layer entirely. If a CDN, a reverse proxy, or the browser is holding an old copy of the HTML, the client keeps reading the &lt;em&gt;old&lt;/em&gt; asset URL — and your shiny new &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;version = &quot;2.0&quot;&lt;/code&gt; digests sit on the server unrequested.&lt;/p&gt;

&lt;p&gt;So “did the client get the latest assets?” is really two questions stacked on top of each other:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Assets:&lt;/strong&gt; solved by fingerprinting, automatically, the moment content changes. Cache them with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;max-age&lt;/code&gt; set to a year and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;immutable&lt;/code&gt;.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;The HTML that names them:&lt;/strong&gt; &lt;em&gt;not&lt;/em&gt; an asset-pipeline concern at all. This is the layer to get right — a short or zero &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;max-age&lt;/code&gt; on the document, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ETag&lt;/code&gt;/&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Last-Modified&lt;/code&gt; revalidation, and correct CDN cache rules for HTML responses. If this layer serves stale HTML, no amount of fingerprinting downstream will help.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Reaching for &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;config.assets.version&lt;/code&gt; to fix a stale-client problem is, most of the time, fixing the layer that already works and ignoring the one that does not. The fingerprint was never the weak link. The document cache is.&lt;/p&gt;

&lt;h2 id=&quot;the-generalisable-lesson&quot;&gt;The generalisable lesson&lt;/h2&gt;

&lt;p&gt;There are two, and they are both cheap habits.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Read the source, then run the experiment, before you declare a feature gone.&lt;/strong&gt; Propshaft is a few hundred lines. The entire digest behaviour — the thing three dozen blog posts summarise, sometimes wrongly — is one method you can read in under a minute. Cloning the repository and grepping for &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;version&lt;/code&gt; took less time than writing the post that announced its removal, and it would have produced the opposite, correct conclusion. Generating a throwaway app and precompiling it twice took five more minutes and turned “the source says it should work” into “I watched it work.” When a tool’s behaviour surprises you, the library’s own code is the primary source and a two-build experiment is the proof. Tutorials are secondary, and they inherit each other’s mistakes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Match the fix to the layer.&lt;/strong&gt; “The client has stale assets” feels like one problem but spans two caches with two different owners. Fingerprinting owns the asset layer and has owned it well since Propshaft shipped. HTTP cache headers own the document layer. Bumping an asset version to solve a document-caching symptom is the kind of fix that appears to work — because you redeployed and cleared something — while leaving the actual cause in place to resurface on the next deploy.&lt;/p&gt;

&lt;p&gt;The lever you were told was gone is still bolted to the dashboard. It is just that the car mostly steers itself now, and the warning light you are chasing is wired to a different system entirely.&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;&lt;em&gt;The code in this post is from &lt;a href=&quot;https://github.com/rails/propshaft&quot;&gt;Propshaft&lt;/a&gt; 1.3.2, the asset pipeline that ships by default with Rails 8. The digest method is &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;lib/propshaft/asset.rb&lt;/code&gt;; the railtie default is in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;lib/propshaft/railtie.rb&lt;/code&gt;; the generated &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;config.assets.version&lt;/code&gt; line is in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;config/initializers/assets.rb&lt;/code&gt;. Every digest in this post was reproduced from a clean &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rails new&lt;/code&gt; on Rails 8.1.3 — two precompiles, one version change, identical content — not from memory.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;If you want the longer story on building Rails applications that stay maintainable as they grow — boundaries, engines, testing, and honest trade-offs — that is what &lt;a href=&quot;/modular-rails/&quot;&gt;Modular Rails: Architecture for the Long Game&lt;/a&gt; covers in depth. &lt;a href=&quot;/books/modular-rails/&quot;&gt;Read it free on the web&lt;/a&gt;, or get the &lt;a href=&quot;https://www.amazon.com/dp/1066649405&quot;&gt;paperback&lt;/a&gt; (&lt;a href=&quot;https://www.amazon.co.uk/dp/1066649405&quot;&gt;UK&lt;/a&gt;).&lt;/em&gt;&lt;/p&gt;
</description>
        <pubDate>Tue, 23 Jun 2026 00:00:00 +0000</pubDate>
        <link>https://davidslv.uk/2026/06/23/propshaft-version-lever-cache-busting.html</link>
        <guid isPermaLink="true">https://davidslv.uk/2026/06/23/propshaft-version-lever-cache-busting.html</guid>
        
        
      </item>
    
      <item>
        <title>From One Controller to Thirteen Handlers: A Webhook Refactor</title>
        <description>&lt;p&gt;A webhook controller is the natural place to put webhook code. You name it &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;WebhooksController&lt;/code&gt;, you put a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;def stripe&lt;/code&gt; action in it, and you start writing. Six months later it is 200 lines long and you cannot remember what half of it does. This post is about the moment I noticed mine had become a god object, and the small architectural shift that fixed it.&lt;/p&gt;

&lt;p&gt;The code is real. It is from &lt;a href=&quot;https://github.com/Davidslv/seams&quot;&gt;Seams&lt;/a&gt;, the gem I am building that scaffolds modular Rails engines. The Billing engine handles Stripe webhooks and the controller had grown to seven distinct responsibilities. I will walk you through what was wrong, what I changed, and – more usefully – the generalisable pattern underneath, so you can recognise the same smell in code that has nothing to do with webhooks.&lt;/p&gt;

&lt;h2 id=&quot;what-a-webhook-controller-actually-does&quot;&gt;What a webhook controller actually does&lt;/h2&gt;

&lt;p&gt;When a Stripe event arrives, the controller does roughly this:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Read the request body.&lt;/li&gt;
  &lt;li&gt;Verify the Stripe signature so you know the payload is real.&lt;/li&gt;
  &lt;li&gt;Record the event id in your database so retries do not re-fire your subscribers.&lt;/li&gt;
  &lt;li&gt;Decide which Stripe event type this is.&lt;/li&gt;
  &lt;li&gt;Pull the relevant fields out of the payload (customer id, subscription id, amount).&lt;/li&gt;
  &lt;li&gt;Upsert your local mirror of the resource (a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Subscription&lt;/code&gt; row, an &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Invoice&lt;/code&gt; row).&lt;/li&gt;
  &lt;li&gt;Publish a domain event for the rest of your application to react to.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That is seven verbs. Each one is a separate concern with its own reasons to change, its own failure modes, and its own testing requirements. Putting them in one place means you cannot exercise any of them in isolation, you cannot see at a glance what the controller is responsible for, and – the part that broke me – adding a new event type means editing the same file every time.&lt;/p&gt;

&lt;p&gt;Mine had five event types mapped. The roadmap said twelve. Going from five to twelve in the same controller would have produced a 300-line action method with a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;case&lt;/code&gt; statement that nobody would want to review.&lt;/p&gt;

&lt;h2 id=&quot;the-smell-named&quot;&gt;The smell, named&lt;/h2&gt;

&lt;p&gt;The cleanest way I know to spot a god object is to write down its responsibilities as verbs. If the list is longer than three, the class is doing too much. The Single Responsibility Principle is usually taught as “a class should have one reason to change,” which sounds vague until you try to change something. If two unrelated bits of code in the same file change for two unrelated reasons, you keep tripping over the other one.&lt;/p&gt;

&lt;p&gt;The webhook controller had one reason from each of these:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;em&gt;Trust&lt;/em&gt;: signature verification changes when Stripe rotates webhook signing schemes.&lt;/li&gt;
  &lt;li&gt;&lt;em&gt;Idempotency&lt;/em&gt;: dedupe logic changes when you move from a unique-index dedupe to a Redis-based one.&lt;/li&gt;
  &lt;li&gt;&lt;em&gt;Mapping&lt;/em&gt;: the event type table changes every time you support a new event.&lt;/li&gt;
  &lt;li&gt;&lt;em&gt;Extraction&lt;/em&gt;: the payload-parsing rules change when Stripe ships an API version bump.&lt;/li&gt;
  &lt;li&gt;&lt;em&gt;Persistence&lt;/em&gt;: the local upsert changes when your data model changes.&lt;/li&gt;
  &lt;li&gt;&lt;em&gt;Publishing&lt;/em&gt;: the canonical event names change when your event-bus contract changes.&lt;/li&gt;
  &lt;li&gt;&lt;em&gt;Forking&lt;/em&gt;: the LTD (Lifetime Deal) special case changes when product decides to add another mode.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Seven independent change vectors in one class. Every commit risked touching unrelated code; every test had to boot the full request stack just to exercise a single field’s extraction.&lt;/p&gt;

&lt;h2 id=&quot;the-refactor&quot;&gt;The refactor&lt;/h2&gt;

&lt;p&gt;The shape I landed on splits the controller into three layers: a thin entry point, a router, and a flat directory of single-purpose handlers.&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-mermaid&quot;&gt;flowchart LR
    Stripe([Stripe]) --&amp;gt;|POST /billing/webhooks/stripe| WC[WebhooksController]
    WC --&amp;gt;|verify + record| DB[(WebhookEvent)]
    WC --&amp;gt;|lookup| ER{EventRouter}
    ER --&amp;gt;|&quot;customer.subscription.created&quot;| H1[SubscriptionCreatedHandler]
    ER --&amp;gt;|&quot;invoice.paid&quot;| H2[InvoicePaidHandler]
    ER --&amp;gt;|&quot;checkout.session.completed&quot;| H3[CheckoutSessionCompletedHandler]
    ER --&amp;gt;|&quot;... 10 more&quot;| H4[...]
    H1 --&amp;gt;|upsert + publish| Bus([Event bus])
    H2 --&amp;gt;|upsert + publish| Bus
    H3 --&amp;gt;|fork on mode| Bus

    style WC fill:#4a90d9,stroke:#2c5f8a,color:#fff
    style ER fill:#e8a838,stroke:#b07828,color:#fff
    style Bus fill:#27ae60,stroke:#1e8449,color:#fff
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The controller shrank from ~210 lines to about 95. Most of those 95 are documentation comments. The action itself is now ten lines:&lt;/p&gt;

&lt;div class=&quot;language-ruby highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;stripe&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;payload&lt;/span&gt;   &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;request&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;body&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;read&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;signature&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;request&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;headers&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Stripe-Signature&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;

  &lt;span class=&quot;n&quot;&gt;event&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Billing&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;gateway&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;verify_webhook&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
    &lt;span class=&quot;ss&quot;&gt;payload: &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;payload&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;signature: &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;signature&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;secret: &lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Billing&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;configuration&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;webhook_secret&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

  &lt;span class=&quot;n&quot;&gt;record_and_dispatch&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;stripe&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;event&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;head&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:ok&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;rescue&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Billing&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;WebhookError&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;e&lt;/span&gt;
  &lt;span class=&quot;no&quot;&gt;Seams&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Observability&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;adapter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;warn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;billing.webhook.invalid&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;error: &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;e&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;message&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;head&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:bad_request&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;record_and_dispatch&lt;/code&gt; inserts the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;WebhookEvent&lt;/code&gt; row inside a transaction and then calls &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Webhooks::EventRouter.handler_for(event[:type])&lt;/code&gt; to look up a handler class. If there is one, it instantiates and calls it; if there is not, the controller no-ops. Stripe sends event types nobody subscribed to all the time, so a missing handler is normal, not an error.&lt;/p&gt;

&lt;p&gt;The handlers themselves form a small inheritance tree:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;Billing::Webhooks::Handler                          ← abstract base
  ├── SubscriptionHandlerBase                       ← shared upsert
  │   ├── SubscriptionCreatedHandler                ← SEAMS_EVENT = &quot;subscription.created.billing&quot;
  │   ├── SubscriptionUpdatedHandler                ← &quot;subscription.updated.billing&quot;
  │   ├── SubscriptionDeletedHandler                ← &quot;subscription.canceled.billing&quot;
  │   └── SubscriptionTrialWillEndHandler           ← &quot;subscription.trial_will_end.billing&quot;
  ├── InvoiceHandlerBase                            ← shared upsert
  │   ├── InvoiceCreatedHandler                     (status: draft)
  │   ├── InvoicePaidHandler                        (status: paid)
  │   ├── InvoicePaymentFailedHandler               (status: open)
  │   ├── InvoiceFinalizedHandler                   (status: open)
  │   └── InvoiceVoidedHandler                      (status: void)
  ├── PaymentSucceededHandler
  ├── PaymentFailedHandler
  ├── ChargeRefundedHandler
  └── CheckoutSessionCompletedHandler               ← LTD fork lives here
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Most leaves are three lines. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;SubscriptionCreatedHandler&lt;/code&gt; is literally:&lt;/p&gt;

&lt;div class=&quot;language-ruby highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;SubscriptionCreatedHandler&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;SubscriptionHandlerBase&lt;/span&gt;
  &lt;span class=&quot;no&quot;&gt;SEAMS_EVENT&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;subscription.created.billing&quot;&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The shared upsert lives in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;SubscriptionHandlerBase&lt;/code&gt;. The leaf only declares which canonical event name to publish. That is the entire difference between “subscription created” and “subscription updated” – one constant.&lt;/p&gt;

&lt;h2 id=&quot;three-patterns-one-shape&quot;&gt;Three patterns, one shape&lt;/h2&gt;

&lt;p&gt;What I have just described is three classical patterns layered on top of each other.&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;Template Method&lt;/strong&gt; pattern is doing the heavy lifting in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;SubscriptionHandlerBase&lt;/code&gt;: a base class defines the algorithm (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;upsert + publish&lt;/code&gt;), subclasses fill in the variable parts (the canonical event name, the invoice status). When five out of six subclasses share the same logic and only the constants differ, Template Method is the right shape. It keeps the shared code in one place and makes each variant trivial to read.&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;Strategy&lt;/strong&gt; pattern is the relationship between the controller and the handlers: the controller does not know which handler will run; it asks the router for one and invokes it through a uniform interface (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;handler.new(event:, gateway:).call&lt;/code&gt;). The controller and the router are decoupled from the concrete strategy. Adding a new strategy does not require changing either of them.&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;Registry&lt;/strong&gt; pattern is what &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;EventRouter&lt;/code&gt; is. It is a hash of strings to class names with a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;register&lt;/code&gt; method that lets hosts add their own mappings without monkey-patching. This is the seam that turns a closed system into an open one. A consuming application can write:&lt;/p&gt;

&lt;div class=&quot;language-ruby highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;no&quot;&gt;Billing&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Webhooks&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;EventRouter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;register&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
  &lt;span class=&quot;s2&quot;&gt;&quot;customer.tax_id.created&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  &lt;span class=&quot;s2&quot;&gt;&quot;MyApp::TaxIdCreatedHandler&quot;&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;…and now a Stripe event type that the gem never heard of routes to host code. No subclassing, no config block, no fork. The extension point is published. This is what people mean when they say “open for extension, closed for modification” – the framework’s behaviour does not need to change for the host to add behaviour.&lt;/p&gt;

&lt;h2 id=&quot;five-concrete-wins&quot;&gt;Five concrete wins&lt;/h2&gt;

&lt;p&gt;The reason I find this shape worth talking about is that it pays off in five different ways, and the wins compound.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Adding event types becomes a one-class job.&lt;/strong&gt; A new file, four lines long, registered in one place. There is no integration risk because the controller does not change. There is no “what else does this method do?” anxiety because each handler does one thing.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Each handler is testable in isolation.&lt;/strong&gt; Today’s controller spec used to need a full request stack. After the refactor, a handler spec is a Plain Old Object instantiated with a hash. I have a directory of saved Stripe event fixtures from Phase 3 (1/4) of the same project; a handler spec reads one, calls &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;.new(event:, gateway:).call&lt;/code&gt;, and asserts on the resulting database state. That is a unit test, not an integration test. It runs in milliseconds.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Lifetime Deal fork stops being a special case.&lt;/strong&gt; Before: there was a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;checkout_lifetime?&lt;/code&gt; predicate baked into the controller, with its own branching. After: &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;CheckoutSessionCompletedHandler&lt;/code&gt; examines &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mode&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;metadata.access_type&lt;/code&gt; and forks internally. The controller does not know LTDs exist. The router does not know LTDs exist. Only the handler that needs to know, knows. That is the kind of thing that lets you delete the LTD feature later – if product changes its mind – by deleting one file.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Async dispatch becomes a config flip, not a code rewrite.&lt;/strong&gt; I shipped a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ProcessEventJob&lt;/code&gt; that takes the same &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;(gateway:, event_data:)&lt;/code&gt; arguments and calls the same router. The controller checks &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Billing.configuration.process_webhooks_async&lt;/code&gt; and either runs the handler in the request thread or enqueues the job. Stripe recommends responding in &amp;lt;100ms; hosts who need that flip a flag. Hosts who prefer the existing transactional semantics (handler failure rolls back the WebhookEvent insert, Stripe retries) keep them. &lt;em&gt;The handler did not change.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The extension point is published.&lt;/strong&gt; Hosts adding Stripe events the gem does not ship with do not have to fork the gem. They write a handler in their own codebase and call &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;EventRouter.register&lt;/code&gt;. This is the difference between a tool you use and a tool you have to maintain a fork of.&lt;/p&gt;

&lt;h2 id=&quot;the-trade-off-honestly&quot;&gt;The trade-off, honestly&lt;/h2&gt;

&lt;p&gt;Thirteen small files where there was one large file. That is a real cost. You now have to navigate a directory tree to read all the webhook code, and someone seeing the codebase for the first time will spend a minute orienting themselves.&lt;/p&gt;

&lt;p&gt;I think it is the right trade. Here is why.&lt;/p&gt;

&lt;p&gt;When code is in one big file, you read it linearly. When it is in thirteen small files, you read whichever file matches the case you care about. The “navigate a directory” cost is only paid by readers who need to understand the &lt;em&gt;whole system&lt;/em&gt;. Readers who only need to understand “what happens when an invoice is paid” go to one file with seven lines in it.&lt;/p&gt;

&lt;p&gt;The opposite trade – one big file – is paid every time you change anything. Every commit shows up in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;git blame&lt;/code&gt; next to unrelated code. Every test has to set up state for the whole controller. Every bug fix risks breaking a sibling case. The cost is paid continuously, by everyone, forever.&lt;/p&gt;

&lt;h2 id=&quot;when-this-pattern-does-not-apply&quot;&gt;When this pattern does not apply&lt;/h2&gt;

&lt;p&gt;If your controller has two event types and they are stable forever, leave it alone. The refactor’s value is in &lt;em&gt;case count&lt;/em&gt; and &lt;em&gt;change frequency&lt;/em&gt;. With two cases, the inheritance tree adds more cognitive load than it removes.&lt;/p&gt;

&lt;p&gt;The breakeven I have seen empirically is somewhere around five cases or “I expect this to grow.” Below that, a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;case&lt;/code&gt; statement is fine. Above that, the per-case classes start paying for themselves.&lt;/p&gt;

&lt;p&gt;The other condition is that the cases must be &lt;em&gt;similar shape&lt;/em&gt;. Webhook handlers are: each takes an event hash, optionally upserts something, publishes a canonical event. That uniformity is what makes a registry possible. If your “cases” each take different inputs and do different things, you do not have a Strategy problem – you have a routing problem at the wrong layer.&lt;/p&gt;

&lt;h2 id=&quot;the-generalisable-lesson&quot;&gt;The generalisable lesson&lt;/h2&gt;

&lt;p&gt;The Single Responsibility Principle scales by case count. A controller with one verb has one responsibility. A controller with seven verbs has seven, and that does not feel like a problem until your case count grows enough that the verbs start interfering with each other.&lt;/p&gt;

&lt;p&gt;When you find yourself adding the &lt;em&gt;Nth&lt;/em&gt; &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;when&lt;/code&gt; to a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;case&lt;/code&gt; statement, or the &lt;em&gt;Nth&lt;/em&gt; &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;if&lt;/code&gt; to a method that is already long, ask whether the cases should be classes. Not always – it costs more files. But often enough that the question is worth asking every time.&lt;/p&gt;

&lt;p&gt;Three follow-on principles fall out of this:&lt;/p&gt;

&lt;p&gt;When the cases share most of the work and differ in constants, &lt;strong&gt;Template Method&lt;/strong&gt; keeps the shared code in one place.&lt;/p&gt;

&lt;p&gt;When the cases need to be looked up by a runtime value (an event type, a strategy name, a content type), &lt;strong&gt;Registry&lt;/strong&gt; is the seam that lets you add cases without editing the dispatcher.&lt;/p&gt;

&lt;p&gt;When the host application needs to add cases that the framework never anticipated, &lt;strong&gt;publish the extension point&lt;/strong&gt;. A &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;register&lt;/code&gt; method on a public module is worth more than any documentation telling people how to monkey-patch.&lt;/p&gt;

&lt;p&gt;The webhook controller is just the example. The shape is everywhere.&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;&lt;em&gt;The code in this post is from &lt;a href=&quot;https://github.com/Davidslv/seams&quot;&gt;Seams&lt;/a&gt;, an open-source gem that scaffolds modular Rails engines. The Billing engine ships the full handler hierarchy, the registry, and the opt-in async job, ready to use in your host application. If you find yourself building this pattern by hand, you might save a few hours.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;If you want the longer story on building modular Rails applications — boundaries, engines, testing, and the trade-offs — that is what &lt;a href=&quot;/modular-rails/&quot;&gt;Modular Rails: Architecture for the Long Game&lt;/a&gt; covers in depth. &lt;a href=&quot;/books/modular-rails/&quot;&gt;Read it free on the web&lt;/a&gt;, or get the &lt;a href=&quot;https://www.amazon.com/dp/1066649405&quot;&gt;paperback&lt;/a&gt; (&lt;a href=&quot;https://www.amazon.co.uk/dp/1066649405&quot;&gt;UK&lt;/a&gt;).&lt;/em&gt;&lt;/p&gt;
</description>
        <pubDate>Tue, 16 Jun 2026 00:00:00 +0000</pubDate>
        <link>https://davidslv.uk/2026/06/16/from-one-controller-to-thirteen-handlers.html</link>
        <guid isPermaLink="true">https://davidslv.uk/2026/06/16/from-one-controller-to-thirteen-handlers.html</guid>
        
        
      </item>
    
      <item>
        <title>When Rails Engines Are the Wrong Tool</title>
        <description>&lt;p&gt;I have spent an entire book making the case for Rails engines. Now I am going to tell you when &lt;em&gt;not&lt;/em&gt; to use them.&lt;/p&gt;

&lt;p&gt;This is not a hedge. It is honesty. Every architectural tool has a cost, and engines are no exception. Using them when they are not warranted creates overhead that slows your team down rather than speeding them up. Knowing when &lt;em&gt;not&lt;/em&gt; to reach for an engine is just as important as knowing how to build one.&lt;/p&gt;

&lt;h2 id=&quot;the-decision-flowchart&quot;&gt;The Decision Flowchart&lt;/h2&gt;

&lt;p&gt;Before introducing an engine, run through this:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-mermaid&quot;&gt;flowchart TD
    A[&quot;Do you have more than&amp;lt;br/&amp;gt;10-15 models?&quot;] --&amp;gt;|No| B[&quot;Engines are overkill.&amp;lt;br/&amp;gt;Use namespaces.&quot;]
    A --&amp;gt;|Yes| C[&quot;Do you have more than&amp;lt;br/&amp;gt;one team or domain area?&quot;]

    C --&amp;gt;|No| D[&quot;Consider namespaces&amp;lt;br/&amp;gt;and conventions first.&quot;]
    C --&amp;gt;|Yes| E[&quot;Do you have clear&amp;lt;br/&amp;gt;domain boundaries?&quot;]

    E --&amp;gt;|No| F[&quot;Identify boundaries first.&amp;lt;br/&amp;gt;Engines can wait.&quot;]
    E --&amp;gt;|Yes| G[&quot;Will the engine have&amp;lt;br/&amp;gt;its own models and&amp;lt;br/&amp;gt;business logic?&quot;]

    G --&amp;gt;|No| H[&quot;A plain Ruby gem&amp;lt;br/&amp;gt;or concern may suffice.&quot;]
    G --&amp;gt;|Yes| I[&quot;An engine is&amp;lt;br/&amp;gt;likely the right choice.&quot;]

    style A fill:#e8a838,stroke:#b07828,color:#fff
    style B fill:#d9654a,stroke:#8a3a2c,color:#fff
    style C fill:#e8a838,stroke:#b07828,color:#fff
    style D fill:#4a90d9,stroke:#2c5f8a,color:#fff
    style E fill:#e8a838,stroke:#b07828,color:#fff
    style F fill:#4a90d9,stroke:#2c5f8a,color:#fff
    style G fill:#e8a838,stroke:#b07828,color:#fff
    style H fill:#4a90d9,stroke:#2c5f8a,color:#fff
    style I fill:#27ae60,stroke:#1e8449,color:#fff
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Notice how many paths lead away from “use an engine.” That is intentional. Engines should be the answer to a specific problem, not the default structure for every Rails application.&lt;/p&gt;

&lt;h2 id=&quot;applications-that-are-too-small&quot;&gt;Applications That Are Too Small&lt;/h2&gt;

&lt;p&gt;If your application has fewer than 10-15 models, engines almost certainly add more overhead than value. The ceremony of gemspecs, dummy apps, mountable routes, and inter-engine dependency management is not justified when the entire codebase fits comfortably in one developer’s head.&lt;/p&gt;

&lt;p&gt;For small applications, namespaces give you most of the organisational benefit at zero cost:&lt;/p&gt;

&lt;div class=&quot;language-ruby highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c1&quot;&gt;# app/models/billing/invoice.rb&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;module&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;Billing&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Invoice&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;ApplicationRecord&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# All the billing logic, clearly namespaced&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;

&lt;span class=&quot;c1&quot;&gt;# app/models/notifications/mailer.rb&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;module&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;Notifications&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Mailer&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;ApplicationRecord&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# All the notification logic, clearly namespaced&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This communicates domain boundaries to developers without introducing any infrastructure. The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Billing::&lt;/code&gt; prefix tells you where this class belongs. The directory structure mirrors the namespace. It is not enforced, but it is clear.&lt;/p&gt;

&lt;h2 id=&quot;teams-that-are-too-small&quot;&gt;Teams That Are Too Small&lt;/h2&gt;

&lt;p&gt;A team of two or three Software Engineers working on a single application does not need engine boundaries. The communication overhead of a small team is low enough that conventions and code review are sufficient to maintain boundaries.&lt;/p&gt;

&lt;p&gt;Engines shine when teams are large enough that not everyone can hold the full codebase in their head. If every developer on your team already knows every model, every controller, and every service object, the boundary enforcement that engines provide is solving a problem you do not have.&lt;/p&gt;

&lt;p&gt;The threshold is not precise, but in my experience, engines start paying for themselves when you have 5+ developers working on a codebase with 30+ models across at least 2-3 distinct domain areas.&lt;/p&gt;

&lt;h2 id=&quot;the-honest-calculation&quot;&gt;The Honest Calculation&lt;/h2&gt;

&lt;p&gt;Before introducing engines, ask three questions:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;What is the actual cost of the problem we are solving?&lt;/strong&gt; Not the theoretical cost. The actual cost. How many hours per month does your team lose to cross-domain coupling? How many production incidents were caused by unexpected dependencies? If you cannot point to specific, recent pain, the problem may not justify the solution.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;What is the ongoing cost of the engine infrastructure?&lt;/strong&gt; Each engine needs its own gemspec, its own test setup, its own factories, its own migration strategy. Someone has to maintain that infrastructure. That someone is usually the most senior developer on the team, which means your most expensive resource is spending time on plumbing.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;Is there a cheaper solution that gets us 80% of the benefit?&lt;/strong&gt; Namespaces, conventions, Packwerk, or even just better code review might address the boundary problem without the full weight of engines.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;h2 id=&quot;the-premature-boundary-trap&quot;&gt;The Premature Boundary Trap&lt;/h2&gt;

&lt;p&gt;The most common mistake is drawing boundaries before you understand the domain. You create an &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;engines/billing&lt;/code&gt; and an &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;engines/shipping&lt;/code&gt; on day one, then discover three months later that billing and shipping share a concept – “order line items” – that does not fit neatly into either engine.&lt;/p&gt;

&lt;p&gt;Now you have three bad options: duplicate the concept, create a third engine for shared logic, or collapse the boundary you just built. All of them are expensive. The premature boundary cost you more than having no boundary at all.&lt;/p&gt;

&lt;p&gt;The antidote is simple: wait. Let the domain reveal its boundaries through co-change patterns (as discussed in Chapter 9) rather than guessing them up front. Six months of git history is a better domain expert than any whiteboard session.&lt;/p&gt;

&lt;h2 id=&quot;signs-you-have-over-modularised&quot;&gt;Signs You Have Over-Modularised&lt;/h2&gt;

&lt;p&gt;If you have already introduced engines, watch for these signals that you have gone too far:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-mermaid&quot;&gt;graph TB
    subgraph &quot;Signs of Over-Modularisation&quot;
        S1[&quot;Every PR touches&amp;lt;br/&amp;gt;3+ engines&quot;]
        S2[&quot;Engine interfaces are&amp;lt;br/&amp;gt;thicker than the logic&amp;lt;br/&amp;gt;behind them&quot;]
        S3[&quot;Developers spend more time&amp;lt;br/&amp;gt;on engine plumbing than&amp;lt;br/&amp;gt;on features&quot;]
        S4[&quot;Cross-engine integration&amp;lt;br/&amp;gt;tests outnumber&amp;lt;br/&amp;gt;unit tests&quot;]
        S5[&quot;New developers take&amp;lt;br/&amp;gt;longer to onboard&amp;lt;br/&amp;gt;than before engines&quot;]
    end

    S1 --&amp;gt; FIX[&quot;Consider merging&amp;lt;br/&amp;gt;those engines&quot;]
    S2 --&amp;gt; FIX
    S3 --&amp;gt; FIX
    S4 --&amp;gt; FIX
    S5 --&amp;gt; FIX

    style S1 fill:#d9654a,stroke:#8a3a2c,color:#fff
    style S2 fill:#d9654a,stroke:#8a3a2c,color:#fff
    style S3 fill:#d9654a,stroke:#8a3a2c,color:#fff
    style S4 fill:#d9654a,stroke:#8a3a2c,color:#fff
    style S5 fill:#d9654a,stroke:#8a3a2c,color:#fff
    style FIX fill:#27ae60,stroke:#1e8449,color:#fff
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;If every pull request touches three or more engines, your boundaries are in the wrong place. If the interface code between engines is more complex than the domain logic inside them, you have created accidental complexity. If new developers are slower to become productive than they were before you introduced engines, the architecture is working against you.&lt;/p&gt;

&lt;p&gt;The fix is not to abandon engines entirely. It is to merge the ones that should not have been separate in the first place. Collapsing a bad boundary is not failure – it is learning.&lt;/p&gt;

&lt;h2 id=&quot;alternatives-that-might-be-enough&quot;&gt;Alternatives That Might Be Enough&lt;/h2&gt;

&lt;p&gt;Before reaching for an engine, consider whether one of these lighter-weight alternatives solves your problem:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Namespaces and directory structure.&lt;/strong&gt; Zero cost, immediate clarity. If your problem is “developers put code in the wrong place,” namespaces may be all you need.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Concerns and modules.&lt;/strong&gt; Shared behaviour extracted into mixins. Not a boundary mechanism, but effective for reducing duplication within a bounded context.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Service objects.&lt;/strong&gt; Encapsulate a business operation in a single class. Good for complex workflows, but they do not create boundaries – they live inside them.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Packwerk.&lt;/strong&gt; Static boundary analysis without runtime isolation. If your problem is “we want to detect boundary violations” rather than “we need hard enforcement,” Packwerk gives you most of the benefit at a fraction of the cost.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Plain Ruby gems.&lt;/strong&gt; If the module has no Rails dependencies, a gem gives you complete isolation with minimal ceremony. A pricing calculator, a tax engine, a PDF generator – these are gems, not engines.&lt;/p&gt;

&lt;p&gt;Each of these tools has a place. The mature Software Engineer asks “what is the cheapest tool that solves my actual problem?” rather than “what is the most architecturally pure solution?”&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;Knowing when not to use a tool is a sign of mastery, not timidity. The best architectures are not the ones with the most boundaries. They are the ones where every boundary earns its keep.&lt;/p&gt;

&lt;p&gt;Honesty about what your application actually needs – not what it might need someday – is the beginning of architectural maturity.&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;&lt;em&gt;This was adapted from Chapter 15 of &lt;a href=&quot;/modular-rails/&quot;&gt;Modular Rails: Architecture for the Long Game&lt;/a&gt;. The book covers the full analysis including performance overhead, boot time impact, memory considerations, and route compilation costs.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Read the &lt;a href=&quot;/books/modular-rails/&quot;&gt;&lt;strong&gt;entire book free on the web&lt;/strong&gt;&lt;/a&gt; — every chapter, no paywall. Prefer print or Kindle? &lt;a href=&quot;https://www.amazon.com/dp/1066649405&quot;&gt;Amazon US&lt;/a&gt; · &lt;a href=&quot;https://www.amazon.co.uk/dp/1066649405&quot;&gt;Amazon UK&lt;/a&gt; · &lt;a href=&quot;/modular-rails/&quot;&gt;all editions &amp;amp; prices&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;
</description>
        <pubDate>Tue, 09 Jun 2026 00:00:00 +0000</pubDate>
        <link>https://davidslv.uk/2026/06/09/when-rails-engines-are-wrong-tool.html</link>
        <guid isPermaLink="true">https://davidslv.uk/2026/06/09/when-rails-engines-are-wrong-tool.html</guid>
        
        
      </item>
    
      <item>
        <title>Testing Strategy for a Modular Rails Application</title>
        <description>&lt;p&gt;&lt;em&gt;This is an adapted excerpt from Chapter 13 of &lt;a href=&quot;/modular-rails/&quot;&gt;Modular Rails: Architecture for the Long Game&lt;/a&gt;, my book on building maintainable Ruby on Rails applications using Rails Engines.&lt;/em&gt;&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;Your test suite takes 40 minutes. Mine takes 4.&lt;/p&gt;

&lt;p&gt;That is not because I write fewer tests. It is because when I change billing code, I only run billing tests. When I change notification code, I only run notification tests. The full suite runs in CI, but a Software Engineer working on a single engine gets feedback in seconds, not minutes.&lt;/p&gt;

&lt;p&gt;This is the testing payoff of a modular architecture. But it does not happen automatically. Engines need a deliberate testing strategy – one that preserves isolation while catching integration failures.&lt;/p&gt;

&lt;h2 id=&quot;the-testing-pyramid-for-engines&quot;&gt;The Testing Pyramid for Engines&lt;/h2&gt;

&lt;p&gt;The testing pyramid for a modular Rails application has an extra dimension: scope.&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-mermaid&quot;&gt;graph TB
    subgraph &quot;Testing Pyramid&quot;
        E2E[&quot;End-to-End Tests&amp;lt;br/&amp;gt;(host app, few)&quot;]
        INT[&quot;Integration Tests&amp;lt;br/&amp;gt;(cross-engine, some)&quot;]
        CONTRACT[&quot;Contract Tests&amp;lt;br/&amp;gt;(engine boundaries, moderate)&quot;]
        UNIT[&quot;Unit Tests&amp;lt;br/&amp;gt;(inside engine, many)&quot;]
    end

    E2E --- INT
    INT --- CONTRACT
    CONTRACT --- UNIT

    style E2E fill:#d9654a,stroke:#8a3a2c,color:#fff
    style INT fill:#e8a838,stroke:#b07828,color:#fff
    style CONTRACT fill:#4a90d9,stroke:#2c5f8a,color:#fff
    style UNIT fill:#27ae60,stroke:#1e8449,color:#fff
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The bottom of the pyramid – unit tests inside an engine – should be the vast majority of your tests. These run fast because they only load the engine, not the entire application. Contract tests verify that the interfaces between engines work correctly. Integration tests confirm that engines compose properly. End-to-end tests are few and focused on critical user journeys.&lt;/p&gt;

&lt;h2 id=&quot;the-dummy-app-and-rspec-setup&quot;&gt;The Dummy App and RSpec Setup&lt;/h2&gt;

&lt;p&gt;Every engine generated by Rails includes a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;test/dummy&lt;/code&gt; (or &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;spec/dummy&lt;/code&gt;) application. This is a minimal Rails app that mounts your engine, providing just enough context to run tests without loading your full application.&lt;/p&gt;

&lt;p&gt;Here is a typical &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;spec/rails_helper.rb&lt;/code&gt; for an engine:&lt;/p&gt;

&lt;div class=&quot;language-ruby highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c1&quot;&gt;# engines/billing/spec/rails_helper.rb&lt;/span&gt;
&lt;span class=&quot;nb&quot;&gt;require&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;spec_helper&quot;&lt;/span&gt;

&lt;span class=&quot;no&quot;&gt;ENV&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;RAILS_ENV&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;||=&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;test&quot;&lt;/span&gt;
&lt;span class=&quot;nb&quot;&gt;require&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;File&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;expand_path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;dummy/config/environment&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;__dir__&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;nb&quot;&gt;abort&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;The Rails environment is running in production mode!&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Rails&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;env&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;production?&lt;/span&gt;

&lt;span class=&quot;nb&quot;&gt;require&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;rspec/rails&quot;&lt;/span&gt;

&lt;span class=&quot;c1&quot;&gt;# Load engine factories&lt;/span&gt;
&lt;span class=&quot;no&quot;&gt;Dir&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Billing&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Engine&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;root&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;join&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;spec/factories/**/*.rb&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)].&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;each&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;require&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;f&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;no&quot;&gt;ActiveRecord&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Migration&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;maintain_test_schema!&lt;/span&gt;

&lt;span class=&quot;no&quot;&gt;RSpec&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;configure&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;config&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;config&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;fixture_paths&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Billing&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Engine&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;root&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;join&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;spec/fixtures&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)]&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;config&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;use_transactional_fixtures&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kp&quot;&gt;true&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;config&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;infer_spec_type_from_file_location!&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;config&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;filter_rails_from_backtrace!&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Notice that this helper loads the dummy app, not your real application. The engine’s tests are completely self-contained. They boot in a fraction of the time because they only load billing code, not the 200 models from the rest of your application.&lt;/p&gt;

&lt;h2 id=&quot;engine-factory-setup&quot;&gt;Engine Factory Setup&lt;/h2&gt;

&lt;p&gt;Factories need special attention in a modular application. Each engine should define its own factories, and those factories should only reference models within the engine:&lt;/p&gt;

&lt;div class=&quot;language-ruby highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c1&quot;&gt;# engines/billing/spec/factories/invoices.rb&lt;/span&gt;
&lt;span class=&quot;no&quot;&gt;FactoryBot&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;define&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;factory&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:invoice&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;class: &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Billing::Invoice&quot;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;sequence&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:number&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;INV-&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;#{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;to_s&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;rjust&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;6&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&apos;0&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;amount&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;99.99&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;currency&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;GBP&quot;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;status&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:draft&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;user_id&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# Simple foreign key, no User factory dependency&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The key decision here is &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;user_id { 1 }&lt;/code&gt; instead of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;association :user&lt;/code&gt;. The billing engine should not depend on a User factory from the core application. It only needs a valid foreign key. This keeps the engine’s tests truly independent.&lt;/p&gt;

&lt;h2 id=&quot;contract-tests-the-boundary-guarantee&quot;&gt;Contract Tests: The Boundary Guarantee&lt;/h2&gt;

&lt;p&gt;Contract tests verify that engines honour their interfaces. They are the most important and most underused testing pattern in modular applications.&lt;/p&gt;

&lt;p&gt;Here is a concrete example. Your billing engine expects that any “billable” object responds to certain methods:&lt;/p&gt;

&lt;div class=&quot;language-ruby highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c1&quot;&gt;# engines/billing/spec/contracts/billable_contract.rb&lt;/span&gt;
&lt;span class=&quot;no&quot;&gt;RSpec&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;shared_examples&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;a billable&quot;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;it&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;is_expected&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;to&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;respond_to&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:email&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;it&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;is_expected&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;to&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;respond_to&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:billing_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;it&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;is_expected&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;to&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;respond_to&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:stripe_customer_id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;it&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;is_expected&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;to&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;respond_to&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:billing_address&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The billing engine defines this contract. Any model that wants to be billable must pass it:&lt;/p&gt;

&lt;div class=&quot;language-ruby highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c1&quot;&gt;# In the host app or core engine&lt;/span&gt;
&lt;span class=&quot;no&quot;&gt;RSpec&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;describe&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;User&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;it_behaves_like&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;a billable&quot;&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Now here is where contract tests prove their worth. Imagine you upgrade the billing engine and add a new requirement to the billable interface:&lt;/p&gt;

&lt;div class=&quot;language-ruby highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c1&quot;&gt;# After upgrade: billing engine v2.0&lt;/span&gt;
&lt;span class=&quot;no&quot;&gt;RSpec&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;shared_examples&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;a billable&quot;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;it&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;is_expected&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;to&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;respond_to&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:email&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;it&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;is_expected&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;to&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;respond_to&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:billing_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;it&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;is_expected&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;to&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;respond_to&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:stripe_customer_id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;it&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;is_expected&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;to&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;respond_to&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:billing_address&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;it&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;is_expected&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;to&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;respond_to&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:tax_id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# New in v2.0&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;When the host app runs its tests, the contract test fails immediately:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;Failures:

  1) User behaves like a billable is expected to respond to :tax_id
     Failure/Error: it { is_expected.to respond_to(:tax_id) }
       expected #&amp;lt;User&amp;gt; to respond to :tax_id
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The failure is clear, specific, and caught before deployment. Without contract tests, this would surface as a runtime error in production when someone tries to invoice a user without a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;tax_id&lt;/code&gt;.&lt;/p&gt;

&lt;h2 id=&quot;selective-test-execution&quot;&gt;Selective Test Execution&lt;/h2&gt;

&lt;p&gt;The real speed gain comes from only running the tests that matter. Here is a script that determines which engines were affected by a change and runs only their tests:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;#!/bin/bash&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;# scripts/run_affected_tests.sh&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;# Runs tests only for engines that changed since the base branch&lt;/span&gt;

&lt;span class=&quot;nv&quot;&gt;BASE_BRANCH&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;${&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;:-&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;main&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;CHANGED_FILES&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;$(&lt;/span&gt;git diff &lt;span class=&quot;nt&quot;&gt;--name-only&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$BASE_BRANCH&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;...HEAD&lt;span class=&quot;si&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;nv&quot;&gt;AFFECTED_ENGINES&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=()&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;for &lt;/span&gt;file &lt;span class=&quot;k&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$CHANGED_FILES&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do
  if&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;[[&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$file&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; engines/&lt;span class=&quot;k&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;]]&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;then
    &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;engine&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;$(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;echo&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$file&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt; | &lt;span class=&quot;nb&quot;&gt;cut&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-d&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&apos;/&apos;&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-f2&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;[[&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot; &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;${&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;AFFECTED_ENGINES&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[@]&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt; &quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;~ &lt;span class=&quot;s2&quot;&gt;&quot; &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;${&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;engine&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt; &quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;]]&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;then
      &lt;/span&gt;AFFECTED_ENGINES+&lt;span class=&quot;o&quot;&gt;=(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$engine&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;fi
  fi
done

if&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;[&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;${#&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;AFFECTED_ENGINES&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[@]&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-eq&lt;/span&gt; 0 &lt;span class=&quot;o&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;then
  &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;echo&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;No engine changes detected. Running host app tests only.&quot;&lt;/span&gt;
  bundle &lt;span class=&quot;nb&quot;&gt;exec &lt;/span&gt;rspec spec/
&lt;span class=&quot;k&quot;&gt;else
  &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;echo&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;Affected engines: &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;${&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;AFFECTED_ENGINES&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[*]&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;for &lt;/span&gt;engine &lt;span class=&quot;k&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;${&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;AFFECTED_ENGINES&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[@]&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do
    &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;echo&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;--- Testing &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$engine&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt; ---&quot;&lt;/span&gt;
    &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;cd&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;engines/&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$engine&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&amp;amp;&lt;/span&gt; bundle &lt;span class=&quot;nb&quot;&gt;exec &lt;/span&gt;rspec&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;done
  &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;echo&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;--- Testing host app integration ---&quot;&lt;/span&gt;
  bundle &lt;span class=&quot;nb&quot;&gt;exec &lt;/span&gt;rspec spec/
&lt;span class=&quot;k&quot;&gt;fi&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This script is the bridge between local development speed and CI thoroughness. Locally, a Software Engineer runs only the affected engine’s tests. In CI, you can run this script for PR builds while running the full suite on merge to main.&lt;/p&gt;

&lt;h2 id=&quot;ci-flow&quot;&gt;CI Flow&lt;/h2&gt;

&lt;p&gt;Your CI pipeline should reflect the modular structure:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-mermaid&quot;&gt;graph LR
    PR[&quot;Pull Request&quot;] --&amp;gt; DETECT[&quot;Detect Changed&amp;lt;br/&amp;gt;Engines&quot;]
    DETECT --&amp;gt; E1[&quot;Test Engine A&quot;]
    DETECT --&amp;gt; E2[&quot;Test Engine B&quot;]
    DETECT --&amp;gt; E3[&quot;Test Engine C&quot;]
    E1 --&amp;gt; INT[&quot;Integration Tests&quot;]
    E2 --&amp;gt; INT
    E3 --&amp;gt; INT
    INT --&amp;gt; MERGE[&quot;Merge&quot;]

    style PR fill:#4a90d9,stroke:#2c5f8a,color:#fff
    style DETECT fill:#e8a838,stroke:#b07828,color:#fff
    style E1 fill:#27ae60,stroke:#1e8449,color:#fff
    style E2 fill:#27ae60,stroke:#1e8449,color:#fff
    style E3 fill:#27ae60,stroke:#1e8449,color:#fff
    style INT fill:#8e44ad,stroke:#6c3483,color:#fff
    style MERGE fill:#27ae60,stroke:#1e8449,color:#fff
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Each engine’s tests run in parallel. Only the affected engines are tested on PR builds. Integration tests run after engine tests pass. The full suite runs as a merge gate.&lt;/p&gt;

&lt;h2 id=&quot;why-your-test-suite-gets-faster&quot;&gt;Why Your Test Suite Gets Faster&lt;/h2&gt;

&lt;p&gt;Let’s do the arithmetic. Suppose your monolithic test suite has 3,000 tests that take 40 minutes. You extract the application into 5 engines with roughly equal test distribution:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Each engine&lt;/strong&gt;: ~600 tests, ~8 minutes&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Integration tests&lt;/strong&gt;: ~200 tests, ~5 minutes&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Parallel engine execution&lt;/strong&gt;: 8 minutes (all 5 run simultaneously)&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Total CI time&lt;/strong&gt;: 8 + 5 = &lt;strong&gt;13 minutes&lt;/strong&gt; (down from 40)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;But the real win is local development. A Software Engineer working on billing runs 600 tests in 8 minutes instead of 3,000 tests in 40 minutes. And because the engine boots faster (no loading 200 unrelated models), those 600 tests often run in under 4 minutes.&lt;/p&gt;

&lt;p&gt;The arithmetic only gets better as the application grows. Adding a new engine does not slow down existing engine tests. Each engine’s test time stays constant while the monolithic suite would keep growing.&lt;/p&gt;

&lt;p&gt;That is why your test suite takes 40 minutes and mine takes 4. Not cleverness. Structure.&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;&lt;em&gt;This was adapted from Chapter 13 of &lt;a href=&quot;/modular-rails/&quot;&gt;Modular Rails: Architecture for the Long Game&lt;/a&gt;. The book covers the full testing strategy including SimpleCov configuration, Capybara setup, database cleaning, CI YAML examples, and automated quality tools.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Read the &lt;a href=&quot;/books/modular-rails/&quot;&gt;&lt;strong&gt;entire book free on the web&lt;/strong&gt;&lt;/a&gt; — every chapter, no paywall. Prefer print or Kindle? &lt;a href=&quot;https://www.amazon.com/dp/1066649405&quot;&gt;Amazon US&lt;/a&gt; · &lt;a href=&quot;https://www.amazon.co.uk/dp/1066649405&quot;&gt;Amazon UK&lt;/a&gt; · &lt;a href=&quot;/modular-rails/&quot;&gt;all editions &amp;amp; prices&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;
</description>
        <pubDate>Tue, 02 Jun 2026 00:00:00 +0000</pubDate>
        <link>https://davidslv.uk/2026/06/02/testing-strategy-modular-rails.html</link>
        <guid isPermaLink="true">https://davidslv.uk/2026/06/02/testing-strategy-modular-rails.html</guid>
        
        
      </item>
    
      <item>
        <title>The Modular Monolith as the Default Starting Point</title>
        <description>&lt;p&gt;&lt;em&gt;This is an adapted excerpt from Chapter 17 of &lt;a href=&quot;/modular-rails/&quot;&gt;Modular Rails: Architecture for the Long Game&lt;/a&gt;, my book on building maintainable Ruby on Rails applications using Rails Engines.&lt;/em&gt;&lt;/p&gt;

&lt;hr /&gt;

&lt;blockquote&gt;
  &lt;p&gt;&lt;em&gt;“Majestic monolith. The vast majority of web applications should start here and never leave.”&lt;/em&gt;
– David Heinemeier Hansson&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The microservices conversation has been going on for over a decade now, and the industry is starting to reach a consensus that most teams arrived at too late: distributed systems are expensive, and the default starting point should be a well-structured monolith.&lt;/p&gt;

&lt;p&gt;This chapter makes the case that a modular monolith – specifically, a Rails application structured with engines – is the right default for most teams. Not because microservices are bad, but because the operational cost of distribution is almost always underestimated.&lt;/p&gt;

&lt;h2 id=&quot;the-operational-cost-nobody-talks-about&quot;&gt;The Operational Cost Nobody Talks About&lt;/h2&gt;

&lt;p&gt;Consider a simple operation: recording a payment. In a monolith, this is a method call:&lt;/p&gt;

&lt;div class=&quot;language-ruby highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c1&quot;&gt;# Monolith: one process, one database, one transaction&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;PaymentsController&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;ApplicationController&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;create&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;payment&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Payment&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;create!&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;payment_params&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;no&quot;&gt;Invoice&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;find&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;payment&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;invoice_id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;mark_paid!&lt;/span&gt;
    &lt;span class=&quot;no&quot;&gt;NotificationMailer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;payment_received&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;payment&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;deliver_later&lt;/span&gt;
    &lt;span class=&quot;no&quot;&gt;AuditLog&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;record&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:payment_created&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;payment&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;n&quot;&gt;render&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;json: &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;payment&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;status: :created&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Four operations, one request, one database transaction. If any step fails, the transaction rolls back. The code is straightforward to write, straightforward to test, and straightforward to debug.&lt;/p&gt;

&lt;p&gt;Now consider the same operation in a microservices architecture:&lt;/p&gt;

&lt;div class=&quot;language-ruby highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c1&quot;&gt;# Microservices: four services, four databases, eventual consistency&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;PaymentsController&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;ApplicationController&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;create&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;payment&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;PaymentService&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;create&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;payment_params&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;c1&quot;&gt;# Synchronous call to billing service&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;response&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;BillingClient&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;mark_invoice_paid&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
      &lt;span class=&quot;n&quot;&gt;payment&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;invoice_id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
      &lt;span class=&quot;ss&quot;&gt;idempotency_key: &lt;/span&gt;&lt;span class=&quot;no&quot;&gt;SecureRandom&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;uuid&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;raise&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;BillingServiceError&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;unless&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;response&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;success?&lt;/span&gt;

    &lt;span class=&quot;c1&quot;&gt;# Asynchronous event for notification service&lt;/span&gt;
    &lt;span class=&quot;no&quot;&gt;EventBus&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;publish&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;payment.created&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
      &lt;span class=&quot;ss&quot;&gt;payment_id: &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;payment&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
      &lt;span class=&quot;ss&quot;&gt;user_id: &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;payment&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;user_id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
      &lt;span class=&quot;ss&quot;&gt;amount: &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;payment&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;amount&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;

    &lt;span class=&quot;c1&quot;&gt;# Asynchronous event for audit service&lt;/span&gt;
    &lt;span class=&quot;no&quot;&gt;EventBus&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;publish&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;payment.created.audit&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
      &lt;span class=&quot;ss&quot;&gt;payment_id: &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;payment&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
      &lt;span class=&quot;ss&quot;&gt;action: :created&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
      &lt;span class=&quot;ss&quot;&gt;timestamp: &lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Time&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;current&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;iso8601&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;

    &lt;span class=&quot;n&quot;&gt;render&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;json: &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;payment&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;status: :created&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;rescue&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;BillingClient&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;TimeoutError&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# What do we do? Payment is recorded but invoice is not marked paid.&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# Retry? Compensate? Queue for later?&lt;/span&gt;
    &lt;span class=&quot;no&quot;&gt;CompensationJob&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;perform_later&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:payment_billing_sync&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;payment&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;render&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;json: &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;payment&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;status: :accepted&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# 202, not 201&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;rescue&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;EventBus&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;PublishError&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# Payment and invoice are updated but notifications may not fire.&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# Is this acceptable? Depends on the business rules.&lt;/span&gt;
    &lt;span class=&quot;no&quot;&gt;FailedEventJob&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;perform_later&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;payment.created&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;payment&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;render&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;json: &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;payment&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;status: :created&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The same four operations now involve network calls, serialisation, idempotency keys, timeout handling, compensation logic, and eventual consistency. The code is three times longer, but more importantly, the failure modes have multiplied. What happens when the billing service is down? What happens when the event bus loses a message? What happens when the compensation job fails?&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-mermaid&quot;&gt;graph LR
    subgraph &quot;Monolith&quot;
        M1[&quot;Controller&quot;] --&amp;gt; M2[&quot;Payment&quot;]
        M1 --&amp;gt; M3[&quot;Invoice&quot;]
        M1 --&amp;gt; M4[&quot;Mailer&quot;]
        M1 --&amp;gt; M5[&quot;AuditLog&quot;]
    end

    subgraph &quot;Microservices&quot;
        S1[&quot;Payment Service&quot;] --&amp;gt;|&quot;HTTP&quot;| S2[&quot;Billing Service&quot;]
        S1 --&amp;gt;|&quot;Event Bus&quot;| S3[&quot;Notification Service&quot;]
        S1 --&amp;gt;|&quot;Event Bus&quot;| S4[&quot;Audit Service&quot;]
        S2 --&amp;gt;|&quot;timeout?&quot;| S5[&quot;Compensation Job&quot;]
        S3 --&amp;gt;|&quot;lost message?&quot;| S6[&quot;Retry Queue&quot;]
    end

    style M1 fill:#27ae60,stroke:#1e8449,color:#fff
    style M2 fill:#27ae60,stroke:#1e8449,color:#fff
    style M3 fill:#27ae60,stroke:#1e8449,color:#fff
    style M4 fill:#27ae60,stroke:#1e8449,color:#fff
    style M5 fill:#27ae60,stroke:#1e8449,color:#fff
    style S1 fill:#e8a838,stroke:#b07828,color:#fff
    style S2 fill:#e8a838,stroke:#b07828,color:#fff
    style S3 fill:#e8a838,stroke:#b07828,color:#fff
    style S4 fill:#e8a838,stroke:#b07828,color:#fff
    style S5 fill:#d9654a,stroke:#8a3a2c,color:#fff
    style S6 fill:#d9654a,stroke:#8a3a2c,color:#fff
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Every arrow in the microservices diagram is a potential failure point. Every potential failure point needs handling code, monitoring, alerting, and runbooks.&lt;/p&gt;

&lt;h2 id=&quot;companies-that-came-back&quot;&gt;Companies That Came Back&lt;/h2&gt;

&lt;p&gt;The most compelling argument for starting with a monolith comes from companies that tried microservices and came back:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Amazon Prime Video&lt;/strong&gt; published a case study in 2023 describing how they moved from a distributed microservices architecture to a monolith for their video quality monitoring tool – and reduced costs by 90% while improving throughput. The distributed architecture created bottlenecks at service boundaries that vanished when the code ran in a single process.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Segment&lt;/strong&gt; famously migrated from a microservices architecture back to a monolith after discovering that the operational overhead of managing 120+ microservices was consuming more engineering time than feature development. Their CTO wrote candidly about how the microservices architecture that was supposed to enable faster development had become a tax on every team.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Istio&lt;/strong&gt;, the service mesh project, consolidated from multiple microservices into a single binary called Istiod. Their blog post explained that the microservices architecture added operational complexity without meaningful benefits at their scale.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Shopify&lt;/strong&gt; – one of the largest Rails applications in the world – chose a modular monolith over microservices. They invested heavily in Packwerk and component-based architecture rather than splitting into services. Their reasoning: the cost of network boundaries was not justified by the organisational benefits.&lt;/p&gt;

&lt;h2 id=&quot;engines-as-a-stepping-stone&quot;&gt;Engines as a Stepping Stone&lt;/h2&gt;

&lt;p&gt;The modular monolith gives you the best of both worlds. You get the organisational benefits of clear boundaries – team ownership, independent development, focused testing – without the operational cost of distribution.&lt;/p&gt;

&lt;p&gt;And critically, engines preserve the option to extract services later. An engine with a clean interface can become a microservice when (and only when) the operational cost is justified by a genuine need.&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-mermaid&quot;&gt;graph LR
    A[&quot;Monolith&amp;lt;br/&amp;gt;(everything in app/)&quot;] --&amp;gt;|&quot;Step 1:&amp;lt;br/&amp;gt;Add structure&quot;| B[&quot;Modular Monolith&amp;lt;br/&amp;gt;(engines)&quot;]
    B --&amp;gt;|&quot;Step 2:&amp;lt;br/&amp;gt;Only if needed&quot;| C[&quot;Selective Extraction&amp;lt;br/&amp;gt;(one engine becomes&amp;lt;br/&amp;gt;a service)&quot;]
    C --&amp;gt;|&quot;Step 3:&amp;lt;br/&amp;gt;Rarely needed&quot;| D[&quot;Distributed System&amp;lt;br/&amp;gt;(multiple services)&quot;]

    style A fill:#d9654a,stroke:#8a3a2c,color:#fff
    style B fill:#27ae60,stroke:#1e8449,color:#fff
    style C fill:#e8a838,stroke:#b07828,color:#fff
    style D fill:#4a90d9,stroke:#2c5f8a,color:#fff
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Most applications never get past step 2. And that is perfectly fine. The goal is not to arrive at microservices. The goal is to have a codebase that is maintainable, testable, and adaptable to whatever comes next.&lt;/p&gt;

&lt;h2 id=&quot;the-decision-framework&quot;&gt;The Decision Framework&lt;/h2&gt;

&lt;p&gt;When the microservices conversation comes up – and it will – use this framework:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-mermaid&quot;&gt;flowchart TD
    A[&quot;Do you have a scaling problem&amp;lt;br/&amp;gt;that cannot be solved with&amp;lt;br/&amp;gt;vertical scaling?&quot;] --&amp;gt;|Yes| B[&quot;Is the problem isolated&amp;lt;br/&amp;gt;to a specific domain?&quot;]
    A --&amp;gt;|No| C[&quot;Stay with&amp;lt;br/&amp;gt;modular monolith&quot;]

    B --&amp;gt;|Yes| D[&quot;Do you have the team&amp;lt;br/&amp;gt;and infrastructure to&amp;lt;br/&amp;gt;operate a distributed system?&quot;]
    B --&amp;gt;|No| C

    D --&amp;gt;|Yes| E[&quot;Extract that one&amp;lt;br/&amp;gt;domain as a service&quot;]
    D --&amp;gt;|No| F[&quot;Invest in infrastructure&amp;lt;br/&amp;gt;first, extract later&quot;]

    style A fill:#e8a838,stroke:#b07828,color:#fff
    style B fill:#e8a838,stroke:#b07828,color:#fff
    style C fill:#27ae60,stroke:#1e8449,color:#fff
    style D fill:#e8a838,stroke:#b07828,color:#fff
    style E fill:#4a90d9,stroke:#2c5f8a,color:#fff
    style F fill:#8e44ad,stroke:#6c3483,color:#fff
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The framework is deliberately conservative. Each “No” sends you back to the monolith because the default should be the simpler architecture. You only move to a distributed system when you have a specific, measurable problem that cannot be solved any other way, and the team and infrastructure to support it.&lt;/p&gt;

&lt;p&gt;Start with a modular monolith. Structure it well. Extract when – and only when – the evidence demands it.&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;&lt;em&gt;This was adapted from Chapter 17 of &lt;a href=&quot;/modular-rails/&quot;&gt;Modular Rails: Architecture for the Long Game&lt;/a&gt;. The book covers the full microservices question including network latency, debugging, data consistency, and the complete decision framework.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;For the bigger picture — engines, Packwerk, data ownership and the full set of trade-offs in one place — see &lt;a href=&quot;/modular-monolith-rails/&quot;&gt;The Modular Monolith in Rails&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Read the &lt;a href=&quot;/books/modular-rails/&quot;&gt;&lt;strong&gt;entire book free on the web&lt;/strong&gt;&lt;/a&gt; — every chapter, no paywall. Prefer print or Kindle? &lt;a href=&quot;https://www.amazon.com/dp/1066649405&quot;&gt;Amazon US&lt;/a&gt; · &lt;a href=&quot;https://www.amazon.co.uk/dp/1066649405&quot;&gt;Amazon UK&lt;/a&gt; · &lt;a href=&quot;/modular-rails/&quot;&gt;all editions &amp;amp; prices&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;
</description>
        <pubDate>Tue, 26 May 2026 00:00:00 +0000</pubDate>
        <link>https://davidslv.uk/2026/05/26/modular-monolith-default-starting-point.html</link>
        <guid isPermaLink="true">https://davidslv.uk/2026/05/26/modular-monolith-default-starting-point.html</guid>
        
        
      </item>
    
      <item>
        <title>Spec is the Artefact</title>
        <description>&lt;p&gt;A passing test tells you the implementation is correct. The second-order question — was the work behind this code the work we meant to do — is the one &lt;a href=&quot;https://davidslv.uk/2026/05/17/comprehension-debt.html&quot;&gt;comprehension debt&lt;/a&gt; and &lt;a href=&quot;https://davidslv.uk/2026/05/21/the-perception-gap.html&quot;&gt;the perception gap&lt;/a&gt; have both been circling. This post is the third leg.&lt;/p&gt;

&lt;p&gt;The argument of the first two posts was diagnostic. Teams using AI-assisted code accrue a gap between what exists in the codebase and what anyone on the team understands; the mechanism that hides the gap from inside is a perception failure on both sides of the review. Neither post offered a remedy. The closing of the second one promised this one would.&lt;/p&gt;

&lt;p&gt;There are several remedies that would be defensible at this point — more local architecture, mentor-model review, a financial reframing of the conversation with leadership. They are all real moves. The argument here is for the one I have found most leverage in: change what the primary artefact is.&lt;/p&gt;

&lt;h2 id=&quot;the-wrong-primary-artefact&quot;&gt;The wrong primary artefact&lt;/h2&gt;

&lt;p&gt;The implicit model for most code review, AI-assisted or not, is that code is the artefact. The author produces it. The reviewer evaluates it. Tests guard it. CI signs off on it. Everything is oriented to the diff.&lt;/p&gt;

&lt;p&gt;This worked, more or less, when the human writing the code carried the intent in their head. Code was a lossy projection of intent, and the reviewer could partially reconstruct the projection because both author and reviewer had been trained to read code as if it spoke for the work behind it. The PR description filled in what the code didn’t say.&lt;/p&gt;

&lt;p&gt;When the AI produces the code, the human author no longer carries the intent in the same way. The intent was in the prompt, in the chat session, in the back-and-forth — most of it gone by the time the diff lands in the reviewer’s queue. What remains is what the &lt;a href=&quot;https://davidslv.uk/2026/05/21/the-perception-gap.html&quot;&gt;previous post called an &lt;em&gt;appearance signal&lt;/em&gt;&lt;/a&gt; — visible enough to be trusted, opaque about whether the work behind it cohered. It looks like the work has been done. It cannot be inspected to confirm the work was done.&lt;/p&gt;

&lt;p&gt;The PR description is in the same category. A thoughtful PR description is the human side of a structured reasoning trace. The reviewer is now holding two appearance signals and no ground truth.&lt;/p&gt;

&lt;h2 id=&quot;spec-is-the-artefact&quot;&gt;Spec is the artefact&lt;/h2&gt;

&lt;p&gt;The structural move that follows is small and unfashionable. Stop treating the code as the primary artefact. Treat the &lt;em&gt;specification&lt;/em&gt; as the primary artefact — the contract the implementation is meant to honour, written by a human before the AI touches anything. Code becomes implementation detail. Review becomes verification against the contract.&lt;/p&gt;

&lt;p&gt;The framing has a forty-year lineage. Bertrand Meyer’s &lt;a href=&quot;https://se.inf.ethz.ch/~meyer/publications/computer/contract.pdf&quot;&gt;&lt;em&gt;Applying Design by Contract&lt;/em&gt;&lt;/a&gt; (IEEE Computer, 1992) made the same structural argument for Eiffel: a routine is the contract it promises to honour, the implementation is detail. What is new is what the cost of skipping the contract has become. Under human authoring, the cost was future debugging. Under AI authoring, the cost is that the reviewer cannot tell whether the work was done at all.&lt;/p&gt;

&lt;p&gt;The current vocabulary is &lt;strong&gt;Spec-Driven Development&lt;/strong&gt;, the framing The Serious CTO uses in &lt;a href=&quot;https://www.youtube.com/watch?v=3o2SlgX9BhE&quot;&gt;the talk&lt;/a&gt; this trilogy has drawn vocabulary from. GitHub’s &lt;a href=&quot;https://github.com/github/spec-kit&quot;&gt;Spec Kit&lt;/a&gt; calls the same thing a &lt;em&gt;project constitution&lt;/em&gt;: non-negotiable principles around code quality, testing, user experience, and performance, baked in before generation begins. Caporusso &amp;amp; Perdue (&lt;a href=&quot;https://iscap.us/proceedings/2025/pdf/6416.pdf&quot;&gt;ISCAP 2025&lt;/a&gt;) compared direct prompting against requirements-first prompting across seven LLMs and found that structured requirements improved code quality — early empirical support for a move whose case is still mostly programmatic.&lt;/p&gt;

&lt;p&gt;The point is not new tooling. The point is that the artefact a reviewer is asked to evaluate now sits above the code, in a layer the human author still authored. The author still carries intent — into the spec, where it is durable, rather than into the chat session, where it is not.&lt;/p&gt;

&lt;h2 id=&quot;why-this-compresses-the-perception-gap&quot;&gt;Why this compresses the perception gap&lt;/h2&gt;

&lt;p&gt;The perception gap was structural. Author and reviewer held different artefacts in working memory. The author held the description, the boundaries, the tests they wrote. The reviewer held the diff. Neither could feel what they were costing the other.&lt;/p&gt;

&lt;p&gt;When the spec is the artefact, this asymmetry compresses. The author and the reviewer are both oriented to the same object — the contract. The author wrote it; the reviewer is evaluating whether the implementation honours it. The cognitive load on the reviewer is bounded by spec size, not diff size. An eighteen-hundred-line implementation of a one-page spec is reviewable in a way an eighteen-hundred-line diff with a thoughtful description is not.&lt;/p&gt;

&lt;p&gt;This also removes the silent disagreement about what &lt;em&gt;the work&lt;/em&gt; is. The author and the reviewer can disagree about whether the implementation honours the contract — that is a productive disagreement, scoped to a shared artefact. The perception gap depended on them disagreeing about what the work had been at all.&lt;/p&gt;

&lt;h2 id=&quot;what-this-looks-like-in-practice&quot;&gt;What this looks like in practice&lt;/h2&gt;

&lt;p&gt;“Spec” is doing several jobs at once here, and it is worth being honest about it. Some specs are prose contracts a human writes out in English. Some are executable. Some are inferred from a type system. Each lives at a different level of formality. What they share, under AI authoring, is that they are independently checkable artefacts of intent — the only parts of the loop that are not appearance-of-thought signals.&lt;/p&gt;

&lt;p&gt;In a Ruby codebase, the move is mostly elevating instruments the team already has.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;RSpec and the discipline behind it.&lt;/strong&gt; A well-written test is a spec of behaviour at the granularity the team has chosen. The shift is in workflow order — write the contract as a test before the AI drafts the implementation, then accept the implementation only when it honours the test. This is TDD without nostalgia; it works at LLM speed because the spec lives somewhere the AI cannot edit during generation.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Property-based testing.&lt;/strong&gt; &lt;a href=&quot;https://dl.acm.org/doi/10.1145/351240.351266&quot;&gt;Claessen &amp;amp; Hughes (2000)&lt;/a&gt; framed properties as specifications of permitted output shapes — the implementation is graded by random sampling against the property, not by the cases the author happened to enumerate. Where example-based tests check the cases you remembered, property tests check the cases the AI might have missed.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Consumer-driven contracts at service boundaries.&lt;/strong&gt; &lt;a href=&quot;https://martinfowler.com/articles/consumerDrivenContracts.html&quot;&gt;Ian Robinson’s 2006 framing&lt;/a&gt; — and twenty years of Pact in production — capture exactly the property contracts need to have under AI authoring: durable, executable, and owned by both sides. A boundary contract is a spec the implementation cannot edit on its way through.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Architectural decision records.&lt;/strong&gt; &lt;a href=&quot;https://www.cognitect.com/blog/2011/11/15/documenting-architecture-decisions&quot;&gt;Michael Nygard’s 2011 piece&lt;/a&gt; already named the problem the trilogy is circling: &lt;em&gt;“one of the hardest things to track during the life of a project is the motivation behind certain decisions.”&lt;/em&gt; An ADR is the spec for the next change, written by the team that owns the consequences.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Type systems where you have them.&lt;/strong&gt; Sorbet and RBS in Ruby, or static typing in any other language, are specs the compiler verifies for free. Mündler et al. (&lt;a href=&quot;https://arxiv.org/abs/2504.09246&quot;&gt;PLDI 2025&lt;/a&gt;) found that in TypeScript, 94% of compilation errors in LLM-generated code are type-check failures — useful evidence that the instrument pays off where it exists, and a useful caution that dynamic languages do not get this protection for free and have to recover the discipline elsewhere.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;None of these are new. What is new is the framing. Under code-as-artefact, these instruments are quality-of-life. Under spec-as-artefact, they are structurally load-bearing — they are the parts of the loop that intent is durable in.&lt;/p&gt;

&lt;h2 id=&quot;what-this-doesnt-fix&quot;&gt;What this doesn’t fix&lt;/h2&gt;

&lt;p&gt;The obvious objection is that the AI will draft the spec too, and the perception gap re-enters one layer up. The objection is real. Zietsman (&lt;a href=&quot;https://arxiv.org/abs/2603.25773&quot;&gt;2026&lt;/a&gt;) calls this the correlated-failure problem: without an external reference, the generating agent and the reviewing agent share the same training distribution, and “the review checks code against itself, not against intent.”&lt;/p&gt;

&lt;p&gt;The defence is that a specification a human authored — even one the AI drafted and the human pruned — sits in a different cognitive position than code the human glanced at. The specs worth keeping are the ones the author can answer questions about. That discipline is exactly what the previous post called comprehension, surfaced one layer up.&lt;/p&gt;

&lt;p&gt;The honest counter is that the same deadline pressure that produced comprehension debt will erode spec-first discipline too. Kuutila et al.’s &lt;a href=&quot;https://arxiv.org/abs/1901.05771&quot;&gt;systematic review of time pressure in software engineering&lt;/a&gt; found that quality assurance is the practice that bends first under load — and spec-first work is QA upstream of itself. The argument is not that spec-first survives the pressure without institutional support. It is that under spec-first, the practice that bends first is visible, named, and budgetable.&lt;/p&gt;

&lt;h2 id=&quot;the-trilogy&quot;&gt;The trilogy&lt;/h2&gt;

&lt;p&gt;I have started writing specs first on the work I review. I have not finished. The discipline is harder than the writing it replaces, because it forces decisions I was making by inference earlier — what is in scope, what is out of scope, what counts as success. The work surfaces.&lt;/p&gt;

&lt;p&gt;Three sentences for the trilogy. Comprehension debt is what AI-assisted teams accrue when generation outruns understanding. The perception gap is what hides the debt from inside the team. The structural response is to move the artefact one layer up — to write the contract first, and to commit to keeping it there.&lt;/p&gt;

&lt;p&gt;This is one shape of answer. There are others; the architectural ones and the financial ones are both real. I have argued for this one because it makes the smallest change to the workflow and the largest change to what is being reviewed. The discipline is fragile. It is also the only one I have found that compresses the gap.&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;&lt;em&gt;Series: &lt;a href=&quot;https://davidslv.uk/2026/05/17/comprehension-debt.html&quot;&gt;Comprehension Debt&lt;/a&gt; · &lt;a href=&quot;https://davidslv.uk/2026/05/21/the-perception-gap.html&quot;&gt;The Perception Gap&lt;/a&gt; · Spec is the Artefact (this post).&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Sources: &lt;a href=&quot;https://se.inf.ethz.ch/~meyer/publications/computer/contract.pdf&quot;&gt;Meyer — Applying Design by Contract (IEEE Computer, 1992)&lt;/a&gt; · &lt;a href=&quot;https://arxiv.org/abs/2504.09246&quot;&gt;Mündler et al. — Type-Constrained Code Generation with Language Models (PLDI 2025)&lt;/a&gt; · &lt;a href=&quot;https://iscap.us/proceedings/2025/pdf/6416.pdf&quot;&gt;Caporusso &amp;amp; Perdue — ISCAP 2025&lt;/a&gt; · &lt;a href=&quot;https://dl.acm.org/doi/10.1145/351240.351266&quot;&gt;Claessen &amp;amp; Hughes — QuickCheck (ICFP 2000)&lt;/a&gt; · &lt;a href=&quot;https://martinfowler.com/articles/consumerDrivenContracts.html&quot;&gt;Robinson — Consumer-Driven Contracts (2006)&lt;/a&gt; · &lt;a href=&quot;https://www.cognitect.com/blog/2011/11/15/documenting-architecture-decisions&quot;&gt;Nygard — Documenting Architecture Decisions (2011)&lt;/a&gt; · &lt;a href=&quot;https://arxiv.org/abs/2603.25773&quot;&gt;Zietsman — The Specification as Quality Gate (2026)&lt;/a&gt; · &lt;a href=&quot;https://arxiv.org/abs/1901.05771&quot;&gt;Kuutila et al. — Time Pressure in Software Engineering (2020)&lt;/a&gt; · GitHub’s &lt;a href=&quot;https://github.com/github/spec-kit&quot;&gt;Spec Kit&lt;/a&gt;. Vocabulary from The Serious CTO’s video on &lt;a href=&quot;https://www.youtube.com/watch?v=3o2SlgX9BhE&quot;&gt;the hidden cost of AI coding&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;
</description>
        <pubDate>Fri, 22 May 2026 00:00:00 +0000</pubDate>
        <link>https://davidslv.uk/2026/05/22/spec-is-the-artefact.html</link>
        <guid isPermaLink="true">https://davidslv.uk/2026/05/22/spec-is-the-artefact.html</guid>
        
        
      </item>
    
      <item>
        <title>The Perception Gap</title>
        <description>&lt;p&gt;An engineer opens a pull request. It is eighteen hundred lines across roughly fifteen files. The description has the kind of structure you write when you mean it — what changed, why, what you would push back on if you were the reviewer. They have thought about it. They feel organised. They are organised, from where they’re sitting.&lt;/p&gt;

&lt;p&gt;The reviewer opens it and is being asked to evaluate two or three architectural decisions and several new features in one sitting. The diff is too large to hold in working memory. The PR description helps, but only at the level it summarises — the line-by-line judgement is still on the reviewer. They are also organised, in the way the work has arrived at their desk.&lt;/p&gt;

&lt;p&gt;Both of them are right about how organised they are. Neither of them is in a position to feel what they are costing the other.&lt;/p&gt;

&lt;p&gt;This is the perception gap. It runs through teams building seriously with AI-assisted code, and from inside it is hard to see.&lt;/p&gt;

&lt;h2 id=&quot;what-the-research-says&quot;&gt;What the research says&lt;/h2&gt;

&lt;p&gt;In a 2025 study by &lt;a href=&quot;https://metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/&quot;&gt;METR&lt;/a&gt;, sixteen experienced open-source developers were timed on 246 real tasks. They predicted AI would speed them up by 24%. It slowed them down by 19%. After the slowdown was measured, they still reported feeling 20% faster.&lt;/p&gt;

&lt;p&gt;The sample is small and the headline slowdown finding is contested — METR themselves &lt;a href=&quot;https://metr.org/blog/2026-02-24-uplift-update/&quot;&gt;posted an update in early 2026&lt;/a&gt; acknowledging selection-bias problems that may mean their numbers underestimate AI speedup. What the update doesn’t undermine is the &lt;em&gt;perception&lt;/em&gt; finding: a 39-point gap between what developers reported feeling after the experiment and what was measured during it. Developers who had just been timed being slower still believed they had been faster.&lt;/p&gt;

&lt;p&gt;The interesting finding is not the slowdown itself. The interesting finding is that the developers could not perceive it. This is a calibration problem of a familiar shape — the literature on self-assessment is decades old — with a new accelerant. The feedback loop that would ordinarily tell an experienced engineer “you should adjust how you’re working” was gone.&lt;/p&gt;

&lt;h2 id=&quot;it-doesnt-stop-at-the-developer&quot;&gt;It doesn’t stop at the developer&lt;/h2&gt;

&lt;p&gt;The METR study measured solo developers on tasks. The same shape extends past the solo case in a way the study didn’t measure: the perception gap operates not just within a single developer using AI, but between the AI-assisted &lt;em&gt;author&lt;/em&gt; and the human &lt;em&gt;reviewer&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;I have spent recent sprints on the receiving end of this. A single AI-assisted author on one of the codebases I review put up tens of thousands of net lines in a fortnight, across PRs whose individual diffs comfortably exceeded the eighteen hundred lines I opened with. The PRs were thoughtful: tight commit messages, sensible scoping, descriptions that named what changed. The volume was not reviewable at human pace. What I felt as a reviewer — that I was perpetually catching up, that the diff was always larger than the attention I could supply — is the same shape the METR developers reported on themselves, only viewed from the other side of the keyboard.&lt;/p&gt;

&lt;p&gt;The author writes a thoughtful PR. They feel organised because they are organised — by the description they wrote, by the boundaries they named, by the tests they added. The artefacts of their thinking are visible to them. What is not visible to them is the cognitive cost of the whole change held in the reviewer’s head at the same time.&lt;/p&gt;

&lt;p&gt;The reviewer experiences something else. The description helps but does not substitute. Two-or-three architectural decisions plus several new features in one diff is a working-memory load no amount of structure in the PR body resolves. The reviewer is not lazy, not slow, not failing — the diff is simply asking for a kind of attention the human reviewer cannot supply at the rate the PR was assembled.&lt;/p&gt;

&lt;p&gt;Both authors and reviewers are honest about their experience. Neither has the feedback that would let them correct the other. The author keeps shipping at one rate, the reviewer keeps absorbing at theirs, the mismatch compounds across every PR until something visible breaks.&lt;/p&gt;

&lt;h2 id=&quot;where-the-cost-shows-up&quot;&gt;Where the cost shows up&lt;/h2&gt;

&lt;p&gt;The cost is real. It appears in places leaders aren’t watching closely.&lt;/p&gt;

&lt;p&gt;The &lt;a href=&quot;https://dora.dev/research/2024/dora-report/&quot;&gt;2024 DORA report&lt;/a&gt; found that a 25% increase in AI adoption corresponded with a 7.2% decrease in delivery stability and a 1.5% decrease in throughput. Independent industry telemetry from Faros AI, sampling ten thousand-plus developers across more than a thousand teams, puts the throughput side of the same picture in starker terms: high-AI-adoption teams merging substantially more pull requests while code review time goes up. The dashboards leaders watch for “AI is working” — PR volume, individual velocity — were green. The dashboard for whether the system itself was holding together was quietly drifting in the other direction.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://www.veracode.com/blog/genai-code-security-report/&quot;&gt;Veracode’s 2025 GenAI Code Security Report&lt;/a&gt; tested over 100 large language models against 80 code-completion tasks designed to surface OWASP Top 10 vulnerabilities. 45% of the generated code contained security flaws. AI failed to defend against cross-site scripting in 86% of relevant samples. Java fared worst, at a 72% security failure rate.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://www.gitclear.com/ai_assistant_code_quality_2025_research&quot;&gt;GitClear’s analysis&lt;/a&gt; of 211 million lines of code found that 2024 was the first year on record where copy-pasted code exceeded refactored code. Duplicated code blocks of five or more lines rose roughly eightfold over the year, while moved code fell from around a quarter of all changes in 2021 to under 10% in 2024. The code was growing faster than it was being shaped.&lt;/p&gt;

&lt;p&gt;The pattern isn’t only on the human side. Apple’s &lt;a href=&quot;https://machinelearning.apple.com/research/illusion-of-thinking&quot;&gt;Illusion of Thinking&lt;/a&gt; (2025) found that large reasoning models reduce their reasoning effort as problem complexity grows past a threshold, despite having token budget to spare — and raised the question of whether the visible reasoning trace reflects reasoning or its appearance. (The headline accuracy-collapse finding has been &lt;a href=&quot;https://arxiv.org/abs/2506.09250&quot;&gt;contested&lt;/a&gt; on methodology grounds; the effort-decline result is the part the critique didn’t touch.) The author’s structured PR description and the model’s structured reasoning trace are artefacts of the same kind — visible enough to be trusted, opaque about whether the work behind them held. The reviewer is downstream of both.&lt;/p&gt;

&lt;p&gt;Three independent sources, three different facets — throughput, security, code shape — all measuring outcomes consistent with a perception gap.&lt;/p&gt;

&lt;p&gt;The pattern that ties them together is what I called &lt;a href=&quot;https://davidslv.uk/2026/05/17/comprehension-debt.html&quot;&gt;comprehension debt&lt;/a&gt; in the previous post: the gap between how much code exists in your system and how much of it any human actually understands. If the perception gap is the &lt;em&gt;mechanism&lt;/em&gt;, comprehension debt is the &lt;em&gt;form&lt;/em&gt; it takes. The author shipped code, the reviewer approved it, the dashboards stayed green — and at the end of all that, fewer people on the team can explain what the system is doing than before.&lt;/p&gt;

&lt;h2 id=&quot;what-to-do-about-it&quot;&gt;What to do about it&lt;/h2&gt;

&lt;p&gt;I am in the vignette above. The reviewer is me as often as the author is, and both seats are familiar. When AI has sped up my own work, the speed-up has felt real. So has the cost on the other side of my own multi-thousand-line PRs. Both kinds of experience happened to the same engineer. Neither corrected the other in time to change the next one.&lt;/p&gt;

&lt;p&gt;The dashboards leaders look at are not pointed at this. PR volume, individual velocity, lead time — none of them measure the cognitive split between the two sides of a review, or the fraction of last quarter’s shipped code anyone on the team could explain at 2am without opening the file. There is no automated way to measure that second number; it requires asking, which is part of why it doesn’t get measured.&lt;/p&gt;

&lt;p&gt;A question worth taking to your team this week, then. Look at the last three large pull requests you approved. Without re-reading the diff, can you re-derive why the architectural decisions in each had to go that way? If the answer is no for two of them, the perception gap is already operating in your codebase, and your dashboards haven’t told you. The structural response is the subject of the next post.&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;&lt;em&gt;Data drawn from: &lt;a href=&quot;https://metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/&quot;&gt;METR — Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity (2025)&lt;/a&gt; and &lt;a href=&quot;https://metr.org/blog/2026-02-24-uplift-update/&quot;&gt;METR’s Feb 2026 design update&lt;/a&gt; · &lt;a href=&quot;https://dora.dev/research/2024/dora-report/&quot;&gt;DORA — Accelerate State of DevOps Report 2024&lt;/a&gt; · &lt;a href=&quot;https://www.veracode.com/blog/genai-code-security-report/&quot;&gt;Veracode — 2025 GenAI Code Security Report&lt;/a&gt; · &lt;a href=&quot;https://www.gitclear.com/ai_assistant_code_quality_2025_research&quot;&gt;GitClear — AI Copilot Code Quality 2025 Research&lt;/a&gt; · &lt;a href=&quot;https://machinelearning.apple.com/research/illusion-of-thinking&quot;&gt;Apple — The Illusion of Thinking (2025)&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Vocabulary from The Serious CTO’s videos on &lt;a href=&quot;https://www.youtube.com/watch?v=fFIjrtH6qjc&quot;&gt;AI killing code review&lt;/a&gt;, &lt;a href=&quot;https://www.youtube.com/watch?v=fvqD83ffMnw&quot;&gt;where developer time actually goes&lt;/a&gt;, and &lt;a href=&quot;https://www.youtube.com/watch?v=3o2SlgX9BhE&quot;&gt;the hidden cost of AI coding&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;
</description>
        <pubDate>Thu, 21 May 2026 00:00:00 +0000</pubDate>
        <link>https://davidslv.uk/2026/05/21/the-perception-gap.html</link>
        <guid isPermaLink="true">https://davidslv.uk/2026/05/21/the-perception-gap.html</guid>
        
        
      </item>
    
      <item>
        <title>Rails Engines vs Packwerk: When to Use What</title>
        <description>&lt;p&gt;&lt;em&gt;This is an adapted excerpt from Chapter 16 of &lt;a href=&quot;/modular-rails/&quot;&gt;Modular Rails: Architecture for the Long Game&lt;/a&gt;, my book on building maintainable Ruby on Rails applications using Rails Engines.&lt;/em&gt;&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;Rails engines are not the only way to introduce structure into a monolith. Packwerk, plain Ruby gems, namespaces, and even Hanami slices offer different trade-offs. The question is not which tool is “best” – it is which tool fits the problem you actually have.&lt;/p&gt;

&lt;p&gt;This post focuses on the comparison that comes up most often: Rails engines versus Packwerk.&lt;/p&gt;

&lt;h2 id=&quot;packwerk-static-boundary-enforcement&quot;&gt;Packwerk: Static Boundary Enforcement&lt;/h2&gt;

&lt;p&gt;Packwerk, created by Shopify, takes a fundamentally different approach to modularity. Instead of runtime isolation (separate load paths, independent gemspecs, mountable routes), Packwerk enforces boundaries at analysis time through static checks.&lt;/p&gt;

&lt;p&gt;A package in Packwerk is a directory with a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;package.yml&lt;/code&gt; file:&lt;/p&gt;

&lt;div class=&quot;language-yaml highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c1&quot;&gt;# components/billing/package.yml&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;enforce_dependencies&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;true&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;enforce_privacy&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;true&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;dependencies&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;components/core&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;That is the entire configuration. The directory structure stays inside your existing &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;app/&lt;/code&gt; folder. There are no gemspecs, no dummy apps, no mountable routes. You add Packwerk to an existing application and draw boundaries around code that already exists.&lt;/p&gt;

&lt;p&gt;Packwerk then analyses your code statically – without running it – and reports violations:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;components/notifications/app/models/notifications/mailer.rb:12
  Billing::Invoice is private to components/billing
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The violation tells you that the notification mailer is reaching into billing’s internals. You fix it by either making &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Invoice&lt;/code&gt; part of billing’s public API or by introducing an interface.&lt;/p&gt;

&lt;h2 id=&quot;the-public-api-pattern&quot;&gt;The Public API Pattern&lt;/h2&gt;

&lt;p&gt;Both engines and Packwerk benefit from explicit public APIs, but Packwerk makes this a first-class concept. You mark classes as public by placing them in a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;public/&lt;/code&gt; directory within your package:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;components/billing/
  app/
    models/
      billing/
        invoice.rb          # private
        line_item.rb         # private
        payment_gateway.rb   # private
    public/
      billing/
        charge_customer.rb   # public API
        invoice_summary.rb   # public API
  package.yml
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Other packages can only reference &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Billing::ChargeCustomer&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Billing::InvoiceSummary&lt;/code&gt;. Any reference to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Billing::Invoice&lt;/code&gt; directly triggers a violation. This is a powerful pattern – it forces you to think about what your module exposes rather than what it contains.&lt;/p&gt;

&lt;p&gt;Engines can achieve the same thing through convention and code review, but Packwerk enforces it automatically.&lt;/p&gt;

&lt;h2 id=&quot;when-to-use-which&quot;&gt;When to Use Which&lt;/h2&gt;

&lt;p&gt;Here is where it gets practical. Each tool excels in different situations:&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Dimension&lt;/th&gt;
      &lt;th&gt;Rails Engines&lt;/th&gt;
      &lt;th&gt;Packwerk&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Isolation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Runtime (separate load paths, gemspecs)&lt;/td&gt;
      &lt;td&gt;Static analysis only&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Setup cost&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Medium-high (gemspec, dummy app, routes)&lt;/td&gt;
      &lt;td&gt;Low (add gem, create package.yml)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Enforcement&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Hard – code literally cannot see other engines without dependencies&lt;/td&gt;
      &lt;td&gt;Soft – violations are warnings, not errors&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Migration path&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Must move files, update requires&lt;/td&gt;
      &lt;td&gt;Draw boundaries around existing code&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Independent testing&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Yes – each engine has its own test suite&lt;/td&gt;
      &lt;td&gt;Partial – tests still run in one suite&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Route isolation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Full mountable routes&lt;/td&gt;
      &lt;td&gt;No route concept&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Database migrations&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Can be engine-specific&lt;/td&gt;
      &lt;td&gt;Application-level only&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Team ownership&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Natural – each engine is a unit&lt;/td&gt;
      &lt;td&gt;Possible but requires tooling&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Extraction to service&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Straightforward – engine is already isolated&lt;/td&gt;
      &lt;td&gt;Requires significant refactoring&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;The key difference is enforcement philosophy. Engines say “you physically cannot cross this boundary.” Packwerk says “we will tell you when you cross this boundary.” Both are valid. The right choice depends on your team’s discipline and your application’s trajectory.&lt;/p&gt;

&lt;h2 id=&quot;brief-mentions-other-approaches&quot;&gt;Brief Mentions: Other Approaches&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Plain Ruby gems&lt;/strong&gt; are the lightest-weight option. If your module has no Rails dependencies – a pricing calculator, a tax rules engine, a PDF generator – a gem gives you complete isolation with minimal overhead. No Rails, no ActiveRecord, just Ruby.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Namespaces and modules&lt;/strong&gt; cost nothing to set up. They communicate intent – &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Billing::Invoice&lt;/code&gt; tells developers that this class belongs to the billing domain. But namespaces have zero enforcement. Nothing prevents &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Notifications::Mailer&lt;/code&gt; from calling &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Billing::Invoice.find(42)&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Hanami slices&lt;/strong&gt; offer a middle ground for teams building new applications. Each slice gets its own container, dependencies, and persistence layer. The trade-off is that you are no longer writing Rails.&lt;/p&gt;

&lt;h2 id=&quot;each-tools-sweet-spot&quot;&gt;Each Tool’s Sweet Spot&lt;/h2&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Tool&lt;/th&gt;
      &lt;th&gt;Best for&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Rails Engines&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Teams that need hard boundaries, independent deployability potential, or are on the path to eventual service extraction&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Packwerk&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Large teams adopting modularity incrementally in an existing monolith, where moving files is too disruptive&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Plain Ruby gems&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Framework-agnostic domain logic with no Rails dependencies&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Namespaces&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Small teams with strong conventions, or as a stepping stone to stronger boundaries&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Hanami slices&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;New applications where the team is willing to move beyond Rails conventions&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h2 id=&quot;layering-your-tools&quot;&gt;Layering Your Tools&lt;/h2&gt;

&lt;p&gt;These tools are not mutually exclusive. In practice, many mature applications use several of them together:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-mermaid&quot;&gt;graph TB
    subgraph &quot;Application&quot;
        direction TB
        N[&quot;Namespaces &amp;amp; Conventions&amp;lt;br/&amp;gt;(every team, day one)&quot;]
        P[&quot;Packwerk Packages&amp;lt;br/&amp;gt;(boundary detection)&quot;]
        E[&quot;Rails Engines&amp;lt;br/&amp;gt;(hard isolation)&quot;]
        G[&quot;Plain Ruby Gems&amp;lt;br/&amp;gt;(framework-free logic)&quot;]
    end

    N --&amp;gt;|&quot;When conventions&amp;lt;br/&amp;gt;aren&apos;t enough&quot;| P
    P --&amp;gt;|&quot;When static analysis&amp;lt;br/&amp;gt;isn&apos;t enough&quot;| E
    E --&amp;gt;|&quot;For domain logic&amp;lt;br/&amp;gt;without Rails&quot;| G

    style N fill:#4a90d9,stroke:#2c5f8a,color:#fff
    style P fill:#e8a838,stroke:#b07828,color:#fff
    style E fill:#27ae60,stroke:#1e8449,color:#fff
    style G fill:#8e44ad,stroke:#6c3483,color:#fff
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;You start with namespaces because they are free. When namespaces are not enough, you add Packwerk to detect boundary violations. When detection is not enough and you need enforcement, you extract an engine. When the engine contains logic that does not need Rails at all, you pull it into a plain gem.&lt;/p&gt;

&lt;p&gt;Each layer builds on the one below. You do not have to pick one tool and commit to it forever. You escalate as the pain justifies the cost.&lt;/p&gt;

&lt;p&gt;The best architecture teams I have worked with use this layered approach. They start cheap, escalate deliberately, and always ask: “Is the boundary problem we have worth the cost of the tool we are reaching for?”&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;&lt;em&gt;This was adapted from Chapter 16 of &lt;a href=&quot;/modular-rails/&quot;&gt;Modular Rails: Architecture for the Long Game&lt;/a&gt;. The book covers all five approaches in depth – with working code, migration guides, and the honest trade-offs for each.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Read the &lt;a href=&quot;/books/modular-rails/&quot;&gt;&lt;strong&gt;entire book free on the web&lt;/strong&gt;&lt;/a&gt; — every chapter, no paywall. Prefer print or Kindle? &lt;a href=&quot;https://www.amazon.com/dp/1066649405&quot;&gt;Amazon US&lt;/a&gt; · &lt;a href=&quot;https://www.amazon.co.uk/dp/1066649405&quot;&gt;Amazon UK&lt;/a&gt; · &lt;a href=&quot;/modular-rails/&quot;&gt;all editions &amp;amp; prices&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;
</description>
        <pubDate>Tue, 19 May 2026 00:00:00 +0000</pubDate>
        <link>https://davidslv.uk/2026/05/19/rails-engines-vs-packwerk.html</link>
        <guid isPermaLink="true">https://davidslv.uk/2026/05/19/rails-engines-vs-packwerk.html</guid>
        
        
      </item>
    
      <item>
        <title>Comprehension Debt</title>
        <description>&lt;blockquote&gt;
  &lt;p&gt;“You’re not a developer anymore. You’re a reviewer of code you don’t understand.”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That line is from &lt;a href=&quot;https://www.youtube.com/@TheSeriousCTO&quot;&gt;The Serious CTO&lt;/a&gt;, and it named something I’d been feeling but didn’t have words for. The shape of the work has changed. The volume of code that gets generated, reviewed, and shipped has decoupled from the volume of code any human actually holds in their head. There’s a debt accruing in that gap, and we don’t track it.&lt;/p&gt;

&lt;p&gt;He calls it &lt;strong&gt;comprehension debt&lt;/strong&gt; — the difference between how much code exists in your system and how much of it anyone on the team could explain at 2am.&lt;/p&gt;

&lt;p&gt;I want to make the case that this is the most important kind of debt our industry has accumulated in the past two years, and that nothing in the standard toolkit measures it.&lt;/p&gt;

&lt;h2 id=&quot;its-not-technical-debt&quot;&gt;It’s not technical debt&lt;/h2&gt;

&lt;p&gt;Ward Cunningham coined “technical debt” in 1992 to describe a deliberate trade. You take a shortcut, you know you took it, you plan to pay it back. The transaction was between an engineer and the future engineer who would inherit the code. Both were assumed to be humans, and both were assumed to remember why.&lt;/p&gt;

&lt;p&gt;Comprehension debt is different in three ways.&lt;/p&gt;

&lt;p&gt;It isn’t deliberate. Nobody chooses to ship code they don’t understand. It happens because the AI generated 600 lines, the spec was implicit in the conversation, the conversation is gone, and the tests pass. The trade was never on the table to refuse.&lt;/p&gt;

&lt;p&gt;It isn’t local. Technical debt usually sits in a specific module you can point at. Comprehension debt is distributed across thousands of small decisions, each one defensible, none of them remembered. The total is much larger than any of its parts.&lt;/p&gt;

&lt;p&gt;And it doesn’t show up in any dashboard, which is the part I want to spend a moment on.&lt;/p&gt;

&lt;h2 id=&quot;what-the-dashboards-measure&quot;&gt;What the dashboards measure&lt;/h2&gt;

&lt;p&gt;Look at any modern engineering analytics platform and you’ll see roughly the same vocabulary. &lt;strong&gt;DORA metrics&lt;/strong&gt; — lead time for changes, deployment frequency, change failure rate, mean time to recovery. &lt;strong&gt;Flow metrics&lt;/strong&gt; — cycle time, work-in-progress, throughput. And now, increasingly, &lt;strong&gt;AI Impact metrics&lt;/strong&gt; — suggestion acceptance rate, percentage of PRs assisted by AI, generated lines per engineer per week. The pitch is the one I keep seeing in the marketing copy: actionable engineering insight across DORA, Flow, AI Impact, and more. Turn engineering insights into predictable outcomes.&lt;/p&gt;

&lt;p&gt;These are useful. I look at DORA numbers regularly and I’d argue every team should. They are also, all of them, measurements of motion.&lt;/p&gt;

&lt;p&gt;Lead time tells you how fast work moves through the pipeline. Deployment frequency tells you how often it ships. Change failure rate and MTTR tell you what fraction breaks and how fast you recover. Cycle time tells you how long an item sat in flight. AI acceptance rate tells you how often a generated suggestion was kept.&lt;/p&gt;

&lt;p&gt;None of them ask the question that matters here: &lt;em&gt;of the code we shipped last quarter, what percentage could any member of the team explain right now, without opening the file?&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;That number doesn’t have a name yet. The “AI Impact” category got close — it noticed AI was changing something about engineering and tried to put a measurement on it — but the things it measures are still adoption and volume, not comprehension. Acceptance rate doesn’t care whether the engineer who accepted the suggestion understood it. Lines-per-engineer doesn’t care whether anyone could narrate those lines six weeks later.&lt;/p&gt;

&lt;p&gt;So the lagging-indicator failure mode is precisely what you’d expect. DORA numbers stay green right up until they don’t, at which point the debt has already compounded across every change that touched anything near the broken thing. The dashboards eventually notice, but only via the breakage. By then you’re not measuring comprehension — you’re measuring its absence.&lt;/p&gt;

&lt;h2 id=&quot;the-standard-answers-dont-measure-understanding&quot;&gt;The standard answers don’t measure understanding&lt;/h2&gt;

&lt;p&gt;This is the part I want to be honest about, because every senior engineer reading this has the same instinct I had: surely more review, more linting, more tests, more automation catches this. They don’t, and it’s worth being precise about why.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Code review measures “looks reasonable”&lt;/strong&gt;, which is approximately the same problem as the code itself — a human skimming a diff, deciding whether it pattern-matches against something they’d write. It doesn’t ask whether anyone could explain &lt;em&gt;why&lt;/em&gt; this code exists, or what would have to change in the world to make it wrong. The failure mode has a name now: LGTM syndrome. The data backs it up. High-AI-adoption teams in the recent DORA report are merging 98% more PRs while review time goes up 91%. We’re rubber-stamping more, faster.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Linters and type-checkers measure syntax, not intent.&lt;/strong&gt; They will tell you the function returns a string. They will not tell you whether the string represents the thing the caller assumed it represented. TypeScript catches an enormous fraction of LLM errors that are &lt;em&gt;type-check failures&lt;/em&gt;, which is real value, but the errors that matter are the ones that compile.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tests measure observed behaviour at the point of writing.&lt;/strong&gt; They are a memory of the assumptions that were live when the test was written. When the assumptions change — and they do, constantly, in any product still being shaped — the tests pass and the meaning quietly diverges. Tests are necessary infrastructure. They are not a measurement of comprehension.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;“Future AI will refactor it”&lt;/strong&gt; is the most seductive answer and the one most worth refuting. AI can refactor syntax. It cannot refactor meaning it never had. If no human ever understood why a piece of code is the way it is, the AI cleaning it up is doing the same thing the AI that wrote it did — pattern-matching against training data, producing something plausible, hoping the tests pass. You’re not paying down the debt. You’re laundering it.&lt;/p&gt;

&lt;h2 id=&quot;what-it-looks-like-when-its-compounding&quot;&gt;What it looks like when it’s compounding&lt;/h2&gt;

&lt;p&gt;The shape is familiar once you start looking for it.&lt;/p&gt;

&lt;p&gt;A field gets added to a model. Six months later nobody can quite explain what it’s for, but removing it breaks four jobs. The PR that introduced it was approved, the tests passed, the issue is closed. The “why” exists in nobody’s head.&lt;/p&gt;

&lt;p&gt;A controller has three early-return branches that each handle a “subtle case”. The cases were real when the code was written. Whether they’re still real is unclear, and checking would require reconstructing a conversation that happened in a chat session that’s now gone.&lt;/p&gt;

&lt;p&gt;A 2am incident lands and the person on call can read the trace but can’t narrate the code that produced it. The original author, if there even was a single one, is the model that generated it. The graceful degradation everyone hoped for at the architecture stage requires understanding that no longer exists in the team.&lt;/p&gt;

&lt;p&gt;None of these are individually catastrophic. Collectively they’re the new shape of legacy systems, and we’re building them at a rate previous generations of engineers couldn’t have imagined.&lt;/p&gt;

&lt;h2 id=&quot;a-field-level-note&quot;&gt;A field-level note&lt;/h2&gt;

&lt;p&gt;I’ve been building static-analysis and review tooling on a real Rails codebase for a while now. Custom linters, date-gated style rules, multi-agent PR review with worktree isolation, the works. Each layer was a response to something concrete. Each one helps. None of them measure comprehension. That’s not a criticism of the tools — they were never trying to. It’s a recognition that the thing I’m trying to defend against doesn’t yet have a measurement, which means it doesn’t yet have a budget, which means it grows.&lt;/p&gt;

&lt;h2 id=&quot;what-i-want-to-ask&quot;&gt;What I want to ask&lt;/h2&gt;

&lt;p&gt;This is the open part, because I genuinely don’t know.&lt;/p&gt;

&lt;p&gt;What would a codebase look like that &lt;em&gt;actively&lt;/em&gt; maintained comprehension? Not after the fact, via documentation written under deadline, but as a continuous property the team owned.&lt;/p&gt;

&lt;p&gt;What would the measurement be? A coverage metric for “any team member can describe this module within five minutes”? A required-reviewer rule that says the reviewer has to be able to &lt;em&gt;teach&lt;/em&gt; the change, not just approve it? A spec-first workflow where the human writes the contract and the AI generates the implementation, so review is “does this honour the contract?” rather than “does this code look correct?”&lt;/p&gt;

&lt;p&gt;I have partial answers and I’m not confident in any of them. What I am confident in is that the response to comprehension debt cannot be more of the same review, more of the same linters, more of the same hope. Whatever the answer is, it has to measure something we are not currently measuring.&lt;/p&gt;

&lt;p&gt;If you’re seeing this too, I’d genuinely like to hear what you’re trying. The vocabulary exists now. The next part is figuring out what to do with it.&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;&lt;em&gt;Further viewing: The Serious CTO’s videos on &lt;a href=&quot;https://www.youtube.com/watch?v=fFIjrtH6qjc&quot;&gt;AI killing code review&lt;/a&gt;, &lt;a href=&quot;https://www.youtube.com/watch?v=fvqD83ffMnw&quot;&gt;where developer time actually goes&lt;/a&gt;, and &lt;a href=&quot;https://www.youtube.com/watch?v=3o2SlgX9BhE&quot;&gt;the hidden cost of AI coding&lt;/a&gt; are where the comprehension-debt vocabulary comes from, and they’re worth your time.&lt;/em&gt;&lt;/p&gt;
</description>
        <pubDate>Sun, 17 May 2026 00:00:00 +0000</pubDate>
        <link>https://davidslv.uk/2026/05/17/comprehension-debt.html</link>
        <guid isPermaLink="true">https://davidslv.uk/2026/05/17/comprehension-debt.html</guid>
        
        
      </item>
    
  </channel>
</rss>
