<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: AWS Community Builders </title>
    <description>The latest articles on DEV Community by AWS Community Builders  (aws-builders).</description>
    <link>https://dev.clauneck.workers.dev/aws-builders</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Forganization%2Fprofile_image%2F2794%2F88da75b6-aadd-4ea1-8083-ae2dfca8be94.png</url>
      <title>DEV Community: AWS Community Builders </title>
      <link>https://dev.clauneck.workers.dev/aws-builders</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.clauneck.workers.dev/feed/aws-builders"/>
    <language>en</language>
    <item>
      <title># Treat: A Global Gifting Platform Empowering Local Businesses</title>
      <dc:creator>Seth David Gyimah</dc:creator>
      <pubDate>Fri, 26 Jun 2026 04:53:04 +0000</pubDate>
      <link>https://dev.clauneck.workers.dev/aws-builders/-treat-a-global-gifting-platform-empowering-local-businesses-i20</link>
      <guid>https://dev.clauneck.workers.dev/aws-builders/-treat-a-global-gifting-platform-empowering-local-businesses-i20</guid>
      <description>&lt;p&gt;&lt;em&gt;This project was built for the purposes of entering the H0 Hackathon.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;🚀 Live at: &lt;a href="https://www.sendatreat.app/" rel="noopener noreferrer"&gt;https://www.sendatreat.app/&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problem
&lt;/h2&gt;

&lt;p&gt;Gifting is one of the most natural human expressions of connection. People love surprising friends, family, and colleagues with food, gifts, services, and experiences.&lt;/p&gt;

&lt;p&gt;However, the current experience is fragmented:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;People rely on multiple apps and coordination tools
&lt;/li&gt;
&lt;li&gt;Most gifting defaults to cash transfers, losing emotional intent
&lt;/li&gt;
&lt;li&gt;Real-world fulfillment is disconnected from digital intent
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;At the same time, millions of local businesses globally — from restaurants and food vendors to salons, hotels, and service providers — struggle to reach customers consistently without expensive marketing platforms or technical infrastructure.&lt;/p&gt;

&lt;p&gt;This creates a gap between &lt;strong&gt;intentional giving&lt;/strong&gt; and &lt;strong&gt;local business access to demand&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I Built
&lt;/h2&gt;

&lt;p&gt;Treat is a global gifting platform that allows anyone to send meaningful real-world experiences to people they care about.&lt;/p&gt;

&lt;p&gt;Instead of sending money or generic gift cards, users send:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Food from local restaurants
&lt;/li&gt;
&lt;li&gt;Services (spa, salon, experiences)
&lt;/li&gt;
&lt;li&gt;Physical goods from nearby merchants
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Recipients receive a secure link via SMS, claim their treat, and redeem it at participating merchants.&lt;/p&gt;

&lt;p&gt;Treat is designed so that:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Merchants get visibility at zero marketing cost
&lt;/li&gt;
&lt;li&gt;Senders create meaningful moments instead of transactions
&lt;/li&gt;
&lt;li&gt;Recipients enjoy flexible, low-friction redemption
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Live project: &lt;a href="https://www.sendatreat.app/" rel="noopener noreferrer"&gt;https://www.sendatreat.app/&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  How I Built It (AWS + Vercel Architecture)
&lt;/h2&gt;

&lt;p&gt;Treat is a cloud-native distributed system designed for scale and real-world transaction workflows.&lt;/p&gt;

&lt;h3&gt;
  
  
  Frontend (Vercel)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Built with &lt;strong&gt;React + Vite&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Deployed on &lt;strong&gt;Vercel via GitHub integration&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Handles user experience for sending, claiming, and managing treats&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Backend (AWS)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Built with &lt;strong&gt;Ruby on Rails API&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Deployed on &lt;strong&gt;AWS ECS Fargate (containerized services)&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Designed as a decoupled API layer for web + future mobile clients&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Database Layer
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Amazon Aurora PostgreSQL&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;Stores users, merchants, payments, claims, and transaction state&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Payments
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Paystack (primary in supported regions)&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;Handles secure card and mobile money payments&lt;/li&gt;
&lt;li&gt;Supports instant merchant settlement after verification&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Communications
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;SMS + Email notifications power:

&lt;ul&gt;
&lt;li&gt;Treat delivery&lt;/li&gt;
&lt;li&gt;Claim flows&lt;/li&gt;
&lt;li&gt;OTP verification&lt;/li&gt;
&lt;li&gt;Merchant alerts&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Key Workflow
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Sender purchases a treat from a merchant
&lt;/li&gt;
&lt;li&gt;Recipient receives SMS link instantly
&lt;/li&gt;
&lt;li&gt;Recipient claims and verifies identity (OTP-based security layer)
&lt;/li&gt;
&lt;li&gt;Merchant is notified and prepares fulfillment
&lt;/li&gt;
&lt;li&gt;After verification at merchant location:

&lt;ul&gt;
&lt;li&gt;Payment is released instantly
&lt;/li&gt;
&lt;li&gt;Treat is served
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Challenges I Ran Into
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Designing a 3-sided marketplace flow
&lt;/h3&gt;

&lt;p&gt;Unlike traditional e-commerce, Treat requires coordination between:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Sender
&lt;/li&gt;
&lt;li&gt;Recipient
&lt;/li&gt;
&lt;li&gt;Merchant
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This meant designing a system where emotional gifting and transactional accuracy coexist.&lt;/p&gt;




&lt;h3&gt;
  
  
  2. Payments and settlement complexity
&lt;/h3&gt;

&lt;p&gt;A major challenge was ensuring:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Low friction checkout for senders
&lt;/li&gt;
&lt;li&gt;Instant payout to merchants after verification
&lt;/li&gt;
&lt;li&gt;Cross-provider payment consistency across countries
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This required integrating regional payment providers and limiting rollout to supported markets.&lt;/p&gt;




&lt;h3&gt;
  
  
  3. Real-world redemption vs digital intent
&lt;/h3&gt;

&lt;p&gt;Bridging digital gifting with physical merchant redemption required careful design of:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;OTP verification flows
&lt;/li&gt;
&lt;li&gt;QR-based validation
&lt;/li&gt;
&lt;li&gt;Merchant-assisted and self-checkout flows
&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  What I’m Proud Of
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Treat is now live and operational at: &lt;a href="https://www.sendatreat.app/" rel="noopener noreferrer"&gt;https://www.sendatreat.app/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Active deployment across multiple supported regions&lt;/li&gt;
&lt;li&gt;Fully working end-to-end gifting flow (send → claim → redeem → payout)&lt;/li&gt;
&lt;li&gt;Early adoption by local merchants and testers&lt;/li&gt;
&lt;li&gt;Built a system that supports real-world gifting, not just digital transactions&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  What I Learned
&lt;/h2&gt;

&lt;p&gt;Building Treat reinforced that:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Gifting is emotional, not transactional
&lt;/li&gt;
&lt;li&gt;UX matters as much as infrastructure in real-world systems
&lt;/li&gt;
&lt;li&gt;Local businesses thrive when embedded into personal moments, not just marketplaces
&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  What’s Next
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Expand to more countries and payment providers
&lt;/li&gt;
&lt;li&gt;Mobile app (iOS + Android) rollout
&lt;/li&gt;
&lt;li&gt;WhatsApp + richer notification channels
&lt;/li&gt;
&lt;li&gt;Delivery partnerships for non-pickup experiences
&lt;/li&gt;
&lt;li&gt;Corporate gifting and rewards integration
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Ultimately, Treat aims to become the global infrastructure for meaningful gifting and local business discovery.&lt;/p&gt;




&lt;h2&gt;
  
  
  Tech Stack
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;AWS (ECS Fargate, Aurora PostgreSQL)&lt;/li&gt;
&lt;li&gt;Ruby on Rails API&lt;/li&gt;
&lt;li&gt;React + Vite&lt;/li&gt;
&lt;li&gt;Vercel&lt;/li&gt;
&lt;li&gt;Paystack&lt;/li&gt;
&lt;li&gt;SMS/Email infrastructure&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Try It Out
&lt;/h2&gt;

&lt;p&gt;🌍 &lt;a href="https://www.sendatreat.app/" rel="noopener noreferrer"&gt;https://www.sendatreat.app/&lt;/a&gt;&lt;/p&gt;




</description>
      <category>h0hackathon</category>
    </item>
    <item>
      <title>Only 1 Post to Help You Understand the Big Picture of System Design</title>
      <dc:creator>Hoang Guruu</dc:creator>
      <pubDate>Fri, 26 Jun 2026 03:17:17 +0000</pubDate>
      <link>https://dev.clauneck.workers.dev/aws-builders/only-1-post-to-help-you-understand-the-big-picture-of-system-design-3ihf</link>
      <guid>https://dev.clauneck.workers.dev/aws-builders/only-1-post-to-help-you-understand-the-big-picture-of-system-design-3ihf</guid>
      <description>&lt;h4&gt;
  
  
  &lt;em&gt;Check More Resource Right Now: &lt;a href="https://lnkd.in/gsXedZBf" rel="noopener noreferrer"&gt;https://lnkd.in/gsXedZBf&lt;/a&gt;&lt;/em&gt;
&lt;/h4&gt;




&lt;h2&gt;
  
  
  98 SYSTEM DESIGN CONCEPTS FOR BEGINNERS
&lt;/h2&gt;




&lt;h2&gt;
  
  
  1. Scalability
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; The system can handle more users without slowing down.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To prevent the system from becoming slow or crashing as traffic grows.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Upgrade the server or add more servers.&lt;/p&gt;




&lt;h2&gt;
  
  
  2. Availability
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; Users can access the system whenever they need it.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To reduce service downtime.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Run multiple servers and prepare backup servers.&lt;/p&gt;




&lt;h2&gt;
  
  
  3. Reliability
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; The system works correctly and rarely fails.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To provide stable and accurate results.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Test the system, back up data, and handle errors carefully.&lt;/p&gt;




&lt;h2&gt;
  
  
  4. Latency
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; The time between sending a request and receiving a response.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; Lower latency makes the application feel faster.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Use caching, a CDN, and faster database queries.&lt;/p&gt;




&lt;h2&gt;
  
  
  5. Throughput
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; The number of requests a system can process in a period of time.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To measure how much work the system can handle.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Measure requests, transactions, or data processed per second.&lt;/p&gt;




&lt;h2&gt;
  
  
  6. Capacity
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; The maximum workload a system can handle.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To know when the system needs more resources.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Run load tests to find CPU, memory, and request limits.&lt;/p&gt;




&lt;h2&gt;
  
  
  7. Client–Server
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; The client sends a request, and the server processes it and returns a result.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To separate the user interface from data processing.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; A browser calls a server API to get data.&lt;/p&gt;




&lt;h2&gt;
  
  
  8. Database
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; A place where system data is stored.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To save data permanently and find it easily.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Store users, products, orders, and transactions.&lt;/p&gt;




&lt;h2&gt;
  
  
  9. SQL vs NoSQL
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; SQL stores data in tables, while NoSQL supports more flexible formats.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; Different systems need different ways to store data.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Use SQL for related data and NoSQL for flexible or large-scale data.&lt;/p&gt;




&lt;h2&gt;
  
  
  10. Load Balancing
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; It shares requests across multiple servers.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To stop one server from becoming overloaded.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Place a load balancer in front of application servers.&lt;/p&gt;




&lt;h2&gt;
  
  
  11. Caching
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; Frequently used data is saved temporarily for faster access.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To improve response time and reduce database load.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Store popular data in Redis or memory.&lt;/p&gt;




&lt;h2&gt;
  
  
  12. Cache Invalidation
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; Old cached data is removed or updated.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To prevent users from seeing outdated information.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Clear the cache when the original data changes or expires.&lt;/p&gt;




&lt;h2&gt;
  
  
  13. CDN
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; A network of servers that stores content in different locations.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To help users load images, videos, and files faster.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Store static files on servers close to users.&lt;/p&gt;




&lt;h2&gt;
  
  
  14. DNS
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; It changes a domain name into an IP address.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; Users can remember a name instead of a number.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Point a domain name to a server or load balancer.&lt;/p&gt;




&lt;h2&gt;
  
  
  15. API Design
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; The way applications communicate with each other.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To make APIs clear, easy to use, and easy to maintain.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Clearly define URLs, inputs, outputs, and error codes.&lt;/p&gt;




&lt;h2&gt;
  
  
  16. REST
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; An API style that uses URLs and HTTP methods.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; It is simple, popular, and easy to connect with other systems.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Use GET, POST, PUT, and DELETE for common actions.&lt;/p&gt;




&lt;h2&gt;
  
  
  17. GraphQL
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; The client asks for exactly the data it needs.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To avoid receiving too much or too little data.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Send a query containing the required fields.&lt;/p&gt;




&lt;h2&gt;
  
  
  18. gRPC
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; A fast way for services to communicate.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; It is often faster and smaller than text-based APIs.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Define functions with Protocol Buffers and call them from another service.&lt;/p&gt;




&lt;h2&gt;
  
  
  19. Authentication
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; It checks who the user is.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To stop strangers from accessing an account.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Use passwords, OTPs, tokens, or biometrics.&lt;/p&gt;




&lt;h2&gt;
  
  
  20. Authorization
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; It checks what a user is allowed to do.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To stop users from accessing functions they do not have permission to use.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Check roles and permissions before allowing an action.&lt;/p&gt;




&lt;h2&gt;
  
  
  21. Rate Limiting
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; It limits how many API requests can be sent in a period of time.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To prevent spam, attacks, and excessive usage.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; For example, allow each user 100 requests per minute.&lt;/p&gt;




&lt;h2&gt;
  
  
  22. Fault Tolerance
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; The system still works when one part fails.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To stop one small failure from crashing the whole system.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Use backup servers, retries, and automatic failover.&lt;/p&gt;




&lt;h2&gt;
  
  
  23. High Availability
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; The system is almost always available.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To reduce service interruptions.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Run several copies of the system on different servers or locations.&lt;/p&gt;




&lt;h2&gt;
  
  
  24. CAP Theorem
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; During a network failure, a distributed system must choose between consistency and availability.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To help choose the right design for a distributed system.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Choose the priority based on the system, such as banking or social media.&lt;/p&gt;




&lt;h2&gt;
  
  
  25. Consistency Models
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; They define when users can see the latest data.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To balance correct data with system speed and availability.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Choose strong consistency or eventual consistency.&lt;/p&gt;




&lt;h2&gt;
  
  
  26. Replication
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; Data is copied to several servers.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To improve reading speed and reduce the risk of data loss.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; One main server writes data, and other servers keep copies.&lt;/p&gt;




&lt;h2&gt;
  
  
  27. Partitioning
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; A large amount of data is divided into smaller parts.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To make data easier to store and process.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Divide data by date, region, customer, or type.&lt;/p&gt;




&lt;h2&gt;
  
  
  28. Sharding
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; Data is divided across multiple database servers.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; So one database does not need to store everything.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Divide users by ID, country, or region.&lt;/p&gt;




&lt;h2&gt;
  
  
  29. Indexing
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; An index works like a table of contents for data.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To find data without scanning the entire table.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Add indexes to columns that are often searched or sorted.&lt;/p&gt;




&lt;h2&gt;
  
  
  30. Denormalization
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; Some data is copied to make reading faster.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To reduce the number of table joins.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Store the customer name directly inside an order record.&lt;/p&gt;




&lt;h2&gt;
  
  
  31. ACID
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; A set of rules that keeps database transactions safe.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To prevent incorrect or incomplete updates.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Use transactions for payments, transfers, and orders.&lt;/p&gt;




&lt;h2&gt;
  
  
  32. BASE
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; The system stays available even when data is not immediately the same everywhere.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; It works well for large distributed systems.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Allow servers to synchronize data after a short delay.&lt;/p&gt;




&lt;h2&gt;
  
  
  33. Microservices
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; An application is divided into small independent services.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; Each service can be developed and deployed separately.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Separate user, payment, and order services.&lt;/p&gt;




&lt;h2&gt;
  
  
  34. Monolith
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; All application functions are inside one large application.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; It is easier to build and deploy at the beginning.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Keep the interface, business logic, and data access in one project.&lt;/p&gt;




&lt;h2&gt;
  
  
  35. Event-Driven Architecture
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; The system reacts when an event happens.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To reduce direct dependency between system components.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; When an order is created, send events to email and inventory services.&lt;/p&gt;




&lt;h2&gt;
  
  
  36. Message Queue
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; A queue stores tasks that need to be processed.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; The sender does not need to wait for the receiver.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Put email sending or image processing into a queue.&lt;/p&gt;




&lt;h2&gt;
  
  
  37. Pub/Sub
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; One service publishes an event, and many services receive it.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; One event can trigger several actions.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; A successful payment event can update inventory, email, and reports.&lt;/p&gt;




&lt;h2&gt;
  
  
  38. Synchronous vs Asynchronous
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; Synchronous work waits for a result; asynchronous work continues without waiting.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To choose the right processing style for each task.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Use synchronous processing for payments and asynchronous processing for emails.&lt;/p&gt;




&lt;h2&gt;
  
  
  39. Idempotency
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; Sending the same request many times still creates only one final result.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To avoid duplicate payments or orders.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Give each payment request a unique ID.&lt;/p&gt;




&lt;h2&gt;
  
  
  40. Backpressure
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; The receiver asks the sender to slow down.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To prevent the receiver from becoming overloaded.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Limit the queue size or process data in smaller groups.&lt;/p&gt;




&lt;h2&gt;
  
  
  41. Circuit Breaker
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; The system temporarily stops calling a failing service.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To avoid sending more requests to a service that is already broken.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Stop calls after several failures and try again later.&lt;/p&gt;




&lt;h2&gt;
  
  
  42. Bulkhead
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; System parts are separated from each other.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; A failure in one part does not affect the whole system.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Give each service its own resources and connection pool.&lt;/p&gt;




&lt;h2&gt;
  
  
  43. Retry Logic
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; The system automatically tries again after a failure.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To recover from temporary network or service problems.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Retry after 1 second, then 2 seconds, then 4 seconds.&lt;/p&gt;




&lt;h2&gt;
  
  
  44. Timeout
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; The system stops waiting after a set amount of time.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To stop requests from waiting forever.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Set a maximum waiting time for APIs and databases.&lt;/p&gt;




&lt;h2&gt;
  
  
  45. Service Discovery
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; Services can find the addresses of other services.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; Service addresses may change when the system scales.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Register service addresses in a central system.&lt;/p&gt;




&lt;h2&gt;
  
  
  46. API Gateway
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; A central entry point that sends requests to the correct service.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To manage routing, authentication, and rate limits in one place.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; The client calls the gateway, and the gateway calls the correct service.&lt;/p&gt;




&lt;h2&gt;
  
  
  47. Load Shedding
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; The system rejects some requests when it is overloaded.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To keep important services working.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Drop low-priority requests or return a “system busy” message.&lt;/p&gt;




&lt;h2&gt;
  
  
  48. Autoscaling
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; The system automatically adds or removes servers.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To handle high traffic and save money during low traffic.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Add servers when CPU usage or traffic passes a limit.&lt;/p&gt;




&lt;h2&gt;
  
  
  49. Blue-Green Deployment
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; The old and new versions run in two separate environments.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To release a new version with little downtime.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Test the new environment and then move traffic to it.&lt;/p&gt;




&lt;h2&gt;
  
  
  50. Canary Release
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; A small group of users receives the new version first.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To find problems before releasing to everyone.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Send 5% of traffic to the new version and increase it slowly.&lt;/p&gt;




&lt;h2&gt;
  
  
  51. Feature Flags
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; A switch that turns a feature on or off.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To control releases without deploying new code.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Enable a feature for employees or a small user group first.&lt;/p&gt;




&lt;h2&gt;
  
  
  52. Observability
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; The ability to understand what is happening inside the system.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To find the cause of problems faster.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Combine logs, metrics, and tracing.&lt;/p&gt;




&lt;h2&gt;
  
  
  53. Logging
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; Recording events that happen inside the system.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To investigate errors and understand system activity.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Record time, requests, errors, and processing details.&lt;/p&gt;




&lt;h2&gt;
  
  
  54. Metrics
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; Numbers that show the health and performance of the system.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To know whether the system is fast, slow, or overloaded.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Track CPU, memory, requests, error rate, and latency.&lt;/p&gt;




&lt;h2&gt;
  
  
  55. Tracing
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; Following one request through multiple services.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To find which service caused a delay or error.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Add a trace ID and record every processing step.&lt;/p&gt;




&lt;h2&gt;
  
  
  56. Correlation ID
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; A shared ID used by all logs for the same request.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To find the complete history of one request.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Create the ID when the request starts and pass it through all services.&lt;/p&gt;




&lt;h2&gt;
  
  
  57. Monitoring
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; Continuously watching the system.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To find problems before they affect many users.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Use dashboards to watch servers, databases, and APIs.&lt;/p&gt;




&lt;h2&gt;
  
  
  58. Alerting
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; Sending a warning when the system has a problem.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To help the operations team respond quickly.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Send an email or message when errors or CPU usage become too high.&lt;/p&gt;




&lt;h2&gt;
  
  
  59. Full-Text Search
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; Searching for words or sentences inside text.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To quickly search articles, products, or documents.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Index the content with Elasticsearch or a similar tool.&lt;/p&gt;




&lt;h2&gt;
  
  
  60. Time Series
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; Data recorded at different points in time.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To track how something changes over time.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Store CPU usage, stock prices, or temperature readings.&lt;/p&gt;




&lt;h2&gt;
  
  
  61. Vector Database
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; A database that finds data with similar meaning.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; It is useful for AI search, images, and similar text.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Convert data into vectors and search for the nearest vectors.&lt;/p&gt;




&lt;h2&gt;
  
  
  62. Materialized View
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; A query result that is calculated and saved in advance.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To make complex reports load faster.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Save daily or monthly sales summaries.&lt;/p&gt;




&lt;h2&gt;
  
  
  63. Query Optimization
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; Improving a database query so it runs faster.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To reduce latency and resource usage.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Add indexes, read less data, and check the query plan.&lt;/p&gt;




&lt;h2&gt;
  
  
  64. Connection Pooling
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; Reusing database connections that are already open.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; Opening a new connection for every request is slow and expensive.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Create a shared pool of database connections.&lt;/p&gt;




&lt;h2&gt;
  
  
  65. Cache Stampede
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; Many requests hit the database when cached data expires.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; It must be prevented because it can overload the database.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Lock cache refreshes or use different expiration times.&lt;/p&gt;




&lt;h2&gt;
  
  
  66. Cache Warming
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; Loading data into the cache before users request it.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; The first user does not need to wait for the database.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Load popular products into the cache before peak hours.&lt;/p&gt;




&lt;h2&gt;
  
  
  67. CDN Caching
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; Saving copies of content on CDN servers.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To help users in different regions load content faster.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Cache images, videos, CSS, and JavaScript files.&lt;/p&gt;




&lt;h2&gt;
  
  
  68. Data Compression
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; Making data smaller before storing or sending it.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To save storage space and network bandwidth.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Compress files, images, or API responses with gzip.&lt;/p&gt;




&lt;h2&gt;
  
  
  69. Serialization
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; Changing an object into a format that can be stored or sent.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; Systems cannot directly send objects from memory.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Convert objects into JSON, XML, or Protocol Buffers.&lt;/p&gt;




&lt;h2&gt;
  
  
  70. Deserialization
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; Changing received data back into an object.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; The application needs objects to continue processing.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Convert a JSON response into an application object.&lt;/p&gt;




&lt;h2&gt;
  
  
  71. WebSockets
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; The client and server keep a two-way connection open.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To send real-time updates without repeated requests.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Use it for chat, notifications, and live prices.&lt;/p&gt;




&lt;h2&gt;
  
  
  72. WebRTC
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; A technology for real-time audio, video, and data communication.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To support low-latency video calls.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Use it for online meetings, video calls, and screen sharing.&lt;/p&gt;




&lt;h2&gt;
  
  
  73. CQRS
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; Reading data and writing data use separate models.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; Each side can be optimized for its own purpose.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Use one model for updates and another for displaying data.&lt;/p&gt;




&lt;h2&gt;
  
  
  74. Event Sourcing
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; Every change to the data is stored as an event.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To view history and rebuild an earlier state.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Store events such as order created, paid, and cancelled.&lt;/p&gt;




&lt;h2&gt;
  
  
  75. Service Mesh
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; A layer that manages communication between microservices.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To manage security, routing, and monitoring in one place.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Place a proxy beside each service to manage traffic.&lt;/p&gt;




&lt;h2&gt;
  
  
  76. Sidecar
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; A supporting component that runs beside the main application.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To add functions without changing much application code.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Use a sidecar for proxying, logging, or security.&lt;/p&gt;




&lt;h2&gt;
  
  
  77. BFF – Backend for Frontend
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; Each type of frontend has its own backend.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; Web and mobile applications often need different data.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Create one backend for the web and another for mobile.&lt;/p&gt;




&lt;h2&gt;
  
  
  78. Strangler Pattern
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; Replacing an old system one part at a time.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To reduce the risk of rewriting the entire system.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Move each function from the old system to the new system gradually.&lt;/p&gt;




&lt;h2&gt;
  
  
  79. LSM Trees
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; A data structure designed for fast writing.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; It works well for systems that write a lot of data.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Write data to memory first and organize it on disk later.&lt;/p&gt;




&lt;h2&gt;
  
  
  80. B-Trees
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; A tree structure that helps find data quickly on disk.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; It is useful for exact searches and range searches.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Databases often use B-Trees for indexes.&lt;/p&gt;




&lt;h2&gt;
  
  
  81. Merkle Trees
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; A tree of hashes that represents data.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To compare and check data without reading everything.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Compare hashes between nodes to find different data.&lt;/p&gt;




&lt;h2&gt;
  
  
  82. Bloom Filter
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; A fast structure that checks whether data may exist.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To avoid database queries when data definitely does not exist.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Check the Bloom Filter before reading the real database.&lt;/p&gt;




&lt;h2&gt;
  
  
  83. HyperLogLog
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; An algorithm that estimates the number of unique items.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To count approximately while using very little memory.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Estimate unique users or IP addresses.&lt;/p&gt;




&lt;h2&gt;
  
  
  84. MapReduce
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; Data is divided, processed, and then combined.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To process very large datasets across many machines.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Map processes the data, and Reduce combines the results.&lt;/p&gt;




&lt;h2&gt;
  
  
  85. Batch Processing
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; Data is collected and processed in groups.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; It works well when results are not needed immediately.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Generate daily reports or calculate salaries at the end of the month.&lt;/p&gt;




&lt;h2&gt;
  
  
  86. Stream Processing
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; Data is processed as soon as it arrives.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To produce near real-time results.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Process transactions, sensor data, or user events continuously.&lt;/p&gt;




&lt;h2&gt;
  
  
  87. ETL
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; Extract data, transform it, and load it into another system.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To clean and standardize data from different sources.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Extract sales data, clean it, and load it into a reporting system.&lt;/p&gt;




&lt;h2&gt;
  
  
  88. Data Pipeline
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; An automatic flow that moves and processes data.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To reduce manual work and keep data updated.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Connect data sources, processing steps, and storage systems.&lt;/p&gt;




&lt;h2&gt;
  
  
  89. Data Lake
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; A storage system for many types of raw data.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To store data now and decide how to use it later.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Store logs, files, images, videos, and sensor data.&lt;/p&gt;




&lt;h2&gt;
  
  
  90. Data Warehouse
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; A storage system for cleaned and organized reporting data.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To make business analysis faster and more consistent.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Combine sales, customer, and financial data.&lt;/p&gt;




&lt;h2&gt;
  
  
  91. Secrets Management
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; Secure management of passwords, tokens, and API keys.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To stop secret information from appearing in source code.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Store secrets in a secure tool and control access.&lt;/p&gt;




&lt;h2&gt;
  
  
  92. RBAC
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; Permissions are given based on user roles.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To manage access more easily when there are many users.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Create roles such as Admin, Manager, and User.&lt;/p&gt;




&lt;h2&gt;
  
  
  93. SSO
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; Users sign in once and access multiple systems.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; Users do not need to remember many passwords.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Connect applications to one central login system.&lt;/p&gt;




&lt;h2&gt;
  
  
  94. Encryption
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; Data is changed into a form that strangers cannot read.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To protect data when it is stored or sent over a network.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Encrypt HTTPS traffic, databases, files, and sensitive information.&lt;/p&gt;




&lt;h2&gt;
  
  
  95. Checksum
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; A value used to check whether data has changed.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To detect damaged or modified files.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Calculate the checksum before and after transfer, then compare the values.&lt;/p&gt;




&lt;h2&gt;
  
  
  96. Erasure Coding
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; Data is divided into pieces that can be used for recovery.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; Data can still be restored even when some pieces are lost.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Store the pieces on different disks or servers.&lt;/p&gt;




&lt;h2&gt;
  
  
  97. Consensus
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; Multiple servers agree on the same state or result.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To prevent servers from storing different results.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Servers vote and accept the result supported by the majority.&lt;/p&gt;




&lt;h2&gt;
  
  
  98. Leader Election
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt; The servers choose one server to coordinate the others.&lt;br&gt;
&lt;strong&gt;Why use it:&lt;/strong&gt; To prevent several servers from doing the same important task.&lt;br&gt;
&lt;strong&gt;How to use it:&lt;/strong&gt; Servers vote, and when the leader fails, they choose a new leader.&lt;/p&gt;

</description>
      <category>designsystem</category>
      <category>devops</category>
    </item>
    <item>
      <title>Unit Prices Are Falling, So Why Are the Bills Going Up? Tokenomics for AI Platform Owners</title>
      <dc:creator>Kento IKEDA</dc:creator>
      <pubDate>Thu, 25 Jun 2026 21:28:23 +0000</pubDate>
      <link>https://dev.clauneck.workers.dev/aws-builders/unit-prices-are-falling-so-why-are-the-bills-going-up-tokenomics-for-ai-platform-owners-2cfl</link>
      <guid>https://dev.clauneck.workers.dev/aws-builders/unit-prices-are-falling-so-why-are-the-bills-going-up-tokenomics-for-ai-platform-owners-2cfl</guid>
      <description>&lt;p&gt;"Model unit prices keep falling, yet our monthly AI bill keeps climbing." If you use AI personally, you can feel the creep of your subscription and metered charges. If you own AI usage inside a company, the gap is even more pronounced.&lt;/p&gt;

&lt;p&gt;Overseas, this feeling has started getting a name: &lt;strong&gt;Tokenomics&lt;/strong&gt;. On June 3, 2026, the Linux Foundation announced its intent to launch the &lt;strong&gt;Tokenomics Foundation&lt;/strong&gt;, dedicated to open standards for AI cost management. Google, Microsoft, Oracle, JPMorganChase, and others — both providers and large buyers — are on board.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.linuxfoundation.org/press/linux-foundation-announces-the-intent-to-launch-the-tokenomics-foundation-to-establish-open-standards-for-ai-cost-management" rel="noopener noreferrer"&gt;https://www.linuxfoundation.org/press/linux-foundation-announces-the-intent-to-launch-the-tokenomics-foundation-to-establish-open-standards-for-ai-cost-management&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This post isn't an explainer of the word itself. It's an account of what changes for the people who own internal generative AI usage — the platform owners, the FinOps practitioners, the engineering leaders watching the bills — once you have this word in your vocabulary.&lt;/p&gt;

&lt;p&gt;What Tokenomics gives you isn't another saving technique. It changes &lt;strong&gt;the unit of measurement and the lens&lt;/strong&gt; through which you read AI cost.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Tokenomics, why now
&lt;/h2&gt;

&lt;p&gt;Tokenomics sits in the lineage of cloud FinOps. The FinOps Foundation now classifies Tokenomics as the &lt;strong&gt;"AI Value"&lt;/strong&gt; dimension within &lt;strong&gt;FinOps for AI&lt;/strong&gt;. Where cloud FinOps tracked the variable infrastructure costs (compute, storage, networking) against value, Tokenomics tracks the variable cost of intelligence itself. It's not a replacement; it adds a probabilistic, non-deterministic layer of variable cost on top.&lt;/p&gt;

&lt;p&gt;Tokens here means what you see on every API price sheet and usage dashboard — the smallest unit a language model reads and writes, the unit of compute. The word "tokenomics" also exists in the crypto world, but that one is about issuance, distribution, and incentives on a blockchain — tokens as units of ownership. Same word, different economies.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.finops.org/insights/token-economics-the-atomic-unit-of-ai-value/" rel="noopener noreferrer"&gt;https://www.finops.org/insights/token-economics-the-atomic-unit-of-ai-value/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The term gained traction from spring 2026 onward. Generative AI and agents moved from pilots to production, and tokens became the largest and fastest-growing line item in many technical budgets. Per-token prices fell, but usage volume rose even faster, and bills became harder to read. The Foundation launch is industry's response: a venue to align on a common yardstick for tokens, the way cloud costs were once aligned.&lt;/p&gt;

&lt;p&gt;As a follow-on, the annual FinOps X conference will be renamed &lt;strong&gt;Tokenomicon&lt;/strong&gt; starting 2027. The word is settling into its own institutional shape.&lt;/p&gt;

&lt;p&gt;From here, four shifts in how a platform owner sees AI cost.&lt;/p&gt;

&lt;h2&gt;
  
  
  Shift 1: Budget on the trajectory of consumption, not on the unit price
&lt;/h2&gt;

&lt;p&gt;The first thing to change is where you anchor your budget. Stop drawing comfort from "unit prices keep dropping" and start watching &lt;strong&gt;the trajectory of total consumption&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Per-million-token prices for general-purpose models fell sharply from 2023 to 2025. Recently they've plateaued, while the top-tier and reasoning models have actually gone up. Yet enterprise spending keeps growing. The reason is demand elasticity: when prices drop, organizations widen modalities (text → images → video), increase agent autonomy, and lengthen reasoning chains. The volume grows faster than the price falls.&lt;/p&gt;

&lt;p&gt;The scale shows in numbers companies publish openly. At Google I/O 2026, Google announced monthly processing of &lt;strong&gt;32 quadrillion tokens&lt;/strong&gt; across its AI products, roughly &lt;strong&gt;7x the 4.8 quadrillion&lt;/strong&gt; of the previous year. AT&amp;amp;T reported scaling its internal "Ask AT&amp;amp;T" GenAI platform from about &lt;strong&gt;8 billion tokens/day&lt;/strong&gt; to about &lt;strong&gt;27 billion tokens/day&lt;/strong&gt; after restructuring orchestration into a multi-agent setup — &lt;strong&gt;3x the volume at about 90% lower cost&lt;/strong&gt;. The IEA noted that AI-related data center electricity demand grew about &lt;strong&gt;50% in 2025 alone&lt;/strong&gt; (against overall electricity demand growth of about 3%), and attributed the gap to a surge in AI usage (roughly 3x monthly active users and 5x revenue at major model providers).&lt;/p&gt;

&lt;p&gt;What matters: &lt;strong&gt;consumption is not linear in user-visible activity&lt;/strong&gt;. A single query that triggers a RAG pipeline, hits a reasoning model, and makes several tool calls can consume tens to hundreds of times the tokens of a direct prompt to a small model. Agent-to-agent communication is itself a cost. The research community has started calling this overhead &lt;strong&gt;"communication tax"&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://openreview.net/forum?id=0iLbiYYIpC" rel="noopener noreferrer"&gt;https://openreview.net/forum?id=0iLbiYYIpC&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Breaking down where consumption accumulates, one request typically stacks up across five elements:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fbkzy9q8dpdqvm2o0x3ld.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fbkzy9q8dpdqvm2o0x3ld.png" alt=" " width="800" height="589"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;These multiply rather than add, which is why the total is unreadable from surface-level activity.&lt;/p&gt;

&lt;p&gt;For a platform owner, the action is clear: stop projecting budgets from last quarter's actuals and price trendlines. Assume that any expansion of use case will spike consumption, and put &lt;strong&gt;the trajectory itself&lt;/strong&gt; on the dashboard. Unit price is no longer the subject of the budget conversation. Total consumption is.&lt;/p&gt;

&lt;h2&gt;
  
  
  Shift 2: Treat tokens as an invisible cost category
&lt;/h2&gt;

&lt;p&gt;The next shift is to &lt;strong&gt;see tokens as a hidden cost category&lt;/strong&gt; and start watching it deliberately.&lt;/p&gt;

&lt;p&gt;Cloud instances can be resized. Storage can be audited. Tokens lack that tactile feedback. They flow quietly through every agent loop, every retrieval call, every reasoning step, and pile up as a cost no one budgeted. This is the property the Tokenomics discussion keeps pointing at.&lt;/p&gt;

&lt;p&gt;What amplifies the invisibility is &lt;strong&gt;metered billing hidden inside SaaS subscriptions&lt;/strong&gt;. What looks like a flat monthly subscription to a developer tool or business app is, in reality, a token meter waiting to spin up. Roll out AI tools, and you can get bills the seat count can't explain. The examples are not hypothetical:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cursor&lt;/strong&gt; moved to usage-based pricing in June 2025. With long-context agent usage, effective spend ballooned by orders of magnitude for some users. On July 4, the CEO had to issue a public apology and offer refunds.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://cursor.com/blog/june-2025-pricing" rel="noopener noreferrer"&gt;https://cursor.com/blog/june-2025-pricing&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Kiro&lt;/strong&gt; launched with a pricing model that charged spec and vibe requests at a 5:1 ratio, immediately drew criticism, and the company officially acknowledged a bug that caused requests to be over-consumed.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://kiro.dev/blog/important-pricing-updates/" rel="noopener noreferrer"&gt;https://kiro.dev/blog/important-pricing-updates/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The common pattern: &lt;strong&gt;subscription prices no longer signal your budget&lt;/strong&gt;. The seat fee is a floor. What you actually pay is determined by usage, not seat count.&lt;/p&gt;

&lt;p&gt;What a platform owner should do first is finish visibility &lt;strong&gt;before&lt;/strong&gt; reaching for optimization techniques. Build a state where you can break down — by model, by product, by team, by environment — who is consuming how much. Surface the tokens hiding inside SaaS, too. Without that foundation, the optimization conversation has nothing to stand on.&lt;/p&gt;

&lt;h2&gt;
  
  
  Shift 3: Solve reduction by design, not by discipline
&lt;/h2&gt;

&lt;p&gt;The third shift is in how you think about cost reduction. &lt;strong&gt;Reducing tokens isn't a matter of restraint; it's a design problem.&lt;/strong&gt; And the levers from the supply side have arrived.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Model routing.&lt;/strong&gt; Instead of sending every query to the top-tier model, route to the cheapest model that can still answer. FrugalGPT, an academic approach, tries smaller models first and only escalates when needed — reporting up to &lt;strong&gt;98% cost reduction vs GPT-4&lt;/strong&gt;. RouteLLM (UC Berkeley) reports up to &lt;strong&gt;85% cost reduction while preserving conversational quality&lt;/strong&gt;. Amazon Bedrock offers this as a managed service (intelligent prompt routing) with &lt;strong&gt;up to 30% reduction&lt;/strong&gt; officially advertised. Routing is no longer research-only; it's a real option from both research and managed services.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://arxiv.org/abs/2305.05176" rel="noopener noreferrer"&gt;https://arxiv.org/abs/2305.05176&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://arxiv.org/abs/2406.18665" rel="noopener noreferrer"&gt;https://arxiv.org/abs/2406.18665&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://aws.amazon.com/bedrock/intelligent-prompt-routing/" rel="noopener noreferrer"&gt;https://aws.amazon.com/bedrock/intelligent-prompt-routing/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Tool calls as code.&lt;/strong&gt; Hand an agent a list of tool definitions and the definitions ride in the context every turn. Cloudflare's "Code Mode" has the agent write code that calls the tools instead. They report compressing the tool definitions of an MCP server exposing 2,500 APIs from about &lt;strong&gt;1.17M tokens to about 1,000 tokens — 99.9% compression&lt;/strong&gt;. Anthropic independently presented the same pattern as "Code Execution with MCP." This isn't a vendor-specific quirk anymore.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://blog.cloudflare.com/code-mode-mcp/" rel="noopener noreferrer"&gt;https://blog.cloudflare.com/code-mode-mcp/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.anthropic.com/engineering/code-execution-with-mcp" rel="noopener noreferrer"&gt;https://www.anthropic.com/engineering/code-execution-with-mcp&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Context compression.&lt;/strong&gt; In a RAG pipeline, only a small fraction of the retrieved text contributes to the answer; the rest is noise that wastes tokens. If you prune it, you cut the tokens the LLM sees. Zilliz, a vector database vendor, reports &lt;strong&gt;70–80% token reduction&lt;/strong&gt; by sentence-level relevance filtering that drops weakly related sentences.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://milvus.io/blog/semantic-highlighting-model-for-rag-context-pruning-and-token-saving.md" rel="noopener noreferrer"&gt;https://milvus.io/blog/semantic-highlighting-model-for-rag-context-pruning-and-token-saving.md&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Data format choice.&lt;/strong&gt; The serialization format you hand the LLM directly affects token volume. Microsoft's Data Science engineering blog shows that &lt;strong&gt;function-calling-based structured output is more token-efficient than free-form JSON&lt;/strong&gt; for the same result. For tabular data, CSV/TSV or newer LLM-oriented formats like TOON can use &lt;strong&gt;30–60% fewer tokens&lt;/strong&gt; than JSON. Data format is a functional decision and a cost decision at the same time.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://medium.com/data-science-at-microsoft/token-efficiency-with-structured-output-from-language-models-be2e51d3d9d5" rel="noopener noreferrer"&gt;https://medium.com/data-science-at-microsoft/token-efficiency-with-structured-output-from-language-models-be2e51d3d9d5&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Lining these up by reported savings and ease of adoption (difficulty is a rough indicator):&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Lever&lt;/th&gt;
&lt;th&gt;Reported reduction&lt;/th&gt;
&lt;th&gt;Adoption difficulty&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Data format choice&lt;/td&gt;
&lt;td&gt;30–60% vs JSON&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Model routing&lt;/td&gt;
&lt;td&gt;up to 98% (FrugalGPT), 85% (RouteLLM)&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Context compression&lt;/td&gt;
&lt;td&gt;70–80%&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Tool calls as code (Code Mode)&lt;/td&gt;
&lt;td&gt;~99.9% on MCP definitions&lt;/td&gt;
&lt;td&gt;Medium–High&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;For a platform owner, the takeaway is the recognition that &lt;strong&gt;savings opportunities live in design, not in operations&lt;/strong&gt;. Most of these can be set as organizational policy — pick a default output format, install routing, decide how tools are exposed. Not "try harder" at the team level, but "decide the standard" at the platform level. Of the four, choosing a default output format is probably the lowest-friction starting point.&lt;/p&gt;

&lt;h2&gt;
  
  
  Shift 4: Measure by outcome, not by volume
&lt;/h2&gt;

&lt;p&gt;The last shift is in what you measure. Move from raw consumption to &lt;strong&gt;cost per outcome&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Counting tokens as if they were uniform misses something real. Tokens spent on a retry due to insufficient quality versus tokens in a first-shot usable response carry the same cost but different value. Tokens an agent burns going in circles look like tokens but don't translate into outcome. LLM inference research has a name for this: &lt;strong&gt;goodput&lt;/strong&gt; — the throughput that meets your SLOs (latency, quality targets). Benchmarks like SemiAnalysis's InferenceX have adopted this view. What an enterprise actually buys isn't raw token volume but the usable-output portion of it.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://bentoml.com/llm/inference-optimization/llm-inference-metrics" rel="noopener noreferrer"&gt;https://bentoml.com/llm/inference-optimization/llm-inference-metrics&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://inferencex.semianalysis.com/" rel="noopener noreferrer"&gt;https://inferencex.semianalysis.com/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;When you only chase volume, cost judgment goes off. What you should be watching is &lt;strong&gt;the fraction of tokens that yielded usable results&lt;/strong&gt; (the yield after retries and quality misses) and &lt;strong&gt;cost per inference / per workflow / per outcome&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;What matters most for a platform owner is keeping the balance between volume and value. Using 10x the tokens for 100x the value is economically right. Cutting tokens to a tenth and getting unusable output is not a saving. Conversely, &lt;strong&gt;token spend that doesn't translate into value is plain waste&lt;/strong&gt;: verbose system prompts, oversized contexts, overuse of expensive models, tool design that ships full documents when MCP could extract only what's needed. There's also an organizational failure mode — &lt;strong&gt;using token usage itself as a performance metric encourages meaningless AI use just to game the number&lt;/strong&gt;, as several reports have documented. Cost-per-outcome as the indicator prevents both directions of failure: the cost-cutting order that kills quality, and the value-disconnected consumption that gets ignored.&lt;/p&gt;

&lt;h2&gt;
  
  
  What the four shifts share
&lt;/h2&gt;

&lt;p&gt;The four shifts look distinct, but they collapse into two underlying moves.&lt;/p&gt;

&lt;p&gt;The first is &lt;strong&gt;changing what unit you look at&lt;/strong&gt;. From unit price to consumption trajectory (Shift 1). From token volume to cost per outcome (Shift 4). Both reset the meter.&lt;/p&gt;

&lt;p&gt;The second is &lt;strong&gt;making it visible, then putting your hands on it&lt;/strong&gt;. Token spend hides inside SaaS and variable cost, so visibility is the prerequisite (Shift 2). Once visible, design levers — not team effort — drive the reduction (Shift 3).&lt;/p&gt;

&lt;p&gt;Changing how you measure without acting changes nothing. Acting without changing how you measure tends to overshoot, killing quality in the name of savings. Each half alone falls short. When both arrive, AI cost shifts from something to watch by intuition to something to &lt;strong&gt;operate with grounding&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pushback worth pre-empting
&lt;/h2&gt;

&lt;p&gt;Four objections are worth addressing up front.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Isn't this just FinOps for AI?&lt;/strong&gt; Largely yes. The FinOps Foundation itself positions Tokenomics within FinOps for AI, specifically in the "AI Value" topic. Tokenomics is not a new methodology; it's a chunk of FinOps for AI with its own name. That said, getting a proper name and an institutional vessel does something on its own. It doesn't mean cross-team discussion and cross-vendor comparison suddenly work — internal vocabulary takes time to spread, and shared data formats need adoption. But laying the foundation for a shared language is itself worth tracking. Think of it less as a new technique and more as &lt;strong&gt;infrastructure for agreement starting to form&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.finops.org/topic/ai-value/" rel="noopener noreferrer"&gt;https://www.finops.org/topic/ai-value/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Doesn't Tokenomics narrow vision down to just tokens?&lt;/strong&gt; A real concern. Tokens are the most measurable layer of AI cost. Beneath that sit SaaS-embedded variable costs and operational/governance costs. If you self-host models, you also carry GPU/compute/storage, data transfer, and training costs underneath.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fdiddy2skr9bdynqcotmh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fdiddy2skr9bdynqcotmh.png" alt=" " width="799" height="147"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Tokens get the spotlight because they're growing fastest, hiding hardest, and have the most-formed vocabulary. A reasonable starting point — not the whole story. Worth holding that distinction.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;We don't use that many tokens.&lt;/strong&gt; Possibly true. Possibly just invisible. The SaaS-embedded portion shows up as a flat monthly fee or a rolled-up invoice, not as itemized token usage. "Don't use" vs. "don't see" only separates when you visualize. Building visibility while scale is small beats chasing it after the bill explodes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Unit prices keep falling — why not just wait?&lt;/strong&gt; Falling prices apply mostly to general-purpose models. Top-tier and reasoning models are a different story. Industry estimates consistently put agent-style workloads at &lt;strong&gt;5–30x the token consumption of the same task in chat form&lt;/strong&gt;. The lower-tier price drops get swallowed by the upper-tier consumption growth. Waiting works less well as your usage shifts toward the upper tiers.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.bigeye.com/blog/how-to-track-ai-agent-costs-and-token-usage" rel="noopener noreferrer"&gt;https://www.bigeye.com/blog/how-to-track-ai-agent-costs-and-token-usage&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://arxiv.org/abs/2604.22750" rel="noopener noreferrer"&gt;https://arxiv.org/abs/2604.22750&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Where to start
&lt;/h2&gt;

&lt;p&gt;No universal recipe. The first step varies with maturity and with which layer (self-hosted API, SaaS-embedded, self-hosting) your AI usage sits on. Still, a common order exists.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Start with visibility.&lt;/strong&gt; Before optimization techniques, build the state where you can break down — by user, model, product, environment — who's consuming how much. Without this, every later judgment is a guess. The tagging exercise itself raises questions worth surfacing: prod vs. staging splits, product and team boundaries, cost allocation logic that everyone can stomach. The setup work doubles as an on-ramp for FinOps awareness inside the organization.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Next, audit billing models.&lt;/strong&gt; For each AI-bearing SaaS and API in use, lay out the floor (the recurring portion) and the variable behavior. Once you suspend the "subscription = fixed cost" assumption, the location of budget risk looks different. Provider-side moves matter too — for example, Anthropic's April 2026 pricing structure change. Decisions about extending the recurring footprint and managing variable-cost blow-up become separate agenda items.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Then set design levers as policy.&lt;/strong&gt; The default output format, routing, how tools are exposed. Don't leave it to the field; pick the standard from the platform. As Shift 3 noted, the default output format is the lightest place to start exercising platform authority.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Finally, push the metric from volume toward outcome.&lt;/strong&gt; Watching cost per outcome and token yield keeps the cost-cutting order from killing quality. It also blocks the gaming pattern where token usage as a KPI breeds meaningless AI use, as Shift 4 noted. The metric step comes last, but how you align it determines how well the previous three actually deliver.&lt;/p&gt;

&lt;p&gt;Tokenomics isn't a new saving trick. It's an auxiliary line for reading AI cost &lt;strong&gt;as an economy&lt;/strong&gt; — as the relationship between volume and value. With the word settling into shared use overseas, holding the lens early, while owning AI inside your organization, is itself the first step.&lt;/p&gt;

&lt;p&gt;Not getting hooked on per-token price moves, but reading the relationship between volume and value — that's the kind of attention platform owners will be asked for going forward.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>finops</category>
      <category>agents</category>
      <category>mcp</category>
    </item>
    <item>
      <title>You Don't Need an AWS Account to Learn AWS</title>
      <dc:creator>Sarvar Nadaf</dc:creator>
      <pubDate>Thu, 25 Jun 2026 20:26:13 +0000</pubDate>
      <link>https://dev.clauneck.workers.dev/aws-builders/you-dont-need-an-aws-account-to-learn-aws-4e6k</link>
      <guid>https://dev.clauneck.workers.dev/aws-builders/you-dont-need-an-aws-account-to-learn-aws-4e6k</guid>
      <description>&lt;p&gt;👋 Hey there, Tech Enthusiasts!&lt;/p&gt;

&lt;p&gt;I'm Sarvar, a Cloud Architect who loves turning complex tech problems into simple solutions. I've worked with AWS, Azure, DevOps, Data, Analytics, Generative-AI and Agentic-AI building real systems for real companies. In this article series, I'll share what I've learned in a way that's easy to follow, whether you're experienced or just getting started.&lt;/p&gt;

&lt;p&gt;Let's get into it! 🚀&lt;/p&gt;




&lt;p&gt;I once woke up to a $14 AWS bill because I forgot to stop an EC2 instance overnight. For a student, that felt like $1400. That one mistake made me afraid to touch anything in AWS for the next three weeks.&lt;/p&gt;

&lt;p&gt;I didn't come from a coding background. I didn't have anyone to guide me. I watched videos, read documentation, tried things on the AWS console, and every single time I was terrified of one thing the bill. After that incident, I stopped experimenting freely. I played it safe. And playing it safe is the worst thing you can do when you're trying to learn cloud.&lt;/p&gt;

&lt;p&gt;That fear slowed me down by months.&lt;/p&gt;

&lt;p&gt;In my first real project, I deployed a Lambda function that wrote to DynamoDB. During development, I never tested what happens when DynamoDB throttles writes because I was too afraid to generate real traffic on AWS. In production, messages started disappearing silently. I spent two days debugging something I could have caught in five minutes if I had a safe local environment to test against.&lt;/p&gt;

&lt;p&gt;Today, after years of working as a Cloud Architect across multiple companies and countries, I can tell you this with full confidence the single most important thing that separates a good cloud engineer from a mediocre one is hands-on practice. Not certifications. Not watching 40 hours of video. Hands-on. Breaking things. Fixing things. Understanding why something failed and how to bring it back.&lt;/p&gt;

&lt;p&gt;A few months ago I found a tool that solves all of this. It's called Floci an open-source local AWS emulator. You run it in Docker, it gives you &lt;a href="https://github.com/floci-io/floci#supported-services" rel="noopener noreferrer"&gt;58 AWS services&lt;/a&gt; at localhost, and it costs nothing. No AWS account. No credit card. No sign-up. No auth token.&lt;/p&gt;

&lt;p&gt;Let me show you how it works.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problems You're Probably Facing Right Now
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Fear of billing.&lt;/strong&gt; Every time you try something new creating a VPC, launching an instance, testing Lambda there's this voice saying "what if I forget to delete this?" That fear kills curiosity.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No safe playground.&lt;/strong&gt; The AWS Free Tier helps, but it has limits. Cross those limits once, and you get charged. For a student or a fresher, even a small unexpected bill feels massive.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Theory without practice.&lt;/strong&gt; You watched hours of videos about S3 and DynamoDB. You can explain what they are. But can you actually use them? Can you create a bucket from the command line? Can you put data into a table and get it back? If not, you haven't really learned it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Starting over after every mistake.&lt;/strong&gt; When something breaks, most beginners delete everything and recreate it. They never learn to troubleshoot. They never learn why it broke. They just run away from the problem and start fresh. I did this for months.&lt;/p&gt;

&lt;p&gt;If any of this sounds familiar, keep reading.&lt;/p&gt;




&lt;h2&gt;
  
  
  Before We Start: What You Need
&lt;/h2&gt;

&lt;p&gt;Two things:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Docker&lt;/strong&gt; a tool that runs applications in isolated containers. Think of it as a lightweight virtual machine. Floci runs inside Docker.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AWS CLI&lt;/strong&gt; the command-line tool to interact with AWS services (optional, but I strongly recommend it)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That's it. Here's how to install both:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Everything in this tutorial runs 100% offline on your own machine Mac, Windows, or Linux. I'm using an EC2 instance to demonstrate the steps, but that's just my setup. You don't need a cloud server. The exact same commands work on your laptop, completely offline, with no internet required after the initial install. That's the whole point.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Install Docker
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Windows / Mac:&lt;/strong&gt; Download and install &lt;a href="https://www.docker.com/products/docker-desktop/" rel="noopener noreferrer"&gt;Docker Desktop&lt;/a&gt;. Open it once installed that's all.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Ubuntu / Debian:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;apt-get update &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;sudo &lt;/span&gt;apt-get &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-y&lt;/span&gt; docker.io docker-compose-v2
&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl start docker
&lt;span class="nb"&gt;sudo &lt;/span&gt;usermod &lt;span class="nt"&gt;-aG&lt;/span&gt; docker &lt;span class="nv"&gt;$USER&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;&lt;strong&gt;Amazon Linux / RHEL / Fedora:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;dnf &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-y&lt;/span&gt; docker
&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl start docker
&lt;span class="nb"&gt;sudo &lt;/span&gt;usermod &lt;span class="nt"&gt;-aG&lt;/span&gt; docker &lt;span class="nv"&gt;$USER&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F9g2e7gl61me3ug3os3w5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F9g2e7gl61me3ug3os3w5.png" alt=" " width="800" height="223"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Install Docker Compose plugin&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo mkdir&lt;/span&gt; &lt;span class="nt"&gt;-p&lt;/span&gt; /usr/local/lib/docker/cli-plugins
&lt;span class="nb"&gt;sudo &lt;/span&gt;curl &lt;span class="nt"&gt;-SL&lt;/span&gt; &lt;span class="s2"&gt;"https://github.com/docker/compose/releases/latest/download/docker-compose-linux-&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;uname&lt;/span&gt; &lt;span class="nt"&gt;-m&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-o&lt;/span&gt; /usr/local/lib/docker/cli-plugins/docker-compose
&lt;span class="nb"&gt;sudo chmod&lt;/span&gt; +x /usr/local/lib/docker/cli-plugins/docker-compose
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F4ign1wj61910bzhmz182.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F4ign1wj61910bzhmz182.png" alt=" " width="800" height="146"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;After running the Linux commands above, &lt;strong&gt;log out and log back in&lt;/strong&gt; (or run &lt;code&gt;newgrp docker&lt;/code&gt;) so the group change takes effect. Otherwise you'll get "permission denied" errors.&lt;/p&gt;
&lt;h3&gt;
  
  
  Install AWS CLI
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Ubuntu / Debian / Amazon Linux:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="s2"&gt;"https://awscli.amazonaws.com/awscli-exe-linux-&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;uname&lt;/span&gt; &lt;span class="nt"&gt;-m&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt;.zip"&lt;/span&gt; &lt;span class="nt"&gt;-o&lt;/span&gt; &lt;span class="s2"&gt;"awscliv2.zip"&lt;/span&gt;
unzip awscliv2.zip &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;sudo&lt;/span&gt; ./aws/install
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fio4s8yfb80c2co9wrk1n.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fio4s8yfb80c2co9wrk1n.png" alt=" " width="797" height="91"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;(If &lt;code&gt;unzip&lt;/code&gt; is not found, install it first: &lt;code&gt;sudo apt-get install -y unzip&lt;/code&gt; or &lt;code&gt;sudo dnf install -y unzip&lt;/code&gt;)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Mac:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="s2"&gt;"https://awscli.amazonaws.com/AWSCLIV2.pkg"&lt;/span&gt; &lt;span class="nt"&gt;-o&lt;/span&gt; &lt;span class="s2"&gt;"AWSCLIV2.pkg"&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;installer &lt;span class="nt"&gt;-pkg&lt;/span&gt; AWSCLIV2.pkg &lt;span class="nt"&gt;-target&lt;/span&gt; /
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;&lt;strong&gt;Windows:&lt;/strong&gt; Download and run the installer from:&lt;br&gt;
&lt;a href="https://awscli.amazonaws.com/AWSCLIV2.msi" rel="noopener noreferrer"&gt;https://awscli.amazonaws.com/AWSCLIV2.msi&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Verify the installation on any platform:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws &lt;span class="nt"&gt;--version&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fuvztqq3gad7lv1zmbnvb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fuvztqq3gad7lv1zmbnvb.png" alt=" " width="800" height="88"&gt;&lt;/a&gt;&lt;/p&gt;


&lt;h2&gt;
  
  
  Getting Floci Running
&lt;/h2&gt;

&lt;p&gt;Open your terminal, create a new folder, and create the compose file:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;mkdir &lt;/span&gt;floci-playground &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;cd &lt;/span&gt;floci-playground
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fltzshfwgrdardhlviimn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fltzshfwgrdardhlviimn.png" alt=" " width="800" height="107"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Now create a file called &lt;code&gt;docker-compose.yml&lt;/code&gt; inside this folder. You can use any text editor VS Code, nano, or even notepad. Paste this content:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;services&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;floci&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;floci/floci:latest&lt;/span&gt;
    &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;4566:4566"&lt;/span&gt;
    &lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;./data:/app/data&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fvf12bvkiflsat3uieuik.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fvf12bvkiflsat3uieuik.png" alt=" " width="800" height="323"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Start it:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker compose up &lt;span class="nt"&gt;-d&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Ff0hyoahcxnp8tj8f0th1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Ff0hyoahcxnp8tj8f0th1.png" alt=" " width="799" height="432"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;If you get &lt;code&gt;docker compose: command not found&lt;/code&gt;, try the older syntax: &lt;code&gt;docker-compose up -d&lt;/code&gt;. Both work the same way.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Wait a few seconds, then verify it's running:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Linux / Mac&lt;/span&gt;
curl http://localhost:4566/_localstack/health
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F2ffv19h5l4fzn4tzwsex.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F2ffv19h5l4fzn4tzwsex.png" alt=" " width="800" height="127"&gt;&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight powershell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Windows (PowerShell)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="n"&gt;Invoke-RestMethod&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;http://localhost:4566/_localstack/health&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;You should see something like this:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"services"&lt;/span&gt;&lt;span class="p"&gt;:{&lt;/span&gt;&lt;span class="nl"&gt;"s3"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"running"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nl"&gt;"sqs"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"running"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nl"&gt;"dynamodb"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"running"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nl"&gt;"lambda"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"running"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="err"&gt;...&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="nl"&gt;"version"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"1.5.25"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;All services showing &lt;code&gt;"running"&lt;/code&gt; you're good to go.&lt;/p&gt;

&lt;p&gt;Now tell your AWS CLI to talk to Floci instead of real AWS:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Linux / Mac&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;AWS_ENDPOINT_URL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;http://localhost:4566
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;AWS_DEFAULT_REGION&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;us-east-1
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;AWS_ACCESS_KEY_ID&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;test
export &lt;/span&gt;&lt;span class="nv"&gt;AWS_SECRET_ACCESS_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;test&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fqvy4d7j4f6s0rs68e5i3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fqvy4d7j4f6s0rs68e5i3.png" alt=" " width="800" height="114"&gt;&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight powershell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Windows (PowerShell)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="nv"&gt;$&lt;/span&gt;&lt;span class="nn"&gt;env&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nv"&gt;AWS_ENDPOINT_URL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"http://localhost:4566"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="nv"&gt;$&lt;/span&gt;&lt;span class="nn"&gt;env&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nv"&gt;AWS_DEFAULT_REGION&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"us-east-1"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="nv"&gt;$&lt;/span&gt;&lt;span class="nn"&gt;env&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nv"&gt;AWS_ACCESS_KEY_ID&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"test"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="nv"&gt;$&lt;/span&gt;&lt;span class="nn"&gt;env&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nv"&gt;AWS_SECRET_ACCESS_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"test"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;The credentials can be anything. I use &lt;code&gt;test&lt;/code&gt; because it's simple. Floci doesn't validate them it just needs non-empty values.&lt;/p&gt;


&lt;h2&gt;
  
  
  Your First Win: Create an S3 Bucket
&lt;/h2&gt;

&lt;p&gt;S3 is the storage backbone of AWS. Logs go to S3. Backups go to S3. Static websites live on S3. If you understand S3, you already understand 30% of how AWS works.&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws s3 mb s3://my-first-bucket
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;Output:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;make_bucket&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;my-first-bucket&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Ffu09zo6sx2mhgm5wv2fn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Ffu09zo6sx2mhgm5wv2fn.png" alt=" " width="800" height="126"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;That's it. You just created a bucket. One command. No console. No waiting.&lt;/p&gt;

&lt;p&gt;Now upload a file:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"hello from my laptop"&lt;/span&gt; | aws s3 &lt;span class="nb"&gt;cp&lt;/span&gt; - s3://my-first-bucket/greeting.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;Check if it's there:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws s3 &lt;span class="nb"&gt;ls &lt;/span&gt;s3://my-first-bucket
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;2026-06-16 11:43:58         21 greeting.txt
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F1x6lfln0zftl44glvgbb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F1x6lfln0zftl44glvgbb.png" alt=" " width="798" height="86"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Read it back:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws s3 &lt;span class="nb"&gt;cp &lt;/span&gt;s3://my-first-bucket/greeting.txt -
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;hello from my laptop
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F0ko54638rghvyzu0po2g.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F0ko54638rghvyzu0po2g.png" alt=" " width="797" height="83"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;That felt good, right? You just did exactly what production systems do store and retrieve data from S3. The same commands. The same behavior. When you move to real AWS someday, nothing changes except where the data lives.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Challenge:&lt;/strong&gt; Upload 3 different files to your bucket, then try to delete the bucket without emptying it first. What error do you get? Now figure out how to fix it using only the CLI. This exact scenario comes up in every cloud job.&lt;/p&gt;


&lt;h2&gt;
  
  
  Create a DynamoDB Table and Store Data
&lt;/h2&gt;

&lt;p&gt;DynamoDB confused me for weeks when I started. But once I actually created a table and put data into it, everything clicked. Let's make that happen for you right now.&lt;/p&gt;

&lt;p&gt;Create a table:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws dynamodb create-table &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--table-name&lt;/span&gt; Users &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--attribute-definitions&lt;/span&gt; &lt;span class="nv"&gt;AttributeName&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;,AttributeType&lt;span class="o"&gt;=&lt;/span&gt;S &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--key-schema&lt;/span&gt; &lt;span class="nv"&gt;AttributeName&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;,KeyType&lt;span class="o"&gt;=&lt;/span&gt;HASH &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--billing-mode&lt;/span&gt; PAY_PER_REQUEST
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fpblozgya4pudkw5lw0tm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fpblozgya4pudkw5lw0tm.png" alt=" " width="800" height="421"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Put an item in it:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Linux / Mac&lt;/span&gt;
aws dynamodb put-item &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--table-name&lt;/span&gt; Users &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--item&lt;/span&gt; &lt;span class="s1"&gt;'{"id":{"S":"user-001"},"name":{"S":"Sarvar"},"role":{"S":"Cloud Architect"}}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight powershell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Windows (PowerShell) - use double quotes and escape inner quotes&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="n"&gt;aws&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;dynamodb&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;put-item&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;--table-name&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;Users&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;--item&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s1"&gt;'{\"id\":{\"S\":\"user-001\"},\"name\":{\"S\":\"Sarvar\"},\"role\":{\"S\":\"Cloud Architect\"}}'&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;A quick note on the JSON format DynamoDB requires you to specify the data type for each value. &lt;code&gt;"S"&lt;/code&gt; means String, &lt;code&gt;"N"&lt;/code&gt; means Number. It looks verbose at first, but you get used to it quickly.&lt;/p&gt;

&lt;p&gt;Get it back:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws dynamodb get-item &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--table-name&lt;/span&gt; Users &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--key&lt;/span&gt; &lt;span class="s1"&gt;'{"id":{"S":"user-001"}}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Item"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"S"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"user-001"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"S"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Sarvar"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"role"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"S"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Cloud Architect"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fumb8ezub2ioye58p88se.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fumb8ezub2ioye58p88se.png" alt=" " width="799" height="192"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You just stored and retrieved structured data from a NoSQL database. That's real cloud development.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Challenge:&lt;/strong&gt; Put another item with the same &lt;code&gt;id&lt;/code&gt; but a different name. Does it overwrite? Does it error? Now put 5 different users and try &lt;code&gt;aws dynamodb scan --table-name Users&lt;/code&gt; to get all of them. This is how you learn database behavior by seeing it happen, not reading about it.&lt;/p&gt;


&lt;h2&gt;
  
  
  Send and Receive Messages with SQS
&lt;/h2&gt;

&lt;p&gt;Almost every production system uses message queues. Order processing, notifications, async workflows queues are everywhere.&lt;/p&gt;

&lt;p&gt;Create a queue:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws sqs create-queue &lt;span class="nt"&gt;--queue-name&lt;/span&gt; orders
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;Send a message:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws sqs send-message &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--queue-url&lt;/span&gt; http://localhost:4566/000000000000/orders &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--message-body&lt;/span&gt; &lt;span class="s1"&gt;'{"event":"order.placed","item":"cloud-book"}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fzudp3o07r4cn6arphw5h.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fzudp3o07r4cn6arphw5h.png" alt=" " width="799" height="121"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;(That &lt;code&gt;000000000000&lt;/code&gt; is the default AWS account ID that Floci uses. In real AWS, it would be your actual 12-digit account number.)&lt;/p&gt;

&lt;p&gt;Receive it:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws sqs receive-message &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--queue-url&lt;/span&gt; http://localhost:4566/000000000000/orders
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Messages"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"MessageId"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ba4ea32c-cfdf-4c28-b705-8bcbb4d8a0d0"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"Body"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"{&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;event&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;:&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;order.placed&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;,&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;item&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;:&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;cloud-book&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;}"&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;Your message comes back exactly as you sent it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Challenge:&lt;/strong&gt; Receive the same message again. What happens? It disappears for about 30 seconds (this is called "visibility timeout" the time SQS hides a message after someone reads it, giving them time to process it). Wait 30 seconds and try again. It comes back. Now try deleting it after receiving. This is exactly how real applications process queues.&lt;/p&gt;


&lt;h2&gt;
  
  
  Create a Secret in Secrets Manager
&lt;/h2&gt;

&lt;p&gt;Every application has passwords, API keys, and database credentials. Secrets Manager is where you store them securely.&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws secretsmanager create-secret &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; my-app/db-password &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--secret-string&lt;/span&gt; &lt;span class="s2"&gt;"super-secret-password-123"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;Retrieve it:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws secretsmanager get-secret-value &lt;span class="nt"&gt;--secret-id&lt;/span&gt; my-app/db-password
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"my-app/db-password"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"SecretString"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"super-secret-password-123"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fx3flb8g9tdrtggheeaku.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fx3flb8g9tdrtggheeaku.png" alt=" " width="798" height="154"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;That's how real applications fetch database credentials at runtime instead of hardcoding them in code. Simple, but incredibly important in production.&lt;/p&gt;


&lt;h2&gt;
  
  
  Store Configuration in SSM Parameter Store
&lt;/h2&gt;

&lt;p&gt;Parameter Store is where you keep application configuration feature flags, environment URLs, settings that change between dev and production.&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws ssm put-parameter &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; &lt;span class="s2"&gt;"/myapp/environment"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--value&lt;/span&gt; &lt;span class="s2"&gt;"development"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--type&lt;/span&gt; String

aws ssm put-parameter &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; &lt;span class="s2"&gt;"/myapp/max-retries"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--value&lt;/span&gt; &lt;span class="s2"&gt;"3"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--type&lt;/span&gt; String
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fduwar9qfq3l500m83s1f.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fduwar9qfq3l500m83s1f.png" alt=" " width="799" height="354"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Get them back:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws ssm get-parameter &lt;span class="nt"&gt;--name&lt;/span&gt; &lt;span class="s2"&gt;"/myapp/environment"&lt;/span&gt;
aws ssm get-parameters &lt;span class="nt"&gt;--names&lt;/span&gt; &lt;span class="s2"&gt;"/myapp/environment"&lt;/span&gt; &lt;span class="s2"&gt;"/myapp/max-retries"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fqdclmpj9xhlxmgdvgkzb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fqdclmpj9xhlxmgdvgkzb.png" alt=" " width="800" height="317"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This is how every well-architected application manages configuration. No more hardcoding values in your code.&lt;/p&gt;


&lt;h2&gt;
  
  
  Use It With Python
&lt;/h2&gt;

&lt;p&gt;If you want to go beyond CLI, here's how your application code talks to Floci. Make sure Python 3 is installed, then:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;boto3
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;

&lt;span class="n"&gt;s3&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;s3&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;endpoint_url&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;http://localhost:4566&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;region_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;us-east-1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;aws_access_key_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;test&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;aws_secret_access_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;test&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;s3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create_bucket&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Bucket&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;my-python-bucket&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;s3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;put_object&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Bucket&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;my-python-bucket&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;hello.txt&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Body&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sa"&gt;b&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;it works!&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_object&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Bucket&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;my-python-bucket&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;hello.txt&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Body&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;read&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;Output: &lt;code&gt;b'it works!'&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;The only difference between this and production code is one line: &lt;code&gt;endpoint_url&lt;/code&gt;. When you deploy to real AWS, remove that line. Everything else stays the same. You're writing production-ready code from day one.&lt;/p&gt;


&lt;h2&gt;
  
  
  The Way I Wish I Had Learned
&lt;/h2&gt;

&lt;p&gt;Looking back at my journey, here's what I would do differently starting today:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Week 1-2: S3 and IAM.&lt;/strong&gt; Create buckets, upload files, set permissions, try to access things you shouldn't. Break the permissions. Fix them. Understand what "Access Denied" actually means.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Week 3-4: DynamoDB.&lt;/strong&gt; Create tables with different key structures. Put data. Query it. Understand the difference between &lt;code&gt;get-item&lt;/code&gt; and &lt;code&gt;query&lt;/code&gt; and &lt;code&gt;scan&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Week 5-6: SQS, SNS, and Secrets Manager.&lt;/strong&gt; Build a simple message flow. Store secrets. Retrieve configuration. These are the building blocks of every real application.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Week 7-8: Lambda.&lt;/strong&gt; Write a simple function. Trigger it. See the logs. Floci runs real Lambda containers not mocks.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Throughout: Break everything.&lt;/strong&gt; Delete tables while data is in them. Send malformed messages. Call APIs with wrong parameters. Read the error messages carefully. This is the real education.&lt;/p&gt;


&lt;h2&gt;
  
  
  Tips From Years of Cloud Work
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;1. Don't just create. Troubleshoot.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The skill that got me promoted from L1 Cloud Support to Cloud Architect wasn't creating infrastructure. It was fixing it when it broke. Break things deliberately on Floci and fix them without starting over.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Learn the CLI before the console.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;In real jobs, you'll use CLI and infrastructure-as-code. Floci forces this habit because there's no console to click around in.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Read error messages. Actually read them.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Most beginners see an error and panic. AWS error messages are surprisingly helpful if you actually read them. Practice generating errors on Floci so you learn to read them calmly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Document what you learn.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I filled four notebooks with cloud concepts. Later, I converted those notes into articles. That habit changed my career. Start writing about what you break and fix even if nobody reads it at first.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. One concept per day is enough.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;You don't need to learn all 200 AWS services. Focus on the core: storage, databases, messaging. One solid hour of hands-on practice beats five hours of video watching.&lt;/p&gt;


&lt;h2&gt;
  
  
  Common Mistakes Beginners Hit
&lt;/h2&gt;

&lt;p&gt;Before you get stuck, here are the issues I see most often:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;"Command not found: aws"&lt;/strong&gt; - You haven't installed AWS CLI yet, or you need to restart your terminal after installation.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Commands hitting real AWS instead of Floci&lt;/strong&gt; - You forgot to set &lt;code&gt;AWS_ENDPOINT_URL&lt;/code&gt;. Always check with &lt;code&gt;echo $AWS_ENDPOINT_URL&lt;/code&gt; before running commands.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;"Cannot connect to the Docker daemon"&lt;/strong&gt; - Docker Desktop isn't running. Open it first.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Data disappeared after restart&lt;/strong&gt; - By default, Floci stores data in memory. Add &lt;code&gt;FLOCI_STORAGE_MODE=hybrid&lt;/code&gt; to keep data between restarts:
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;services&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;floci&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;floci/floci:latest&lt;/span&gt;
    &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;4566:4566"&lt;/span&gt;
    &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;FLOCI_STORAGE_MODE=hybrid&lt;/span&gt;
    &lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;./data:/app/data&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  When You're Ready for Real AWS
&lt;/h2&gt;

&lt;p&gt;Here's the transition path:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Learn and practice on Floci (no cost, no risk)&lt;/li&gt;
&lt;li&gt;When you feel confident, create an AWS Free Tier account&lt;/li&gt;
&lt;li&gt;Deploy the same commands on real AWS they work identically&lt;/li&gt;
&lt;li&gt;For your application code, just remove the &lt;code&gt;endpoint_url&lt;/code&gt; line&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;There is no rewrite. No migration. No new learning curve. The skills you build on Floci transfer directly to real AWS because it's the same API, the same CLI, the same SDK.&lt;/p&gt;


&lt;h2&gt;
  
  
  Cleaning Up
&lt;/h2&gt;

&lt;p&gt;When you're done for the day:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker compose down
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;This stops everything. If you used &lt;code&gt;hybrid&lt;/code&gt; storage mode, your data stays in the &lt;code&gt;./data&lt;/code&gt; folder for next time.&lt;/p&gt;


&lt;h2&gt;
  
  
  Final Thoughts
&lt;/h2&gt;

&lt;p&gt;When I started learning cloud, I was scared of bills, confused by services, and paralyzed by the fear of breaking something expensive. I shipped broken code to production because I couldn't test properly locally. I wasted months playing it safe when I should have been experimenting aggressively.&lt;/p&gt;

&lt;p&gt;That was then. Today, you have Floci.&lt;/p&gt;

&lt;p&gt;If you're a fresher starting your cloud journey, a student preparing for certifications, or a working professional switching to cloud start here. Build things. Break things. Fix things. Do it a hundred times until the commands feel natural and the errors feel familiar.&lt;/p&gt;

&lt;p&gt;The cloud rewards people who practice. Not people who watch.&lt;/p&gt;

&lt;p&gt;In my next article, I'll show you how to deploy a complete serverless API Lambda + API Gateway + DynamoDB entirely on Floci, and then move it to real AWS with zero code changes. Follow me so you don't miss it.&lt;/p&gt;



&lt;p&gt;&lt;strong&gt;Resources:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Floci GitHub: 
&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://assets.dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/floci-io" rel="noopener noreferrer"&gt;
        floci-io
      &lt;/a&gt; / &lt;a href="https://github.com/floci-io/floci" rel="noopener noreferrer"&gt;
        floci
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      Light, fluffy, and always free - The AWS Local Emulator alternative
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;&lt;p&gt;
  &lt;a rel="noopener noreferrer" href="https://github.com/floci-io/floci/docs/assets/floci-black.svg#gh-light-mode-only"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2Ffloci-io%2Ffloci%2FHEAD%2Fdocs%2Fassets%2Ffloci-black.svg%23gh-light-mode-only" alt="Floci" width="500"&gt;&lt;/a&gt;
  &lt;a rel="noopener noreferrer" href="https://github.com/floci-io/floci/docs/assets/floci-white.svg#gh-dark-mode-only"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2Ffloci-io%2Ffloci%2FHEAD%2Fdocs%2Fassets%2Ffloci-white.svg%23gh-dark-mode-only" alt="Floci" width="500"&gt;&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;
  &lt;strong&gt;Light, fluffy, and always free&lt;/strong&gt;&lt;br&gt;
  No account. No auth token. No feature gates. Just &lt;code&gt;docker compose up&lt;/code&gt;
&lt;/p&gt;
&lt;p&gt;
  &lt;a href="https://github.com/floci-io/floci/releases/latest" rel="noopener noreferrer"&gt;&lt;img src="https://camo.githubusercontent.com/c892d7fecb7988b7b3deada1dce66fb04a5ce604f44c24d0bfef71588414b734/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f762f72656c656173652f666c6f63692d696f2f666c6f63693f6c6162656c3d6c617465737425323072656c6561736526636f6c6f723d626c7565" alt="Latest Release"&gt;&lt;/a&gt;
  &lt;a href="https://github.com/floci-io/floci/actions/workflows/release.yml" rel="noopener noreferrer"&gt;&lt;img src="https://camo.githubusercontent.com/fe80e691bbc394ea1a41482161b9db7e89905b9f1af7d3eadf66e7bdaa756152/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f616374696f6e732f776f726b666c6f772f7374617475732f666c6f63692d696f2f666c6f63692f72656c656173652e796d6c3f6c6162656c3d6275696c64" alt="Build Status"&gt;&lt;/a&gt;
  &lt;a href="https://hub.docker.com/r/floci/floci" rel="nofollow noopener noreferrer"&gt;&lt;img src="https://camo.githubusercontent.com/a0baf22d81171625a19cba372e88169c313366f0397beb44693d4f1c3fc333de/68747470733a2f2f696d672e736869656c64732e696f2f646f636b65722f70756c6c732f666c6f63692f666c6f63693f6c6162656c3d646f636b657225323070756c6c73" alt="Docker Pulls"&gt;&lt;/a&gt;
  &lt;a href="https://hub.docker.com/r/floci/floci" rel="nofollow noopener noreferrer"&gt;&lt;img src="https://camo.githubusercontent.com/ed3a09325423d2d937e42a307261285f40d1d7d4d446977fb683fb039c157ca1/68747470733a2f2f696d672e736869656c64732e696f2f646f636b65722f696d6167652d73697a652f666c6f63692f666c6f63692f6c61746573743f6c6162656c3d696d61676525323073697a65" alt="Docker Image Size"&gt;&lt;/a&gt;
  &lt;a href="https://opensource.org/licenses/MIT" rel="nofollow noopener noreferrer"&gt;&lt;img src="https://camo.githubusercontent.com/f8df3091bbe1149f398a5369b2c39e896766f9f6efba3477c63e9b4aa940ef14/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f6c6963656e73652d4d49542d677265656e" alt="License: MIT"&gt;&lt;/a&gt;
  &lt;a href="https://github.com/floci-io/floci/stargazers" rel="noopener noreferrer"&gt;&lt;img src="https://camo.githubusercontent.com/02cf08877e8fc8049ece2824ec2b956cec1fe0b330cfc3b0b8354f8f1e0c29b4/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f73746172732f666c6f63692d696f2f666c6f63693f7374796c653d666c6174" alt="GitHub Stars"&gt;&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;
  &lt;a href="https://github.com/floci-io/floci#quick-start" rel="noopener noreferrer"&gt;Quick Start&lt;/a&gt; ·
  &lt;a href="https://github.com/floci-io/floci#features" rel="noopener noreferrer"&gt;Features&lt;/a&gt; ·
  &lt;a href="https://github.com/floci-io/floci#supported-services" rel="noopener noreferrer"&gt;Services&lt;/a&gt; ·
  &lt;a href="https://github.com/floci-io/floci#sdk-integration" rel="noopener noreferrer"&gt;SDKs&lt;/a&gt; ·
  &lt;a href="https://github.com/floci-io/floci#testcontainers" rel="noopener noreferrer"&gt;Testcontainers&lt;/a&gt; ·
  &lt;a href="https://github.com/floci-io/floci#migrating-from-localstack" rel="noopener noreferrer"&gt;Migration&lt;/a&gt; ·
  &lt;a href="https://floci.io/floci/" rel="nofollow noopener noreferrer"&gt;Docs&lt;/a&gt;
&lt;/p&gt;

&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;What is Floci?&lt;/h2&gt;
&lt;/div&gt;

&lt;p&gt;Floci is a free, open-source local AWS emulator for development, testing, and CI.&lt;/p&gt;

&lt;p&gt;It gives you AWS-shaped services on your machine without requiring a cloud account, an auth token, or paid feature gates. Point your AWS SDK, CLI, Terraform, CDK, OpenTofu, or test suite at &lt;code&gt;http://localhost:4566&lt;/code&gt; and keep your existing workflows.&lt;/p&gt;

&lt;p&gt;Floci is named after &lt;a href="https://en.wikipedia.org/wiki/Cirrocumulus_floccus" rel="nofollow noopener noreferrer"&gt;floccus&lt;/a&gt;, the cloud formation that looks like popcorn.&lt;/p&gt;

&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;Quick Start&lt;/h2&gt;
&lt;/div&gt;

&lt;p&gt;The fastest way to run Floci is with the official &lt;a href="https://github.com/floci-io/floci-cli" rel="noopener noreferrer"&gt;CLI&lt;/a&gt;&lt;/p&gt;

&lt;div class="highlight highlight-source-shell notranslate position-relative overflow-auto js-code-highlight"&gt;
&lt;pre&gt;floci start&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Export the AWS environment variables:&lt;/p&gt;

&lt;div class="highlight highlight-source-shell notranslate position-relative overflow-auto js-code-highlight"&gt;
&lt;pre&gt;&lt;span class="pl-c1"&gt;eval&lt;/span&gt; &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;$(&lt;/span&gt;floci env&lt;span class="pl-pds"&gt;)&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Use your existing AWS tools normally:&lt;/p&gt;

&lt;div class="highlight highlight-source-shell notranslate position-relative overflow-auto js-code-highlight"&gt;
&lt;pre&gt;aws s3 mb s3://my-bucket
aws dynamodb create-table \
  --table-name demo-table \
  --attribute-definitions AttributeName=pk,AttributeType=S \
  --key-schema AttributeName=pk,KeyType=HASH \
  --billing-mode PAY_PER_REQUEST

aws&lt;/pre&gt;…
&lt;/div&gt;&lt;/div&gt;
  &lt;/div&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/floci-io/floci" rel="noopener noreferrer"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;


&lt;/li&gt;

&lt;li&gt;Floci Documentation: &lt;a href="https://floci.io/floci/" rel="noopener noreferrer"&gt;floci.io/floci/&lt;/a&gt;
&lt;/li&gt;

&lt;li&gt;Docker Hub: &lt;a href="https://hub.docker.com/r/floci/floci" rel="noopener noreferrer"&gt;hub.docker.com/r/floci/floci&lt;/a&gt;
&lt;/li&gt;

&lt;/ul&gt;




&lt;p&gt;If this helped you, share it with someone who's starting their cloud journey. Drop a comment if you have questions I'll respond to every one.&lt;/p&gt;

&lt;p&gt;Connect with me on &lt;a href="https://www.linkedin.com/in/sarvar04/" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt; for more cloud architecture, DevOps, and career guidance.&lt;/p&gt;




&lt;h2&gt;
  
  
  📌 Wrapping Up
&lt;/h2&gt;

&lt;p&gt;Thanks for reading! If this was helpful:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;❤️ Like if it added value&lt;/li&gt;
&lt;li&gt;💾 Save for later&lt;/li&gt;
&lt;li&gt;🔄 Share with your team&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Follow me for more on:&lt;/strong&gt; AWS architecture, FinOps, DevOps, and AI Infrastructure.&lt;/p&gt;

&lt;p&gt;👉 &lt;strong&gt;&lt;a href="https://sarvarnadaf.com" rel="noopener noreferrer"&gt;Visit my website&lt;/a&gt;&lt;/strong&gt; | &lt;strong&gt;&lt;a href="https://www.linkedin.com/in/sarvar04/" rel="noopener noreferrer"&gt;Connect on LinkedIn&lt;/a&gt;&lt;/strong&gt; | &lt;strong&gt;Email:&lt;/strong&gt; &lt;a href="mailto:simplynadaf@gmail.com"&gt;simplynadaf@gmail.com&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Happy Learning&lt;/strong&gt; 🚀&lt;/p&gt;

</description>
      <category>aws</category>
      <category>beginners</category>
      <category>career</category>
      <category>learning</category>
    </item>
    <item>
      <title>Zipping 15Gb of S3 files in 6s. How the power of community made it possible.</title>
      <dc:creator>Paul SANTUS</dc:creator>
      <pubDate>Thu, 25 Jun 2026 18:23:01 +0000</pubDate>
      <link>https://dev.clauneck.workers.dev/aws-builders/zipping-15gb-of-s3-files-in-11s-how-the-power-of-community-made-it-possible-5fgg</link>
      <guid>https://dev.clauneck.workers.dev/aws-builders/zipping-15gb-of-s3-files-in-11s-how-the-power-of-community-made-it-possible-5fgg</guid>
      <description>&lt;p&gt;In my &lt;a href="https://dev.clauneck.workers.dev/aws-builders/s3-zipper-challenge-a-parallel-zip-assembly-that-beats-the-single-lambda-approach-37gf"&gt;first article&lt;/a&gt;, I showed how parallelizing zip assembly across multiple Lambdas can beat the single-Lambda bandwidth ceiling. I zipped 6.9GB in 35 seconds with just 5 workers.&lt;/p&gt;

&lt;p&gt;Since then, Jérémie published a &lt;a href="https://rustysl.com/fr/blog/beyond-s3-archive-streaming" rel="noopener noreferrer"&gt;follow-up article&lt;/a&gt; where a contributor (&lt;a href="https://github.com/FigmentEngine/demo-s3-archiving/tree/main/contenders/rust/figment-engine" rel="noopener noreferrer"&gt;Fitz&lt;/a&gt;) introduced a brilliant optimization: &lt;code&gt;UploadPartCopy&lt;/code&gt;. Instead of downloading (or even streaming) big files through Lambda just to upload them back into the zip, you can tell S3 to copy them server-side. This halves the bandwidth requirement and brought his single-Lambda solution down to &lt;strong&gt;106 seconds&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;I took Fitz's &lt;code&gt;UploadPartCopy&lt;/code&gt; idea and combined it with my parallel approach. Here's what happened.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I took from Jérémie and Fitz
&lt;/h2&gt;

&lt;p&gt;The &lt;code&gt;UploadPartCopy&lt;/code&gt; insight is elegant: since ZIP STORE mode has deterministic offsets, we know exactly where each file's data lands in the final archive. For big files (≥5MB), we can:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Write just the &lt;strong&gt;local file header&lt;/strong&gt; (50 bytes) in an &lt;code&gt;UploadPart&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Have S3 copy the &lt;strong&gt;file data&lt;/strong&gt; directly via &lt;code&gt;UploadPartCopy&lt;/code&gt; — no download, no upload, instant&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This means workers barely use any memory or bandwidth for big files. &lt;/p&gt;

&lt;p&gt;Only issue is that S3 multipart upload API requires all segments (except the last one) to be bigger than 5MB. So the local file header needs to be appended to an another file (or group of files). &lt;/p&gt;

&lt;p&gt;My planner Lambda groups small files together until they reach 5MB, appends the LOC header of the next big file, then the worker fires an &lt;code&gt;UploadPartCopy&lt;/code&gt; for that big file's data. &lt;/p&gt;

&lt;p&gt;When we run out of small files, we stream the smallest remaining big file and pair it with (the LOC header then) a copy of the largest remaining one.&lt;/p&gt;

&lt;p&gt;For CRC32 (required in zip headers): files uploaded with modern AWS SDKs already have CRC32 stored as object metadata. A simple &lt;code&gt;HeadObject&lt;/code&gt; call retrieves it — no need to read the file.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step Functions: three limitations
&lt;/h2&gt;

&lt;p&gt;My original architecture used Step Functions to orchestrate workers. Here's what I hit.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Inline Map caps at ~40 concurrent iterations
&lt;/h3&gt;

&lt;p&gt;The &lt;a href="https://docs.aws.amazon.com/step-functions/latest/dg/state-map.html#concepts-map-process-modes" rel="noopener noreferrer"&gt;AWS documentation&lt;/a&gt; says the Inline Map state supports "up to 40 concurrent iterations." In practice I saw up to 55, but never more. With 1500 duos to process, Step Functions queued them in batches of 55. &lt;/p&gt;

&lt;p&gt;I switched to &lt;strong&gt;Distributed Map&lt;/strong&gt; which launches Express child workflow executions. All 1120 iterations started within 2 seconds. Problem solved? Not quite.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Distributed Map: fast to dispatch, slow to collect
&lt;/h3&gt;

&lt;p&gt;With Distributed Map, all workers started within 2 seconds. Every single one finished in under 1 second (mostly &lt;code&gt;UploadPartCopy&lt;/code&gt; calls). Total Lambda compute: ~500ms average.&lt;/p&gt;

&lt;p&gt;Yet the Map state took &lt;strong&gt;38 seconds&lt;/strong&gt; to complete.&lt;/p&gt;

&lt;p&gt;The bottleneck? Step Functions' internal machinery for collecting and aggregating results from 1120 Express child executions. I confirmed: all workers started at 10:06:52-53, all finished by 10:06:54, but the Map state didn't report success until 10:07:28. &lt;strong&gt;35 seconds of pure orchestration overhead&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. The 256KB payload limit
&lt;/h3&gt;

&lt;p&gt;Step Functions states can pass at most 256KB between them. With 3000 files:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The planner's assignment list exceeds 256KB → had to write to S3&lt;/li&gt;
&lt;li&gt;The aggregated worker results exceed 256KB → had to write CRC32s to S3, read them back in the finalizer&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This added complexity and latency (the finalizer reading 1500 small S3 files — 29 seconds sequentially, until I parallelized it down to 1.5s).&lt;/p&gt;

&lt;p&gt;After all these fixes, the Step Functions version ran in &lt;strong&gt;41 seconds&lt;/strong&gt; for 3000 × 5MB files. Respectable — 2.5× faster than Jérémie's 106s — but I knew most of that time was Step Functions overhead, not actual work.&lt;/p&gt;

&lt;h2&gt;
  
  
  The final version: direct Lambda invocation
&lt;/h2&gt;

&lt;p&gt;I stripped out Step Functions entirely and wrote a single &lt;strong&gt;orchestrator Lambda&lt;/strong&gt; that:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Lists files, computes zip layout (the job of the "planner" Lambda in my StepFunction architecture), and initiates multipart upload (~0.5s)&lt;/li&gt;
&lt;li&gt;Invokes all worker Lambdas &lt;strong&gt;synchronously in parallel&lt;/strong&gt; using goroutines + the Lambda SDK (~0.5s to dispatch)&lt;/li&gt;
&lt;li&gt;Collects results (workers return inline, no S3 round-trip for parts)&lt;/li&gt;
&lt;li&gt;Reads CRC32 files from S3 in parallel, builds central directory, completes multipart upload (~1s)
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Orchestrator Lambda (15min timeout, 1024MB)
    │
    ├─── goroutine → Invoke Worker 1 (sync) → return {parts}
    ├─── goroutine → Invoke Worker 2 (sync) → return {parts}
    ├─── ...
    └─── goroutine → Invoke Worker N (sync) → return {parts}
    │
    └─── All done → Build CD → CompleteMultipartUpload
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The Lambda SDK's synchronous &lt;code&gt;Invoke&lt;/code&gt; blocks until the worker returns. With 200 concurrent goroutines, all workers are dispatched instantly. No orchestration overhead, no state size limits for the parts (only CRC32s go to S3), no 35-second result aggregation.&lt;/p&gt;

&lt;p&gt;Now the theoretical time is: &lt;code&gt;orchestration time&lt;/code&gt; + &lt;code&gt;time to upload the smallest large file that stays orphan after we pair all large files with groups of small files or single large files&lt;/code&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Results: 3000 × 5MB benchmark
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Approach&lt;/th&gt;
&lt;th&gt;Time&lt;/th&gt;
&lt;th&gt;Notes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Jérémie Gen1 (Rust, streaming)&lt;/td&gt;
&lt;td&gt;212s&lt;/td&gt;
&lt;td&gt;Single Lambda, 512MB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Jérémie Gen2 (Rust, UploadPartCopy)&lt;/td&gt;
&lt;td&gt;106s&lt;/td&gt;
&lt;td&gt;Single Lambda, 640MB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;My Step Functions version&lt;/td&gt;
&lt;td&gt;41s&lt;/td&gt;
&lt;td&gt;Distributed Map, 1120 workers&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;My orchestrator Lambda&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;6s&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Direct invoke, ~1500 workers&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;6 seconds&lt;/strong&gt; to zip 15GB into a single valid ZIP64 archive. That's a 18× improvement over the optimized single-Lambda approach, and 35× over the original.&lt;/p&gt;

&lt;p&gt;Worker stats:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Max memory: &lt;strong&gt;85 MB&lt;/strong&gt; (I initially allocated 3008MB — massively over-provisioned thanks to &lt;code&gt;UploadPartCopy&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Average duration: &lt;strong&gt;516ms&lt;/strong&gt; per worker&lt;/li&gt;
&lt;li&gt;Max duration: &lt;strong&gt;1035ms&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What I learned (round 2)
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Step Functions Parallel Map adds seconds, not milliseconds.&lt;/strong&gt; For latency-sensitive fan-out/fan-in, direct Lambda invocation is faster. Step Functions shines when you need retries, visual debugging, long-running workflows, or error handling, or lightning fast step transition speed. This outstanding performance lasts only until you need more than 40 parallel processes.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;UploadPartCopy is the killer optimization.&lt;/strong&gt; When most files are ≥5MB, workers barely do any work — they just tell S3 to copy data server-side. Memory stays under 100MB regardless of file sizes.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;The orchestrator pattern is underrated.&lt;/strong&gt; A single Lambda with goroutines can invoke hundreds of child Lambdas synchronously, collect results, and finalize — all within one execution context. No state machine, no payload limits between states, no aggregation overhead.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Over-parallelization can hurt.&lt;/strong&gt; 1500 separate assignments created more Step Functions overhead than the actual compute. Grouping into fewer, larger batches would have been better for the SFN approach.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;

&lt;p&gt;Code: &lt;a href="https://github.com/psantus/on-demand-archive-on-s3" rel="noopener noreferrer"&gt;github.com/psantus/on-demand-archive-on-s3&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The repo has both approaches: Step Functions (&lt;code&gt;cmd/planner&lt;/code&gt; + &lt;code&gt;cmd/worker&lt;/code&gt; + &lt;code&gt;cmd/finalizer&lt;/code&gt;) and the orchestrator Lambda (&lt;code&gt;cmd/orchestrator&lt;/code&gt;).&lt;/p&gt;

&lt;p&gt;Jérémie's challenge repo: &lt;a href="https://github.com/RustyServerless/demo-s3-archiving" rel="noopener noreferrer"&gt;github.com/RustyServerless/demo-s3-archiving&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What's next?
&lt;/h2&gt;

&lt;p&gt;The theoretical minimum is bounded by Lambda cold start time (~200ms) plus the slowest &lt;code&gt;UploadPart&lt;/code&gt; call (if we lack small files, we may need to upload a large file manually to append another file's LOC to it) plus orchestrator overhead (~500ms). &lt;/p&gt;

&lt;p&gt;Your move, Jérémie 😏&lt;/p&gt;

&lt;p&gt;Edit: with 73.2Gb (15,000 files), my solutions gives quite acceptable performance. Just 20s (probably due to my 1000 account default concurrency, would likely be faster on an unbounded account :D) &lt;/p&gt;

&lt;p&gt;Paul out. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fzqe9cg5mt4eqtp2djrwr.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fzqe9cg5mt4eqtp2djrwr.gif" alt="Mic drop" width="250" height="132"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>aws</category>
      <category>serverless</category>
      <category>s3</category>
      <category>zip</category>
    </item>
    <item>
      <title>Deploy your own OpenVPN Server on AWS with one prompt</title>
      <dc:creator>Noureldin ehab</dc:creator>
      <pubDate>Thu, 25 Jun 2026 17:00:00 +0000</pubDate>
      <link>https://dev.clauneck.workers.dev/aws-builders/deploy-your-own-openvpn-server-on-aws-with-one-prompt-cjf</link>
      <guid>https://dev.clauneck.workers.dev/aws-builders/deploy-your-own-openvpn-server-on-aws-with-one-prompt-cjf</guid>
      <description>&lt;h1&gt;
  
  
  Overview
&lt;/h1&gt;

&lt;p&gt;Most of your AWS resources should be in private subnets for security reasons, but that also means they’re not directly accessible from the internet. To reach them securely, you need a VPN.&lt;/p&gt;

&lt;p&gt;In this tutorial, we’ll use OpenVPN on AWS to create a secure, encrypted connection to your private resources so your team can access them safely.&lt;/p&gt;

&lt;p&gt;Note: Stakpak is open source, vendor neutral, and works with any model you choose.&lt;/p&gt;

&lt;h1&gt;
  
  
  Problem
&lt;/h1&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;AWS resources in private subnets aren’t accessible from the internet by default.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Teams often try to solve this by opening ports or using bastion hosts, which increases security risks.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;These workarounds also add complexity to network management and access control.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;A VPN is needed to provide secure and simple access without exposing services publicly.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h1&gt;
  
  
  Business Impact
&lt;/h1&gt;

&lt;p&gt;Without a VPN, secure remote access is harder, slower, and riskier. A VPN simplifies access and keeps development and operations running securely.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;But what is a VPN?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A &lt;strong&gt;VPN (Virtual Private Network)&lt;/strong&gt; is a secure, encrypted connection that allows you to access a private network over the internet as if you were physically inside it. It’s commonly used to safely reach internal servers, databases, or applications without exposing them to the public.&lt;/p&gt;

&lt;h1&gt;
  
  
  Step-by-Step Guide
&lt;/h1&gt;

&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://stakpak.gitbook.io/docs/get-started/install-stakpak" rel="noopener noreferrer"&gt;Install Stakpak&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Cloud provider credentials configured&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Then just ask it to i want to install openvpn on aws so i can access my private resources&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Here you chose your preferences&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fq00smwnqd1908pw1ywbm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fq00smwnqd1908pw1ywbm.png" alt=" " width="800" height="188"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;I want to know more about the different architectures, so let's ask about it&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F0k8ozvszzw5m1ottxx2x.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F0k8ozvszzw5m1ottxx2x.png" alt=" " width="800" height="741"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Here I chose &lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Which AWS Region? EU West 1&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Do you have a VPC set ups? Yeah, i have a VPC&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;How many people need VPN Access? Just one person needs access &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;AWS Client VPN or Self Hosted Open VPN or Open VPN from Market Place? Self Hosted Open VPN&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fc16x0pmt6qvyecbd8rda.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fc16x0pmt6qvyecbd8rda.png" alt=" " width="799" height="477"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;I will just tell it to continue with the defaults&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Ft2zfiye28kfdixgdmsy5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Ft2zfiye28kfdixgdmsy5.png" alt=" " width="800" height="490"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Now we can review the commands and press Enter to continue it will be:&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Get the VPC details&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Get the subnet details&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Check the internet gateway&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;ol&gt;
&lt;li&gt;Now it will create a security group for open vpn and get the latest Ubuntu version&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fjfsn6ri7jpwut5scq6hv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fjfsn6ri7jpwut5scq6hv.png" alt=" " width="799" height="218"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Now it will create the security group rules, SSH key, and launch the ec2 instance&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fs8jq9bz85jdqu73on32t.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fs8jq9bz85jdqu73on32t.png" alt=" " width="799" height="260"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Now that we have the EC2 ready, Stakpak will start setting up open VPN&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Faonsfxb9yusy5kf9exrv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Faonsfxb9yusy5kf9exrv.png" alt=" " width="800" height="221"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;That's it, now we can use OpenVPN&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F6oe5kyt2gdtf7um5qkiy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F6oe5kyt2gdtf7um5qkiy.png" alt=" " width="792" height="146"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  Extra Resources:
&lt;/h1&gt;

&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://stakpak.gitbook.io/docs/get-started/install-stakpak" rel="noopener noreferrer"&gt;Install Stakpak&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://docs.aws.amazon.com/ec2/" rel="noopener noreferrer"&gt;EC2 Documentation&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://openvpn.net/as-docs/general.html" rel="noopener noreferrer"&gt;Open VPN Documentation&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>aws</category>
      <category>devops</category>
      <category>vpn</category>
      <category>openvpn</category>
    </item>
    <item>
      <title>Beyond the System Prompt: Building Modular AI Agents with Strands Skills</title>
      <dc:creator>Milad Rezaeighale</dc:creator>
      <pubDate>Thu, 25 Jun 2026 09:08:31 +0000</pubDate>
      <link>https://dev.clauneck.workers.dev/aws-builders/beyond-the-system-prompt-building-modular-ai-agents-with-strands-skills-27om</link>
      <guid>https://dev.clauneck.workers.dev/aws-builders/beyond-the-system-prompt-building-modular-ai-agents-with-strands-skills-27om</guid>
      <description>&lt;p&gt;Anyone who has shipped a multi-capability agent knows the pattern. You start clean. Then the product needs more. You append instructions. Then edge cases. Then domain-specific rules for each capability. Six months later your system prompt is 3,000 tokens of competing guidance that the model has to reconcile on every single call — whether it needs that context or not.&lt;/p&gt;

&lt;p&gt;The problem isn't prompt engineering skill. It's architecture. You're treating instruction delivery like a static config file when it should be dynamic.&lt;/p&gt;

&lt;p&gt;This is the same problem software engineering solved decades ago with modular design. You don't load every library into memory at startup. You import what you need, when you need it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Skills bring that principle to agent instruction design.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What Skills Are
&lt;/h2&gt;

&lt;p&gt;Skills are self-contained instruction packages that an agent loads on demand. The agent's context stays lean — only skill names and descriptions are present at startup. When the agent determines it needs a specific capability, it fetches the full instructions at that moment and executes within them.&lt;/p&gt;

&lt;p&gt;Three properties make this meaningful at scale:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Isolation&lt;/strong&gt; — each skill's instructions are scoped. They can't conflict with each other because they're never in context at the same time unless explicitly needed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Token efficiency&lt;/strong&gt; — you pay only for what's active. An agent with ten skills doesn't carry ten sets of instructions into every call.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Maintainability&lt;/strong&gt; — skills are versioned and updated independently. Changing how your agent handles one domain doesn't touch anything else.&lt;/p&gt;

&lt;p&gt;This is progressive disclosure applied to LLM context management.&lt;/p&gt;

&lt;h2&gt;
  
  
  Strands and the AgentSkills Plugin
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://strandsagents.com/" rel="noopener noreferrer"&gt;Strands&lt;/a&gt; is AWS's open-source agent SDK for Python and TypeScript. It takes a model-driven approach — instead of hardcoding orchestration logic, the LLM itself decides when to call tools, which order to execute steps, and when it has enough information to respond. This makes agents significantly more flexible without requiring complex orchestration code.&lt;/p&gt;

&lt;p&gt;Strands ships with built-in tool support, multi-agent orchestration, and a plugin system for extending agent behavior. One of those plugins is &lt;strong&gt;&lt;em&gt;AgentSkills&lt;/em&gt;&lt;/strong&gt; — a production implementation of the progressive disclosure pattern.&lt;/p&gt;

&lt;p&gt;Setting up an agent with Strands takes less than ten lines:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;strands&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Agent&lt;/span&gt;

&lt;span class="n"&gt;agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;system_prompt&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;You are a helpful assistant.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;What is the capital of France?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Adding skills is one extra step:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;strands&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Agent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;AgentSkills&lt;/span&gt;

&lt;span class="n"&gt;plugin&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;AgentSkills&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;skills&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;./skills/&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;plugins&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;plugin&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;From that point, the agent manages skill discovery and activation automatically — you don't wire any routing logic.&lt;/p&gt;

&lt;h2&gt;
  
  
  How AgentSkills Works in Detail
&lt;/h2&gt;

&lt;p&gt;The plugin operates in three phases:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Discovery
At initialization, AgentSkills scans your skills directory and injects only the skill names and descriptions into the system prompt:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;&lt;span class="nt"&gt;&amp;lt;available_skills&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;skill&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;name&amp;gt;&lt;/span&gt;email-drafter&lt;span class="nt"&gt;&amp;lt;/name&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;description&amp;gt;&lt;/span&gt;Drafts professional emails from a plain-English brief.&lt;span class="nt"&gt;&amp;lt;/description&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;/skill&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;skill&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;name&amp;gt;&lt;/span&gt;bug-investigator&lt;span class="nt"&gt;&amp;lt;/name&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;description&amp;gt;&lt;/span&gt;Analyzes errors and returns a structured diagnosis.&lt;span class="nt"&gt;&amp;lt;/description&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;/skill&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;skill&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;name&amp;gt;&lt;/span&gt;git-commit-writer&lt;span class="nt"&gt;&amp;lt;/name&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;description&amp;gt;&lt;/span&gt;Writes conventional commit messages from a change description.&lt;span class="nt"&gt;&amp;lt;/description&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;/skill&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/available_skills&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's all the agent sees upfront — names and descriptions. No instructions, no domain logic, no token cost beyond the metadata.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Activation&lt;/strong&gt;&lt;br&gt;
When the agent receives a message it determines requires a specific skill, it calls the built-in skills tool with the skill name as the argument. This is a standard tool call — the same mechanism the agent uses for any other tool. No special routing, no conditional logic on your side.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Execution&lt;/strong&gt;&lt;br&gt;
The tool returns the full contents of the &lt;strong&gt;SKILL.md&lt;/strong&gt; — instructions, rules, output format, everything. The agent now operates within those instructions for that response. Activated skills persist in agent state for the remainder of the session, so they don't need to be re-fetched on follow-up messages in the same domain.&lt;/p&gt;
&lt;h2&gt;
  
  
  Let's See It in Action
&lt;/h2&gt;

&lt;p&gt;To make skill activation visible, I built a simple Streamlit UI — three skills loaded into one agent, each triggered by a different type of message.&lt;/p&gt;

&lt;p&gt;I sent this prompt:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;I'm getting this error in my React app, can you help me debug it? TypeError: Cannot read properties of undefined (reading 'map') at App.js:42&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The agent identified it as a bug report, activated the bug-investigator skill on demand, and returned a structured diagnosis — no routing logic, no conditionals, no hardcoded rules.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F1s5nb3fd6t9xidq3ill2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F1s5nb3fd6t9xidq3ill2.png" alt="Skill activated: bug-investigator — triggered automatically from a single prompt" width="800" height="554"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Flgbk5pyr9kzvk5jwwz3r.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Flgbk5pyr9kzvk5jwwz3r.png" alt="Full structured diagnosis: Root Cause, Fix, and Example — all from the SKILL.md instructions." width="800" height="563"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Same agent, one prompt, the right skill loaded automatically.&lt;/p&gt;
&lt;h2&gt;
  
  
  Defining a Skill
&lt;/h2&gt;

&lt;p&gt;A skill is a directory with a single &lt;strong&gt;SKILL.md&lt;/strong&gt; file. The file has two parts: a YAML frontmatter header that the plugin reads, and a markdown body that becomes the agent's instructions.&lt;/p&gt;

&lt;p&gt;skills/&lt;br&gt;
├── bug-investigator/&lt;br&gt;
│   └── SKILL.md&lt;br&gt;
├── email-drafter/&lt;br&gt;
│   └── SKILL.md&lt;br&gt;
└── git-commit-writer/&lt;br&gt;
    └── SKILL.md&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;bug-investigator&lt;/span&gt;
&lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Analyzes&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;an&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;error&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;message&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;or&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;stack&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;trace&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;and&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;returns&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;a&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;structured&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;diagnosis&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;with&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;root&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;cause&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;and&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;fix."&lt;/span&gt;
&lt;span class="nn"&gt;---&lt;/span&gt;

&lt;span class="gh"&gt;# Bug Investigator Skill&lt;/span&gt;

You are a senior software debugger. When given an error message or stack trace, respond in this exact format:

🔍 Root Cause:
&lt;span class="nt"&gt;&amp;lt;one&lt;/span&gt; &lt;span class="na"&gt;clear&lt;/span&gt; &lt;span class="na"&gt;sentence&lt;/span&gt; &lt;span class="na"&gt;explaining&lt;/span&gt; &lt;span class="na"&gt;why&lt;/span&gt; &lt;span class="na"&gt;this&lt;/span&gt; &lt;span class="na"&gt;error&lt;/span&gt; &lt;span class="na"&gt;occurs&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;

🛠 Fix:
&lt;span class="nt"&gt;&amp;lt;step-by-step&lt;/span&gt; &lt;span class="na"&gt;instructions&lt;/span&gt; &lt;span class="na"&gt;to&lt;/span&gt; &lt;span class="na"&gt;resolve&lt;/span&gt; &lt;span class="na"&gt;it&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;

✅ Example:
&lt;span class="nt"&gt;&amp;lt;a&lt;/span&gt; &lt;span class="na"&gt;minimal&lt;/span&gt; &lt;span class="na"&gt;corrected&lt;/span&gt; &lt;span class="na"&gt;code&lt;/span&gt; &lt;span class="na"&gt;snippet&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;

Rules:
&lt;span class="p"&gt;-&lt;/span&gt; Be precise — if the error is ambiguous, ask one clarifying question.
&lt;span class="p"&gt;-&lt;/span&gt; Always explain the why, not just the what.
&lt;span class="p"&gt;-&lt;/span&gt; Keep the example under 10 lines.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The name field must be lowercase alphanumeric with hyphens, 1–64 characters. The description is what the agent reads to decide whether to activate the skill — write it as a clear, specific one-liner. Vague descriptions lead to wrong activations.&lt;/p&gt;

&lt;p&gt;An optional allowed-tools field restricts which tools the skill can use:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;pdf-processor&lt;/span&gt;
&lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Extracts text and tables from PDF files using shell scripts.&lt;/span&gt;
&lt;span class="na"&gt;allowed-tools&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;file_read shell&lt;/span&gt;
&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Two Ways to Define Skills
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Filesystem-based&lt;/strong&gt; is the standard approach — each skill in its own directory, versioned alongside your code, easy to review and update independently.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Programmatic&lt;/strong&gt; is useful when instructions need to be generated at runtime — pulled from a database, built from environment config, or constructed dynamically per tenant:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;strands&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Skill&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;AgentSkills&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Agent&lt;/span&gt;

&lt;span class="n"&gt;skill&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Skill&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;summarizer&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Condenses any text into a bullet-point summary preserving all key facts.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;instructions&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Extract the 3-5 most important points as bullet points. &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Add a one-sentence TL;DR at the top. &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Do not add information not present in the source text.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;plugin&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;AgentSkills&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;skills&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;skill&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="n"&gt;agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;plugins&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;plugin&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Both approaches compose cleanly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;plugin&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;AgentSkills&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;skills&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;./skills/&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dynamic_skill&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is the practical setup for most production agents — static skills for stable capabilities, programmatic skills for anything that varies by environment or user context.&lt;/p&gt;

&lt;h2&gt;
  
  
  When to Reach for Skills
&lt;/h2&gt;

&lt;p&gt;Skills aren't the right tool for every agent. If your agent has one job, a well-crafted system prompt is simpler and sufficient.&lt;/p&gt;

&lt;p&gt;Skills pay off when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Your agent handles genuinely different domains where instruction sets would conflict&lt;/li&gt;
&lt;li&gt;You're optimizing for token cost at scale across high-volume calls&lt;/li&gt;
&lt;li&gt;You need independent versioning of capabilities across a team&lt;/li&gt;
&lt;li&gt;You're building toward a multi-skill agent that will grow over time
They're a step below full multi-agent orchestration — more structure than a monolithic prompt, less overhead than spawning separate agents per capability.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Try It&lt;br&gt;
Full project with Streamlit UI on GitHub:&lt;/p&gt;

&lt;p&gt;👉 &lt;a href="https://github.com/miladrezaei-ai/strands-agent-skills" rel="noopener noreferrer"&gt;https://github.com/miladrezaei-ai/strands-agent-skills&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/miladrezaei-ai/strands-agent-skills
&lt;span class="nb"&gt;cd &lt;/span&gt;strands-agent-skills
uv &lt;span class="nb"&gt;sync
&lt;/span&gt;aws configure   &lt;span class="c"&gt;# or AWS SSO&lt;/span&gt;
uv run streamlit run app.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Where does your current agent prompt need this kind of separation? Would love to hear what you're building.&lt;/p&gt;

</description>
      <category>agenticai</category>
      <category>strandsagents</category>
      <category>agentskills</category>
      <category>ai</category>
    </item>
    <item>
      <title>AWS Lambda MicroVMs: I Tested the New Stateful Serverless Primitive</title>
      <dc:creator>Alexey Vidanov</dc:creator>
      <pubDate>Thu, 25 Jun 2026 03:49:35 +0000</pubDate>
      <link>https://dev.clauneck.workers.dev/aws-builders/aws-lambda-microvms-i-tested-the-new-stateful-serverless-primitive-40jf</link>
      <guid>https://dev.clauneck.workers.dev/aws-builders/aws-lambda-microvms-i-tested-the-new-stateful-serverless-primitive-40jf</guid>
      <description>&lt;h2&gt;
  
  
  What just happened
&lt;/h2&gt;

&lt;p&gt;On June 22, 2026, AWS quietly launched AWS Lambda MicroVMs. Not a Lambda feature update. A new compute primitive sitting between AWS Lambda Functions (stateless, 15-min max) and EC2 (full VM, you manage everything).&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F4dgxkyb4ysr2sbpy3emc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F4dgxkyb4ysr2sbpy3emc.png" alt=" " width="800" height="596"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Each MicroVM is an isolated Firecracker VM with its own HTTPS endpoint, running your code from a pre-built snapshot. Stateful. Up to 8 hours. Suspend when idle, resume on demand.&lt;/p&gt;

&lt;p&gt;I tested it the same week. Here's what I found.&lt;/p&gt;

&lt;h2&gt;
  
  
  The test setup
&lt;/h2&gt;

&lt;p&gt;A minimal Python HTTP server packaged as a Dockerfile:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;http.server&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;HTTPServer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;BaseHTTPRequestHandler&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Handler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseHTTPRequestHandler&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;start_time&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;request_count&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;do_GET&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;Handler&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;request_count&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
        &lt;span class="n"&gt;body&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;message&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Hello from Lambda MicroVM!&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;uptime_seconds&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;Handler&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;start_time&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;requests_served&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Handler&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;request_count&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pid&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getpid&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="p"&gt;})&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;send_response&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;send_header&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Content-Type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;application/json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;end_headers&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;wfile&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;

&lt;span class="nc"&gt;HTTPServer&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;0.0.0.0&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;8080&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;Handler&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;serve_forever&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The Dockerfile:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;&lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="s"&gt; public.ecr.aws/lambda/microvms:al2023-minimal&lt;/span&gt;
&lt;span class="k"&gt;RUN &lt;/span&gt;dnf &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-y&lt;/span&gt; python3 &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; dnf clean all
&lt;span class="k"&gt;WORKDIR&lt;/span&gt;&lt;span class="s"&gt; /app&lt;/span&gt;
&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; app.py .&lt;/span&gt;
&lt;span class="k"&gt;EXPOSE&lt;/span&gt;&lt;span class="s"&gt; 8080&lt;/span&gt;
&lt;span class="k"&gt;CMD&lt;/span&gt;&lt;span class="s"&gt; ["python3", "app.py"]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  How it works
&lt;/h2&gt;

&lt;p&gt;Three steps:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Zip code + Dockerfile → upload to Amazon Simple Storage Service (Amazon S3)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;create-microvm-image&lt;/code&gt; builds the container, starts the app, takes a Firecracker snapshot of memory and disk&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;run-microvm&lt;/code&gt; launches from that snapshot&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Every launch resumes from the pre-initialized state. No cold boot. Your app is already running the moment the MicroVM starts.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws lambda-microvms create-microvm-image &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; hello-microvm-test &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--code-artifact&lt;/span&gt; &lt;span class="s2"&gt;"uri=s3://my-bucket/artifact.zip"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--base-image-arn&lt;/span&gt; arn:aws:lambda:us-east-1:aws:microvm-image:al2023-1 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--build-role-arn&lt;/span&gt; arn:aws:iam::123456789:role/MicroVMBuildRole
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Image build took about 3 minutes. Once done:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws lambda-microvms run-microvm &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--image-identifier&lt;/span&gt; arn:aws:lambda:us-east-1:123456789:microvm-image:hello-microvm-test &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--execution-role-arn&lt;/span&gt; arn:aws:iam::123456789:role/MicroVMExecutionRole &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--idle-policy&lt;/span&gt; &lt;span class="s1"&gt;'{"maxIdleDurationSeconds":300,"suspendedDurationSeconds":60,"autoResumeEnabled":true}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Response:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"microvmId"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"microvm-489fbc1b-1c73-3b37-a9f2-266d0173cb94"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"state"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"RUNNING"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"endpoint"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"34cf7dac-bb5c.lambda-microvm.us-east-1.on.aws"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  The numbers
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Measured&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Image build&lt;/td&gt;
&lt;td&gt;~3 minutes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Launch API call&lt;/td&gt;
&lt;td&gt;1.17s&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Time to RUNNING&lt;/td&gt;
&lt;td&gt;~12s&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;First request (from snapshot)&lt;/td&gt;
&lt;td&gt;911ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Warm request latency&lt;/td&gt;
&lt;td&gt;~340ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Suspend → Resume&lt;/td&gt;
&lt;td&gt;1.86s&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The 340ms warm latency includes my network round-trip from Hamburg to us-east-1. The actual compute latency is lower.&lt;/p&gt;

&lt;h2&gt;
  
  
  Statefulness proof
&lt;/h2&gt;

&lt;p&gt;This is the part that matters. After three requests:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"requests_served"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"uptime_seconds"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;434.76&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"pid"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Suspend the MicroVM. Resume it. Send another request:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"requests_served"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"uptime_seconds"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;454.1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"pid"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Same PID. Counter continued from where it left off. Uptime kept ticking (includes suspended time). Full memory and disk state preserved across suspend/resume.&lt;/p&gt;

&lt;h2&gt;
  
  
  Authentication
&lt;/h2&gt;

&lt;p&gt;Each request needs a JWE token generated via the API:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws lambda-microvms create-microvm-auth-token &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--microvm-id&lt;/span&gt; microvm-489fbc1b &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--expiration-in-minutes&lt;/span&gt; 15 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--allowed-ports&lt;/span&gt; &lt;span class="s1"&gt;'[{"port":8080}]'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The token goes in the &lt;code&gt;X-aws-proxy-auth&lt;/code&gt; header. Short-lived, scoped to specific ports. No way to hit someone else's MicroVM.&lt;/p&gt;

&lt;h2&gt;
  
  
  What this replaces
&lt;/h2&gt;

&lt;p&gt;Before Lambda MicroVMs, running untrusted code (AI-generated, user-submitted) meant:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Containers with custom hardening&lt;/strong&gt; — shared kernel, escape risk, significant engineering to harden&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;EC2 per user&lt;/strong&gt; — minutes to start, expensive, you manage everything&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Lambda Functions&lt;/strong&gt; — 15-min max, stateless, no interactive sessions&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Lambda MicroVMs fills the gap: VM-level isolation with serverless operational model. No capacity planning. No kernel to patch. Suspend when idle, pay only for snapshot storage.&lt;/p&gt;

&lt;h2&gt;
  
  
  Specs and limits
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Compute:&lt;/strong&gt; 0.5–8 GB RAM baseline, burst to 32 GB. 0.25–4 vCPU baseline, burst to 16.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Disk:&lt;/strong&gt; up to 32 GB&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Runtime:&lt;/strong&gt; max 8 hours&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Architecture:&lt;/strong&gt; ARM64 only (for now)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Protocols:&lt;/strong&gt; HTTP/1.1, HTTP/2, gRPC, WebSocket, SSE&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Regions:&lt;/strong&gt; us-east-1, us-east-2, us-west-2, eu-west-1, ap-northeast-1&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Pricing model
&lt;/h2&gt;

&lt;p&gt;Three dimensions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Compute:&lt;/strong&gt; per-second, based on your chosen baseline + peak usage above it&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Snapshot operations:&lt;/strong&gt; read/write when launching or suspending&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Snapshot storage + data transfer&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Suspended MicroVMs cost only storage. No compute charges while idle.&lt;/p&gt;

&lt;h2&gt;
  
  
  Who should care
&lt;/h2&gt;

&lt;p&gt;If you're building any of these, Lambda MicroVMs changes your architecture:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;AI agent sandboxes (execute generated code safely)&lt;/li&gt;
&lt;li&gt;Browser-based IDEs (each user gets their own env)&lt;/li&gt;
&lt;li&gt;CI/CD runners (isolated per job, no shared state)&lt;/li&gt;
&lt;li&gt;Jupyter/analytics (state persists across sessions)&lt;/li&gt;
&lt;li&gt;Vulnerability scanning (disposable, isolated)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What I'd watch
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;ARM64 only is a constraint for workloads compiled for x86&lt;/li&gt;
&lt;li&gt;5 regions at launch means some customers wait&lt;/li&gt;
&lt;li&gt;The snapshot-based model means your app's initialization needs to be snapshot-friendly (no stale connections, no clock-sensitive state at init)
&lt;del&gt;- Pricing details not fully public yet at time of writing&lt;/del&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Getting started
&lt;/h2&gt;

&lt;p&gt;You need AWS CLI v2.35.10+. The &lt;code&gt;lambda-microvms&lt;/code&gt; service is a separate command namespace:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws lambda-microvms list-managed-microvm-images &lt;span class="nt"&gt;--region&lt;/span&gt; us-east-1
aws lambda-microvms create-microvm-image &lt;span class="nt"&gt;--help&lt;/span&gt;
aws lambda-microvms run-microvm &lt;span class="nt"&gt;--help&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The base image (&lt;code&gt;al2023-1&lt;/code&gt;) is Amazon Linux 2023 minimal. Your Dockerfile adds what you need on top.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pricing
&lt;/h2&gt;

&lt;p&gt;Lambda MicroVMs bills per second across three dimensions. You configure a baseline and pay for&lt;br&gt;
  burst capacity only when used.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Compute (eu-west-1):&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;vCPU: $0.0000291572 per second&lt;/li&gt;
&lt;li&gt;Memory: $0.0000038603 per second per GB&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You pay baseline while running. Burst above baseline is charged only for the seconds consumed&lt;br&gt;
  at peak, not for the full duration.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Snapshot operations and storage&lt;/strong&gt; are charged separately (pricing not fully detailed at&lt;br&gt;
  launch).&lt;/p&gt;

&lt;h3&gt;
  
  
  Real-world example: Playwright browser automation
&lt;/h3&gt;

&lt;p&gt;Baseline: 1 vCPU / 2 GB RAM. Chromium bursts to 2 vCPU + 4 GB for 3 seconds during page render.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Simple scrape (stays at baseline)&lt;/strong&gt; — 5s duration → $0.000185 per invocation → $1.85 at 10K/month&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Heavy page (burst 3s of 8s)&lt;/strong&gt; — 8s duration → $0.000405 per invocation → $4.05 at 10K/month&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Full PDF render (burst 5s of 12s)&lt;/strong&gt; — 12s duration → $0.000996 per invocation → $9.96 at 10K/month&lt;/p&gt;

&lt;p&gt;A Playwright job that needs 4 GB for 3 seconds of an 8-second run costs half of what a fixed 4 GB allocation would for the full duration. Configure for your typical workload, let Lambda handle the spikes.&lt;/p&gt;

&lt;p&gt;Suspended MicroVMs incur only snapshot storage costs. No compute charges while idle.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Tested June 24, 2026. Lambda MicroVMs launched June 22 in preview.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Sources
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Blog: &lt;a href="https://aws.amazon.com/blogs/aws/run-isolated-sandboxes-with-full-lifecycle-control-aws-lambda-introduces-microvms/" rel="noopener noreferrer"&gt;https://aws.amazon.com/blogs/aws/run-isolated-sandboxes-with-full-lifecycle-control-aws-lambda-introduces-microvms/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Product page: &lt;a href="https://aws.amazon.com/lambda/lambda-microvms/" rel="noopener noreferrer"&gt;https://aws.amazon.com/lambda/lambda-microvms/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;CLI: aws-cli v2.35.10+ (&lt;code&gt;aws lambda-microvms&lt;/code&gt;)&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>aws</category>
      <category>serverless</category>
      <category>lambda</category>
      <category>firecracker</category>
    </item>
    <item>
      <title>I Built a Full-Stack Fantasy Football App Using Kiro and Vercel v0</title>
      <dc:creator>Ogbeide Godstime Osemenkhian</dc:creator>
      <pubDate>Wed, 24 Jun 2026 18:47:44 +0000</pubDate>
      <link>https://dev.clauneck.workers.dev/aws-builders/i-built-a-full-stack-fantasy-football-app-using-kiro-and-vercel-v0-4bab</link>
      <guid>https://dev.clauneck.workers.dev/aws-builders/i-built-a-full-stack-fantasy-football-app-using-kiro-and-vercel-v0-4bab</guid>
      <description>&lt;h2&gt;
  
  
  The Idea
&lt;/h2&gt;

&lt;p&gt;Fantasy Premier League has 11 million players. The official app gives you a wall of numbers and leaves you to interpret them. I wanted something that felt easier to use; squad management, player transfers, a credit economy, and leaderboards, with a clean mobile-first UI on top of AWS infrastructure.&lt;/p&gt;

&lt;p&gt;The H0: Hack the zero Stack with Vercel v0 and AWS Database Hackathon gave me a deadline to actually ship it.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Two-Tool Workflow
&lt;/h2&gt;

&lt;p&gt;I split the work between two AI tools and it worked better than I expected:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;v0 by Vercel&lt;/strong&gt; handled the frontend. I described pages and components in natural language and v0 generated production-ready Next.js 16 code with Tailwind CSS and shadcn/ui. The squad builder pitch visualization, dashboard layout, transfer history cards, and responsive mobile nav all came from v0. It nailed the App Router file conventions and Tailwind utility classes without hand-holding.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Kiro&lt;/strong&gt; handled the backend and orchestration. I used Kiro's spec-driven workflow: Requirements → Design → Tasks. Once the spec was locked, Kiro executed all 32 implementation tasks autonomously; API routes, auth middleware, DynamoDB operations, credit logic, transfer validation.&lt;/p&gt;

&lt;h2&gt;
  
  
  Architecture: Deliberately Simple
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Browser → Vercel (Next.js API Routes) → DynamoDB (via AWS SDK)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No API Gateway, No Lambda in the request path. Next.js API routes call DynamoDB directly using &lt;code&gt;@aws-sdk/lib-dynamodb&lt;/code&gt; with IAM credentials injected as Vercel environment variables.&lt;/p&gt;

&lt;p&gt;Auth is AWS Cognito. The JWT lives in an &lt;code&gt;httpOnly&lt;/code&gt; secure cookie called &lt;code&gt;squadiq-token&lt;/code&gt;. Middleware validates presence; API routes decode the &lt;code&gt;sub&lt;/code&gt; claim for the userId.&lt;/p&gt;

&lt;p&gt;Lambda only exists for background work:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Player Sync&lt;/strong&gt; - daily cron that pulls data from the FPL API into DynamoDB&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scoring Engine&lt;/strong&gt; - EventBridge trigger after match completion&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The DynamoDB Single-Table Design
&lt;/h2&gt;

&lt;p&gt;Everything lives in one table (&lt;code&gt;SquadIQ-dev&lt;/code&gt;) with composite keys:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;PK&lt;/th&gt;
&lt;th&gt;SK&lt;/th&gt;
&lt;th&gt;Entity&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;USER#&amp;lt;id&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;PROFILE&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;User profile&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;USER#&amp;lt;id&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;SQUAD#&amp;lt;competitionId&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Squad&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;USER#&amp;lt;id&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;TRANSFER#&amp;lt;ts&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Transfer record&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;USER#&amp;lt;id&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;CREDIT#&amp;lt;ts&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Credit transaction&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;LEAGUE#&amp;lt;id&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;META&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;League metadata&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;PLAYER#&amp;lt;id&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;META&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Player data&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;This means every query is a single &lt;code&gt;GetItem&lt;/code&gt; or &lt;code&gt;Query&lt;/code&gt; on the partition key. No scans, no joins.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Kiro's Spec Workflow Actually Looks Like
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Write requirements (user stories + acceptance criteria)&lt;/li&gt;
&lt;li&gt;Generate a technical design (data model, API contracts, auth flow)&lt;/li&gt;
&lt;li&gt;Generate implementation tasks from the design&lt;/li&gt;
&lt;li&gt;Run all tasks; Kiro writes the code, runs type checks, iterates on errors&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The spec files live in &lt;code&gt;.kiro/specs/squadiq-live-features/&lt;/code&gt; and serve as living documentation. When I needed to add a new endpoint, I updated the spec and Kiro knew exactly what to implement.&lt;/p&gt;

&lt;h2&gt;
  
  
  What v0 Did Well
&lt;/h2&gt;

&lt;p&gt;v0 is remarkable at generating &lt;strong&gt;complete, styled, accessible React components&lt;/strong&gt; from descriptions. Things that normally take an hour of fiddling with CSS grid took one prompt:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;"Dashboard with a small pitch on the left and stat cards stacked vertically on the right"&lt;/li&gt;
&lt;li&gt;"Squad builder with a football pitch background showing players in formation positions"&lt;/li&gt;
&lt;li&gt;"Transfer history page with a hero banner and card list"&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It understood Tailwind responsive modifiers (&lt;code&gt;md:&lt;/code&gt;, &lt;code&gt;lg:&lt;/code&gt;), shadcn/ui component patterns, and Next.js Image optimization out of the box.&lt;/p&gt;

&lt;h2&gt;
  
  
  Lessons Learned
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Spec-first saves time.&lt;/strong&gt; 30 minutes on requirements prevented hours of rework. When you define the API response shape before writing code, frontend and backend stay aligned.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;You don't need Lambda for everything.&lt;/strong&gt; Next.js API routes with IAM credentials are fast, cheap, and way simpler to debug than a Lambda + API Gateway stack.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;DynamoDB single-table design is worth the upfront cost.&lt;/strong&gt; Once your access patterns are clear, queries are trivial and fast. The hard part is resisting the urge to add a second table "just in case."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;AI tools work best with clear boundaries.&lt;/strong&gt; v0 for UI, Kiro for backend + orchestration. Letting each tool do what it's best at produced better results than asking one tool to do everything.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Stack
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Next.js 16, React 19, TypeScript&lt;/li&gt;
&lt;li&gt;Tailwind CSS + shadcn/ui&lt;/li&gt;
&lt;li&gt;AWS Cognito, DynamoDB, IAM, Lambda (background)&lt;/li&gt;
&lt;li&gt;Vercel (hosting + serverless)&lt;/li&gt;
&lt;li&gt;Kiro (spec-driven development)&lt;/li&gt;
&lt;li&gt;v0 (frontend generation)&lt;/li&gt;
&lt;li&gt;GitHub Actions (CI/CD)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fx2l9xfq1m3onpt5f5prl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fx2l9xfq1m3onpt5f5prl.png" alt=" " width="800" height="519"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Links
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;GitHub&lt;/strong&gt;: &lt;a href="https://github.com/gtogbes/SquadIQ" rel="noopener noreferrer"&gt;github.com/gtogbes/SquadIQ&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;API Docs&lt;/strong&gt;: &lt;a href="https://github.com/gtogbes/SquadIQ/blob/feat/live-features-api-routes/docs/swagger.yaml" rel="noopener noreferrer"&gt;swagger.yaml&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Architecture&lt;/strong&gt;: &lt;a href="https://github.com/gtogbes/SquadIQ/blob/feat/live-features-api-routes/ARCHITECTURE.md" rel="noopener noreferrer"&gt;ARCHITECTURE.md&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;created this for the purposes of entering the HO hackathon.&lt;/strong&gt;&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Migrate from NGINX to Caddy on AWS</title>
      <dc:creator>Noureldin ehab</dc:creator>
      <pubDate>Wed, 24 Jun 2026 17:00:00 +0000</pubDate>
      <link>https://dev.clauneck.workers.dev/aws-builders/migrate-from-nginx-to-caddy-on-aws-k03</link>
      <guid>https://dev.clauneck.workers.dev/aws-builders/migrate-from-nginx-to-caddy-on-aws-k03</guid>
      <description>&lt;h1&gt;
  
  
  Why Migrate to Caddy?
&lt;/h1&gt;

&lt;p&gt;&lt;a href="https://caddyserver.com/" rel="noopener noreferrer"&gt;Caddy&lt;/a&gt; is open source, and it provides automatic HTTPS and certificate renewal out of the box, removing the need for Certbot or cron jobs. It offers secure defaults, simpler configuration, which makes it a lightweight and low maintenance replacement for nginx&lt;/p&gt;

&lt;p&gt;It acts as a reverse proxy, load balancer, and static file server out of the box, with secure defaults and minimal setup.&lt;/p&gt;

&lt;p&gt;Note: Stakpak is open source, vendor neutral, and works with any model you choose.&lt;/p&gt;

&lt;h1&gt;
  
  
  Step by Step Guide
&lt;/h1&gt;

&lt;h2&gt;
  
  
  Architecture
&lt;/h2&gt;

&lt;p&gt;Our current setup uses a single tier architecture on AWS to host a static HTML website. It runs on a t3.micro EC2 instance using nginx 1.28.0, serving files from &lt;code&gt;/usr/share/nginx/html/&lt;/code&gt;. The instance is part of the default VPC and resides in a public subnet, allowing direct internet access.&lt;/p&gt;

&lt;p&gt;Traffic is managed by a security group  with inbound rules open to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;SSH (port 22)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;HTTP (port 80)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;HTTPS (port 443)&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;DNS is handled through Amazon Route 53, where an A record points the domain &lt;code&gt;migratingtocaddy.guku.io&lt;/code&gt; to the instance’s public IP. TLS certificates are issued by Let’s Encrypt and configured via Certbot with the nginx plugin, enabling automatic HTTPS redirection. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The problem with this architecture:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Depends on manual Certbot setup (The renewal cron job can easily be forgotten)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;nginx configuration is unnecessarily complex&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;No built in automation for TLS or reloads&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Higher maintenance for updates and security hardening&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Let's see how we can fix these problems with caddy&lt;/p&gt;

&lt;h1&gt;
  
  
  Prerequisites
&lt;/h1&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Install Stakpak&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Open your terminal and type "stakpak"&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;You should configure your cloud credentials before opening stakpak, since Stakpak will use your existing machine setup to work&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h1&gt;
  
  
  Guide
&lt;/h1&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Then ask Stakpak to &lt;code&gt;Migrate from NGINX to Caddy with 0 downtime on AWS&lt;/code&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;First Stakpak will check what is our current set up on AWS&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fdu700rmbwed3a8z9xdss.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fdu700rmbwed3a8z9xdss.png" alt=" " width="800" height="247"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Now, Stakpak recommended three zero down time strategies for the migration&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Faem02rwadlnpa4y95hx7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Faem02rwadlnpa4y95hx7.png" alt=" " width="800" height="583"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Since we don't want downtime because of the DNS access and TLS let's choose the second option&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fzoo851r8aqzpip6iizze.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fzoo851r8aqzpip6iizze.png" alt=" " width="800" height="589"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Now that we have the ALB and target groups, Stakpak will install Caddy&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;After installing Caddy Stakpak will copy the website content&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Now wait for the health checks so we make sure Caddy is working fine&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F97y1kl1w0xk1eqg7tvfg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F97y1kl1w0xk1eqg7tvfg.png" alt=" " width="799" height="273"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Now Stakpak is updates the DNS to point to the ALB&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Thats it, we are ready to redirect the traffic to Caddy, and since we are using ALB we will be able to roll back if needed&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Now it's working🥳&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fj51kfw4s5aqpjv2a0vd9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fj51kfw4s5aqpjv2a0vd9.png" alt=" " width="800" height="503"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;ps: don't forget to check our new &lt;a href="https://stakpak.gitbook.io/docs/how-it-works/slack-integration" rel="noopener noreferrer"&gt;Slack Integration&lt;/a&gt;👀&lt;/p&gt;

&lt;h1&gt;
  
  
  Extra Resources:
&lt;/h1&gt;

&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://stakpak.gitbook.io/docs/get-started/install-stakpak" rel="noopener noreferrer"&gt;Install Stakpak&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://docs.aws.amazon.com/ec2/" rel="noopener noreferrer"&gt;EC2 Documentation&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://docs.aws.amazon.com/vpc/" rel="noopener noreferrer"&gt;VPC Documentation&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://docs.aws.amazon.com/elasticloadbalancing/latest/application/application-load-balancers.html" rel="noopener noreferrer"&gt;Application Load Balancer Documentation&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://docs.aws.amazon.com/route53/" rel="noopener noreferrer"&gt;Route 53 Documentation&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://nginx.org/en/docs/" rel="noopener noreferrer"&gt;NGINX Documentation&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://caddyserver.com/docs/" rel="noopener noreferrer"&gt;Caddy Documentation&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>aws</category>
      <category>nginx</category>
      <category>caddy</category>
      <category>devops</category>
    </item>
    <item>
      <title>How I Built a Full High Availability AWS Infrastructure with Terraform Modules</title>
      <dc:creator>Emmanuel Ulu</dc:creator>
      <pubDate>Wed, 24 Jun 2026 13:49:58 +0000</pubDate>
      <link>https://dev.clauneck.workers.dev/aws-builders/how-i-built-a-full-high-availability-aws-infrastructure-with-terraform-modules-2985</link>
      <guid>https://dev.clauneck.workers.dev/aws-builders/how-i-built-a-full-high-availability-aws-infrastructure-with-terraform-modules-2985</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;Most AWS tutorials teach you how to launch a single EC2 instance in a public subnet and call it a day. That's fine for learning the basics, but it's nowhere near what production infrastructure looks like.&lt;br&gt;
In this article I'll walk you through how I designed and deployed a full multi-tier, multi-AZ High Availability infrastructure on AWS written entirely in Terraform, structured as reusable modules. By the end you'll understand not just what I built, but why each decision was made.&lt;br&gt;
This is part of my AWS SAA-C03 certification preparation as an AWS Community Builder 2026 (Serverless track).&lt;/p&gt;
&lt;h2&gt;
  
  
  What Does "High Availability" Actually Mean?
&lt;/h2&gt;

&lt;p&gt;High Availability means your system keeps running even when something fails. In AWS, the primary failure unit is an Availability Zone a physically separate data centre within a region.&lt;/p&gt;

&lt;p&gt;True HA means no single AZ failure can bring your application down. That requires every tier network, compute, and database to span multiple AZs.&lt;/p&gt;

&lt;p&gt;Here's what most people get wrong: they put their EC2 instances in two AZs but share a single NAT Gateway in one AZ. When that AZ goes down, all outbound traffic from private subnets dies even the instances in the healthy AZ. True HA requires one NAT Gateway per AZ.&lt;/p&gt;
&lt;h2&gt;
  
  
  Architecture Drawing
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fj0bddaj7ardeyyd6eths.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fj0bddaj7ardeyyd6eths.gif" alt="Draw.io" width="800" height="1233"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Architecture Overview
&lt;/h2&gt;

&lt;p&gt;The infrastructure spans 3 Availability Zones in eu-west-1 (Ireland) across 4 tiers:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Internet
    |
Route 53 (app.skylumanex.click)
    |
Application Load Balancer (3 public subnets)
    |
Auto Scaling Group (3 private app subnets)
    |
RDS MySQL Multi-AZ (3 private DB subnets)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every tier lives in its own subnet type, in its own security group, with tightly scoped ingress rules.&lt;/p&gt;

&lt;h2&gt;
  
  
  Terraform Module Structure
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;modules/
├── network/      # VPC, subnets, IGW, NAT, route tables
├── compute/      # Security groups, ALB, ASG, CloudWatch
├── database/     # RDS Multi-AZ, DB subnet group
└── dns/          # Route 53 alias + health check
environments/
└── dev/          # Root module wiring everything together
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each module has exactly 3 files: &lt;code&gt;main.tf&lt;/code&gt;, &lt;code&gt;variables.tf&lt;/code&gt;, and &lt;code&gt;outputs.tf&lt;/code&gt;. Modules communicate through outputs and inputs — the networking module outputs VPC ID and subnet IDs, the compute module takes those as inputs, the database module takes the app security group ID from compute to scope its DB ingress rules.&lt;/p&gt;

&lt;h2&gt;
  
  
  Module 1 — Networking
&lt;/h2&gt;

&lt;p&gt;The networking module creates the entire network foundation:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;1 VPC (10.0.0.0/16)&lt;/li&gt;
&lt;li&gt;3 public subnets (one per AZ) for the ALB and NAT Gateways&lt;/li&gt;
&lt;li&gt;3 private app subnets (one per AZ) for EC2 instances&lt;/li&gt;
&lt;li&gt;3 private DB subnets (one per AZ) for RDS&lt;/li&gt;
&lt;li&gt;1 Internet Gateway&lt;/li&gt;
&lt;li&gt;NAT Gateway(s) configurable via single_nat_gateway toggle&lt;/li&gt;
&lt;li&gt;Route tables 1 public, 1 private per AZ&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The NAT Gateway decision:
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;variable "single_nat_gateway" {
  type    = bool
  default = true  # cost-optimized for dev
}

locals {
  nat_count = var.single_nat_gateway ? 1 : length(local.azs)
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Flip &lt;code&gt;single_nat_gateway = false&lt;/code&gt; in production and you get one NAT Gateway per AZ true HA outbound routing at ~$97/month. Keep it &lt;code&gt;true&lt;/code&gt; for dev at ~$33/month.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Dynamic AZ lookup no hardcoded AZ names:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;data "aws_availability_zones" "available" {
  state = "available"
}

locals {
  azs = slice(data.aws_availability_zones.available.names, 0, 3)
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This makes the module region-agnostic. Deploy to us-east-1 and it automatically picks the right AZs.&lt;/p&gt;

&lt;h2&gt;
  
  
  Module 2 — Compute
&lt;/h2&gt;

&lt;p&gt;The compute module creates the application layer:&lt;br&gt;
Two security groups with layered access:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# ALB SG — internet can reach the ALB on port 80
resource "aws_security_group" "alb" {
  ingress {
    from_port   = 80
    to_port     = 80
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }
}

# App SG — only the ALB can reach the EC2 instances
resource "aws_security_group" "app" {
  ingress {
    from_port       = 80
    to_port         = 80
    protocol        = "tcp"
    security_groups = [aws_security_group.alb.id]
  }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;EC2 instances are in private subnets and only accept traffic that came through the ALB. They are never directly reachable from the internet.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Launch template with security best practices:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;metadata_options {
  http_endpoint               = "enabled"
  http_tokens                 = "required"  # enforces IMDSv2
  http_put_response_hop_limit = 1
}

monitoring {
  enabled = true  # detailed CloudWatch metrics
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Auto Scaling with CloudWatch alarms:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;resource "aws_cloudwatch_metric_alarm" "cpu_high" {
  comparison_operator = "GreaterThanThreshold"
  threshold           = var.cpu_high_threshold  # default 60%
  alarm_actions       = [aws_autoscaling_policy.scale_out.arn]
}

resource "aws_cloudwatch_metric_alarm" "cpu_low" {
  comparison_operator = "LessThanThreshold"
  threshold           = var.cpu_low_threshold   # default 20%
  alarm_actions       = [aws_autoscaling_policy.scale_in.arn]
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Module 3 — Database
&lt;/h2&gt;

&lt;p&gt;RDS MySQL with Multi-AZ the core HA database setting:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;resource "aws_db_instance" "this" {
  engine         = var.db_engine
  instance_class = var.db_instance_class
  multi_az       = true              # synchronous standby + auto failover
  storage_encrypted = true           # encrypted at rest
  storage_type      = "gp3"          # faster and cheaper than gp2
  deletion_protection = true         # safety guardrail
  backup_retention_period = 7        # 7 days of automated backups
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The DB security group only accepts traffic from the app security group — not from any IP address, not from the internet:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ingress {
  from_port       = 3306
  to_port         = 3306
  protocol        = "tcp"
  security_groups = [var.app_security_group_id]
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Module 4 — DNS
&lt;/h2&gt;

&lt;p&gt;Route 53 alias record pointing &lt;code&gt;app.skylumanex.click&lt;/code&gt; to the ALB, with a health check:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;resource "aws_route53_record" "app" {
  zone_id = var.hosted_zone_id
  name    = var.domain_name
  type    = "A"

  alias {
    name                   = var.alb_dns_name
    zone_id                = var.alb_zone_id
    evaluate_target_health = true
  }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;evaluate_target_health = true&lt;/code&gt; means Route 53 won't route traffic to the ALB if the ALB health checks are failing. Another layer of resilience.&lt;/p&gt;

&lt;h2&gt;
  
  
  Root Module — Wiring Everything Together
&lt;/h2&gt;

&lt;p&gt;The environments/dev/main.tf calls all 4 modules and passes outputs between them:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;module "networking" {
  source         = "../../modules/network"
  project_name   = var.project_name
  vpc_cidr_block = var.vpc_cidr_block
  single_nat_gateway = var.single_nat_gateway
  # ...
}

module "compute" {
  source                 = "../../modules/compute"
  vpc_id                 = module.networking.vpc_id
  public_subnet_ids      = module.networking.public_subnet_ids
  private_app_subnet_ids = module.networking.private_app_subnet_ids
  # ...
}

module "database" {
  source                = "../../modules/database"
  vpc_id                = module.networking.vpc_id
  private_db_subnet_ids = module.networking.private_db_subnet_ids
  app_security_group_id = module.compute.app_security_group_id
  # ...
}

module "dns" {
  source         = "../../modules/dns"
  alb_dns_name   = module.compute.alb_dns_name
  alb_zone_id    = module.compute.alb_zone_id
  hosted_zone_id = var.hosted_zone_id
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Notice how &lt;code&gt;module.networking.vpc_id&lt;/code&gt; flows into both compute and database. &lt;code&gt;module.compute.app_security_group_id&lt;/code&gt; flows into database. Each module is independent but they communicate cleanly through their interfaces.&lt;/p&gt;

&lt;h2&gt;
  
  
  Deploying It
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;cd environments/dev
export TF_VAR_db_password="YourStrongPassword"
terraform init
terraform plan
terraform apply
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Terraform provisions all 40 resources in the correct dependency order automatically.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Proof
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ curl http://app.skylumanex.click
Hello from ip-10-0-11-230.eu-west-1.compute.internal

$ curl http://app.skylumanex.click
Hello from ip-10-0-12-181.eu-west-1.compute.internal
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Traffic distributing across private subnets in eu-west-1a and eu-west-1b through the ALB, resolved via Route 53. The instances are never directly reachable from the internet.&lt;/p&gt;

&lt;h2&gt;
  
  
  Screenshots
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fkzwyfhweblzojyevwlfk.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fkzwyfhweblzojyevwlfk.png" alt="Image2" width="800" height="282"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fh20dp0u9qfq7utg0s78c.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fh20dp0u9qfq7utg0s78c.png" alt="Image4" width="800" height="703"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Ftqbu2cq8cez4wui41ysm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Ftqbu2cq8cez4wui41ysm.png" alt="Image5" width="800" height="173"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fju9i26iv5wxhrvhdz38n.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fju9i26iv5wxhrvhdz38n.png" alt="Image6" width="799" height="280"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Lessons
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Modules enforce separation of concerns the networking module doesn't know about EC2, the compute module doesn't know about RDS. Each module has one job.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Outputs are the module API — what a module exposes in &lt;code&gt;outputs.tf&lt;/code&gt; is its contract with the outside world. Design them carefully&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The NAT Gateway is the hidden single point of failure most HA tutorials miss this. One shared NAT Gateway means one AZ failure kills all outbound private traffic.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;deletion_protection = true&lt;/code&gt; on RDS is a guardrail, not an obstacle — it saved me from accidentally destroying a database during testing. Disable it explicitly before destroy, never by default.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Never put &lt;code&gt;db_password&lt;/code&gt; in &lt;code&gt;terraform.tfvars&lt;/code&gt; use &lt;code&gt;TF_VAR_db_password&lt;/code&gt; environment variable. It never touches disk.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Add a bastion host or SSM Session Manager for secure instance access&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Enable VPC Flow Logs for network traffic visibility&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Add WAF in front of the ALB&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Build a &lt;code&gt;staging/&lt;/code&gt; environment by copying &lt;code&gt;environments/dev/&lt;/code&gt; the modules don't change&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Resources
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Terraform AWS Provider docs&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;AWS Well-Architected Framework — Reliability Pillar&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;GitHub: &lt;a href="https://github.com/Emmanuel-DevOps-Portfolio/aws-full-ha-infra" rel="noopener noreferrer"&gt;aws-full-ha-infra&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Part of my AWS SAA-C03 prep as an AWS Community Builder 2026 (Serverless track). Follow along as I build toward certification&lt;/strong&gt;&lt;/p&gt;

</description>
      <category>aws</category>
      <category>terraform</category>
      <category>devops</category>
      <category>cloudcomputing</category>
    </item>
    <item>
      <title>AWS Lambda MicroVMs: run untrusted code with VM-level isolation (no infra to manage)</title>
      <dc:creator>will peixoto</dc:creator>
      <pubDate>Wed, 24 Jun 2026 12:44:14 +0000</pubDate>
      <link>https://dev.clauneck.workers.dev/aws-builders/aws-lambda-microvms-run-untrusted-code-with-vm-level-isolation-no-infra-to-manage-3i3</link>
      <guid>https://dev.clauneck.workers.dev/aws-builders/aws-lambda-microvms-run-untrusted-code-with-vm-level-isolation-no-infra-to-manage-3i3</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;AWS just shipped &lt;strong&gt;Lambda MicroVMs&lt;/strong&gt;, a new serverless primitive that gives each user or session a &lt;strong&gt;VM-level isolated sandbox&lt;/strong&gt;, with near-instant launch and &lt;strong&gt;state preserved for up to 8 hours&lt;/strong&gt;, all on Firecracker. Here is what it is, when to reach for it instead of a plain Lambda Function, and how to architect on top of it.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;em&gt;🇧🇷 &lt;a href="https://willpeixoto.dev/aws-lambda-microvms-codigo-isolado-serverless" rel="noopener noreferrer"&gt;Leia em português&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Let me put you in a situation. You need to run a piece of code you did not write. Maybe it is the script your user pasted into your platform, maybe it is the snippet an AI agent just generated and wants to execute. And then comes the question that keeps anyone working with multi-tenant up at night: how do I run this without handing a stranger the keys to the house?&lt;/p&gt;

&lt;p&gt;Until last week you had three paths, each with a catch. A VM gives you strong isolation but takes minutes to boot. A container starts in seconds but shares a kernel, so running untrusted code there takes a pile of hardening. And the Lambda Function was built for short request-response, not for a session that has to keep live state between one interaction and the next (externalizing it to DynamoDB stores the data, not the live runtime: the running process, the loaded packages, the memory). In the end you chose between performance and isolation. No way around it. Or there was.&lt;/p&gt;

&lt;h2&gt;
  
  
  Container, VM, or Lambda: the trade-off none of them solved alone
&lt;/h2&gt;

&lt;p&gt;This pattern got common: AI coding assistants, interactive code environments, analytics, vulnerability scanners, game servers running player scripts. They all need the same thing: give each user their own environment to run code the team did not write, safely and without lag.&lt;/p&gt;

&lt;p&gt;The knot is that real isolation and low latency pull in opposite directions. From a security angle you want a hard boundary between tenants (the Security pillar of the Well-Architected Framework: isolate what is not trusted). From an experience angle you want that environment up the instant the user shows up. Reconciling the two was the expensive work.&lt;/p&gt;

&lt;p&gt;And there is a nice irony in this story. We spent years learning to build stateless apps, and now state is a requirement again.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The solution to the future was hiding in the past.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That is a line a friend dropped in a conversation, and it has not left my head since. Ever felt that way? Because I have. And it is roughly what Lambda MicroVMs does: it brings state back, without handing you the weight of a full VM.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Lambda MicroVMs is
&lt;/h2&gt;

&lt;p&gt;Lambda MicroVMs is a new primitive &lt;strong&gt;inside&lt;/strong&gt; Lambda, built exactly for that gap. Each MicroVM gives a single user or session its own isolated environment that boots fast, keeps memory and disk for the whole session, and pauses to a low cost when the user steps away.&lt;/p&gt;

&lt;p&gt;The magic comes from &lt;strong&gt;Firecracker&lt;/strong&gt;, the same lightweight virtualization that already runs over 15 trillion Lambda invocations a month. This is not raw new tech, it is the mature foundation of Lambda itself, exposed in a new way.&lt;/p&gt;

&lt;p&gt;The model is &lt;strong&gt;image-then-launch&lt;/strong&gt;:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fzr3z1cwd5n28mnspugkj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fzr3z1cwd5n28mnspugkj.png" alt="lambda microvm lifecycle " width="799" height="155"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You build the &lt;strong&gt;image&lt;/strong&gt; once (AWS runs your Dockerfile, initializes the app, and takes a snapshot of memory and disk). After that, every MicroVM you launch resumes from that snapshot instead of cold-booting. That is why launch and resume are near-instant, even for a multi-gigabyte session.&lt;/p&gt;

&lt;h2&gt;
  
  
  What it is actually for (with examples you will recognize)
&lt;/h2&gt;

&lt;p&gt;The main cue: this only enters the picture if you are &lt;strong&gt;building a platform that runs third-party code&lt;/strong&gt;. If your app does not execute outside code, you do not need it. It is a building block for people who build that kind of product:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Replit, CodeSandbox, "VS Code in the browser":&lt;/strong&gt; the user types code in the browser and it runs isolated, per user, holding state while the tab is open. That "runs isolated" is the MicroVM.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Code interpreter (like ChatGPT's or Claude's):&lt;/strong&gt; you ask "plot this CSV", the AI writes Python and &lt;strong&gt;runs it&lt;/strong&gt; to answer you. The runtime that executes that generated code, isolated per conversation, is the use case.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CI/CD runner (and relatives):&lt;/strong&gt; a job runs the code of a Pull Request that may come from any stranger's fork, untrusted by definition, so you want an isolated, disposable runner per job. Same family: a scanner that runs a suspicious binary, a coding-interview platform (the candidate's code runs isolated), an AI agent that runs shell commands.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The thread tying it all together: &lt;strong&gt;each user, session, or job needs its own isolated environment, and the code running there is not code you wrote.&lt;/strong&gt; That is the cue to use a MicroVM instead of a Lambda Function.&lt;/p&gt;

&lt;h2&gt;
  
  
  Lambda Function or Lambda MicroVM?
&lt;/h2&gt;

&lt;p&gt;They do not compete, they complete each other. The official comparison:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;Lambda Functions&lt;/th&gt;
&lt;th&gt;Lambda MicroVMs&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Best for&lt;/td&gt;
&lt;td&gt;request-response or event-driven (APIs, data processing, automation)&lt;/td&gt;
&lt;td&gt;persistent environments running user or AI-produced untrusted code&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Programming model&lt;/td&gt;
&lt;td&gt;function handler invoked in a supported runtime&lt;/td&gt;
&lt;td&gt;any application: run your own binaries, listen on ports, use Linux OS capabilities&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Duration&lt;/td&gt;
&lt;td&gt;up to 15 min per invocation; multi-step workflows up to a year with Lambda Durable Functions&lt;/td&gt;
&lt;td&gt;up to 8 hours per session; suspend and resume across sessions&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Runtime&lt;/td&gt;
&lt;td&gt;service-provided runtimes (or customer-provided)&lt;/td&gt;
&lt;td&gt;customer-provided MicroVM images&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Inbound networking&lt;/td&gt;
&lt;td&gt;direct invocations or event-source integrations; response streaming&lt;/td&gt;
&lt;td&gt;inbound access to any port using OSI Layer 7 protocols&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Concurrency&lt;/td&gt;
&lt;td&gt;one request per execution environment at a time&lt;/td&gt;
&lt;td&gt;multiple concurrent connections per MicroVM&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Environment state&lt;/td&gt;
&lt;td&gt;warm starts may reuse the environment, but state may not persist across invocations&lt;/td&gt;
&lt;td&gt;memory and disk state preserved on suspend, restored on resume&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Scaling&lt;/td&gt;
&lt;td&gt;automatic: Lambda creates and destroys environments in response to traffic&lt;/td&gt;
&lt;td&gt;developer-controlled: you create, suspend, resume, and terminate via API&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Lifecycle&lt;/td&gt;
&lt;td&gt;fully managed by Lambda&lt;/td&gt;
&lt;td&gt;developer-controlled, with optional idle policies&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Pricing&lt;/td&gt;
&lt;td&gt;per-request + GB-seconds&lt;/td&gt;
&lt;td&gt;per-second of compute while running + snapshot storage while suspended&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The most common confusion: people assume the duration is the same as Lambda's. The startup is similar (both resume from a snapshot), but a Function dies at 15 minutes while a MicroVM holds a session for up to 8 hours with state intact. The real design: your app keeps Lambda Functions for the event-driven backbone, and &lt;strong&gt;calls&lt;/strong&gt; MicroVMs only for the steps that need to run untrusted code in isolation.&lt;/p&gt;

&lt;h2&gt;
  
  
  How it works in practice: from endpoint to orchestration
&lt;/h2&gt;

&lt;p&gt;Three things that trip people up at first, together.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The endpoint has a status.&lt;/strong&gt; When you call &lt;code&gt;run-microvm&lt;/code&gt;, you get an ID and a dedicated HTTPS endpoint for that MicroVM. But it is not ready instantly: it goes through states, from launch to &lt;code&gt;RUNNING&lt;/code&gt; (about 2 seconds), and when idle it moves to suspended, coming back on resume. The endpoint is per MicroVM, per session.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;One image, many MicroVMs.&lt;/strong&gt; You build the image once (&lt;code&gt;create-microvm-image&lt;/code&gt;) and each MicroVM is a &lt;code&gt;run-microvm&lt;/code&gt;. Want two? Call it twice, and you get two independent instances. Idle behavior is governed by the &lt;code&gt;idle-policy&lt;/code&gt;: &lt;code&gt;maxIdleDurationSeconds&lt;/code&gt; (suspend after X idle) and &lt;code&gt;autoResumeEnabled&lt;/code&gt; (the next request wakes the MicroVM on its own, in about 1s, no manual restart). When you are done, &lt;code&gt;terminate-microvm&lt;/code&gt; releases everything.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;You become the orchestrator.&lt;/strong&gt; Since the endpoint is per session, something has to decide when to launch and where to route. Typically a Lambda Function in the backbone does it: it keeps a &lt;code&gt;session -&amp;gt; MicroVM&lt;/code&gt; map (a store like DynamoDB in production), calls &lt;code&gt;RunMicrovm&lt;/code&gt; on a user's first access, stores the ID and endpoint, mints a short-lived token with &lt;code&gt;CreateMicrovmAuthToken&lt;/code&gt;, and proxies the request to the MicroVM's endpoint with the &lt;code&gt;X-aws-proxy-auth&lt;/code&gt; header. If the instance is suspended and &lt;code&gt;autoResume&lt;/code&gt; is on, the request itself wakes it. Add a routine to terminate orphan MicroVMs and you have the skeleton. The backbone code is in the next post in the series. And do not confuse this with Step Functions: MicroVM is the execution environment, Step Functions is an orchestrator, different layers.&lt;/p&gt;

&lt;h2&gt;
  
  
  Cost, limits, and what is still missing
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Cost is a decision, not a detail.&lt;/strong&gt; Werner Vogels keeps hammering in the &lt;strong&gt;Frugal Architect&lt;/strong&gt; that cost is an architecture requirement, not a number you discover on the bill. The suspend is exactly that in practice: you pay a lot for VM-level isolation, but only while the user is active. When they leave, the MicroVM suspends and the cost drops, with no loss of state. Designing your &lt;code&gt;idle-policy&lt;/code&gt; on purpose is a cost decision. The model, from the official table: you pay &lt;strong&gt;per second of compute while it runs&lt;/strong&gt;, and only &lt;strong&gt;snapshot storage while it is suspended&lt;/strong&gt;. Unit prices are on the &lt;a href="https://aws.amazon.com/lambda/pricing/" rel="noopener noreferrer"&gt;Lambda pricing page&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Limits:&lt;/strong&gt; ARM64, up to 16 vCPUs, 32 GB of memory, and 32 GB of disk per MicroVM, and up to 8 hours of total runtime. Provisioning is flexible: you set a baseline and burst up to 4x at peak, paying the baseline while it runs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;IaC:&lt;/strong&gt; you can use the console, CloudFormation, and CDK.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why Dockerfile + zip, and not a prebuilt ECR image?&lt;/strong&gt; Aidan Steele dug into it: Lambda builds two copies of the image, one for Graviton 3 and one for Graviton 4, so it needs the source to recompile. The base comes from ECR Public, but pushing your own prebuilt image from a private ECR as the artifact is not the path. One thing that confuses people coming from containers: ECR does not leave your life. You do not &lt;strong&gt;deliver&lt;/strong&gt; the MicroVM image via ECR, but &lt;strong&gt;inside&lt;/strong&gt; the running MicroVM you can run Docker and &lt;code&gt;docker pull&lt;/code&gt; your private ECR images at runtime. ECR is for consumption inside, not for delivering the image itself.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Networking and region:&lt;/strong&gt; inbound traffic on configurable ports (HTTP/2, gRPC, WebSockets), service-provided JWE auth, outbound to the internet or your VPC. And it is available so far only in US East (N. Virginia, Ohio), US West (Oregon), Europe (Ireland), and Asia Pacific (Tokyo).&lt;/p&gt;

&lt;h2&gt;
  
  
  When NOT to use it
&lt;/h2&gt;

&lt;p&gt;If the workload is short request-response with no state, it stays a Lambda Function. A MicroVM there is a cannon for a mosquito. And if you just need &lt;strong&gt;more than 15 minutes with your own (trusted) code&lt;/strong&gt;, a MicroVM is also overkill: for a long job, look at Fargate; for a multi-step workflow, Lambda Durable Functions (up to a year, as the table shows). MicroVMs are for when the differentiator is &lt;strong&gt;isolating untrusted code&lt;/strong&gt;, not just going past 15 minutes.&lt;/p&gt;

&lt;p&gt;There is also a gotcha AWS itself flags, and it rhymes with the determinism conversation: since the MicroVM boots from a pre-initialized snapshot (the equivalent of Lambda SnapStart, as Aidan Steele confirmed by testing), apps that generate unique content, open connections, or load ephemeral data at init may diverge. The snapshot froze a moment; whatever needs to be fresh per session cannot be frozen along with it. The fix has a name: &lt;strong&gt;lifecycle hooks&lt;/strong&gt; to re-initialize randomness when each MicroVM is created. Map that out before assuming it just works.&lt;/p&gt;

&lt;h2&gt;
  
  
  Does it kill the container?
&lt;/h2&gt;

&lt;p&gt;No, and the reason is even better.&lt;/p&gt;

&lt;p&gt;The hype of the week is "containers are obsolete." They are not. Quite the opposite: Aidan Steele tested it and &lt;strong&gt;you can run Docker inside a MicroVM&lt;/strong&gt;, with OS capabilities enabled. So the MicroVM does not kill the container, it is more isolated and still runs containers inside. The honest cut is different: there is one specific spot, running untrusted code in isolation, where you will no longer want to harden a container by hand. There the MicroVM wins. Everywhere else, the container is still king.&lt;/p&gt;

&lt;h2&gt;
  
  
  The details the docs leave out
&lt;/h2&gt;

&lt;p&gt;Aidan Steele spent launch day poking at the service and found some really interesting things that are not in the official docs.&lt;br&gt;
I read it and figured it was worth bringing here:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;You can get a shell into the MicroVM&lt;/strong&gt;, via the &lt;code&gt;CreateMicrovmShellAuthToken&lt;/code&gt; API, with pty as a first-class citizen (Lambda Functions do not have it). Gold for IDE and coding-agent use cases.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Outbound UDP is blocked by default&lt;/strong&gt; and DNS is a local stub, so DNS inside a container falls back to 8.8.8.8 and fails. The fix is to run with Lambda's DNS: &lt;code&gt;docker run --dns 169.254.169.253&lt;/code&gt;, or go via VPC.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Lambda network connectors:&lt;/strong&gt; a reified VPC config (subnets, security groups, an IAM role for the ENI) with its own lifecycle. The network team creates it, the developer just consumes it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Performance&lt;/strong&gt; (his tests): image build 2-3 min; &lt;code&gt;RunMicrovm&lt;/code&gt; to RUNNING about 2s, plus 2s to serve; suspend and resume about 1s each.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What you take away
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Lambda MicroVMs fills a real gap: VM-level isolation &lt;strong&gt;with&lt;/strong&gt; near-instant launch &lt;strong&gt;and&lt;/strong&gt; per-session state, which no single service delivered together.&lt;/li&gt;
&lt;li&gt;It does not replace the Lambda Function, it complements it. Function in the backbone, MicroVM for the untrusted code.&lt;/li&gt;
&lt;li&gt;The idle suspend is a deliberate cost lever, design your &lt;code&gt;idle-policy&lt;/code&gt; on purpose.&lt;/li&gt;
&lt;li&gt;Before locking in architecture: check the region (no São Paulo yet), the limits (ARM64, 16 vCPU, 32 GB, 8h), and the snapshot caveat.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This post was the map. In the next one in the series I actually spin up a MicroVM and we prove the isolation in practice, launching two MicroVMs and testing whether one can reach the other, with the repo on GitHub for you to run along.&lt;/p&gt;

&lt;p&gt;Got a case where you run user or AI code that today is duct-taped onto a container or a hand-rolled VM? Does this primitive fit? Drop a like, share it with whoever is building a multi-tenant platform, and let's talk. Cheers! =D&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://willpeixoto.dev" rel="noopener noreferrer"&gt;willpeixoto.dev&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>serverless</category>
      <category>lambda</category>
      <category>firecracker</category>
    </item>
  </channel>
</rss>
