close

DEV Community

Cover image for What I learned building a low-cost multi-model AI gateway
Tokens Forge
Tokens Forge

Posted on

What I learned building a low-cost multi-model AI gateway

I have been building Tokens Forge, a small AI gateway for people who want one practical API key for GPT, Claude, Gemini, and longer AI research workflows.

The product started from a simple frustration: model access is easy, but operating multiple model providers cleanly is not.

Once you move past a prototype, you usually need to answer a few boring but important questions:

  • Which model should this request actually hit?
  • Is this key allowed to call that model?
  • How do I show usage in a way a normal user understands?
  • What happens when one provider fails or gets slow?
  • How do I keep official model spend and discounted routed model spend separate?

Those are not exciting demo features, but they are the difference between a toy wrapper and something people can actually use.

The interface I wanted

I wanted the user-facing side to stay boring on purpose:

curl https://tokens-forge.com/v1/chat/completions \
  -H "Authorization: Bearer sk-your-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.5",
    "messages": [
      { "role": "user", "content": "Give me a concise market summary." }
    ]
  }'
Enter fullscreen mode Exit fullscreen mode

The user should not need to care whether the request goes through an official direct route, a compatible provider route, or a subscription-backed pool. They should mostly care about:

  • Can I call the model?
  • What will it cost?
  • Did it work?
  • Where can I inspect usage afterwards?

What became harder than expected

The hardest part was not building a proxy. The hard part was making the product understandable.

For example, Tokens Forge separates two balances:

  • Credit for official/direct model usage
  • RMB Wallet for ordinary routed usage

That sounds like a UI detail, but it affects everything: model cards, billing copy, admin pricing, route health, usage logs, and how users understand discounts.

Another lesson: if a route has backup channels, the admin UI needs to explain the route tree. A flat table becomes hard to reason about quickly. I ended up moving toward a tree like:

Channel type -> brand -> channel -> routed models -> primary/backup order
Enter fullscreen mode Exit fullscreen mode

It is much easier to debug when the UI matches the mental model.

The research workflow

One feature I did not expect to matter as much is the AI Research Agent.

A lot of users do not only want raw API access. They want to run a longer task: market analysis, company research, trading-support research, PDF export, and a saved history of past runs.

So Tokens Forge includes an AI Research Agent alongside the API gateway. The idea is that users can start with the API, but still have a useful workflow ready when they do not want to wire up their own agent stack.

What I am looking for feedback on

I am still refining the positioning.

Should this be explained first as:

  1. A low-cost multi-model AI gateway, with research workflows included; or
  2. An AI research workspace that also gives users OpenAI-compatible model access?

Right now I am leaning toward the first one, because the core business is still model/API access.

If you build with multiple AI providers, I would be interested in what you expect from a gateway product before you trust it:

  • Better pricing visibility?
  • Better usage logs?
  • Failover and backup channels?
  • Model permission controls per API key?
  • Built-in workflows like research reports?

The project is here: https://tokens-forge.com/

I would appreciate feedback from other builders, especially around onboarding clarity and whether the API gateway + research agent combination makes sense.

Top comments (0)