Tokens Forge

Posted on Jun 24

What I learned building a low-cost multi-model AI gateway

#ai #api #machinelearning #showdev

I have been building Tokens Forge, a small AI gateway for people who want one practical API key for GPT, Claude, Gemini, and longer AI research workflows.

The product started from a simple frustration: model access is easy, but operating multiple model providers cleanly is not.

Once you move past a prototype, you usually need to answer a few boring but important questions:

Which model should this request actually hit?
Is this key allowed to call that model?
How do I show usage in a way a normal user understands?
What happens when one provider fails or gets slow?
How do I keep official model spend and discounted routed model spend separate?

Those are not exciting demo features, but they are the difference between a toy wrapper and something people can actually use.

The interface I wanted

I wanted the user-facing side to stay boring on purpose:

curl https://tokens-forge.com/v1/chat/completions \
  -H "Authorization: Bearer sk-your-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.5",
    "messages": [
      { "role": "user", "content": "Give me a concise market summary." }
    ]
  }'

The user should not need to care whether the request goes through an official direct route, a compatible provider route, or a subscription-backed pool. They should mostly care about:

Can I call the model?
What will it cost?
Did it work?
Where can I inspect usage afterwards?

What became harder than expected

The hardest part was not building a proxy. The hard part was making the product understandable.

For example, Tokens Forge separates two balances:

Credit for official/direct model usage
RMB Wallet for ordinary routed usage

That sounds like a UI detail, but it affects everything: model cards, billing copy, admin pricing, route health, usage logs, and how users understand discounts.

Another lesson: if a route has backup channels, the admin UI needs to explain the route tree. A flat table becomes hard to reason about quickly. I ended up moving toward a tree like:

Channel type -> brand -> channel -> routed models -> primary/backup order

It is much easier to debug when the UI matches the mental model.

The research workflow

One feature I did not expect to matter as much is the AI Research Agent.

A lot of users do not only want raw API access. They want to run a longer task: market analysis, company research, trading-support research, PDF export, and a saved history of past runs.

So Tokens Forge includes an AI Research Agent alongside the API gateway. The idea is that users can start with the API, but still have a useful workflow ready when they do not want to wire up their own agent stack.

What I am looking for feedback on

I am still refining the positioning.

Should this be explained first as:

A low-cost multi-model AI gateway, with research workflows included; or
An AI research workspace that also gives users OpenAI-compatible model access?

Right now I am leaning toward the first one, because the core business is still model/API access.

If you build with multiple AI providers, I would be interested in what you expect from a gateway product before you trust it:

Better pricing visibility?
Better usage logs?
Failover and backup channels?
Model permission controls per API key?
Built-in workflows like research reports?

The project is here: https://tokens-forge.com/

I would appreciate feedback from other builders, especially around onboarding clarity and whether the API gateway + research agent combination makes sense.

DEV Community

What I learned building a low-cost multi-model AI gateway

The interface I wanted

What became harder than expected

The research workflow

What I am looking for feedback on

Top comments (0)