I have been building Tokens Forge, a small AI gateway for people who want one practical API key for GPT, Claude, Gemini, and longer AI research workflows.
The product started from a simple frustration: model access is easy, but operating multiple model providers cleanly is not.
Once you move past a prototype, you usually need to answer a few boring but important questions:
- Which model should this request actually hit?
- Is this key allowed to call that model?
- How do I show usage in a way a normal user understands?
- What happens when one provider fails or gets slow?
- How do I keep official model spend and discounted routed model spend separate?
Those are not exciting demo features, but they are the difference between a toy wrapper and something people can actually use.
The interface I wanted
I wanted the user-facing side to stay boring on purpose:
curl https://tokens-forge.com/v1/chat/completions \
-H "Authorization: Bearer sk-your-key" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-5.5",
"messages": [
{ "role": "user", "content": "Give me a concise market summary." }
]
}'
The user should not need to care whether the request goes through an official direct route, a compatible provider route, or a subscription-backed pool. They should mostly care about:
- Can I call the model?
- What will it cost?
- Did it work?
- Where can I inspect usage afterwards?
What became harder than expected
The hardest part was not building a proxy. The hard part was making the product understandable.
For example, Tokens Forge separates two balances:
- Credit for official/direct model usage
- RMB Wallet for ordinary routed usage
That sounds like a UI detail, but it affects everything: model cards, billing copy, admin pricing, route health, usage logs, and how users understand discounts.
Another lesson: if a route has backup channels, the admin UI needs to explain the route tree. A flat table becomes hard to reason about quickly. I ended up moving toward a tree like:
Channel type -> brand -> channel -> routed models -> primary/backup order
It is much easier to debug when the UI matches the mental model.
The research workflow
One feature I did not expect to matter as much is the AI Research Agent.
A lot of users do not only want raw API access. They want to run a longer task: market analysis, company research, trading-support research, PDF export, and a saved history of past runs.
So Tokens Forge includes an AI Research Agent alongside the API gateway. The idea is that users can start with the API, but still have a useful workflow ready when they do not want to wire up their own agent stack.
What I am looking for feedback on
I am still refining the positioning.
Should this be explained first as:
- A low-cost multi-model AI gateway, with research workflows included; or
- An AI research workspace that also gives users OpenAI-compatible model access?
Right now I am leaning toward the first one, because the core business is still model/API access.
If you build with multiple AI providers, I would be interested in what you expect from a gateway product before you trust it:
- Better pricing visibility?
- Better usage logs?
- Failover and backup channels?
- Model permission controls per API key?
- Built-in workflows like research reports?
The project is here: https://tokens-forge.com/
I would appreciate feedback from other builders, especially around onboarding clarity and whether the API gateway + research agent combination makes sense.
Top comments (0)