Cli-Modelarium 0.1.4: 10 LLM providers now, with Qwen and GLM

#python #ai #llm #opensource

Quick release note. Cli-Modelarium 0.1.4 just shipped, and the headline is two new providers.

Two new providers, ten in total

You can now compare Alibaba's Qwen models (via DashScope) and Z.AI's GLM models side by side with the rest of the lineup: OpenAI, Anthropic, Google, xAI, DeepSeek, Mistral, Groq, OpenRouter, plus your local models. That brings it to 10 cloud providers.

If you have wanted to benchmark the open-weight models against the frontier ones on your own prompts, it is now a single command:

pip install --upgrade cli-modelarium

cli-modelarium "Write a haiku about garbage collection in programming" \
  --models qwen3.7-max,glm-5.2,gpt-5.4,claude-opus-4-8 \
  --runs 10 --max-cost 0.50

You get a side by side table with cost and latency per model. With --runs greater than 1 it repeats the trials and runs the statistical tests automatically, so you can tell a real difference from noise instead of eyeballing one output. The --max-cost flag is a hard cap, so a multi-model run does not surprise your API bill.

Also in this release

Refreshed all pricing to current provider rates
Added Qwen and GLM to the model groups (all-flagship, all-budget, all-fast, all-cheap), plus GLM to all-reasoning, so you can pull them in by group
Added Python 3.14 support
A few model id updates to track provider renames

New here?

Cli-Modelarium is a command line tool for comparing LLM outputs side by side, with real statistics (bootstrap confidence intervals, paired significance tests, McNemar's), CI-ready assertions, hallucination detection, LLM-as-judge scoring, and cost tracking. One pip install, no infrastructure, Apache 2.0.