model catalog

Leading open-source models

wylon Token Factory serves the leading open-source model families — MiniMax, Kimi, GLM, Qwen, DeepSeek — all running on wylon's GPU super-node architecture behind a single unified API.

Frontiers

MiniMax

Long-context, productivity-grade workloads — document processing, summarization, multi-turn business conversations.

ContextLong-form / structured

Use casesEnterprise knowledge / long documents

Moonshot

Kimi

Solid on multimodal, ultra-long context, and code — a popular foundation for agents and developer tooling.

ContextLong dialogue / repos

Use casesAgent / IDE / research

Zhipu · Z.ai

GLM

Strong on Chinese corpora — covers general chat, tool use, and agentic workflows.

ContextGeneral chat / tool calls

Use casesCustomer support / Chinese-language workloads

Qwen

Full lineup from on-device tiny models to flagship MoEs, balancing performance and cost.

ContextGeneral / multilingual / multimodal

Use casesContent / RAG / vision

DeepSeek

Reputation for reasoning, code, and math — its MoE line offers excellent price-performance.

ContextReasoning / code / chain-of-thought

Use casesHigh-throughput batch / agent planning

And more

See the full list, context lengths, and pricing in the model matrix below and on the pricing page.

View pricing →

wylon continuously adds new models — the authoritative, up-to-date list is the Model Plaza in your console.

Other

Refer to the live service for the authoritative model list. See the pricing page or your console for current pricing.

ModelsFamilyContextBest for

Kimi-K2.5

General chat · long context

Kimi

128k

Long documents, knowledge Q&A, multi-turn chat

Kimi-K2.6

General chat · enhanced

Kimi

128k

Complex reasoning, creative writing, expert Q&A

MiniMax-M2.5

General chat · cost-effective

MiniMax

256k

Bulk summarization, customer support, content

MiniMax-M2.7

General chat · enhanced

MiniMax

256k

High-quality generation, multi-turn chat, tool use

GLM-5.1

Chinese general · tool use

GLM

128k

Code generation, structured output, customer support

Qwen3.6-35B-A3B

Flagship MoE · reasoning

Qwen

256k

Data analysis, code, workflow orchestration

Qwen3.6-27B

Dense · reasoning

Qwen

256k

Complex reasoning, RAG, IDE assistants

DeepSeek-V4-Flash

Fast · low cost

DeepSeek

64k

High-concurrency inference, summarization, classification

Access method

Single wylon API

All models share the same endpoint — just switch the model field to compare families.

Get started → Switch to wylon → Pricing →

FAQ

Do I need to change my code when switching models?

Just swap the model field — the request shape stays the same. If you rely on tool calls or structured output, we recommend running a regression pass after switching.

Will I be notified when new models go live?

Yes. wylon posts release announcements in the dashboard, and major changes are sent ahead of time.

Can enterprises request private access?

Please contact us — our solutions team can help scope a plan with you.