Account/Billing & consumption

Billing & consumption

wylon uses pay-as-you-go billing metered per token. This page explains how usage is measured, how charges accrue, and how to review your consumption.

Pricing model

Every inference request is billed on two dimensions:

Input tokens — everything you send: the prompt, system message, tool definitions, and prior turns.
Output tokens — everything the model generates, including tool-call arguments and reasoning tokens.

Rates are per-million tokens and vary by model. Batch jobs receive a discount on top of the standard rate. See the pricing page for current rates.

calculate

Token counts returned in the usage field of each response are authoritative — they are what you will be charged for. Cached prefix tokens (when eligible) are billed at a reduced rate and appear as cached_input_tokens.

Usage dashboard

Dashboard → Account management → User Center → Usage statistics breaks down consumption by model, API key, and day, with export support.

Invoices & receipts

To view receipts or issue invoices, go to Dashboard → Billing → Invoice management.

FAQ

Are failed requests billed? No — 4xx and 5xx responses are not charged. Timed-out streams are billed for tokens actually produced.
Are batch jobs billed differently? Yes — batch jobs are still billed per token, but at a discount versus the real-time rate. See Batch.
Can I raise my usage limits? Contact our sales team to request a higher usage quota.