Token inference API · elevated latency
April 14, 2026
14:52
Resolved
Latency returned to normal. Root cause: a misconfigured traffic-shaping rule pushed during a batch update. The rule was rolled back and the rollout process hardened.
14:31
Investigating
cn-east-1 region P99 latency rose to ~3.2s; P50 unaffected. Traffic was shifted to a backup region.