Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.vizochok.com/llms.txt

Use this file to discover all available pages before exploring further.

Overview

VIZOCHOK enforces rate limits at three tiers to protect the platform, ensure fair usage, and prevent abuse:

Tier 1: Per-Connection

10 messages / 60 seconds (sliding window per WebSocket)

Tier 2: Per-User

200 messages / day, 20 conversations / day (per user)

Tier 3: Per-Tenant

500,000 tokens / day, 10,000,000 tokens / month (per tenant)
Each tier is checked in order. If any tier rejects the request, an error is returned immediately and subsequent tiers are not checked.

Tier 1: Per-Connection Rate Limit

Protects against rapid message flooding on a single WebSocket connection.
ParameterValue
Max messages10
Time window60 seconds (sliding)
ScopeSingle WebSocket connection

How It Works

The server tracks message frequency per connection and rejects messages that exceed the limit.

Error Response

{
  "type": "error",
  "code": "rate_limit_exceeded",
  "retry_after": 45
}
The retry_after field indicates how many seconds until the oldest message in the window expires and a new message can be sent.

Connection Limit

Each API key is also limited to a maximum of 3 concurrent WebSocket connections. Exceeding this limit closes the new connection with close code 4029.

Tier 2: Per-User Rate Limit

Prevents individual users from consuming excessive resources.
ParameterDefaultConfigurable
Messages per day200Yes
Conversations per day20Yes

How It Works

Counters reset daily at midnight UTC. Rate limits are tracked per authenticated user. Always pass userId in the widget config for accurate per-user tracking.

Error Responses

{
  "type": "error",
  "code": "user_message_limit",
  "limit": 200
}
{
  "type": "error",
  "code": "user_conversation_limit",
  "limit": 20
}

Configuring Limits

Per-user limits are configurable per tenant via the Admin Panel.

Tier 3: Per-Tenant Rate Limit

Prevents a single tenant from consuming disproportionate LLM resources.
ParameterDefaultConfigurable
Tokens per day500,000Yes
Tokens per month10,000,000Yes

How It Works

Token usage is recorded after each AI response and checked before processing new messages. Both daily and monthly limits are enforced.

Error Responses

{
  "type": "error",
  "code": "tenant_daily_token_limit"
}
{
  "type": "error",
  "code": "tenant_monthly_token_limit"
}
Tenant token limits affect all users of a tenant. When a tenant hits the daily or monthly limit, no user under that tenant can send new messages until the limit resets.

Usage Alerts

VIZOCHOK sends a one-time alert notification when a tenant reaches 80% of their monthly token budget.

Configuring Limits

Per-tenant token limits are configurable via the Admin Panel.

Additional Limits

Beyond the three tiers, the system enforces several other limits:
LimitValueDescription
Message size64 KBMaximum size of a single WebSocket message (checked client-side and server-side).
Agent processing timeout120sMaximum time for the agent to process a single message.
Auth timeout10sMaximum time to receive the auth message after connection.
Heartbeat timeout60sConnection closed if no pong within this period.
Per-session token limitConfigurableMaximum tokens per conversation (prevents runaway sessions).
Max tool chain roundsConfigurableMaximum LLM iterations per response.
Client message queue50SDK-side limit on queued messages during disconnection.
Concurrent WS connections3Per API key.

What Happens When Limits Are Hit

LimitUser Experience
Connection rate limitError shown in chat with countdown timer. User can send another message after retry_after seconds.
User message limitError shown in chat: “Daily message limit reached.” No more messages until midnight UTC.
User conversation limitError shown in chat: “Daily conversation limit reached.” Starting new conversations is blocked.
Tenant daily tokensError shown in chat: “Service temporarily unavailable.” All users of the tenant are blocked.
Tenant monthly tokensSame as daily tokens. Persists until the calendar month changes.
Agent busyError shown in chat: “Assistant is processing a previous request.” Resolves when the previous request completes.
Message too largeError shown immediately (client-side check). User must shorten the message.

Best Practices for High-Traffic Stores

1. Set Appropriate Per-User Limits

For high-traffic stores, consider lowering per-user limits to prevent individual users from consuming a disproportionate share of the tenant’s token budget:
  • Messages per day: 50-100 for most retail use cases
  • Conversations per day: 5-10

2. Monitor Token Usage

Use the admin panel dashboard to monitor daily and monthly token consumption. VIZOCHOK automatically sends an alert when you reach 80% of your monthly budget.

3. Optimize with Smart Prompts

Configure your tenant’s system prompt to encourage concise interactions:
  • Provide clear store rules to reduce unnecessary tool calls
  • Disable tools that are not relevant to your use case via disabled_tools

4. Use User IDs for Accurate Tracking

Always pass userId in the widget config when the user is authenticated. This enables accurate per-user rate limiting rather than falling back to per-API-key limits.
const widget = new VizochokWidget({
  apiKey: 'pk_your_key',
  storeId: 'your-store',
  userId: currentUser.id, // Important for rate limiting
});

5. Handle Limit Errors Gracefully

Listen for limit errors in the onError callback and provide appropriate feedback:
onError: (error) => {
  if (error.code === 'tenant_daily_token_limit' ||
      error.code === 'tenant_monthly_token_limit') {
    notifyOps('VIZOCHOK token limit reached');
  }
},

Monitoring Usage

Monitor your current usage in the Admin Panel dashboard, which shows:
  • Tenant’s token count for today and this month
  • Per-user message and conversation counts
  • Usage trends and alerts
Usage data is also available via the REST API — see the API Reference tab.