Skip to main content

Overview

VIZOCHOK enforces rate limits at three tiers to protect the platform, ensure fair usage, and prevent abuse:

Tier 1: Per-Connection

10 messages / 60 seconds (sliding window per WebSocket)

Tier 2: Per-User

200 messages / day, 20 conversations / day (Redis counters per user)

Tier 3: Per-Tenant

500,000 tokens / day, 10,000,000 tokens / month (Redis counters per tenant)
Each tier is checked in order. If any tier rejects the request, an error is returned immediately and subsequent tiers are not checked.

Tier 1: Per-Connection Rate Limit

Protects against rapid message flooding on a single WebSocket connection.
ParameterValue
Max messages10
Time window60 seconds (sliding)
ScopeSingle WebSocket connection
StorageIn-memory (server-side deque)

How It Works

The server maintains a deque of timestamps for each WebSocket connection. On every incoming message:
  1. Timestamps older than 60 seconds are evicted
  2. If the deque has 10 or more entries, the message is rejected
  3. Otherwise, the current timestamp is added to the deque

Error Response

{
  "type": "error",
  "code": "rate_limit_exceeded",
  "retry_after": 45
}
The retry_after field indicates how many seconds until the oldest message in the window expires and a new message can be sent.

Connection Limit

Each API key is also limited to a maximum of 3 concurrent WebSocket connections. Exceeding this limit closes the new connection with close code 4029.

Tier 2: Per-User Rate Limit

Prevents individual users from consuming excessive resources.
ParameterDefaultConfigurable
Messages per day200Yes
Conversations per day20Yes

How It Works

  • Counters are stored in Redis with keys like usage:{tenant_id}:{user_id}:{date}:msgs
  • TTL is set to 25 hours (86,400 + 3,600 seconds) to handle timezone edge cases
  • Counters reset at midnight UTC
  • Both counters are fetched in a single Redis pipeline (one round-trip)

User Identity for Rate Limiting

API Key TypeRate Limit IdentityWhy
Public (pk_)key:{api_key_id}Public keys cannot trust client-provided user_id
Secret (sk_)Provided user_idServer-to-server calls can be trusted

Error Responses

{
  "type": "error",
  "code": "user_message_limit",
  "limit": 200
}
{
  "type": "error",
  "code": "user_conversation_limit",
  "limit": 20
}

Configuring Limits

Per-user limits are configured per-tenant via the admin panel:
Tenant ColumnDescriptionDefault
limit_max_user_messages_per_dayMax messages per user/day200
limit_max_user_conversations_per_dayMax conversations per user/day20

Tier 3: Per-Tenant Rate Limit

Prevents a single tenant from consuming disproportionate LLM resources.
ParameterDefaultConfigurable
Tokens per day500,000Yes
Tokens per month10,000,000Yes

How It Works

  • Token usage is recorded after each agent call based on actual consumption (not estimated)
  • Daily counters use keys like usage:{tenant_id}:{date}:tokens with 25-hour TTL
  • Monthly counters use keys like usage:{tenant_id}:{YYYY-MM}:tokens with ~32-day TTL
  • Both daily and monthly limits are checked before processing each message
  • Both counters are fetched in a single Redis pipeline (one round-trip)

Error Responses

{
  "type": "error",
  "code": "tenant_daily_token_limit"
}
{
  "type": "error",
  "code": "tenant_monthly_token_limit"
}
Tenant token limits affect all users of a tenant. When a tenant hits the daily or monthly limit, no user under that tenant can send new messages until the limit resets.

Usage Alerts

When a tenant reaches 80% of their monthly token budget, VIZOCHOK sends a one-time alert notification. This uses an atomic Redis SET NX operation to ensure the alert fires exactly once per month, even under concurrent request load.

Configuring Limits

Tenant ColumnDescriptionDefault
limit_max_tenant_tokens_per_dayMax LLM tokens per day500,000
limit_max_tenant_tokens_per_monthMax LLM tokens per month10,000,000

Additional Limits

Beyond the three tiers, the system enforces several other limits:
LimitValueDescription
Message size64 KBMaximum size of a single WebSocket message (checked client-side and server-side).
Agent processing timeout120sMaximum time for the agent to process a single message.
Auth timeout10sMaximum time to receive the auth message after connection.
Heartbeat timeout60sConnection closed if no pong within this period.
Per-session token limitConfigurableMaximum tokens per conversation (prevents runaway sessions).
Max tool chain roundsConfigurableMaximum LLM iterations per response.
Client message queue50SDK-side limit on queued messages during disconnection.
Concurrent WS connections3Per API key.

What Happens When Limits Are Hit

LimitUser Experience
Connection rate limitError shown in chat with countdown timer. User can send another message after retry_after seconds.
User message limitError shown in chat: “Daily message limit reached.” No more messages until midnight UTC.
User conversation limitError shown in chat: “Daily conversation limit reached.” Starting new conversations is blocked.
Tenant daily tokensError shown in chat: “Service temporarily unavailable.” All users of the tenant are blocked.
Tenant monthly tokensSame as daily tokens. Persists until the calendar month changes.
Agent busyError shown in chat: “Assistant is processing a previous request.” Resolves when the previous request completes.
Message too largeError shown immediately (client-side check). User must shorten the message.

Best Practices for High-Traffic Stores

1. Set Appropriate Per-User Limits

For high-traffic stores, consider lowering per-user limits to prevent individual users from consuming a disproportionate share of the tenant’s token budget:
  • Messages per day: 50-100 for most retail use cases
  • Conversations per day: 5-10

2. Monitor Token Usage

Use the admin panel dashboard to monitor daily and monthly token consumption. VIZOCHOK automatically sends an alert when you reach 80% of your monthly budget.

3. Optimize with Smart Prompts

Configure your tenant’s system prompt to encourage concise interactions:
  • Provide clear store rules to reduce unnecessary tool calls
  • Disable tools that are not relevant to your use case via disabled_tools

4. Use User IDs for Accurate Tracking

Always pass userId in the widget config when the user is authenticated. This enables accurate per-user rate limiting rather than falling back to per-API-key limits.
const widget = new VIZOCHOKWidget({
  apiKey: 'pk_your_key',
  storeId: 'your-store',
  userId: currentUser.id, // Important for rate limiting
});

5. Handle Limit Errors Gracefully

Listen for limit errors in the onError callback and provide appropriate feedback:
onError: (error) => {
  if (error.code === 'tenant_daily_token_limit' ||
      error.code === 'tenant_monthly_token_limit') {
    notifyOps('VIZOCHOK token limit reached');
  }
},

Monitoring Usage

Retrieve current usage statistics via the admin API:
GET /api/v1/admin/usage?user_id=optional_user_id
Response includes:
FieldDescription
tokens_todayTenant’s token count for today
tokens_monthTenant’s token count this month
user_messages_todayUser’s message count for today
user_conversations_todayUser’s conversation count for today