Overview
VIZOCHOK enforces rate limits at three tiers to protect the platform, ensure fair usage, and prevent abuse:
Tier 1: Per-Connection 10 messages / 60 seconds (sliding window per WebSocket)
Tier 2: Per-User 200 messages / day, 20 conversations / day (Redis counters per user)
Tier 3: Per-Tenant 500,000 tokens / day, 10,000,000 tokens / month (Redis counters per tenant)
Each tier is checked in order. If any tier rejects the request, an error is returned immediately and subsequent tiers are not checked.
Tier 1: Per-Connection Rate Limit
Protects against rapid message flooding on a single WebSocket connection.
Parameter Value Max messages 10 Time window 60 seconds (sliding) Scope Single WebSocket connection Storage In-memory (server-side deque)
How It Works
The server maintains a deque of timestamps for each WebSocket connection. On every incoming message:
Timestamps older than 60 seconds are evicted
If the deque has 10 or more entries, the message is rejected
Otherwise, the current timestamp is added to the deque
Error Response
{
"type" : "error" ,
"code" : "rate_limit_exceeded" ,
"retry_after" : 45
}
The retry_after field indicates how many seconds until the oldest message in the window expires and a new message can be sent.
Connection Limit
Each API key is also limited to a maximum of 3 concurrent WebSocket connections . Exceeding this limit closes the new connection with close code 4029.
Tier 2: Per-User Rate Limit
Prevents individual users from consuming excessive resources.
Parameter Default Configurable Messages per day 200 Yes Conversations per day 20 Yes
How It Works
Counters are stored in Redis with keys like usage:{tenant_id}:{user_id}:{date}:msgs
TTL is set to 25 hours (86,400 + 3,600 seconds) to handle timezone edge cases
Counters reset at midnight UTC
Both counters are fetched in a single Redis pipeline (one round-trip)
User Identity for Rate Limiting
API Key Type Rate Limit Identity Why Public (pk_) key:{api_key_id}Public keys cannot trust client-provided user_id Secret (sk_) Provided user_id Server-to-server calls can be trusted
Error Responses
{
"type" : "error" ,
"code" : "user_message_limit" ,
"limit" : 200
}
{
"type" : "error" ,
"code" : "user_conversation_limit" ,
"limit" : 20
}
Configuring Limits
Per-user limits are configured per-tenant via the admin panel:
Tenant Column Description Default limit_max_user_messages_per_dayMax messages per user/day 200 limit_max_user_conversations_per_dayMax conversations per user/day 20
Tier 3: Per-Tenant Rate Limit
Prevents a single tenant from consuming disproportionate LLM resources.
Parameter Default Configurable Tokens per day 500,000 Yes Tokens per month 10,000,000 Yes
How It Works
Token usage is recorded after each agent call based on actual consumption (not estimated)
Daily counters use keys like usage:{tenant_id}:{date}:tokens with 25-hour TTL
Monthly counters use keys like usage:{tenant_id}:{YYYY-MM}:tokens with ~32-day TTL
Both daily and monthly limits are checked before processing each message
Both counters are fetched in a single Redis pipeline (one round-trip)
Error Responses
{
"type" : "error" ,
"code" : "tenant_daily_token_limit"
}
{
"type" : "error" ,
"code" : "tenant_monthly_token_limit"
}
Tenant token limits affect all users of a tenant. When a tenant hits the daily or monthly limit, no user under that tenant can send new messages until the limit resets.
Usage Alerts
When a tenant reaches 80% of their monthly token budget, VIZOCHOK sends a one-time alert notification. This uses an atomic Redis SET NX operation to ensure the alert fires exactly once per month, even under concurrent request load.
Configuring Limits
Tenant Column Description Default limit_max_tenant_tokens_per_dayMax LLM tokens per day 500,000 limit_max_tenant_tokens_per_monthMax LLM tokens per month 10,000,000
Additional Limits
Beyond the three tiers, the system enforces several other limits:
Limit Value Description Message size 64 KB Maximum size of a single WebSocket message (checked client-side and server-side). Agent processing timeout 120s Maximum time for the agent to process a single message. Auth timeout 10s Maximum time to receive the auth message after connection. Heartbeat timeout 60s Connection closed if no pong within this period. Per-session token limit Configurable Maximum tokens per conversation (prevents runaway sessions). Max tool chain rounds Configurable Maximum LLM iterations per response. Client message queue 50 SDK-side limit on queued messages during disconnection. Concurrent WS connections 3 Per API key.
What Happens When Limits Are Hit
Limit User Experience Connection rate limit Error shown in chat with countdown timer. User can send another message after retry_after seconds. User message limit Error shown in chat: “Daily message limit reached.” No more messages until midnight UTC. User conversation limit Error shown in chat: “Daily conversation limit reached.” Starting new conversations is blocked. Tenant daily tokens Error shown in chat: “Service temporarily unavailable.” All users of the tenant are blocked. Tenant monthly tokens Same as daily tokens. Persists until the calendar month changes. Agent busy Error shown in chat: “Assistant is processing a previous request.” Resolves when the previous request completes. Message too large Error shown immediately (client-side check). User must shorten the message.
Best Practices for High-Traffic Stores
1. Set Appropriate Per-User Limits
For high-traffic stores, consider lowering per-user limits to prevent individual users from consuming a disproportionate share of the tenant’s token budget:
Messages per day : 50-100 for most retail use cases
Conversations per day : 5-10
2. Monitor Token Usage
Use the admin panel dashboard to monitor daily and monthly token consumption. VIZOCHOK automatically sends an alert when you reach 80% of your monthly budget.
3. Optimize with Smart Prompts
Configure your tenant’s system prompt to encourage concise interactions:
Provide clear store rules to reduce unnecessary tool calls
Disable tools that are not relevant to your use case via disabled_tools
4. Use User IDs for Accurate Tracking
Always pass userId in the widget config when the user is authenticated. This enables accurate per-user rate limiting rather than falling back to per-API-key limits.
const widget = new VIZOCHOKWidget ({
apiKey: 'pk_your_key' ,
storeId: 'your-store' ,
userId: currentUser . id , // Important for rate limiting
});
5. Handle Limit Errors Gracefully
Listen for limit errors in the onError callback and provide appropriate feedback:
onError : ( error ) => {
if ( error . code === 'tenant_daily_token_limit' ||
error . code === 'tenant_monthly_token_limit' ) {
notifyOps ( 'VIZOCHOK token limit reached' );
}
},
Monitoring Usage
Retrieve current usage statistics via the admin API:
GET /api/v1/admin/usage?user_id=optional_user_id
Response includes:
Field Description tokens_todayTenant’s token count for today tokens_monthTenant’s token count this month user_messages_todayUser’s message count for today user_conversations_todayUser’s conversation count for today