Multi-Tenant Architecture
VIZOCHOK is a multi-tenant SaaS where each tenant (retailer) gets an isolated environment with their own:- Product catalog and vector embeddings
- AI configuration (LLM model, system prompt, disabled tools)
- API keys (public and secret)
- Webhook endpoints (products, cart)
- Usage limits and billing
- Admin users
tenant_id scoping at every layer.
Tenant A
- Catalog + Embeddings
- API Keys (pk_, sk_)
- LLM Config (configurable per tenant)
- Webhooks (products, cart)
- Usage Limits + Billing
- Admin Users
Tenant B
- Catalog + Embeddings
- API Keys (pk_, sk_)
- LLM Config (configurable per tenant)
- Webhooks (products, cart)
- Usage Limits + Billing
- Admin Users
Data Flow
The complete data flow from a user message to a response:VIZOCHOK Backend
WS Handler authenticates and checks rate limits. Shopping Agent (LLM + Tools) processes the request using Hybrid Search and Webhook Calls.
Client Backend
VIZOCHOK calls your Products API and Cart API via HTTP webhooks to get live prices and confirm cart operations.
Step-by-Step Flow
- User sends a message via the Widget SDK over WebSocket
- API server authenticates the connection using the API key (SHA-256 hashed)
- Rate limits are checked at three tiers: per-connection, per-user, per-tenant
- Agent is initialized with tenant config, session state (from Redis), and user profile
- Smart model routing selects the LLM model:
- First message in a conversation uses the complex model
- Subsequent tool-chain iterations use the fast model
- Agent processes the message using a tool-based architecture:
- Search tools: Hybrid search (vector + FTS + trigram) with RRF fusion and optional Cohere reranker
- Cart tools: Add, remove, update, clear — each calls the client’s webhook
- UI tools: Show products, ask user, show recipe checklist, show meal plan
- Prices are fetched from the client’s backend via the
products_urlwebhook (server-to-server) - Cart operations call the client’s
cart_urlwebhook, which confirms or rejects - Results stream back to the widget via WebSocket as
text_delta,product_cards,cart_changed, etc. - Session is saved to Redis for the next message (1h TTL)
What VIZOCHOK Stores vs. What the Client Stores
Understanding the data boundary is critical for integration:VIZOCHOK Stores
| Data | Storage | Purpose |
|---|---|---|
| Product catalog | PostgreSQL | Names, descriptions, categories, images, SKUs — for search and embeddings |
| Product embeddings | pgvector | 1536-dimension vectors for semantic search |
| Conversation history | PostgreSQL | Full chat log for analytics and session restore |
| Session state | Redis | Cart, pending tools, context — 1h TTL |
| API keys (hashed) | PostgreSQL | Authentication |
| Tenant configuration | PostgreSQL | LLM, webhooks, limits, prompts |
| Usage counters | Redis | Rate limiting and billing |
| User profiles | PostgreSQL | Language, dietary preferences, favorite brands |
Client Stores
| Data | Purpose |
|---|---|
| Product prices | Returned via products_url webhook on demand |
| Product availability | Returned via products_url webhook on demand |
| Shopping cart | Managed via cart_url / cart_get_url webhooks |
| User identity | Passed as userId to the widget |
| Order history | Never shared with VIZOCHOK |
| Payment information | Never shared with VIZOCHOK |
VIZOCHOK intentionally does not store prices or availability. These are always fetched in real-time from the client’s backend via webhooks, ensuring the AI always has current data.
Session Lifecycle
New Connection
- Client sends
authmessage - Server responds with
auth_ok - Server sends
conversation_startedwith newconversation_id - Message loop begins
- On disconnect, session saved to Redis (1h TTL)
Reconnection
- Client sends
authmessage - Server responds with
auth_ok - Server sends
session_restoredwith cart + pending tools - Message loop resumes
- On disconnect, session saved to Redis (1h TTL)
Session Details
- Server side: Redis with 1-hour TTL, scoped by
tenant_id:conversation_id - Client side:
sessionStorage(per-tab) stores conversation ID and message history under keyvz-session-{storeId} - On reconnect: Server sends
session_restoredwith cart state and any pending interactive tool (product selection, quick replies, etc.) - On expiry: After 1 hour of inactivity, the Redis session expires and a new conversation starts
Smart Model Routing
VIZOCHOK uses a dual-model strategy to balance quality and cost:| Model | When Used | Strengths |
|---|---|---|
| Complex model (configurable) | First message in a conversation | Complex reasoning, intent classification |
| Fast model (configurable) | Subsequent tool-chain iterations | Fast response, cost-effective for tool calls |
- Conversation position: First message always uses the complex model
- Tool chain depth: After the first LLM call, subsequent iterations in the same response use the fast model
- Tenant configuration: Each tenant can configure which models to use (via
llm_modelandllm_model_fastcolumns)
Token Budget
Each conversation has a configurable token budget (max_tokens_per_session). When the accumulated token usage approaches the limit, the agent sends a session_token_limit error and suggests starting a new conversation.
Agent Tool Architecture
The AI agent uses a tool-based architecture where the LLM decides which tools to call:Non-Interactive Tools
These tools execute and return results to the LLM for further processing. The agent loops: LLM call -> tool execution -> LLM call -> … until it produces a final text response or calls an interactive tool.| Tool | Description |
|---|---|
search_products | Hybrid search across the product catalog |
add_to_cart | Add a product to cart (via webhook) |
remove_from_cart | Remove a product from cart (via webhook) |
update_quantity | Change quantity of a cart item (via webhook) |
clear_cart | Clear all cart items (via webhook) |
get_cart | Retrieve current cart state |
Interactive Tools
These tools render UI in the widget and pause for user input. The conversation resumes when the user interacts with the UI element.| Tool | Description | Widget UI |
|---|---|---|
show_products_to_user | Display product cards for selection | Product list with quantity steppers |
ask_user | Present quick-reply options | Pill-shaped buttons |
show_recipe_checklist | Show ingredient checklist | Checkable ingredient list with submit |
show_meal_plan | Display a meal plan for approval | Multi-day plan with approve/modify |
Hybrid Search
Product search combines three strategies using Reciprocal Rank Fusion (RRF):- Vector search — Cohere embed-v4.0 embeddings (1536 dimensions) for semantic similarity via pgvector
- Full-text search — PostgreSQL
tsvectorwith Ukrainian language configuration - Trigram search —
pg_trgmfor fuzzy matching of brand names and misspellings
products_url webhook to fetch real-time prices and availability. Products not returned by the webhook are filtered out (treated as unavailable).
Webhook Architecture
VIZOCHOK uses server-to-server webhooks for all commercial data:- VIZOCHOK Backend calls your HTTP endpoints server-to-server
GET products_urlreturns prices and stock for requested SKUsPOST cart_urlconfirms add/remove/update/clear cart operationsGET cart_get_urlreturns current cart contents at session start
X-VIZOCHOK-Signature header for verification. The shared secret is configured per tenant.
Error Handling
Errors flow through the system as machine-readable codes:- Backend detects error (rate limit, validation, LLM failure)
- Backend sends
{"type": "error", "code": "rate_limit_exceeded"}via WebSocket - SDK maps the code to a localized message using its i18n dictionary
- SDK renders the error in the chat and fires the
onErrorcallback - The host page can handle the error (e.g., show a toast, log to analytics)