Technical concepts

The developer-facing companion to the owner concepts. Read this once before wiring up the API.

For the plain-English version aimed at non-technical stakeholders, see How personalization works.

Identity model#

`client_id`#

Your workspace identifier. Embedded inside the API key (sk_live_…) — you don’t pass it explicitly in requests. All data is partitioned by client_id: catalog, events, models, widgets, analytics. No cross-workspace queries.

`(item_id, languageCode)`#

The composite key for every product. The same SKU sold in English and French is two records — distinct titles, descriptions, attributes, but the same item_id. This lets your storefront pass a single SKU through the funnel while CartAmplify keeps language-local representations.

`(client_id, languageCode)`#

The partition key for the personalization model. Each language trains an independent model. Cold-start state and the 10,000-event activation threshold are evaluated per language, not per workspace.

`visitorId`#

Required on every event and every search/browse/recommendation request. Anonymous identifier (1–100 chars, alphanumeric + _-). Your storefront generates and persists this — usually in a cookie or localStorage. Identity continuity across sessions depends on persistence; per-page-load visitor IDs disable cross-session personalization.

`userInfo.userId`#

Optional. When sent, ties the current visitorId to a known user. Sending the same userId from multiple visitor sessions (different devices, anonymous-then-logged-in) merges those sessions in the personalization signal.

Key types#

Prefix	Environment	Counted in analytics?	Counted in personalization?
`sk_test_…`	Sandbox	No	No
`sk_live_…`	Production	Yes	Yes

Generated under Dashboard → API Keys. Both grant full read/write to the workspace — treat as secrets, never ship in client-side bundles. Use sk_test_ during integration so test traffic doesn’t pollute live analytics or training data.

Catalog data flow#

Bulk sync state machine#

POST /v1/products/bulk is stateful across pages:

Page 1 acquires a sync lock and records the start timestamp.
Pages 2..N-1 upsert products.
The last page (page == pages) deletes any products not touched during this sync, rebuilds the search index for the language, and releases the lock.

Skipping the last page leaves the sync lock held and stale products in place. The lock auto-expires after 24 hours; clients should always close the sync explicitly.

Idempotency: re-sending the same page is safe (upserts). Re-running the entire sync is safe (each pages > 0 rebuilds).

Single-product CRUD#

Real-time. Single-product POST/PUT/DELETE operations update the index incrementally — no lock involved. Use these for live inventory changes after the initial bulk sync.

Index rebuild triggers#

Last page of a /v1/products/bulk sync
Catalog import via dashboard re-upload
Manual rebuild via dashboard (rare)

Single-product mutations update the index in place without a full rebuild.

Event ingestion#

Synchronous vs async#

Event ingestion is async. /v1/user-events returns 202 Accepted with an eventId immediately; validation, deduplication, and downstream processing happen out of band. Typical end-to-end visibility in analytics: seconds. During high traffic: occasionally minutes.

Required fields, all events#

eventType, eventTime, visitorId, languageCode, userInfo.ipAddress, userInfo.userAgent.

`eventTime` window#

Live events (/v1/user-events) must be within ±24h of server time. Outside this window, the event is rejected.

Deduplication#

Events keyed by (client_id, visitorId, eventType, eventTime, languageCode, transactionId/cartId). Duplicates are accepted but de-duplicated downstream — so retries and accidental double-fires don’t skew analytics or training.

Languages#

Two-letter ISO 639-1 codes (en, fr, es, it, de…). Configured per workspace in the dashboard. Sending an unsupported language returns 400 unsupported_language.

Per-language scoping applies to:

Catalog (product records)
Events (each event tagged)
Models (one model per language)
Widgets (widgets configured per language for search/browse; recommendation widgets currently span languages)
Analytics (filter by language)

Model behavior#

For every search, browse, and recommendation request, the personalization model scores products for the requesting shopper and returns them ordered by predicted likelihood of click or conversion. Scoring factors in the shopper’s event history and current session context. Optimized to handle catalogs of 50,000+ products with end-to-end response times in the ~50ms range.

Models train per (client_id, language). The underlying architecture is proprietary and not documented publicly; this page describes the behavior integrators need to reason about, not the implementation.

Training cadence#

First training: triggered automatically when a (client_id, language) pair crosses 10,000 events.
Subsequent retraining: scheduled, frequency proportional to event volume (high-traffic languages retrain more often; low-traffic ones less).
No client-triggered retraining. No “retrain now” endpoint. Models are managed automatically.

Real-time session signals#

Independently of the trained model, the system tracks real-time session signals — recent views, recent cart adds, current product context — and uses them to influence ranking before any retrain. A shopper who just viewed a blue dress sees related blue products climb in the next category page immediately, without waiting for the model to train again.

Attribution#

Every /v1/search, /v1/browse, and /v1/recommendations response includes an attributionToken (opaque string). Pass that token through to subsequent detail-page-view, add-to-cart, and purchase-complete events on productDetails[].attributionToken.

The dashboard’s per-surface revenue metrics (revenue from search, revenue from {widget}) are computed exclusively from attributed events. Unattributed conversions show as direct.

Token lifetime: implementation-dependent, but generally valid for the shopper’s session and a short window beyond.

Cold start#

Per-language, evaluated independently:

Under 10k events: model status = cold start. Search and browse use content-based ranking (titles, categories, attributes) plus popularity feedback. Behavioral recommendation widgets fall back to content-based or popularity-based results.
≥ 10k events: first model trains automatically. Status flips to personalized. Per-shopper ranking activates.
Continued growth: automatic retraining keeps the model fresh.

Cold start ends once the language reaches the threshold through live event ingestion — there is no fast-forward path.

Rate limits#

Monthly request quotas per tier (Starter 20k, Growth 100k, Amplify 500k, Enterprise unlimited). A ~10% grace buffer applies before hard-blocking; over the buffer, requests return 429 rate_limit_exceeded. Overage is charged per 1k requests, by tier (see Pricing).

Rate-limited endpoints: /v1/search, /v1/browse, /v1/recommendations. Event ingestion and catalog operations are not metered the same way (event-volume billing is separate, usually included; bulk catalog operations are governed by tier product limits).

Multi-tenancy guarantees#

All requests scoped to the workspace identified by the bearer key.
No cross-workspace data access (no admin endpoints expose other tenants’ data through the public API).
Catalog, events, models, and analytics are isolated per client_id.
A client_id cannot enumerate another client_id’s widgets, products, or events.

What you don’t have to manage#

Things CartAmplify handles automatically that integrators don’t touch:

Model retraining schedules
Search index refresh after catalog changes
Per-language model tuning
Real-time session signal handling
Cold-start → trained-model handoff

If you find yourself needing one of the above, talk to us — there’s likely an existing dashboard control or a feature in development that solves the underlying need.