[ RECOMMENDATION SYSTEMS ]
Recommendation Systems for Real-Time Personalisation
A recommendation engine that takes 200ms to respond is a recommendation engine that gets ignored. Sub-50ms latency is the threshold where personalisation starts to drive measurable basket-size growth. Hitting that threshold at scale is an engineering problem more than a modelling problem.
[ THE LEVENT POINT OF VIEW ]
Sub-50ms is the difference between recommendations that drive revenue and recommendations that get ignored.
The model can be elegant. The latency can still kill it. We design recommendation systems back from the latency budget, not forward from the model. Hybrid approaches (collaborative + content) win when cold-start matters; serving infrastructure choices win the rest.
[ WHAT THIS MEANS IN PRACTICE ]
Our recommendation accelerator combines collaborative filtering on aggregated anonymised signals with content-based scoring for cold-start coverage. Serving runs on Elasticsearch for sub-50ms response across millions of items. The hybrid weighting tunes per-segment so recommendations match how each cohort actually shops.
Cold-start handling is where most rec engines quietly fail. We handle new items with content-based fallback and propagate to collaborative filtering as soon as enough behavioural signal accumulates. New users get a session-context bootstrap (the items they just looked at) until their behavioural fingerprint emerges.
Signal selection drives almost all of the model performance. View signals are cheap and noisy; add-to-cart signals are sparser and stronger; conversion signals are the strongest and rarest. We weight the signal stack per cohort and per surface (homepage versus PDP versus cart upsell), because the same recommendation strategy fails differently on each.
Integration with merchandising matters as much as the model. A recommendation engine that the merchandising team cannot inspect or override does not survive its first peak season. We deliver merchandiser controls (pinning, exclusion, business-rule overlays) and a transparent ranking explanation so the team trusts the engine on the days it matters most.
Evaluation matters in recommendations more than almost any other ML problem, because online metrics diverge from offline metrics regularly. We instrument both, with A/B harnesses that the merchandising team can run themselves.
[ HOW WE DELIVER THIS ]
How we deliver this
Build delivers the model, the serving stack, and the A/B harness. Operate runs the latency-sensitive infrastructure. Enable upskills the merchandising team on running their own experiments.