Last quarter, my team ran an A/B test on a re-engagement campaign. Version A versus Version B. We let it run for two weeks, declared a winner, and rolled it out. Standard process. But here's the thing that nagged me afterward: for those two weeks, half our users were getting the worse experience, and we knew it by day three. We just couldn't do anything about it because the test hadn't reached statistical significance yet.
That's the fundamental tension of traditional experimentation. You need rigorous results, but you're paying for rigor with wasted opportunity.
The multi-armed bandit is a different approach. Instead of splitting traffic 50/50 and waiting, a bandit algorithm dynamically shifts traffic toward the better-performing variation while the test is still running. It explores early (trying all options) and exploits later (leaning into what works), continuously balancing learning with optimization.
The name comes from the old metaphor of a gambler facing a row of slot machines ("one-armed bandits"), trying to figure out which one pays best without wasting too many coins on the losers. In product and growth, the "machines" are your message variants, send times, subject lines, or channel choices.
For growth teams, the practical difference is significant. A/B tests optimize for certainty at the end. Bandits optimize for cumulative performance throughout. You still learn which variant wins, but you waste fewer impressions on the loser along the way. Bain research has shown that bandits can substantially reduce the cost of experimentation while delivering faster, more adaptive insights.
Multi-armed bandits are powerful for individual tests. But the bigger shift I'm watching is how AI is changing the entire CRM stack: not just which message wins, but who gets it, when, through which channel, and what it says.
The old model was batch-and-blast: segment your users, write a campaign, send it to everyone in the segment at the same time. The new model uses AI to optimize every dimension simultaneously:
Right person. Predictive models identify which users are most likely to respond, so you're not wasting messages on people who will ignore them or annoying people who were already going to convert.
Right time. Send-time optimization analyzes individual behavior patterns. Some users engage at 7 AM with coffee. Others at 10 PM on the couch. AI learns these rhythms and adapts.
Right channel. Email, push notification, SMS, in-app message. The optimal channel varies by user, by context, and by message type. An AI-driven CRM doesn't pick one channel for the campaign. It picks the right channel for each person.
Right message. Generative AI can now produce variant copy at scale, and bandit algorithms can test those variants continuously, converging on what resonates without anyone manually writing twelve subject line options.
When all four dimensions are optimized together, the results compound. Early adopters of AI-powered CRM personalization are reporting meaningful lifts in conversion and retention, not because any single dimension improved dramatically, but because optimizing them simultaneously creates a multiplicative effect.
Two things changed in the last year. First, the models got good enough. Language models can generate genuinely useful message variants, not just Mad Libs-style token swaps. Second, the infrastructure caught up. Platforms like Braze, Amplitude, and Hightouch now offer bandit-based experimentation and AI-driven personalization as native features, not custom builds.
For product and growth teams, this means the experimentation bottleneck is shifting. It's no longer about running more tests. It's about building systems that learn continuously, across every touchpoint, without waiting for a human to declare a winner and flip a switch.
The era of "send this email to everyone in Segment A on Tuesday at 10 AM" is ending. Not because it doesn't work at all, but because the gap between that approach and what's now possible is widening every quarter.
The teams that win in 2024 will be the ones who treat experimentation not as a process you run periodically, but as an engine that runs all the time, learning and adapting faster than any human operator could.