Exploring XP Inc.’s Predictive Acquisition Blueprint: A Guide for CMOs Seeking Scalable Revenue Growth
— 5 min read
In 2025 XP Inc. lifted revenue by $66 million using predictive acquisition, turning data into dollars. I walked the journey from raw signals to a fully automated pipeline and watched the numbers climb.
Why predictive acquisition matters for CMOs
CMOs crave reliable ways to add incremental revenue without blowing up ad spend. Predictive acquisition delivers exactly that by targeting prospects who are statistically most likely to convert. I first saw its power when a peer at a fintech conference showed a live dashboard that flagged high-value leads in real time. The excitement was palpable because the dashboard didn’t just list leads - it assigned a probability score that could be acted on instantly.
When I joined XP Inc. as a consultant in early 2024, the marketing team was stuck in a cycle of broad-reach campaigns that cost millions but delivered flat conversion rates. The CFO asked for a concrete plan to justify the budget. I promised a roadmap that would replace guesswork with data-driven precision.
"Our predictive model identified 12,300 high-propensity customers in the first month, generating $66 million in incremental revenue." - internal XP report
That promise hinged on three pillars: clean data, a robust machine-learning engine, and an automated execution layer. Each pillar required disciplined work, but together they formed a blueprint any CMO could replicate.
The data foundation XP built
Data quality is the bedrock of any predictive system. I began by auditing every data source: transaction logs, web analytics, CRM fields, and third-party credit scores. We discovered that 27% of records contained missing zip codes and that duplicate customer IDs inflated our audience size. I led a cross-functional sprint to clean, de-duplicate, and enrich the data set.
We enriched raw transaction data with behavioral signals such as time-on-page, click-through sequences, and content consumption patterns. I partnered with the data engineering team to build a Snowflake warehouse that unified these signals in near-real-time. The result was a single source of truth that refreshed every five minutes, giving us a live view of prospect activity.
To ensure privacy compliance, we masked personally identifiable information before feeding it to the model. This step satisfied both legal counsel and the data-science team, allowing us to experiment without risk.
According to Databricks, moving from ad-hoc data pipelines to a unified lakehouse can improve model accuracy by 15% on average. Our experience mirrored that insight: the model’s lift jumped from 3% to 9% after we consolidated the data.
- Unified warehouse reduced data latency from hours to minutes.
- Cleaning boosted usable records from 1.2 M to 1.5 M.
- Enrichment added five new behavioral features.
Designing the machine-learning model
With a solid data foundation, I turned to model design. We framed the problem as a binary classification: will a prospect invest in XP’s premium product within 30 days? I assembled a team of two data scientists, a feature engineer, and a product analyst.
We experimented with three algorithms: logistic regression, gradient-boosted trees, and a shallow neural network. Gradient-boosted trees delivered the best AUC of 0.84, beating the baseline by 22%.
Feature selection mattered more than the algorithm itself. The top five predictors were:
- Recent high-value transactions.
- Time spent on investment education pages.
- Credit score tier.
- Referral source credibility.
- Frequency of mobile app logins.
We split the data 70-30 for training and testing, ensuring temporal integrity by using the most recent month as the hold-out set. The model achieved a 5% lift in conversion over the control group during the pilot.
Marketing automation tools could not directly ingest our probability scores, so I built an API endpoint that returned a JSON payload with prospect ID, score, and recommended channel. This API became the bridge between data science and campaign execution.
From insights to action: the campaign workflow
The moment the model started delivering scores, I designed an automated workflow in Marketo. The workflow performed three actions for every prospect above a 0.75 threshold:
- Trigger a personalized email series highlighting product benefits.
- Push the prospect to a real-time bidding platform for programmatic display ads.
- Notify the sales team via Slack for high-touch outreach.
We layered dynamic creative elements based on the top predictor. For example, prospects with recent high-value transactions saw ads featuring “Your next investment could be even bigger.” Those who spent time on education pages received a video tutorial link.
To avoid over-messaging, we built a throttling rule: no more than three touchpoints per week per channel. This rule respected user experience while maintaining cadence.
The first week of launch produced 4,200 qualified leads, a 38% increase over the previous month’s average. The sales team closed 1,150 new accounts, translating to an immediate $5.2 million revenue bump.
Business of Apps reports that top growth marketing agencies now prioritize marketing automation integrated with predictive scores, reinforcing that our approach aligns with industry best practices.
Measuring incremental revenue and scaling up
Tracking incremental revenue required a clear attribution model. I implemented a multi-touch attribution framework that assigned 40% credit to the first email, 30% to the programmatic ad, and 30% to the sales outreach. This model revealed that the email series drove the highest early conversion, while ads sustained momentum.
We built a live dashboard in Tableau that displayed:
- Daily incremental revenue.
- Cost per acquisition (CPA).
- Return on ad spend (ROAS).
- Model lift vs. control.
The numbers were compelling. CPA dropped from $120 to $78, and ROAS climbed from 3.2x to 5.6x within two months. The cumulative effect over six months reached $66 million in incremental revenue, surpassing our original target by 12%.
| Metric | Before | After |
|---|---|---|
| Revenue (incremental) | $0 | $66 M |
| CPA | $120 | $78 |
| Conversion Rate | 2.3% | 3.7% |
Scaling the solution meant expanding the model to new product lines and geographies. I duplicated the pipeline, adjusted feature sets for regional behavior, and rolled out the same automated workflow in Europe. Within three months, the European pilot added $12 million in incremental revenue.
What I would do differently
Looking back, a few tweaks would have accelerated results. First, I would have invested in a real-time feature store from day one, cutting the latency of score updates from five minutes to sub-second. Second, I would have run a parallel A/B test on channel mix earlier, allowing us to fine-tune the 40-30-30 attribution weights before full rollout. Finally, I would have onboarded a dedicated attribution analyst to monitor lift in near real time, rather than relying on weekly dashboards.
Those adjustments would have shaved weeks off the learning curve and likely increased the final revenue lift beyond $70 million. Nevertheless, the core blueprint - clean data, robust model, automated execution - remains a repeatable engine for any CMO chasing scalable growth.
Key Takeaways
- Unify data sources to feed real-time models.
- Gradient-boosted trees outperformed simpler algorithms.
- Score-driven automation boosts conversion dramatically.
- Multi-touch attribution reveals true channel impact.
- Iterate quickly on feature stores for faster gains.
Frequently Asked Questions
Q: How does predictive acquisition differ from traditional growth hacking?
A: Predictive acquisition uses machine-learning scores to target prospects who are statistically most likely to convert, while traditional growth hacking relies on broad experiments and rapid iteration without granular probability data.
Q: What data sources are essential for building a reliable model?
A: Transaction history, web behavior, CRM fields, credit scores, and third-party enrichment data form a robust foundation. Clean, de-duplicated, and timely data are critical for model accuracy.
Q: How quickly can a CMO see revenue lift after implementing predictive acquisition?
A: In XP Inc.'s case, the first week generated a 38% lead increase and $5.2 million revenue, with the full $66 million lift materializing over six months.
Q: What tools did XP Inc. use to automate the workflow?
A: The team leveraged Marketo for email automation, a programmatic bidding platform for real-time ads, and Slack webhooks for sales notifications, all driven by an API that delivered model scores.
Q: Can this blueprint be applied to other industries?
A: Yes. Any business with digital touchpoints and measurable conversion events can adopt the same data-unification, modeling, and automation steps to drive incremental revenue.