Good financial advice has long been scarce and expensive. A large literature shows that human advisers are costly and often dispense low-quality or conflicted advice that distorts what households do (Mullainathan et al. 2012, Egan et al. 2019, Linnainmaa et al. 2021, Guiso and Foà 2015). The lower-cost alternatives are not obviously better: popular personal finance gurus frequently depart from what economic models prescribe (Choi 2022). Against this backdrop, the rise of generative AI has raised hopes that high-quality, personalised advice might finally become cheap and universally accessible.
That prospect is no longer hypothetical. Within a few years, over half of adults in the US and the UK report having used AI tools for financial guidance, rivalling the share who consult a human adviser (J.D. Power 2025, Lloyds Banking Group 2025, Gallup 2025). This rapid growth has prompted a lively policy debate. Regulators and consumer advocates worry that large language models (LLMs) not designed to maximise users’ financial wellbeing could reinforce biases, validate poor decisions, or quietly embed discrimination, concerns that extend to the broader use of LLMs in finance (Anand et al. 2026). Optimists counter that, like robo-advisers before them, these tools could broaden stock market participation and improve decision-making (D’Acunto et al. 2019). Regardless, making progress on settling this debate requires knowing what advice these models actually give.
In new research (Choukhmane et al. 2025), we develop a method to quantitatively characterise the personal financial advice that LLMs provide to households. We combine prompts written by a representative sample of individuals with a life-cycle model to quantify the impact of following AI financial advice over a lifetime. Applying this methodology to ChatGPT 5.2 and Gemini Flash 3 delivers three main results. First, following LLM advice would move most people closer to the saving, spending, and investing patterns recommended by standard life cycle theory relative to their current behaviour. Second, the advice nonetheless leans on simple rules of thumb, diverging from theory on subtler margins such as smoothing consumption after a job loss. Third, the advice varies systematically across people – by gender, financial literacy, and prior AI experience – reflecting both what different people ask (demand) and how the model answers identical questions (supply).
Supply, demand, and the life cycle
Studying LLM advice is difficult for three reasons. It depends on the model (supply), the questions people ask (demand), and a household’s circumstances, which evolve over a lifetime as a function of past decisions. Our framework addresses these three challenges.
We surveyed a nationally representative sample of around 1,000 US adults, asking each to write the prompts they would actually use to seek spending and investing advice from an LLM. We then build a standard life-cycle simulation model with realistic earnings, job transitions, taxes, and asset returns (in the tradition of Cocco et al. (2005), and closest to Choukhmane and de Silva 2024) .Finally, we simulate the lifetime paths of people who follow the model’s advice, drawing a real survey prompt at each age, updating it with the simulated person’s current circumstances, and translating the model’s response into consumption–savings and portfolio choices. We apply the method to ChatGPT and Gemini and obtain similar results, so we focus on ChatGPT, the model households use most.
Fact 1: LLM advice moves households toward life-cycle theory
Following LLM advice would move most people closer to the broad prescriptions of life-cycle theory than their current behaviour. Figure 1 compares what survey respondents report doing (red) with the advice they would receive if they asked once today (green) and if they followed the model’s advice every year over their lifetime (blue).
Figure 1 Observed behaviour versus LLM-recommended behaviour over the life cycle
Note: Red: survey respondents’ reported behaviour; green: one-shot LLM advice at current circumstances; blue: full life cycle simulation following LLM advice each year.
The differences are large. One-third of respondents report holding no equities at all, with an average equity share of around 30%. Following the model’s advice each period instead leads over 99% of people to participate in the stock market, raises equity shares by as much as 40 percentage points, and produces allocations that decline with age after the mid-40s – the glide path textbook theory recommends (Merton 1969). Almost all of this exposure is in diversified equity funds. The advice also builds buffers: more than 20% of respondents in every age group hold less than $10,000, yet following LLM advice lifts almost everyone above that threshold by age 30. As Figure 2 shows, the result is a consumption profile that is smoother than income, substantial wealth at retirement, and an equity share that rises early and then declines. The LLM’s advice even responds to context in the prompt as theory predicts, recommending more savings and less risk-taking when prompts mention macroeconomic uncertainty.
Figure 2 Life-cycle profiles of LLM-recommended consumption, wealth, and equity shares
Note: Dots are medians; shaded bands show the 25th–75th percentile range. Values in 2025 dollars.
Fact 2: LLM advice leans on heuristics
On subtler margins, the advice diverges from theory. Much like the popular advice studied by Choi (2022) and the rules of thumb analysed by Love (2014), it leans heavily on round-number heuristics: about a third of recommended saving rates are multiples of 10%, and over 98% of recommended retirement withdrawals obey the famous ‘4% rule’. It also fails to smooth consumption during unemployment: when a simulated worker loses their job and income falls by half, the LLM tells them to cut consumption, even with ample savings to draw on. Additionally, the LLM’s recommended portfolios drift passively with market returns rather than being rebalanced, showing roughly twice the inertia Calvet et al. (2009) estimate for actual investors.
Much of this gap reflects differences in the prompts people write rather than fundamental limitations of the models. When we replace ordinary survey prompts with a structured ‘academic’ prompt that provides full context and casts the model as a professional adviser, consumption smoothing improves markedly and reliance on round-number rules declines. The implication is double-edged: better advice is within reach, but realising it depends on how people interact with the technology, and demand (i.e. prompt quality) may remain a binding constraint even as the models improve.
Fact 3: LLM advice varies across households
Splitting respondents by gender, financial literacy, and prior AI experience, we find meaningful differences in lifetime outcomes: wealth at age 60 is roughly 4–6% lower when advice is generated from prompts written by women, by people with low financial literacy, or by those who have never used AI for financial advice. The channels differ: for gender and financial literacy, the wealth differences arise from lower recommended equity shares; for AI experience, lower savings. That the model recommends lower equity shares to women echoes long-standing patterns in observational data, in human financial advice (Bucher-Koenen et al. 2025), and in documented gender differences in risk-taking (D’Acunto 2015).
Focusing on gender, differences in investment advice could arise for two reasons: women and men may write different prompts (demand), or the model may provide different advice when the same prompt is labelled as coming from a man rather than a woman (supply). The top panel of Figure 3 shows that the prompts written by men and women cover different topics: women used words like “family,” “grocery,” and “pay” more often, while men leaned toward “strategy,” “crypto,” and “growth”. In turn, the models also gave different advice by gender. The bottom panel shows that ChatGPT’s responses to women’s prompts featured more references to credit unions (“ncua”) and safer investments (“fdic,” “debit,” “insure”), while its responses to men’s prompts over-represented words like “bet,” “international,” “portfolio,” and “equity”.
Figure 3 Word overrepresented in prompts and advice by gender
a) Words overrepresented in prompts
b) Words overrepresented in advice
Note: This figure shows word clouds of terms that are over-represented in prompts authored by men vs women (Panel A) and in ChatGPT 5.2. responses to prompts written by men vs women (Panel B). See Choukhmane et al. (2025) for details.
To separate demand from supply, we randomly added gender labels (“I am a man” or “I am a woman”) to the large share of prompts in our sample with no explicit gender labels. We estimate that a prompt written by a woman and labelled as from a woman yields a recommended equity share that is 1.7 percentage points lower than that of a prompt written by a man and labelled as from a man. About two-thirds of this gap is demand: men and women write different prompts, and the difference persists even when prompts carry the same label. The remaining third is supply: the model recommends more equity when an identical prompt is randomly labelled as coming from a man. This supply-side difference could reflect reasonable inferences about how preferences or circumstances vary by gender, which the model could ideally make explicit to users, or biases learned from training data.
Conclusion
Our results suggest guarded optimism about whether LLMs can give good financial advice. On the decisions that matter most for long-run wealth, such as investing in diversified equity or building a savings buffer, today’s leading models give advice broadly consistent with what economists would prescribe. This is not something to take for granted: these models are not optimised to improve household finances, and one might have feared they would tell people what they want to hear. It is a meaningful step toward affordable, accessible advice, in contrast to the conflicts of interest and uneven quality that plague much human and popular guidance.
But the technology is no panacea. What people get from AI depends as much on the model’s ability as on how they write their prompts. The advice relies on simple heuristics, struggles with dynamic responses, and varies across households in ways that partly reflect the model itself. As policymakers and firms weigh how to regulate and deploy these tools, three features seem likely to persist even as models improve. Much of the heterogeneity we document comes from the questions people ask, so the demand side (i.e. financial literacy and prompt quality) may remain a binding constraint. The supply side raises governance questions about when advice should depend on who is asking. The diagnostic tests we propose for assessing consumption smoothing, diversification, and rebalancing offer a transparent yardstick for evaluating each new generation of LLM advice as it arrives.
References
Anand, K, S Kazinnik, A Leonello and E Panetti (2026), “Financial stability in the age of artificial intelligence: The role of algorithmic architecture”, VoxEU.org, 25 May.
Bucher-Koenen, T, A Hackethal, J Koenen and C Laudenbach (2025), “Gender differences in financial advice”, American Economic Review 115: 4218–4252.
Calvet, L E, J Y Campbell and P Sodini (2009), “Fight or flight? Portfolio rebalancing by individual investors”, Quarterly Journal of Economics 124: 301–348.
Choi, J J (2022), “Popular personal financial advice versus the professors”, Journal of Economic Perspectives 36: 167–192.
Choi, J J (2022), “How popular personal finance advice compares to economic theory”, VoxEU.org, 9 October.
Choukhmane, T and T de Silva (2024), “Drivers of investors’ portfolio choices: Separating risk preferences from frictions”, VoxEU.org, 5 July.
Choukhmane, T, T de Silva, W Lin and M Akuzawa (2025), “AI financial advice: Supply, demand, and life cycle implications”, Working Paper.
Cocco, J F, F J Gomes and P J Maenhout (2005), “Consumption and portfolio choice over the life cycle”, Review of Financial Studies 18: 491–533.
D’Acunto, F (2015), “Risk tolerance of men and women”, VoxEU.org, 20 September.
D’Acunto, F, N Prabhala and A Rossi (2019), “The promises and pitfalls of robo-advising”, Review of Financial Studies 32(5): 1983–2020.
Egan, M, G Matvos and A Seru (2019), “The market for financial adviser misconduct”, Journal of Political Economy 127: 233–295.
Gallup (2025), “Americans still turn to people for financial advice”, Gallup News, 13 May.
Guiso, L and G Foà (2015), “Distorted financial advice”, VoxEU.org, 12 October.
J.D. Power (2025), “As more U.S. consumers struggle with rising prices, many turn to artificial intelligence for financial advice”, Press Release, 28 August.
Linnainmaa, J T, B T Melzer and A Previtero (2021), “The misguided beliefs of financial advisors”, Journal of Finance 76: 587–621.
Lloyds Banking Group (2025), “Over 28 million adults now using AI tools to help manage their money”, Press Release, 3 November.
Love, D (2014), “Optimal rules of thumb for personal finance”, VoxEU.org, 27 January.
Merton, R C (1969), “Lifetime portfolio selection under uncertainty: The continuous-time case”, Review of Economics and Statistics 51: 247–257.
Mullainathan, S, M Noëth and A Schoar (2012), “The market for financial advice: An audit study”, American Economic Review 102: 2970–3000.







