← Back to Stock Research

AI in Finance Part 3: Clues in Earnings Calls

How I instruct LLMs to find what matters in earnings calls

AI analyzing earnings calls to extract predictive patterns

This is part 3 in a series on how we used LLMs to do reliable financial research. Dan wrote about earnings releases and 10-Ks. I'm writing this piece on earnings calls.

Earnings calls generate 30-40 pages of transcripts quarterly. We study the entire market, so 30-40 pages per filing for a 1000+ companies requires some automation. We have invested in how to do this, but if you're earlier on, and starting with just prompting ChatGPT or Claude to summarize these documents, you may benefit from what we've figured out.

(The context in this series is that we are reading these documents as part forecasting long-term company returns. There are of course other uses for these documents, where the best practices for using LLMs would be different.)

A reminder of our methods: we analyzed earnings calls from every S&P 500 company, and correlated the findings with what ended up in our final forecasts of long-term outcomes. We then took the 50 most interesting companies (as determined by LLMs) and looked for patterns: what did we use the most and the least in our forecasts?

Our findings

  • In earnings calls, actual numbers changed my forecasts a lot (10% or more), while CEO vision statements and forward guidance don't matter
  • I found companies in distress provide more useful information
  • Divergences in segments matter a lot more than averages across segments. (As we found in other document types.)

Now I'll share more specific findings that can be useful to specifically prompt LLMs to find and avoid.

Category 1: Segment Divergences (I found this in 42% of companies)

The examples here explain it better than I can in a summary:

Example: Palantir

"U.S. commercial segment grew 93% year-over-year while international commercial declined 3%"

The 10-K showed overall growth. The call revealed dangerous geographic concentration.

Example: Trane Technologies

"Americas Commercial HVAC orders surged 60%+ while residential forecast cut from mid-to-high single digits to flat"

Hidden strength in commercial masked residential weakness which is critical for modeling mix shift.

Forecast Impact: Changes revenue CAGR by 5-10 percentage points, margin assumptions by 100-300bps

What to Search For:

  • Geographic splits differing >10 percentage points
  • Product lines with opposing trends
  • Customer segments with divergent growth

Category 2: Unit Economics and Cohort Profitability (I found this useful for in 38% of companies)

This is similar to segment analysis. What cohorts of users are making what amount of money? These had surprises we found used in final forecasts a lot.

Example: UnitedHealth

"Newer value-based care cohorts operating at negative margins while mature cohorts (2021 and prior) achieve 8%+ margins"

This revealed the company was paying for growth, not achieving scale economies. The 10-K only showed aggregate Optum margins.

Example: Trane Technologies

"Applied Solutions generate 8 to 10x the equipment sale in service revenue over the asset's life"

Equipment sales at low margins suddenly make sense when lifetime value is 10x.

Forecast Impact:

  • UnitedHealth: Reduced 2030 margin from 6.5% to 4.95%
  • Trane: Raised long-term margins by 200bps

Category 3: Customer Concentration (I found this in 35% of companies)

Who's buying and why determines revenue quality more than growth rate, in our forecasts.

Example: Broadcom

"AI compute business concentrated among just 4 customers, with only 7 total qualified prospects globally"

The 10-K mentioned concentration risk. The call quantified it: TAM of 7 companies, not "massive AI opportunity."

Example: UPS

"Amazon volume will decline 30% year-over-year in Q3/Q4"

Specific customer guidance worth $5-6B revenue wasn't in any filing.

Forecast Impact:

  • Broadcom: Reduced 2030 ceiling by 20%, widened confidence intervals
  • UPS: Enabled precise modeling of business transition

Category 4: Timing and Phasing Precision (I found this in 31% of companies)

When revenue arrives can matter to us as much as whether it arrives. When these calls have these dates, LLMs should pull them out.

Example: American Electric Power

"Fourth customer's entire $10 billion in orders shipping in Q3 FY2026"

The 10-K showed the backlog. The call revealed an entire year of revenue in one quarter.

Example: Alliant Energy

"Load ramp: ~200 MW in 2026, 300-1,000 MW in 2027, full 1,600 MW in 2028"

This phasing schedule explained conservative near-term guidance despite massive backlog.

Forecast Impact: Creates 20-40% quarterly volatility bands, changes cash flow timing by years

Category 5: Management's Problem Admissions (I found this in 27% of companies)

This one is kind of obvious. Any LLM will notice if management acknowledges a failure. Still, it's important not to miss any of these.

Example: UnitedHealth

"Optum Health missing earnings expectations by $6.6 billion... permanently lowering margin target to 6-8%"

Not "challenges"; specifically calls out $6.6B miss and structural reset.

Example: Southwest Airlines

"Open seating has been the primary reason customers leave Southwest"

After years of defending their model, this admission explained the complete reversal.

Forecast Impact: Major strategy pivots require 20-40% reduction in near-term growth, 2-4 year recovery timelines

Category 6: Regulatory Mechanisms (I found this in 29% of companies)

How exactly are regulators regulating? We found details in these findings that affect the forecasts, whereas the general fact that, say, a lawsuit exists doesn't really move the needle. (We found this with earnings releases too.)

Example: AT&T

"The 'One Big Beautiful Bill Act' providing $6.5-8 billion in cash tax savings through 2027, with $3.5 billion to fiber deployment"

Specific legislation, amounts, allocation, and sunset dates—not generic "tax benefits."

Example: Alliant Energy

"Iowa's advance ratemaking eliminates regulatory lag through 2029"

This mechanism transformed a risky story into pre-approved growth.

Forecast Impact:

  • AT&T: Added $2B annual FCF through 2027
  • Alliant: Raised success probability from 40% to 70%

What I Want LLMs to Ignore

I don't even look at the stuff below. LLMs tend to include it anyway, which muddies the actually good information.

  1. CEO quotes about vision - "Positioned for long-term success"
  2. Macro commentary - "Challenging environment ahead"
  3. Strategic priorities - "Three pillars of our strategy"
  4. Known seasonality - "Q4 typically strongest"
  5. Historical review - "As we've discussed before"

These sound important but have zero value for us.

My prompt techniques

When I ask an LLM to review an earnings call, I find answers to these five questions. These can go in a single prompt, it likely works as well as asking the LLM a series of independent questions.

1. "What specific numbers weren't in the press release?"

Search: Growth by segment, unit economics, customer metrics Example: UPS revealing 23.3% Ground Saver collapse vs. flat total volume

2. "Where is performance diverging internally?"

Search: Segment margins, cohort performance, geography Example: Palantir's 93% U.S. growth vs. -3% international

3. "What is management admitting isn't working?"

Search: "Below expectations", "challenging", "exiting" Example: Southwest admitting open seating drives customer losses

4. "What customer dynamics are changing?"

Search: Concentration, churn, pricing power Example: Broadcom's AI business limited to 7 global customers

5. "When exactly do things happen?"

Search: Quarterly phasing, contract timing, buildout schedules Example: AEP's $10B order all shipping Q3 2026

Different prompts for different industries

We didn't actually use different prompts for different industries, just for convenience, but in hindsight looking at the most interesting findings, asking LLMs to find these types of things may help you:

Technology

Useful: Cohort retention, R&D productivity, customer acquisition cost Noise: TAM expansion, generic AI positioning

Healthcare

Useful: Payer mix shifts, utilization rates, prior authorization changes Noise: Pipeline dreams without probability adjustment

Industrials

Useful: Book-to-bill ratios, pricing realization, capacity utilization Noise: Macro outlook repetition

Utilities

Useful: Rate case specifics, recovery mechanisms, allowed ROE Noise: "Constructive regulatory environment"

Further Reading