Back to Blog

Profound’s Black Friday Index: What the Leaderboard Can’t Tell You

Kevin McCabe

CRO

May 30, 2026

5 min read

Profound’s Black Friday Index ranks brands by AI visibility score across shopping categories. Every brand gets a numbered position and a score to one decimal place. The format communicates a settled rank order with clean separation between each position.

None of the scores carry a confidence interval. That means the reader can see that Amazon scored 93.1% and Best Buy scored 91.4% in electronics, but can’t tell whether that 1.7-point gap represents a real difference or falls within the normal range of measurement uncertainty. Without confidence bands, a leaderboard presents every rank difference as meaningful, whether the gap is 40 points or 0.3 points.

How the Index works

Profound generated 50 unbranded Black Friday prompts per category — half broad shopping intent, half with price or feature constraints. Brand visibility was scored based on how frequently each brand appeared in ChatGPT’s answers. The Index covers beauty, electronics, appliances, wearables, shoes, and other categories.

Profound discloses the prompt count, which is more than most vendors provide and makes the math auditable. The methodology doesn’t disclose how many times each prompt was run, doesn’t report confidence intervals on any score, and doesn’t indicate whether the rankings would reproduce if the same prompts were fired again the following week.

What the missing confidence intervals hide

Every score on the Index sits inside a confidence band the leaderboard doesn’t show. Without those bands, the reader can’t tell which rank differences are real and which ones would shuffle on a different measurement pass. The disclosed sample size of 50 prompts lets us estimate how wide those bands are and in several categories, they’re wide enough that adjacent brands are statistically indistinguishable.

Electronics — Amazon (93.1%) vs Best Buy (91.4%): A 1.7-point gap. At 50 prompts, the confidence bands around both scores overlap heavily. The leaderboard ranks them as #1 and #2; the data can’t separate them.

Appliances — Walmart (74.1%) vs Best Buy (69.3%): A 4.8-point gap. The bands overlap here too. The leaderboard shows a clear first and second. The data shows two scores sitting inside each other’s uncertainty range.

Shoes — Walmart (35.6%) vs Nike (35.3%): A 0.3-point gap. These scores are functionally identical. The leaderboard presents them as second and third. They are statistically the same number.

Beauty — Ulta (85.9%) vs Sephora (45%): A 40.9-point gap. Here the bands don’t overlap. This is a real difference the data supports. But the reader has no way to know that from the leaderboard. Nothing on the Index distinguishes the Ulta-Sephora gap (real) from the Amazon-Best Buy gap (noise). Both are presented the same way: a rank position and a decimal-point score.

A larger sample size would narrow the bands and make more rank differences distinguishable. But the core issue isn’t the sample size, it’s the absence of confidence intervals. At any sample size, a leaderboard without confidence bands asks the reader to trust that every rank position is meaningful without providing the information to verify it.

What the leaderboard format communicates

A leaderboard assigns every entry a numbered position and displays them in descending order. That format tells the reader the ranking is resolved. First is ahead of second. Second is ahead of third. The order is the finding.

Without confidence bands alongside each score, that communication is unsupported. Some gaps on the Index are large enough to be real. Some are fractions of a point. The leaderboard treats them all identically. The format makes the judgment for the reader by presenting every position as settled, without showing whether the data actually settles it.

Why this matters

A brand team benchmarking against the Index might set a goal of moving from fifth to second in their category by next Black Friday. Whether that’s a meaningful target or an already-achieved position depends on the confidence bands around both scores. The brand could already be statistically tied with second, or further away than the leaderboard suggests.

The same applies to vendor conversations. A pitch built around “we’ll move you up three positions on the Profound Index” is selling a rank improvement that the data may not distinguish. Whether two positions are genuinely different or statistically tied depends on the gap between their scores and the width of the bands.

The ask

Four questions to put against any AI visibility leaderboard:

How many times was each prompt run per measurement window?
What confidence range surrounds each reported score?
Would the ranking reproduce if the same prompts were run again next week?
Which adjacent rank positions are statistically separable and which are statistically tied?

The Profound Index answers the prompt count (50 per category). It doesn’t answer the other three. Adding confidence bands to each score and flagging which rank differences are statistically meaningful would turn the Index from a leaderboard into a measurement.

Frequently asked questions

Is Ulta really number one in AI visibility for beauty during Black Friday?

Ulta at 85.9% and Sephora at 45%. That gap is large enough that the confidence bands don't overlap. The data supports Ulta being well ahead of Sephora. But the reader only knows that because we computed the bands independently. The Index itself doesn't surface that information, and it doesn't distinguish this real gap from the gaps in other categories that are too small to be meaningful.

Are Amazon and Best Buy really #1 and #2 in electronics?

At 93.1% and 91.4%, the 1.7-point gap falls inside the confidence bands. The data can't separate them. They could be tied, or their order could reverse on a different measurement pass. The leaderboard presents them as first and second; the math doesn't support that distinction.

What would make the Index more reliable?

Confidence intervals on every score, so a reader can see where the uncertainty lies. A disclosure of how many times each prompt was run. And a visual indicator showing which adjacent positions are statistically separable — even something as simple as grouping brands into statistical bands rather than assigning each one a unique rank position.

Is this a problem specific to Profound's Index?

No. Any leaderboard in AI visibility that reports rank positions without confidence intervals has the same issue. The Profound Index is more transparent than most because it discloses the prompt count, which lets a reader estimate the bands independently. The gap is that the Index doesn't do that estimation for them.

Kevin McCabe is CRO at IQRush. If you want to see how your brand’s AI visibility holds up under the same measurement framework described here, book a 30-minute walkthrough.