Smart money doesn't stay smart

A centralized exchange keeps its order flow to itself. Hyperliquid is a DEX, so every fill is public and signed by the wallet that made it. That's a strange and powerful thing: you can watch every trader on the venue, name by name, month after month. So the obvious question almost asks itself — why not just find the wallets that make money and copy them?

We ran the experiment. Each month we score roughly 108,933 wallets on BTC alone (and tens of thousands more across 260-odd other markets): how much they trade, how often they win, how much they make, and how toxic their flow is — how far the price runs against you in the minutes after they hit your quote. Then we asked the only question a copy-trader cares about: does any of it carry over to next month?

Last month's hero, next month's goat

Here's the same picture for three different wallet traits. Each grid sorts wallets into deciles by a trait this month (bottom-to-top) and asks where they land next month (left-to-right). A bright diagonal means "you stay where you were" — the trait persists. A bright everywhere means it's a coin toss.

Win rate

A clean diagonal: a wallet’s win rate lands where it started. Style persists.

Realized PnL

An X, not a line: last month’s biggest winners are about as likely to crash to the bottom as to repeat. That’s variance, not skill.

Toxicity (5-min markout)

Almost flat: informed, toxic flow barely repeats at the wallet level month to month.

Rows = this month's decile (low at bottom); columns = next month's decile. Brighter = higher probability. Pooled over 10 major markets, 2025-08..2026-06, wallets active in both months.

Look at the middle panel. If trading profit were skill, big winners would stay winners and the grid would light up along the diagonal like the win-rate panel on the left. Instead it forms an X: the wallets with the largest realized PnL this month are almost as likely to plunge to the bottom decile next month as to repeat at the top. That's the fingerprint of variance, not edge — the same accounts take the biggest swings in both directions. The bright core in the centre is the quiet majority whose PnL hovers near zero and stays there. Toxicity (right) barely persists at all.

What does persist is style, not skill

Rank every trait by how strongly it carries month-to-month (Spearman correlation of a wallet with itself, one month later). The pattern is stark. The things that persist describe who a wallet is and how it trades — how much size it pushes, the fees it pays, its habitual win rate. The one thing that doesn't persist is the thing you'd actually want to copy: whether it makes money.

Trading volume

0.81

Fees paid

0.80

Win rate

0.22

Markout coverage

0.19

Realized PnL

0.08

PnL per volume

0.04

Toxicity (5m markout)

0.01

Who they are How they trade Whether they profit

Trading volume is almost perfectly sticky (ρ ≈ 0.81) — whales stay whales. Win rate is moderately sticky (ρ ≈ 0.22), but a high win rate is a style: scalpers bank many tiny gains and take rare large losses, so a wallet's win rate tells you how it trades, not whether it profits. Realized PnL (ρ ≈ 0.08) and 5-minute toxicity (ρ ≈ 0.01) are essentially memoryless. Copying last month's leaderboard is copying a coin flip.

The shape of the crowd

Two more facts explain why the leaderboard is so noisy. First, most wallets lose: across every full month, 54% of the BTC wallets that trade close it with negative realized PnL. The distribution is fat-tailed and roughly balanced around zero — a few big winners, a few big losers, and a huge pile clustered near break-even before fees.

Monthly realized PnL per wallet · BTC (signed-log $ axis)

-$562K-$561-$1+$2K+$562K

Lost money Made money

Volume concentration · BTC, 2026-06

The dashed line is perfect equality. The gap below it is the reality: the top 1% of wallets drive 89% of volume; the bottom 98% together trade under 7%. Gini 0.987.

Second, volume is extraordinarily concentrated. A handful of accounts do almost all the trading, and the long tail barely moves size. So a naive "top PnL" ranking is dominated by whoever happened to take the biggest position into the biggest move — survivorship and variance wearing the costume of skill.

Where the edge actually is

So is wallet data useless? The opposite — but the edge isn't in any single column, and it isn't in copying winners. It's in the interaction of many weak, orthogonal behaviours (they share a mean absolute correlation of only ≈0.11). No one of them forecasts next month; combined in a non-linear model, they do. That's the productized layer: a tree that scores every wallet's next-month PnL and toxicity, validated out-of-sample the hard way.

+0.11

next-month PnL — out-of-sample IC

replicated on unseen wallets & coins

+0.04

next-month toxicity — out-of-sample IC

the only forward-toxicity signal

≈0

coin price return — negative control

forecasts behaviour, not price

Those numbers survive the tests that kill most "smart money" claims. We measure walk-forward, only scoring months the model was fit before. We hold out whole coins — leave-one-coin-out — so the reported skill is on markets the model never trained on, and it replicates. A no-shared-wallet variant, where train and test wallets don't overlap, actually strengthens it: the model learns behaviour, not a memorized list of addresses. And the tell-tale negative control — whether the same signal predicts the coin's price return — comes back at ≈0. This forecasts wallet behaviour, not the market.

What we're not claiming

Straight talk on the limits. The edge is small and it's a feature, not a buy button — you stack it into your own model, you don't trade the score directly. The win is non-linear: an interpretable equal-weight blend of these columns doesn't beat the best single feature, so there's no tidy formula to hand you, only the model. And it's validated inside a single bear-and-chop regime (late 2025 to mid 2026); we'll keep re-testing as the tape turns. Anyone quoting a huge information ratio off IC × √wallets is selling you fiction — wallets aren't independent bets.

See it for yourself

The raw behaviour underneath this post — signed per-wallet flow — is what makes it possible, and it's the one signal family you cannot rebuild from public price data. The scored forecasts ship as gold_wallet_factors_1mo: a monthly per-wallet rank and in-cohort percentile for next-month PnL, PnL-per-volume and toxicity. Browse the full column dictionary, then check pricing for access.