
On-chain analytics is the field of inferring real-world identity, intent, and risk from public blockchain data. Every transaction you have ever signed sits in a ledger that anyone can read, forever. There is no admin who can hide it, no expiry date, no privacy setting. Specialised firms — and increasingly, hobbyists with a laptop — make a living out of turning that raw history into named entities, behavioural profiles, and risk scores.
If you have ever sent crypto to or from a major exchange, paid a freelancer, minted an NFT, or tipped a creator, you already have an on-chain footprint. Understanding what analysts can and cannot see is the first step to deciding how much you care, and what to do about it.
What on-chain analytics actually does
The work breaks down into three core operations that build on each other.
1. Address clustering. Blockchains expose individual addresses, but a single person or business typically controls many. Analysts group these addresses into clusters using heuristics — patterns that are not proof but are statistically reliable. The classic example on Bitcoin is the common-input-ownership heuristic: if a transaction spends from multiple inputs in one signature, those inputs almost certainly belong to the same wallet. Change-address detection — spotting which output is the leftover sent back to the sender — is another. Together they collapse thousands of raw addresses into a much smaller graph of entities.
2. Entity labelling. A cluster is just a blob of addresses until someone attaches a name. Labels come from external evidence: deposit addresses scraped from exchange sign-ups, addresses published in court filings, sanctions lists like OFAC, hacker wallets disclosed by victims, and addresses people post themselves on Twitter or Etherscan. Once one address in a cluster is labelled, the label propagates to the whole cluster.
3. Flow analysis. With clusters labelled, analysts can follow funds across hops: who sent to whom, how much, when, and through which intermediaries — including mixers, bridges, and exchanges.
Who does on-chain analytics
A handful of firms dominate the commercial side, and a long tail of independent researchers fill in the gaps.
- Chainalysis (
chainalysis.com) is the oldest and best-known. It sells tooling to exchanges for compliance and to US law enforcement for investigations, and publishes a widely cited annual Crypto Crime Report. - Elliptic (
elliptic.co) is UK-headquartered and focused on financial-crime compliance for banks, exchanges, and government bodies. - TRM Labs (
trmlabs.com) plays in a similar lane, with an emphasis on cross-chain risk scoring. - Arkham (
arkm.com) is more consumer-facing: it provides a free explorer with entity labels and runs a public Intel Exchange where users can post bounties for de-anonymising specific addresses. - Nansen (
nansen.ai) focuses on labelling DeFi and NFT wallets — "smart money", funds, market makers — for the benefit of traders rather than investigators.
Mentioning these firms is not an endorsement or a criticism. They simply define what is technically and commercially possible today. Beyond them, individual researchers — ZachXBT being the most visible — routinely publish investigations using free tools alone. The capability is no longer rare.
Pseudonymous is not anonymous
This is the point most newcomers miss. Bitcoin, Ethereum, and the great majority of other chains are pseudonymous: your real name is not written on-chain, but a permanent, public, machine-readable record of your activity is. Anonymity would mean an analyst cannot connect activity to a person. Pseudonymity only means they have not connected it yet.
The link is rarely cryptographic. It is usually mundane:
- A know-your-customer check at a centralised exchange ties a cluster of deposit and withdrawal addresses to your passport.
- A one-time off-ramp through a peer-to-peer trader who later cooperates with investigators.
- A wallet address posted in a tweet, a forum bio, a fundraising page, or an ENS name.
- An NFT mint from a wallet associated with a doxxed pseudonym.
- A dust attack — sending you a tiny amount that, if you ever spend it alongside your other coins, fingerprints them together.
Once any single address in your cluster is identified, every past transaction it touched and every future one it touches inherits that identity. There is no "delete history" button, and there never will be. The window for plausible deniability shrinks with every transaction you sign.
What ordinary users can do
You cannot opt out of on-chain analytics once your coins are on-chain, but you can shrink your attack surface with a few habits.
- Use a fresh receiving address for unrelated relationships. Most modern wallets, SSP included, generate a new receive address each time on purpose — accept the default, do not hand out the same one repeatedly. This is most useful for incoming payments; spending still draws from your wider UTXO set and can re-link clusters, but every little bit reduces the signal.
- Be wary of dust. If a wallet you do not recognise sends you a fractional satoshi or a worthless token, do not consolidate it with your real balance. Many wallets flag suspected dust automatically; leave it parked.
- Do not publicise addresses you actually use. Tweeting "send tips to bc1q..." links a public identity to a cluster forever. Use a dedicated address you never reuse for anything sensitive.
- Treat KYC venues as identity anchors. Anything that enters or leaves an exchange in your name is potentially attributable. Plan accordingly.
- Approach privacy tools with eyes open. Coinjoin services, privacy-focused chains, and mixers all carry regulatory, counterparty, or technical risks of their own. This article is not a recommendation either way — just a reminder that "privacy" is not a free feature you can bolt on after the fact.
Where to go from here
On-chain analytics is not a story about catching criminals or about Big Brother — it is just the natural consequence of running a permanent, public ledger. The sooner you treat your wallet history as public-by-default, the better the decisions you will make about which addresses to reuse, which venues to trust, and how much to share.
If you want to keep tightening up the basics, two related reads worth your time:
- Seed phrase best practices — because the cleanest on-chain footprint in the world will not save you if your seed leaks.
- What is 2-of-2 multisig? — SSP's threat model and why two independent keys raise the bar for any analyst trying to link your activity to a single compromised device.