CategoryBlockchain

Identity Is the Dark Matter & Energy of Our World

Defining identity is a notoriously difficult endeavor. Identity, in the technology industry, is the name of a business vertical that developers and software vendors use to describe a class of products and services they employ for managing users, authentication, and access control. But the average person doesn’t define identity in this way. They’re more likely to describe identity as how a person or thing defines itself and is known to the world around it. I have always thought it ironic that an industry vertical named ‘Identity’ rarely produces solutions that extend beyond a narrow sliver of what identity truly is. In this post, I will present a conceptual model for identity that defines it as an expansive, organic system, and my stretch goal is that it changes how you perceive your interactions in the world (at least a little).

Identity as a Cosmological Analog

According to Dark Matter & Energy theory, the atoms, particles, and energy people commonly believe are the sum total of the universe, is known as Baryonic matter. Interestingly, and in contrast with human perception, the theory postulates that this form of matter only makes up ~5% of total matter and energy in the universe. The vast majority of the universe is said to be composed of another form of matter and energy, Dark Matter and Dark Energy. I find it fascinating there can be so much beyond our perception that goes into producing the relatively tiny sliver of what we recognize as our world.

A visualization of Dark Matter – NASA, ESA, and E. Hallman (University of Colorado, Boulder)

I posit to you that your perception of the countless interactions and exchanges you participate in on a daily basis is the Baryonic equivalent of their true scope and depth – just 5% of what’s really happening. The 95% you don’t see – the organic, interconnected web that permeates everything you experience – is identity. In this way, identity can be described as the Dark Matter & Energy of our physical and digital lives.

Identity Transactions All Around Us

Every object on the planet (alive or inanimate) is a distinct entity, each with its own identity – this includes humans, organizations, devices, bots, VR objects, and countless other nouns. But the rabbit’s hole goes much deeper: this swirl of identities is constantly interacting with each other in ways you probably don’t recognize as identity transactions. What these identities generate, exchange, and experience during their interactions are transfers of identity-encoded data that augment their state, definition, and accrued provenance.

All identity interactions are generally based on the transfer of self-signed or multi-party proofs; attestations, as many in the industry refer to them. These identity-linked proofs are what form the basis of trust and auditability when interacting with other identities. Here are just a few examples of identity transactions in our world:

  • Text, vocal, or visual communications – identity-encoded message transmissions
  • Licenses, permits, certificates, etc. – identity-signed declarations of privilege
  • Blogs, reviews, comments, etc. – self-signed attestations about other identities
  • Music, paintings, novels, photos. etc. – identity-encoded asset authorship proofs
  • Sales, ads, bids, asks, etc. – identity-signed signals of offer or intent
  • Supply chains, ownership histories, etc. – a trail of identity-signed proofs

As you can see, almost everything you do in life is an identity-encoded transaction, but much like Dark Matter & Energy, they are currently difficult to capture and record. In the identity technology sphere, we lack a good technical system for processing and storing these exchanges in a way that is precise, self-sovereign, discoverable, and universally interoperable. Most of these interactions are either never recorded or are captured by a sea of apps and services that lock them in silos and walled gardens, often sharing the resulting data with unintended third-parties.

Enter The Blockchain

We talk about identity, proofs, and signatures as the fundamental building blocks of every interaction between entities, but what technical options are best suited for this? In my early work at Mozilla, and more serious development now at Microsoft and the Decentralized Identity Foundation, we found that decentralized blockchains provide a solid foundation for anchoring Decentralized Identifiers and DPKI operations. By using decentralized blockchains as a root of trust for anchoring identifiers and their association with cryptographic keys and off-chain personal datastores, you can create a system with just the right mix of attributes and features to support these kinds of identity interactions. Here’s a graphic that shows what the system looks like from a high level:

With this layered approach, most interactions are able to be done off-chain by simply signing data and Verifiable Credentials with keys that are linked to blockchain-anchored identifiers. In many cases, the time-state of these identity interactions can also be important, so the identities involved may want to create a chain-anchored proof that captures the state of what they exchanged, when they exchanged it.

The Breathtaking Scale of Identity

Given the scope of identity, it is important to think about its scale. As alluded to above, each person, organization, device, bot, etc., singularly generates hundreds, if not thousands – and in some cases, millions – of transactions per day. Hopefully you are starting to realize the staggering scale required from any system or technology that seeks to support a world of identity transactions.

The canonical scaling goal you often hear about in blockchain land is “Look Ma, I can do 3000 transactions per second, just like Visa!”. While this would be a laudable goal if Visa-type payments accounted for a majority of transactional demand, it’s a drop in the ocean of demand-reality – and a tsunami is coming. All other demand sources aside, identity transactions + streaming micropayments, deployed at world-scale, would reach into the trillions of transactions per day.

Preview of Part 2

Blockchains, as decentralized, distributed systems that rely on broadcast transmission and consensus synchronization, are implicitly difficult to scale compared to traditional systems. Some blockchain communities have turned to what is known as on-chain scaling, versus intelligent off-loading of demand to secure, chain-anchored, Layer 2 systems. In the next post, I will break down the quantitative scale of demand blockchains will face if they intend to support decentralized identity, and provide an assessment of the various approaches to meeting that demand.

Scaling Decentralized Apps & Services via Blockchain-based Identity Indirection

Blockchain Scalability

Developers of apps and services, blockchain-based or not, must always consider efficiency and scalability in determining how to best serve the needs of their users. This is especially true when you add new or emerging technologies to the equation. In the realm of blockchain-based apps and services, scalability considerations are magnified by the distributed nature of the underlying system.

In order to maintain the unique guarantees of a blockchain, transactions must be processed with a mechanism that ensures consensus, then propagated across the network. These constraints introduce three major scalability challenges:

  1. Propagation over a distributed system
  2. Consensus processing of transactions
  3. Data duplication size and cost

Transaction Rate

Currently there are two main blockchains: Bitcoin and Ethereum. Neither of these large-scale blockchain implementations publicly exceed double digits in per-second transaction rates. While various members of the blockchain community are experimenting with Proof of Stake and Sharding, which could theoretically push transaction rates “into the millions per second”, these additions introduce new constraints and network characteristics that could impact how developers write apps and services.

But let’s assume blockchains could process millions of transactions per second – is that enough? The real question is: “How many transactions per second would we need for an on-chain world of people, apps, services, and devices?”

To answer that, imagine a world where organic and inorganic entities are generating frequent, on-chain transactions – here are a few examples of transaction sources:

  • Billions of people each generating transactions throughout the day
  • Hundreds of millions of IoT devices triggering transactions
  • Millions of apps, services, and bots performing background transactions

When you consider the breathtaking enormity of the scale, it’s hard to precisely quantify a lower bound, but BILLIONS of transactions per second may be conservative.

Dr Evil

Transaction-related Computation

Another scalability consideration is the fact that transactions on some blockchains trigger consensus-based computation of programmatic ‘contracts’. While this is a neat feature, it also incurs a significant cost that must be borne by nodes on the network.

Ethereum devs themselves are open about the fundamental performance limitations of these contract computations:

“Clearly Ethereum is not about optimizing efficiency of computation. Its parallel processing is redundantly parallel. This is to offer an efficient way to reach consensus on the system state without needing trusted third parties, oracles or violence monopolies. But importantly they are not there for optimal computation. The fact that contract executions are redundantly replicated across nodes, naturally makes them expensive, which generally creates an incentive not to use the blockchain for computation that can be done off-chain.”

(I want you to remember the word off-chain, it’s going to come up a lot)

Data Storage

Blockchains do not allow much data to be embedded within their transactions, and rightly so, because duplicating  data across a significant number of nodes is a non-starter when you’re talking about billions of transactions per second. In fact, at those transactional rates – even with strict transactional data limits, pruning,  and aggressive transaction aggregation – the network would still generate hundreds of petabytes of data annually. This would likely reduce the number of full nodes in the network, and the widespread replication of blockchain transactional data that goes with it.

Consider the Following

The most common way of developing decentralized apps and services with blockchain tech is to create a digital record of source data, link it to an on-chain blockchain transaction via a hash of the data + timestamp, then store the source data off-chain somewhere. Let’s call this method the Blockchain Transactional Model.

But what does this model really provide?

  1. The on-chain transaction provides a rough timestamp based on when it is added to the chain.
  2. The hash embedded in the transaction records the state of off-chain data, but does not provide storage of that data.
  3. Both 1 and 2 are synced to the global ledger, and off-chain data can be validated against the embedded hash.

What if I told you blockchain-anchored identity could provide a more efficient means to achieve the majority of off-chain use-cases? I submit the following for your consideration:

Most non-monetary blockchain use-cases can be accomplished off-chain using a combination of blockchain-anchored identity, cryptographic signatures, and traditional storage systems – all while retaining the features developers desire and avoiding the scalability issues of on-chain transactions.

Identity – The Red Pill

There are known choke points in the transmission, computation, and storage of on-chain transactions that make the Blockchain Transactional Model prohibitive at high levels of transactional load. But what does that mean for the promise of decentralized apps and services – is blockchain still the solution?

The answer is yes, but you’ll need to free your mind.

red-pill-blue-pill

We are currently working on an open source system that will enable cross-blockchain registrations of identifiers for self-sovereign identities. Think of it like a transparent, open source, blockchain-based naming and identity layer for the world. Users will own their identities and can prove ownership based on control of a private key linked to a known identifier.

A blockchain-based identity can be used to represent any type of entity, including: people, apps, services, companies, government agencies, etc.

Regardless of the non-monetary decentralized app or service use-case – rental agreement, supply chain system, car title transfer, or any other attestation – it all boils down to three key features:

  1. Capturing the state of data
  2. Logging time of occurrence
  3. Verifying the participation of all parties involved

It turns out we can accomplish all three of these by having various parties to an action sign a payload comprised of source data, the state of that data, the time of occurrence (if relevant), and proof of participation using their globally verifiable, blockchain-based identity.

Use-Case: Renting an Apartment

Let’s compare how non-monetary, blockchain-based use-cases are handled under the Blockchain Transactional Model, vs handling them entirely off-chain via the Blockchain Identity Model. Imagine Jane wants to rent an apartment from Bill – here’s how that plays out under each model:

Blockchain Transactional Model (On-Chain)

rental-agreement-flow

Result:

  • A blockchain is used to record a hash of a rental agreement document and meta data – a scale bottleneck at higher transaction rates.
  • You must be online to commit a transaction to a blockchain – a blocker for many use-cases that demand quick resolution.
  • Bill and Jane have forced a huge, distributed consensus system to store a hash of the rental agreement only they are interested in – introduces scale problems without providing a clear benefit.
  • Bill and Jane still must save and maintain the original rental agreement document if it is ever needed for verification or inspection, because a hash alone is just a proof of state – it doesn’t escape off-chain data maintenance.
  • Bill and Jane are still required to prove that they each signed the rental document, which either requires traditional means of verification or a system of verifiable, digital identity – identity is an ever-present issue.

Blockchain Identity Model (Off-Chain):

identity-rental-agreement-flow

Result:

  • Use of a blockchain is only required once for each participant to register their identity, which can happen prior to this entire use-case taking place, yay, it scales!
  • Bill and Jane don’t have to worry about the cost, hassle, and delay in broadcasting a transaction to a huge, distributed consensus system, while retaining all the features they wanted from the on-chain model – less blockchain is more in many cases.
  • Bill and Jane still save their agreement payload off-chain, just as with the Blockchain Transactional Model, but instead the payload contains everything they need: the source document, data state hash, and identity signatures for verification — fewer moving parts and no added burden.
  • Bill and Jane can now complete this entire use-case with standard, scalable systems, even while offline, then sync the payload to the secure storage services of their choice – more flexibility with less complexity.

Implications

Blockchain-based identity should cause you to question core assumptions around modeling and implementation of decentralized app and service use-cases. As a result, you should ask whether or not your use-case truly requires on-chain transactions. Here are a few important points to take away from this post:

  1. The scalability issues with blockchains are an irrelevant mirage for the majority of decentralized app and service use-cases.
  2. Hashes or Merkle Roots stored on the blockchain may be valuable in certain unique situations where no combination of identity signatures can be trusted and there is no viable third-party witness, like a Notary Public or a government agency.
  3. We can build scalable, decentralized apps and services today by enabling entities to sign off-chain payloads with blockchain-based identities.

Determining On-Chain vs Off-Chain

Here is a little pseudo-code test to help you determine whether to use the off-chain Blockchain Identity Model or the on-chain Blockchain Transactional Model:


if (selfAsserted || groupSigsAreTrusted || canBeAttestedByNeutralParty) {
  useOffChainIdentitySignatures();
}
else {
  anchorItOnChain();
}

Feature Comparison

Last but not least, here’s a feature breakdown of the two models:

FeaturesTraditional(On-Chain)Identity(Off-Chain)
~Unlimited scalability
Eliminates the need for off-chain storage 
Transactions are verifiable without middlemen 
Solves identity verification 
Reuses existing systems and tools 
Doesn't require learning new languages
Not tied to any one blockchain 

The Nickelback Persistence Conjecture

“The Hammer’s Coming Down”

In my work with open source blockchains, distributed hash tables, and peer-to-peer systems in the area of identity, I have run across people making this statement from time to time, often in an effort to minimize open source, decentralized, distributed systems vs their centralized or proprietary counterparts:

“The data on these decentralized, distributed systems will only be persistently available if people continue to run them.”

These folks often add:

“This is a serious concern because the people running these systems must have an economic incentive to keep them running.”

I’ve never spent more than an eye-roll of my time on people who make these statements because they’re pure fallacy, but a random assault committed on my eardrums (Britney Spears – Oops I Did it Again) in the admission line at Google I/O 2016 inspired me to finally drop the hammer on this FUD. I’m sure you’re wondering how Britney Spears, Nickelback, and fallacious statements about decentralized, distributed systems are all related: slow your roll, I’ll get there.

“Believe It, or Not”

Let me channel my inner-@pmarca for a moment:

“I am hearing disturbing rumors that a system’s data won’t be unavailable if no one is running it.”

Wow, what a profound thought! While that is technically true, it’s also the Least Plausible Hypothesis for why data on a decentralized, distributed system with any traction, scale, or utility would suddenly become irrecoverable.

“Remind Me”

The craziest part about these statements is how absolutely backwards they are when used to target decentralized, distributed systems vs others. Perhaps the folks who emit this brand of truthiness need a reminder about the reality of their claim: A centralized or proprietary system is just as likely to go dark for any number of reasons.

yahoo-geocities-homepage

Disturbingly (not really), you can no longer access these systems (though some have been archived by various sites):

  • GeoCities
  • AltaVista
  • Yahoo Auctions
  • Friendster
  • Google Wave
  • etc. etc. ad infinitum

For whatever reason, a person might be inclined to gravitate toward centralized or proprietary systems because they believe them to be more trustworthy, secure, or permanent, but that is far from a reliable truth. In fact, centralized or proprietary systems can be worse for many reason, including:

  • Shutdown can happen more abruptly, sometimes without notice
  • Proprietary system shutdown is often permanent
  • Centralized and proprietary systems usually have a single arbiter that decides their fate
  • Almost without exception they are driven by revenue, sometimes at the expense of users

In contrast, decentralized, distributed systems with significant traction, scale, or utility:

  • Are very difficult to shut down abruptly – most if not all nodes need to go dark
  • Can be restarted by anyone in the community with copies of the data
  • Are not subject to the shutdown whims of a single arbiter
  • May persist because parties have a principled or personal desire to ensure they do

The Nickelback Persistence Conjecture

nickelback-meme-31

Now we get to the fun part. How does one articulate to someone deeply skeptical of open source, decentralized, distributed systems, such as a blockchain, that their data will not vanish suddenly? How does one rebut the statement that these systems must rely on unsustainable economic interests for anyone to run them?

After hearing that Britney Spears song at Google I/O, I wondered: “Is this noise pollution still available on decentralized systems (torrents, etc.) after this long?” When I got home I decided to up the ante in my search for long-lasting, distributed trash, so I chose to base my conjecture on the poster child of musical eye-rolling: Nickelback. Not many people like Nickelback, and that’s great, because those who are circulating its songs on a peer network are probably doing so out of some unnatural affection for them or specific self-interest. As it turns out, I took a look and found that Nickelback’s songs are still available on peer networks 20 years after they first appeared. Herein lies the conjecture I present to you, without further delay:

The mean duration that data will remain persistently available on an open source, decentralized, distributed system that reaches at least the size, popularity, or acolyte following of Nickelback, is roughly equal to the time period from which Nickelback’s songs first became available to the day no peer nodes remain that circulate their ‘music’.

“It’s Over”

“Never Again” let one of these purveyors of persistence FUD keep you up “Into the Night” thinking about if “Today Was Your Last Day” to access the data you have stored on an open source, decentralized, distributed system.

The Web Beyond: How blockchain identity will transform our world

Imagine

Jane wakes up to the sound of her alarm clock, it’s 6:13 AM. “Oh great, what am I in for today,” she thinks. Jane’s alarm clock is normally set for 6:30 AM, but her identity agent detected a traffic accident that is projected to add 17 minutes to her commute. Jane’s identity agent, acting on her behalf, changed her alarm while she was sleeping. All three, Jane’s identity, the identity of her alarm clock, and the identity of her agent, are connected via a self-sovereign, decentralized, blockchain-anchored identity system.

Jane gets ready and grabs a yogurt from the fridge as she heads out the door. The yogurt was delivered yesterday, after her fridge detected she was out. Her fridge’s identity has been granted limited access to initiate purchases for her. In this case, Jane has opted to be notified for confirmation of any purchases her fridge initiates; yesterday Jane swiped “Confirm” when the identity management app on her phone asked if the fridge could execute a purchase of some groceries. The fridge executed a payment over the blockchain using Jane’s identity-linked blockchain wallet and the wallet linked to the grocery store’s identity. That’s right, the grocery store has a blockchain-anchored identity as well. Starting to get the picture?

Jane needs to get to a downtown office building where she is scheduled to meet a contact on the 12th floor. Jane doesn’t have a car, so she asks her identity agent to fetch her one by leveraging the many identity crawlers dedicated to indexing sharing economy identity data. These crawlers are always hard at work, real-time indexing the (user allowed) blockchain identity data changes of every person, place, device, and intangible entity on Earth. In this case, there are hundreds of drivers in Jane’s general vicinity who have granted popular ride sharing identity agents access to read and update their identity’s ride sharing fields. Jane uses her preferred crawler’s app to send signed, encrypted requests directly to providers of sharing economy services. The crawler identifies a driver whose identity shows a ride sharing status of “Available,” with a geolocation value that indicates he is close to Jane. Jane taps “Request a Ride” on the app and it immediately sends a message to the communication endpoint listed on the driver’s blockchain identity. The driver’s blockchain sharing economy app alerts him that a new ride request was received and asks whether he wants to accept. The driver accepts and is sent Jane’s current geolocation.

Upon arriving at her destination, Jane authorizes a payment of her driver’s identity-linked blockchain wallet. She enters the office building and heads directly for the elevators, bypassing a lengthy check-in procedure in the ground floor lobby. Jane taps her phone against an NFC pad, which instantly identifies her via a challenge/response verification of her identity assertion. The elevator system’s blockchain-anchored identity has been given access to the appointment schedules of the various software systems used by the companies that reside in the building. It uses Jane’s identity datastore to locate the appointment entry, which was created by her contact. Within this entry is a signed directive to allow Jane’s identity to access the elevator and take it to the 12th floor. Jane enters the elevator and the button for the 12th floor is already lit up. Just for fun, Jane tries hitting other buttons. But alas, she was not granted access to other floors, so the buttons don’t light up and she isn’t able to access them.

Jane walks up to the front desk and alerts the attendant that she has arrived for her meeting. The attendant directs her to verify her identity once more, via the guest terminal. Jane is greeted by her contact and smiles at the thought of how efficient and interoperable the world has become, thanks to the universal blockchain-based identity system.

Understand

A blockchain is a decentralized, distributed ledger that accounts for and stores cryptographically verifiable token ownership proofs, synced to computers around the globe. Blockchains represent an unprecedented opportunity to create standard, decentralized systems that handle complex activities in a more efficient, automated, programmable way than ever before. One of the most interesting applications of blockchain tech is in the area of identity. Identity has never, ever, had a good solution. Humanity has built countless centralized systems, federation schemes, and every hybrid of the two you can imagine. With a worldwide, decentralized blockchain of identity, that all ends.

Each transaction on a blockchain allows for a small amount of data to be stored with it. For the purpose of identity, this data can be encoded with two things:

  1. A registration for an ID (a friendly or unfriendly name), that is verifiable and indexable
  2. A pointer to off-blockchain data that describes the identity attached to the ID

Whoever possesses the private key for one of these blockchain ID transactions controls the identity data attached to it. This allows us to do interesting things, like:

  • Lookup IDs on a cacheable index of the global ledger
  • CRUD identity data connected to an ID at real-time speed
  • Prove ownership of an ID, or verify data has been signed/sent by an ID’s owner, using standard cryptographic methods

Build

With a global blockchain of identity, we can dramatically transform almost every product or service that relies on interactions between living, non-living, and intangible things. Here are a few examples of what it will do:

  • Allows users to directly expose products or services to real-time crawlers and indexes, which can disintermediate centralized products/services in every vertical.
  • Provides a means to lookup and contact anyone on the planet via the exposure of public or private (access limited) communication endpoints
  • Simplifies service access and accounting schemes, like registering for API keys, leaky URL params, etc.
  • Provides better mechanisms for verifying access/ownership of digital goods
  • Solves the fundamental issues with provisioning, security, and access control for the IoT ecosystem

Here are a few developer-enabling features, APIs, and tools we can build into existing platforms to more rapidly realize this blockchain-based future:

  • Create a new protocol (chain:, bid: ?) that allows for CRUD and search of blockchain transactions/identities
  • Build cloud services that make blockchain identity agents, and their bots, as easy to develop as all the social/messaging bot frameworks of today
  • Develop new Web standards and browser features that integrate a more secure, more powerful blockchain-anchored system of authentication and identity into common flows, like login and request signing
  • We may want to reuse/augment some existing mechanism, like the FIDO flow, etc.

^ This is the future we deserve, a standard, generative, user-sovereign world of identity that will fundamentally change the way we interface with every person and object around us.

© 2025 Back Alley Coder

Theme by Anders NorenUp ↑