Scaling Decentralized Apps & Services via Blockchain-based Identity Indirection

Blockchain Scalability

Developers of apps and services, blockchain-based or not, must always consider efficiency and scalability in determining how to best serve the needs of their users. This is especially true when you add new or emerging technologies to the equation. In the realm of blockchain-based apps and services, scalability considerations are magnified by the distributed nature of the underlying system.

In order to maintain the unique guarantees of a blockchain, transactions must be processed with a mechanism that ensures consensus, then propagated across the network. These constraints introduce three major scalability challenges:

  1. Propagation over a distributed system
  2. Consensus processing of transactions
  3. Data duplication size and cost

Transaction Rate

Currently there are two main blockchains: Bitcoin and Ethereum. Neither of these large-scale blockchain implementations publicly exceed double digits in per-second transaction rates. While various members of the blockchain community are experimenting with Proof of Stake and Sharding, which could theoretically push transaction rates “into the millions per second”, these additions introduce new constraints and network characteristics that could impact how developers write apps and services.

But let’s assume blockchains could process millions of transactions per second – is that enough? The real question is: “How many transactions per second would we need for an on-chain world of people, apps, services, and devices?”

To answer that, imagine a world where organic and inorganic entities are generating frequent, on-chain transactions – here are a few examples of transaction sources:

  • Billions of people each generating transactions throughout the day
  • Hundreds of millions of IoT devices triggering transactions
  • Millions of apps, services, and bots performing background transactions

When you consider the breathtaking enormity of the scale, it’s hard to precisely quantify a lower bound, but BILLIONS of transactions per second may be conservative.

Dr Evil

Transaction-related Computation

Another scalability consideration is the fact that transactions on some blockchains trigger consensus-based computation of programmatic ‘contracts’. While this is a neat feature, it also incurs a significant cost that must be borne by nodes on the network.

Ethereum devs themselves are open about the fundamental performance limitations of these contract computations:

“Clearly Ethereum is not about optimizing efficiency of computation. Its parallel processing is redundantly parallel. This is to offer an efficient way to reach consensus on the system state without needing trusted third parties, oracles or violence monopolies. But importantly they are not there for optimal computation. The fact that contract executions are redundantly replicated across nodes, naturally makes them expensive, which generally creates an incentive not to use the blockchain for computation that can be done off-chain.”

(I want you to remember the word off-chain, it’s going to come up a lot)

Data Storage

Blockchains do not allow much data to be embedded within their transactions, and rightly so, because duplicating  data across a significant number of nodes is a non-starter when you’re talking about billions of transactions per second. In fact, at those transactional rates – even with strict transactional data limits, pruning,  and aggressive transaction aggregation – the network would still generate hundreds of petabytes of data annually. This would likely reduce the number of full nodes in the network, and the widespread replication of blockchain transactional data that goes with it.

Consider the Following

The most common way of developing decentralized apps and services with blockchain tech is to create a digital record of source data, link it to an on-chain blockchain transaction via a hash of the data + timestamp, then store the source data off-chain somewhere. Let’s call this method the Blockchain Transactional Model.

But what does this model really provide?

  1. The on-chain transaction provides a rough timestamp based on when it is added to the chain.
  2. The hash embedded in the transaction records the state of off-chain data, but does not provide storage of that data.
  3. Both 1 and 2 are synced to the global ledger, and off-chain data can be validated against the embedded hash.

What if I told you blockchain-anchored identity could provide a more efficient means to achieve the majority of off-chain use-cases? I submit the following for your consideration:

Most non-monetary blockchain use-cases can be accomplished off-chain using a combination of blockchain-anchored identity, cryptographic signatures, and traditional storage systems – all while retaining the features developers desire and avoiding the scalability issues of on-chain transactions.

Identity – The Red Pill

There are known choke points in the transmission, computation, and storage of on-chain transactions that make the Blockchain Transactional Model prohibitive at high levels of transactional load. But what does that mean for the promise of decentralized apps and services – is blockchain still the solution?

The answer is yes, but you’ll need to free your mind.

red-pill-blue-pill

I am currently leading an open source initiative at Microsoft to develop a system with Blockstack and ConsenSys that will enable cross-blockchain registrations of identifiers for self-sovereign identities. Think of it like a transparent, open source, blockchain-based naming and identity layer for the world. Users will own their identities and can prove ownership based on control of a private key linked to a known identifier. Please note: this is not vaporware, and will be available at enterprise-scale in the coming months.

A blockchain-based identity can be used to represent any type of entity, including: people, apps, services, companies, government agencies, etc.

Regardless of the non-monetary decentralized app or service use-case – rental agreement, supply chain system, car title transfer, or any other attestation – it all boils down to three key features:

  1. Capturing the state of data
  2. Logging time of occurrence
  3. Verifying the participation of all parties involved

It turns out we can accomplish all three of these by having various parties to an action sign a payload comprised of source data, the state of that data, the time of occurrence (if relevant), and proof of participation using their globally verifiable, blockchain-based identity.

Use-Case: Renting an Apartment

Let’s compare how non-monetary, blockchain-based use-cases are handled under the Blockchain Transactional Model, vs handling them entirely off-chain via the Blockchain Identity Model. Imagine Jane wants to rent an apartment from Bill – here’s how that plays out under each model:

Blockchain Transactional Model (On-Chain)

rental-agreement-flow

Result:

  • A blockchain is used to record a hash of a rental agreement document and meta data – a scale bottleneck at higher transaction rates.
  • You must be online to commit a transaction to a blockchain – a blocker for many use-cases that demand quick resolution.
  • Bill and Jane have forced a huge, distributed consensus system to store a hash of the rental agreement only they are interested in – introduces scale problems without providing a clear benefit.
  • Bill and Jane still must save and maintain the original rental agreement document if it is ever needed for verification or inspection, because a hash alone is just a proof of state – it doesn’t escape off-chain data maintenance.
  • Bill and Jane are still required to prove that they each signed the rental document, which either requires traditional means of verification or a system of verifiable, digital identity – identity is an ever-present issue.

Blockchain Identity Model (Off-Chain):

identity-rental-agreement-flow

Result:

  • Use of a blockchain is only required once for each participant to register their identity, which can happen prior to this entire use-case taking place – yay, it scales!
  • Bill and Jane don’t have to worry about the cost, hassle, and delay in broadcasting a transaction to a huge, distributed consensus system, while retaining all the features they wanted from the on-chain model – less blockchain is more in many cases.
  • Bill and Jane still save their agreement payload off-chain, just as with the Blockchain Transactional Model, but instead the payload contains everything they need: the source document, data state hash, and identity signatures for verification – fewer moving parts and no added burden.
  • Bill and Jane can now complete this entire use-case with standard, scalable systems, even while offline, then sync the payload to the secure storage services of their choice – more flexibility with less complexity.

Implications

Blockchain-based identity should cause you to question core assumptions around modeling and implementation of decentralized app and service use-cases. As a result, you should ask whether or not your use-case truly requires on-chain transactions. Here are a few important points to take away from this post:

  1. The scalability issues with blockchains are an irrelevant mirage for the majority of decentralized app and service use-cases.
  2. Hashes or Merkle Roots stored on the blockchain may be valuable in certain unique situations where no combination of identity signatures can be trusted and there is no viable third-party witness, like a Notary Public or a government agency.
  3. We can build scalable, decentralized apps and services today by enabling entities to sign off-chain payloads with blockchain-based identities.

Determining On-Chain vs Off-Chain

Here is a little pseudo-code test to help you determine whether to use the off-chain Blockchain Identity Model or the on-chain Blockchain Transactional Model:

if (selfAsserted || groupSigsAreTrusted || canBeAttestedByNeutralParty) {
  useOffChainIdentitySignatures();
}
else {
  anchorItOnChain();
}

Feature Comparison

Last but not least, here’s a feature breakdown of the two models:

Features Traditional
(On-Chain)
Identity
(Off-Chain)
~Unlimited scalability
Eliminates the need for off-chain storage 
Transactions are verifiable without middlemen 
Solves identity verification 
Reuses existing systems and tools 
Doesn't require learning new languages
Not tied to any one blockchain 

Recapping the W3C Blockchain Standardization Workshop @ MIT

I had the opportunity to both co-chair and present at the W3C’s blockchain workshop at MIT in Boston this week. More than 100 of the best and brightest from the standards and blockchain worlds met to discuss potential Web standards opportunities for blockchain tech.

I have been working toward the establishment of a large-scale, user-sovereign, blockchain identity system for many years, and with an initiative now underway at Microsoft, this event was instrumental in better understanding the views of other organizations we intend to work with in bringing this system to users, companies, and governments across the globe.

CmOUQitWMAA0YZR

There were many areas of standardization discussed that touched on blockchain-based identity systems, but perhaps the most specific, actionable area of standardization the group identified was extension of the existing Web Auth spec to include the APIs, features, and flows necessary to enable blockchain-based identity authentication in browsers. This is something I will begin to explore with other implementers and interested organizations from the W3C workshop who committed to do so with us.

The workshop was a great opportunity to start an important discussion about systems that could move the Web far beyond what it is today. I look forward to reporting back on the progress of our explorations as we move forward!

The Nickelback Persistence Conjecture

“The Hammer’s Coming Down”

In my work with open source blockchains, distributed hash tables, and peer-to-peer systems in the area of identity, I have run across people making this statement from time to time, often in an effort to minimize open source, decentralized, distributed systems vs their centralized or proprietary counterparts:

“The data on these decentralized, distributed systems will only be persistently available if people continue to run them.”

These folks often add:

“This is a serious concern because the people running these systems must have an economic incentive to keep them running.”

I’ve never spent more than an eye-roll of my time on people who make these statements because they’re pure fallacy, but a random assault committed on my eardrums (Britney Spears – Oops I Did it Again) in the admission line at Google I/O 2016 inspired me to finally drop the hammer on this FUD. I’m sure you’re wondering how Britney Spears, Nickelback, and fallacious statements about decentralized, distributed systems are all related: slow your roll, I’ll get there.

Continue reading

S(GH)PA: The Single-Page App Hack for GitHub Pages

SPA woes

For some time now I have wanted the ability to route paths for a gh-pages site to its index.html for handling as a single-page app. This ability is table stakes for single-page apps because you need all requests to be routed to one HTML file, unless you want to copy the same file across all your routes every time you make a change to your project. Currently GitHub Pages doesn’t offer a route handling solution; the Pages system is intended to be a flat, simple mechanism for serving basic project content.

If you weren’t aware, GitHub does provide one morsel of customization for your project site: the ability to add a 404.html file and have it served as your custom error page. I took a first stab at doing an SPA hack by simply copying my index.html file and renaming the copy to 404.html. Turns out many folks have experienced the same issue with GitHub Pages and liked the general idea: https://twitter.com/csuwildcat/status/730558238458937344. The issue that some folks on Twitter correctly raised was that the 404.html page is still served with a status code of 404, which is no bueno for crawlers. The gauntlet had been thrown down, but I decided to answer, and answer with vigor!

One more time, with feeling

After sleeping on it, I thought to myself: “Self, we’re deep in fuck-it territory, so why don’t I make this hack even dirtier?!” To that end, I developed an even better hack that provides the same functionality and simplicity, while also preserving your site’s crawler juice – and you don’t even need to waste time copying your index.html file to a 404.html file anymore! The following solution should work in all modern desktop and mobile browsers (Edge, Chrome, Firefox, Safari), and Internet Explorer 10+.

Template & Demo: If you want to skip the explanation and get the goods, here’s a template repo (https://github.com/csuwildcat/sghpa), and a test URL to see it in action: https://csuwildcat.github.io/sghpa/foo/bar

That’s so META

The first thing I did was investigate other options for getting the browser to redirect to the index.html page. That part was pretty straight forward, you basically have three options: server config, JavaScript location manipulation, or a meta refresh tag. The first one is obviously a no-go for GitHub pages, and JavaScript is basically the same as a refresh, but arguably worse for crawler indexing, so that leaves us with the meta tag. Setting a meta tag with a refresh of 0 appears to be treated as a 301 redirect by search engines, which works out well for this use-case.

You’ll need to start by adding a 404.html file to your gh-pages repo that contains an empty HTML document inside it – but your document must total more than 512 bytes (explained below). Next put the following markup in your 404.html page’s head element:

<script>
  sessionStorage.redirect = location.href;
</script>
<meta http-equiv="refresh" content="0;URL='/REPO_NAME_HERE'"></meta>

This code sets the attempted entrance URL to a variable on the standard sessionStorage object and immediately redirects to your project’s index.html page using a meta refresh tag. If you’re doing a Github Organization site, don’t put a repo name in the content attribute replacer text, just do this: content="0;URL='/'"

Customizing your route handling

If you want more elaborate route handling, just include some additional JavaScript logic in the script tag shown above to tweak things like: the composition of the href you pass to the index.html page, which pages should remain on the 404 page (via dynamic removal of the meta tag), and any other logic you want to put in place to dictate what content is shown based on the inbound route.

512 magical bytes:

This is hands down one of the strangest quirks I have ever encountered in web development: You must ensure the total size of your 404.html page is greater than 512 bytes, because if it isn’t IE will disregard it and show a generic browser 404 page instead. When I finally figured this out, I had to crack a beer to help cope with the amount of time it took.

Let’s make history

In order to capture and restore the URL the user initially navigated to, you’ll need to add the following script tag to the head of your index.html page before any other JavaScript acts on the page’s current state:

<script>
  (function(){
    var redirect = sessionStorage.redirect;
    delete sessionStorage.redirect;
    if (redirect && redirect != location.href) {
      history.replaceState(null, null, redirect);
    }
  })();
</script>

This bit of JavaScript retrieves the URL we cached in sessionStorage over on the 404.html page and replaces the current history entry with it. However you choose to handle things from there is up to you, but I’d use popstate and hashchange if you can.


Well folks, that’s it – now go hug it out and celebrate by writing some single-page apps on GitHub Pages!

GIF

The Web Beyond: How blockchain identity will transform our world

Imagine

Jane wakes up to the sound of her alarm clock – it’s 6:13 AM. “Oh great, what am I in for today,” she thinks. Jane’s alarm clock is normally set for 6:30 AM, but her identity agent detected a traffic accident that is projected to add 17 minutes to her commute. Acting in agency on Jane’s behalf, her identity agent changed her alarm while she was sleeping. All three – Jane’s identity, the identity of her alarm clock, and the identity of her agent – are connected via a standard, worldwide, decentralized, blockchain-based identity system.

Jane gets ready and grabs a yogurt from the fridge as she heads out the door. The yogurt was delivered yesterday after her fridge detected she was out. Her fridge’s identity has been granted limited access to initiate purchases for her. In this case, Jane has opted to be notified for confirmation of any purchases her fridge initiates; yesterday Jane swiped “Confirm” when her identity agent asked her if the fridge could execute a purchase of some groceries. The fridge executed a payment over the blockchain between Jane’s identity-linked blockchain wallet and the wallet specified on the grocery store’s blockchain identity. That’s right, the grocery store has a blockchain identity as well. Starting to get the picture?

Jane needs to get to a downtown office building where she is scheduled to meet a contact on the 12th floor. Jane doesn’t have a car so she asks her identity agent to fetch her one by leveraging the many blockchain crawlers dedicated to indexing sharing economy identity data. These crawlers are always hard at work, real-time indexing the blockchain identity object changes of every person, place, device, and intangible entity on Earth. In this case, there are hundreds of drivers in Jane’s general vicinity who have granted popular ride sharing agents access to read and update their identity’s ride sharing fields. Jane uses her preferred crawler’s app to send signed requests directly to providers of sharing economy services. The crawler identifies a driver whose blockchain identity shows a ride sharing status of “Available,” with a geolocation value that indicates he is close to Jane. Jane taps “Request a Ride” on the app and it immediately sends a signed message to the communication endpoint listed on the driver’s blockchain identity. The driver’s blockchain sharing economy app alerts him that a new ride request was received and asks whether he wants to accept. The driver accepts and is sent Jane’s current geolocation.

Upon arriving at her destination, Jane authorizes a payment of her driver’s identity-linked blockchain wallet. She enters the office building and heads directly for the elevators, bypassing a lengthy check-in procedure in the ground floor lobby. Jane taps her phone against an NFC pad, which instantly identifies her via a challenge/response verification of her blockchain identity. The elevator system’s blockchain identity has been given access to the appointment schedules of the various software systems used by the companies that reside in the building. It uses Jane’s blockchain identity to locate the appointment entry, which was created by her contact. Within this entry is a signed directive to allow Jane’s identity to access the elevator and take it to the 12th floor. Jane enters the elevator and the button for the 12th floor is already lit up. Just for fun Jane tries hitting other buttons. But alas, she was not granted access to other floors, so the buttons don’t light up and she isn’t able to access those floors.

Jane walks up to the front desk and alerts the attendant that she has arrived for her meeting. The attendant directs her to verify her identity once more, via the guest terminal. She verifies her identity using her blockchain identity, of course. Jane is greeted by her contact and smiles at the thought of how efficient and interoperable the world has become, thanks to the blockchain.

Understand

A blockchain is a decentralized system of accounting for and storing cryptographically verifiable bits of data to a distributed ledger, synced to computers worldwide. Blockchains represent an unprecedented opportunity to create standard, decentralized systems that handle complex activities in a more efficient, automated, programmable way than ever before. One of the most interesting applications of blockchains is in the area of identity. Identity has never – never – had a good solution. Humanity has built countless centralized systems, federation schemes, and every hybrid of the two you can imagine. With a worldwide, decentralized blockchain of identity, that all ends.

Each transaction on the blockchain allows for a small amount of data to be stored with it. For the purpose of identity, this data can be encoded with two things:

  1. A registration for an ID (a friendly or unfriendly name), that is verifiable and indexable
  2. A pointer to off-blockchain data that describes the identity attached to the ID

Whoever possesses the private key for one of these blockchain ID transactions controls the identity data attached to it. This allows us to do interesting things, like:

  • Lookup IDs on a cacheable index of the global ledger
  • CRUD identity data connected to an ID at real-time speed
  • Prove ownership of an ID, or verify data has been signed/sent by an ID’s owner, using standard cryptographic methods

Build

With a global blockchain of identity, we can dramatically transform almost every product or service that relies on interactions between living, non-living, and intangible things. Here are a few examples of what it will do:

  • Allows users to directly expose products or services to real-time crawlers and indexes, which can disintermediate centralized products/services in every vertical.
  • Provides a means to lookup and contact anyone on the planet via the exposure of public or private (access limited) communication endpoints
  • Simplifies service access and accounting schemes, like registering for API keys, leaky URL params, etc.
  • Provides better mechanisms for verifying access/ownership of digital goods
  • Solves the fundamental issues with provisioning, security, and access control for the IoT ecosystem

Here are a few developer-enabling features, APIs, and tools we can build into existing platforms to more rapidly realize this blockchain-based future:

  • Create a new protocol (chain:, bid: ?) that allows for CRUD and search of blockchain transactions/identities
  • Build cloud services that make blockchain identity agents, and their bots, as easy to develop as all the social/messaging bot frameworks of today
  • Develop new Web standards and browser features that integrate a more secure, more powerful blockchain-based system of authentication and identity into common flows, like login and request signing
  • We may want to reuse/augment some existing mechanism, like the FIDO flow, etc.

^ This is the future we deserve – a standard, generative, user-sovereign world of identity that will fundamentally change the way we interface with every person and object around us.

A Renewed Call for App-to-App Interaction APIs

Cow, come in cow

The battle ground of app-to-app interaction history is littered with abandoned ideas, half-solutions, and unimplemented APIs. The current, consumer/provider interaction paradigm for apps and services is a mess of one-off, provider-defined systems that each use their own transaction mechanisms and custom data structures. This makes it hard to do simple things across N providers, like save something to a user’s preferred storage service without jumping through provider-specific code and UX hoops.

I’d like to restart the conversation about bringing legit, app-to-app interaction APIs to the Web. There have been past spec attempts, namely Web Activities and Web Intents, but I’ll argue that while they get a lot right, they all fail to deliver an A+ solution.

Continue reading

Element Queries, From the Feet Up

Everybody’s looking for Element Queries

What are Element Queries? At a high level, I’d describe them as pure, unfiltered, rocket fuel for hyper-responsive layouts and components. More technically, they are Media Queries scoped to individual elements. Element Queries would allow you to attach Media Query break-points based on the dimensions and characteristics of an element itself, instead of the page’s viewport.

Developers have wanted Element Queries for a long time. Here’s a list of articles that convey the need, along with a few attempts to make them real:

Continue reading

Cross-Browser, Event-based, Element Resize Detection

UPDATE: This post has seen a significant change from the first version of the code. It now relies on a much simpler method: a hidden object element that relays its resize event to your listeners.

DOM Elements! Y U No Resize Event?

During your coding adventures, you may have run into occasions where you wanted to know when an element in your document changed dimensions – basically the window resize event, but on regular elements. Element size changes can occur for many reasons: modifications to CSS width, height, padding, as a response to changes to a parent element’s size, and many more. Before today, you probably thought this was mere unicorn lore, an impossible feat – well buckle up folks, we’re about to throw down the gauntlet.

Continue reading

The Oft-Overlooked Overflow and Underflow Events

A Primer on Overflow and Underflow

To level-set, I’ll define and describe what overflow and underflow are in the context of the web. Overflow is a rather simple concept you’re probably familiar with: when an element’s content takes up more space than it allows, given style or box model constraints, it causes a scrollbar to appear or the content to be cut off from view (if you set overflow: hidden;). Underflow is the less common case you probably don’t think about: an element currently in an overflown state leaves that state as a result of the element growing in size or a reduction of the amount of content within it – visually, the scrollbars disappear and all content is visible within the element. As it turns out, Firefox and WebKit browsers offer events that alert you of changes between these two flow states.

What if I told you

Continue reading

FlightDeck and MetaLab: Bad Messaging Leads to Bad Times

NOTE: I spoke with Andrew Wilkinson (CEO of MetaLab) prior to releasing this post.

The Back Story

I arrived at Mozilla 4 years ago at age of 26 with a passion for the web. Like many Mozillians, my previous job was with a private company. Mozilla was radically different than any work environment I had ever been in. Not only is Mozilla open source, it’s also open meeting, open planning, open specs, open mockups, open bug lists – yeah, lots of open. I wasn’t used to this, not that I shied away from openness or wanted to be secretive, it simply took a while to acclimate myself.

One of the first projects I was tasked with when I arrived was Add-on Builder. It was to be a lightweight code environment for Firefox add-ons – mostly for beginners and people who wanted to test their add-ons in a collaborative way (think jsFiddle for Firefox Add-ons). Unfortunately, it was also the source of the most frustrating, painful event of my professional career. Given Add-on Builder was end-of-life’d a few months ago to free up resources for other developer-facing products, I thought I’d finally write about the event and what actually happened. As it turns out, it was far less interesting than the woefully inaccurate fable it mutated into. Here goes:

Continue reading

© 2016 Back Alley Coder

Theme by Anders NorenUp ↑