Consortiums and Shared Ledgers: Supply Chains as a Use Case

This post is based on a presentation I gave in Hong Kong for Chain of Things and also one for the Scotland Blockchain meetup.  This was inspired by a recent whitepaper I co-authored with Gilbert & Tobin.  Since a blockchain/shared ledger is a network of nodes all coming together to share information and work together on different use cases a consortium model makes sense to deploy.

This presentation focuses on: 

1) Different Trust Models that are used by the different blockchain and shared ledger technologies.

2) Privacy and confidentiality are the key design features when architecting blockchain solutions for financial services.

3) What is a Private Shared Ledger? How it differs from other solutions.

4) Why use a consortium? 

5) The different types of consortiums

6) How to Form/Structure a Consortium

7) Different examples of consortiums (public & private models) looking at supply chains

8) How Internet of Things (IOT) fits in.

Different Trust Models

Different types of networks require different trust models.  Below is a chart I put together to highlight the differences in each model.

If privacy and confidentiality are the main design feature for shared ledgers, tradeoff need to occur in order to make this work.  All systems have tradeoffs and using a triangle to understand this usually helps.  In this triangle, one can only achieve two of three sides at any time.   The three most important features of blockchain/shared ledgers: 1) Consistency 2) Full Decentralization & 3) Enterprise Scale.  So for example with Bitcoin and Ethereum,  you get full decentralization and consistency (all nodes get data replicated to them) but you can't get enterprise.  With this triangle you can shift it like a slide ruler, so you can have something between full decentralization and centralization, but of course you will lose consistency or scale as you do.  Many of the shared ledger/private blockchain solutions have sacrificed decentralization because of the trust model they employ.  The nodes need to be known, but not trusted on these ledgers so why bother with decentralization at all?  Some of these models are no different than centralized databases with the addition of blockchain "features".  Once you sacrifice decentralization and synchronization by not replicating to all nodes you can get enterprise scale and privacy and confidentitality while keeping the data consistent.  Some use a node to node model some use a master node which validates and achieves consensus.

The slide below shows the difference between open, permissionless ledgers and shared/distributed/permissioned ledgers.  

Shared ledgers and private blockchains have some or all of the features found in the next slide.  These are the common design features associated with blockchain solutions.

The next slide comes from the KPMG report on Consensus I co-authored and shows how many use cases there are for financial services.  There are just as many consensus mechanisms as well and this can be really confusing particularly when trying to implement different use cases and figure out the design features necessary for what you want to accomplish.  For all of these uses anonymity can't be used and privacy and confidentiality are necessary.  This makes most ways of doing consensus not suitable.

tIn fact there are many different solutions which are being deployed to tackle the privacy problem. In my blog post: "The Trend Towards Blockchain Privacy: Zero Knowledge Proofs", I go into deep detail about these systems and why they are being implemented.  This includes zkSnarks, Zero Knowledge Proofs, zCash, Hawk, Confidential Transactions and State Channels. 

Why Use Consortiums?

Consortiums make sense because the success of shared ledger and blockchain technologies requires significant levels of market participation, collaboration and investment.  The consortium is less about a technology solution or a particular business model.  It's more about how companies who haven't been able to trust each other in the past can come together and collaborate and share information.  The trust is shifted to the shared ledger/technology solution.  The participants just need to have similar requirements in terms of: 

  • the mix of confidentiality and transaparency (as captured in the design choices and operating rules for the shared ledger platform)
  • functionality and processes
  • the approach to governance and
  • a shared view of regulation and compliance

All participants need to commit to complying with the operating rules of the consortium.  Since these participants are known to each other they can deal with each other directly, without the need for a third party and innovate in a cost effective manner.   This leads to collaboration with other consortium members where it makes sense.

These new technology instances are most powerful when you using a consortia model.   Below is a diagram showing at the bottom platform consortia.  Moving left to right you get public (things like Ethereum and distributed autonomous communities (DACs) to hybrids like Hyperledger which is modular to private like R3, AMIS, and Clearmatics Utility Settlement Coin.  Many of these private models rely heavily on industry input and engagement.  In the public model you don'y pay to be a member. In Hyperledger you pay a small fee and companies of all shapes and sizes can join plus you have a fee to join the Linux Foundation.  In the private models it is much more expensive and exclusive.  Incidentally companies like R3  are also part of and main contributors to Hyperledger through one of the main projects Fabric.  Other major projects include Sawtooth Lake (Intel) and Iroha.  All of these models are trending toward open sourcing their code to allow developers to build on top of their platforms and form communities around the technology.  The DACs are truly a new business model as they are building decentralized companies in which the network participants are part owners of the company. Centralized companies will go away and the projects are creating their own economic ecosystems.  Many of these projects are raising money through Initial Coin Offerings (ICOs) and those who invest get "appcoins" and owners and users of the network.  These are essentially decentralized software protocols and are "disrupting the disruptors".  

At the top right are business consortia these can be public or private depending on use cases and business needs of the companies and business units involved.  Shared ledgers allow companies and business units to work together in a way they couldn't in the past.  They provide end to end business processing and record keeping for corporations (particularly intra-company).  This, in turn, allows companies to derive new efficiencies by sharing information, reducing costs and producing revenue generating opportunities.  Many blockchain technology providers are building their own shared ledger platforms which allow companies to use it and work together on use cases and improving inefficiencies. 

Next is the critical choices that need to be made.  This comes directly from the report I co-authored with Gilbert & Tobin.

Screen Shot 2016-12-17 at 9.16.10 AM.png

 

1) Setting the Consortium Strategy

 Setting a consortium strategy at the outset is critical to the long-term success of the business consortium:

++  Identifying new revenue opportunities, which may not have been viable outside of the shared ledger environment;

++  Identifying possibilities for working with new organisations which may not have been practical in the past – and which could deliver new opportunities for collaboration, innovation and efficiencies;

++  Identifying the target consortium members – as well as those who may not be permitted to join;

++  Strategically consider who should control the consortium?

++  What are the competitive threats and long-term strategies for the sustainability of the consortium?

2) Choosing the Ledger Platform

 Selection of a shared ledger platform is not like any other technology project – status quo processes won’t work. Before

evaluating potential options and selecting a “fit for purpose” platform, critical preparatory steps need to be taken to establish the basis for evaluation:

++  identifying all of the business processes and operational requirements (including regulation) that will need to be performed on an end-to-end basis; and

++  identifying how those business processes will be reconstructed on the shared ledger platform.

Private shared ledger platforms are by no means standard or uniform. Each platform provides different options for:

++  how confidentiality is achieved and how the “permissions” are designed: who can see, who can write, who reads, who validates;

++  how scalability is achieved;

++  how consensus is done (although some platform developers are questioning if consensus is even necessary);

++  how encryption and security are implemented and managed;

++  how smart contracts are linked with “real world” contracts (including hashing options), and how validation is carried out to ensure consistency between them;

++  capabilities for interfacing with information feeds (or “oracles”); and

++  capabilities for interoperability with other shared ledger platforms and legacy systems.

It is critical to choose a “fit for purpose” shared ledger platform from the outset. If the platform design doesn’t readily lend itself to delivering the consortium’s strategy, there is no assurance that workarounds will achieve the required result. Even if workarounds are possible, they may prove to be just too complex in practice. Failure to choose a “fit for purpose” platform from the outset may leave the consortium with little choice but to start all over again.

3) Choosing the Solution Design

Once the shared ledger platform is chosen, that is not the end of the matter. A technical solution design is still required, determining how the consortium’s specific strategy and operational requirements will be delivered on the shared ledger platform. This requires decision-making around:

++  the overall architecture of the solution – including how security and safeguards will be built into the technology and the

processes, and how real-world governance and decision-making will be integrated with the technical solution;

++  the design of the “rules engine” and the “smart contracts system” for automated processing on the consortium (including automation of the consortium’s operating rules where practicable);

++  how to create trust and security on the shared ledger? How will the public / private encryption keys be managed? Who will hold those keys? Where will the keys be held?

++  the specific protocols to be adopted in relation to “permissions” on the shared ledger: who can see, who can write, who reads, who validates – and how consensus will be performed, if required;

++  how the technical design will prevent (as far as possible) any breach of the consortium’s operating rules – whether by the technical operator or by participants on the shared ledger?

++  the required information feeds or “oracles” to be incorporated into the design; and

++  how to achieve compliance with applicable regulatory requirements – and in financial services, this may include considerations around redundancy and other technology matters which could impact on the stability of the relevant financial markets.

4) Planning to Mitigate Potential Pitfalls

 While there are clear efficiency benefits and cost savings that shared ledgers can provide through collaboration and more effective information sharing, this doesn’t mean that competition / antitrust risks will disappear. There is no reason why consortium activity on a private shared ledger would be exempt from these laws. Under current Australian laws, consortium activity can constitute cartel conduct if the structure fails to incorporate appropriate compliance and enforcement measures. Upcoming changes to Australia’s competition laws will see the introduction of a “concerted practice” prohibition which sets a lower threshold for illegal coordinated activity than the existing cartel laws.

It is arguable that current competition laws do not take appropriate account of developments in the digital economy – at least in Australia. While this may change in the future, efficiencies and cost savings do not currently constitute any defence to cartel conduct - unless the consortium members pre-emptively seek “authorisation” from the Australian Competition and Consumer Commission. This is a public process that could take six months to complete. By comparison, if the consortium can establish well-designed operating rules and governance for the consortium, then it may be possible to rely on that framework to clearly establish the efficiency benefits – without the need to go through the public authorisation process.

5) Establishing the Consortium Framework

This is detailed in the slide below

Screen Shot 2016-12-21 at 9.47.57 PM.png

This slide shows how a consortium forms.  Someone (consortium promoter) decides that there is a reason for the consortium to form. Once this happens founding members and participation members join with different benefits or tiers of benefits based on the operating rules of the consortium agreement which are decided by a governance board.  These rules include such things as legal and structural decisions of the consortium as well as decisions which are to be implemented on the shared ledger based on a specific technology implemented by the consortium promoter with input from all of its members.  The smart contract system and the rules engine have ensure consistency between real world contracts and smart contracts and as is shown below this is done in both private and public models because "code is king" does not hold up.

R3's Corda matches real world contracts with smart contracts and hashes them to a blockchain.  This is a decision based on best practices and governance because smart contracts fail when you need them most: when you need dispute resolution.  There isn't a way to code this in and know how a dispute will resolve.  By doing this you can going off-chain into the real world of courts to settle disputes.  Corda comes to consensus on these contracts in two ways:  1) transaction validity:  which is based on state outputs and contract code which match the actual contracts and signatures necessary. This allows for the counter parties to agree by independently running the same smart code and validation logic and confirm that the contract is the same.  2) Transaction uniqueness:  this essentially makes sure there is no double spend by making sure the inputs are unique and no other transaction matches this one with certainty.  A third party is generally involved in this piece of the consensus mechanism.

CODE (Centrally Organized Distributed Entity) uses real world governance to counterbalance smart contract "code" for the Ethereum (public) blockchain in order to ensure another DAO does not happen.  An article by Zach Lebeau describes what exactly this entails: 

"CO + DE = Decentralization Generator

CODE stands for Centrally Organized Distributed Entity.

The CO — Centrally Organized component — can be represented by a number of different company structures in various jurisdictions around the world. Because MME — the Swiss legal architects of the Ethereum Foundation — were integral in the formation of the CODE structure, SingularDTV’s CO resides in Zug, Switzerland. The DE — Distributed Entity — is the component existing on Ethereum’s blockchain. Together, the CO and the DE build a bridge between the centralized legacy paradigm and the decentralized Ethereum paradigm.

The engine of SingularDTV’s CODE is its Smart Contract System (SCS). The SCS decentralizes legacy assets and places them on the blockchain. The SCS also directs the flow of decentralized assets for the purpose of funding, building, maintaining and growing real world projects. Value resulting from these real world projects are routed through the SCS and placed back onto the blockchain via SingularDTV’s tokens, SNGLS.

When functioning at its potential, the CODE acts as a decentralization generator that perpetually places centralized assets onto the blockchain, growing the Ethereum ecosystem. In the coming years, when trillions of dollars of assets are placed on the blockchain, it’s structures like the CODE that will have helped make it possible."

SUPPLY CHAINS and IOT

At Devcon 2, Tyler Smith from BHP Billiton, one of the world's largest natural resource producers, presented Project Rai Stones (which can be viewed here) which is being done in collaboration with Consensys and Block Apps using the Ethereum public blockchain.  This project is a blockchain based tracking application for wellbore samples.  Smith explained that these samples are extremely expensive and can not be replaced.  The samples themselves pass through many custodians, mostly all BHP vendors and they are currently manually tracked using spreadsheets and email.  This leads to human error which can cause large regulatory fines when things aren't right.  Smith also explained that there is a lack of transparency to business unit stakeholders and a lack of efficiency in finding metadata about these samples.

There are 3 nodes in this project: BHP, Weatherford (who is providing data analysis) and the regulators and the goal is to properly map businesses processes to a granular level than understand how to map that to a blockchain architecture.  This flow needs to have a simplified version of process flow of the samples through the vendors and custodians to capture the analysis of the sample itself and link it to the digital object of the sample on a blockchain.

Below is a slide of the business process flow.

This allows for real time updates of the samples while getting rid of siloed functions, making tracking of information across functions and processes that are not as efficient as they could be for supply chains of all sorts.  Billions of dollars of value travel on supply chains each day.

Other companies, financial institutions and startups are focusing on supply chains.  IBM will use Hyperledger to build its solutions on and focus on supply chains.  Below is a slide showing other companies focusing on supply chains.

 

The other use case touched on was industry cooperation more than just sourcing and origins of supply chains. The example Tyler Smith gave was around required public regulatory data that resource companies must give to partner countries that are set up in a contract before extraction can begin.  It's not just BHP, but all companies in this industry that need to record and report things like production and location of assets to the partner countries who are required to make it available to the public.  Their are firms that aggregate public data for the governments and the countries into databases and costs multi billions of dollars annually.  The idea is to cut out this middle man by forming a consortium in the natural resource industry to report all public regulatory data that is required. Countries, regulators and natural resource companies can all be nodes in this consortium and publish real time access of this data for all to see and share information amongst peers. The countries themselves can save money by not needed to house this data since the blockchain has built in redundancy.  This also improves transparency dramatically.

Since this information all needs to be open and transparent and a publicly recorded, BHP has decided to use a public blockchain.  There are cases however where using private shared ledgers are necessary, particularly because of data leakage and sensitivity around pricing of invoices and businesses processes as the slide below shows.  There are many different financial institutions and companies involved who would require sensitivity and encryption around the transaction and data flows.  Using a public model in this case would not be warranted.  In many instances, just the counterparties involved in each part of the chain might need to be involved, while the entire chain itself just knows that business logic/process flows happened with integrity.

 

There are lots of corporations, banks and startups beginning to focus on supply chains. IBM has announced this as their focus using Hyperledger.  The Slide below shows other companies and startups building supply chain solutions using a ledger.

 

Elements of IOT would be needed for both models in order to ensure this integrity as value get transferred over large distances between many parties in sensitive climates and by land, sea and air in areas where there may or may not be wi-fi.  The image below shows what is needed in order to make a robust supply chain enhanced by IOT.

Filament is an example of an IOT technology that works where there is no wi-fi by using radio waves.

The consortium model using shared ledgers works really well for use cases such as supply chains where business processes and information sharing between many different counterparties is necessary from a geographic and company endpoint to endpoint.  However, any business process, no matter how automated, always has to leave space for human judgement, nuance and contextual understanding. This is why the consortia model needs human governance and real world legal solutions to go with blockchain/shared ledger technology making for a robust solution.

Conclusion

  • Institutional trust wasn't designed for the digital age.
  • The emergence of shared ledger technologies – empowered by consortia – is a game changer for a major trust shift, which will empower new business models and relationships between corporations and consumers. 
  •  If shared ledger technologies realise their full potential, then the consortium model should thrive and be sustainable in the way that hasn’t been possible in the past.

Kadena: The First Real Private Blockchain

Note:  This blog will be broken into 3 sections in order to explain the evolution of how the consensus algorithm achieved by Raft was attempted to be fixed by Tangaroa and Juno and then finally solved by its distant relative Kadena.  The 3 sections include:
  1. Introduction and the Raft Consensus Algorithm
  2. Kadena’s predecessors – Tangaroa and Juno
  3. Kadena’s blockchain – ScalableBFT, Pact, and Pervasive Determinism

 Part 1: Introduction and the Raft Consensus Algorithm

Foreword

This series of posts will cover Kadena’s Blockchain. It uses ScalableBFT to offer high-performance (8k-12k transactions per second) with full replication and distribution at previously impossible scales (the capacity for more than 500 participating nodes). This, along with the multi layered security model and incremental hashing allow for a truly robust blockchain. Based on Raft and Juno, Kadena embeds a full smart contract language (Pact) into its blockchain that can be run as either public (plain text) or private (Double-Ratchet encrypted) transactions. It is a huge step forward in the blockchain space, possibly representing a new-generation of blockchain technology entirely by it’s introduction of the idea of “Pervasive Determinism.”

Similar to Bitcoin, Kadena’s blockchain is tightly integrated; understanding what it is capable of and what these capabilities imply requires covering a considerable amount of ground. As such, I’ve broken the post into 3 parts: Introduction & Raft, Kadena’s predecessors – Tangaroa & Juno, and Kadena’s Blockchain – ScalableBFT, Pact and Pervasive Determinism.

 

Introduction

The history behind Kadena is an interesting case study in the new field of blockchain consensus algorithms and distributed computing.  Kadena is a “distant relative” of the Raft consensus algorithm. The Raft consensus mechanism was followed by Tangaroa (A Byzantine Fault Tolerant (BFT) Raft) and the JP Morgan project Juno (A fork of Tangaroa), neither of which are longer under active development.  JP Morgan’s new blockchain Quorum is much different from Juno and uses a fusion of ideas from sidechains and Ethereum; public smart contracts are allowed on the blockchain in addition to private contracts which are represented as encrypted hashes and replicated via side-channels.   Kadena is the “next generation Juno.” It uses a new, but related, protocol called ScalableBFT that was spawned from the open source code of the Juno project and was built by the two key developers who built Juno.  Before diving deep into Kadena a brief history and description of Raft and the predecessors to Kadena need to be discussed.

 

Raft Consensus

The Raft consensus algorithm is a single leader-based system for managing a replicated log.  It uses a replicated state machine architecture and produces a result equivalent to Paxos but is structurally different. Keeping the replicated log consistent is the job of the consensus algorithm.  In this model, the leader does most of the work because it is issuing all log updates, validating transactions, and generally managing the cluster. Raft consensus guarantees a strict ordering and replication of messages.  It does not care what the message contain.

A new leader is elected using randomized timeouts, which are triggered if a follower receives no communication from the leader before the timeout fires.  These are called heartbeats.  If the follower receives no communication over this time period, it becomes a candidate and initiates an election.  A candidate that receives votes from a majority of the full cluster (nodes in the network) becomes the new leader.  Leaders typically operate until they fail.  The heartbeats are sent out to make sure the leader is still there, if nothing is received a new election takes place.

The following stages are how Raft comes to consensus:

 

1.     A cluster of Raft node servers is started with every node launching as a Follower. Eventually, one node will timeout, become a candidate, gain a majority of the votes and become the leader.

2.     Each node stores a log containing commands. It is the Leader’s job to accept new commands, strictly order the commands in its log, replicate its log to its followers, and finally inform followers when to commit logs that they have replicated. The consensus algorithm thus ensures that each server’s logs are the same order.

3.     Logs are “committed” when they have been replicated to a majority of nodes. The leader gathers the replication count and, upon a majority being seen, commits its own new log entries and informs its followers to do the same.

4.     Upon “commit” the command in each log entry is evaluated by a State machines. Because Raft is indifferent to the body of the command, any state machine can process committed entries. Moreover, consensus assures that command execution always takes place in the same order as the commands come from the Log which is strictly ordered.

5.     State machines will remain consistent as long as command executions are deterministic.

6.     When a client sends a command to one of the servers, that server will either forward the command to the leader or is the leader. The leader collects the new command, assigns it a Log Index, encapsulates it in a Log Entry, and adds the command to the uncommited portion of it’s log.

7.     Whenever the leader has uncommitted entries, it replicates this portion of the log to its followers. When the leader is informed of successful replication by a majority of the cluster, it commits the new entries and orders its followers to do the same.

8.     Whenever a new log entry is committed consensus about this entry has been reached. It is then evaluated by the (each server’s) state machine.

9.     From this point on, Raft is finished and implementers can decide how to handle responses; replying to the client or waiting for the client to query for the result.

Responses to the client are generally asynchronous.

The Raft consensus protocol is just that – a consensus algorithm. It does not have a notion of and is, by default, fully open to any client issuing commands. The only participation restriction it makes is on what nodes exist at a given time. Moreover, the leader has absolute authority over the cluster and orders the followers to replicate and commit. It does not assume Byzantine attacks, it needs to only handle crash faults, because nodes are assumed altruistic.

 

Kadena: The First Real Private Blockchain

 Part 2: Kadena’s predecessors – Tangaroa and Juno

Foreword

This series of posts will cover Kadena’s Blockchain. It uses ScalableBFT to offer high-performance (8k-12k transactions per second) with full replication and distribution at previously impossible scales (the capacity for more than 500 participating nodes). This, along with the multi layered security model and incremental hashing allow for a truly robust blockchain. Based on Raft and Juno, Kadena embeds a full smart contract language (Pact) into its blockchain that can be run as either public (plain text) or private (Double-Ratchet encrypted) transactions. It is a huge step forward in the blockchain space, possibly representing a new-generation of blockchain technology entirely by it’s introduction of the idea of “Pervasive Determinism.”

Similar to Bitcoin, Kadena’s blockchain is tightly integrated; understanding what it is capable of and what these capabilities imply requires covering a considerable amount of ground. As such, I’ve broken the post into 3 parts: Introduction & Raft, Kadena’s predecessors – Tangaroa & Juno, and Kadena’s Blockchain – ScalableBFT, Pact and Pervasive Determinism.

 

Tangaroa: The first step towards a BFT Raft

Tangaroa is Byzantine Fault Tolerant (BFT) variant of the Raft consensus algorithm inspired by the original Raft algorithm and the Practical Byzantine Fault Tolerance (PBFT) algorithm.  Byzantine fault tolerance refers to a class failures caused by malicious nodes attacking the network.  If some of the nodes go down it is imperative for the network to continue running without stopping. In standard Raft, you need to replicate a log entry to a majority of nodes in the cluster before committing it. For BFT consensus algorithms, including Tangaroa, the required cluster size is at least 2f + 1, where f is the number of failures you want to tolerate (including both crashed nodes and compromised nodes). Consensus is achieved by a majority vote of the cluster; if f <= 3 then cluster size = 7 and non-byzantine nodes = 4. Some BFT protocols can even require 3f+1.

A Byzantine Leader can decide to arbitrarily increase the commit index of other nodes before log entries have been sufficiently replicated, thus causing safety violations when nodes fail later on. Tangaroa shifts the commit responsibility away from the leader, and every node can verify for itself that a log entry has been safely replicated to a quorum of nodes and that this quorum agrees on an ordering.

Tangaroa allows clients that interrupt the current leadership if it fails to make progress, in the same way that other BFT Consensus algorithms allow to client to behave as a trusted oracle to depose certain nodes. This allows Tangaroa to prevent Byzantine leaders from starving the system but is very trusting of the client.

 

Leader Election and Stages

Tangaroa uses Raft as the foundation for consensus; thus there is a single leader.  In Tangaroa, as in Raft, each node is in one of the three states: leader, follower, or candidate. Similar to Raft, every node starts as a follower, one of which will eventually timeout and call an election. The winner of the election serves as the leader for the rest of the term; terms end when a new leader is elected. Sometimes, an election will result in a split vote, and the term will end with no leader; in this case, a follower will again time out (timeouts are reset when a vote is cast or an election is called) and start the voting process again.

To begin an election, a follower increments its current term and sends RequestVote (RV) Remote Procedure Call (RPCs) in parallel to each of the other nodes in the cluster asking for their vote. The RPCs Tangaroa uses are similarl to Raft’s RPCs with the exception that every RPC is signed and validated via PPK signatures. RPCs allow for a data exchange between different computers residing on a network and the signatures allow for receiving nodes to verify which node sent the RPC in addition to allowing any node to forward any other node’s RPC at any time.

When a Tangaroa node receives a RV RPC with a valid signature, it grants a vote immediately only if it does not currently have a leader (only occurs at startup). Otherwise, it begins the process that Tangaroa calls a “LazyVote.” Lazy voting’s purpose is to protect non-Byzantine Followers from electing a new leader when the leader is not faulty; without lazy voting, a byzantine node could trigger repeated elections at any time and starve the system. When a new RV is received by a follower, it saves the RV and waits for all of the following conditions to be met:

a)    The follower’s election timeout triggers fires before it handles a heartbeat from its current leader. If a heartbeat is received, the Lazy Vote is cleared.

b)    The RV’s new term is greater than its current term.

c)     The request sender is an eligible candidate (valid PPK signature and the client hasn’t banned the node).

d)    The node receiving the RV has not voted for another leader for the proposed term.

e)    The candidate shares a log prefix with the node that contains all committed entries. A node always rejects the request if it is still receiving heartbeat messages from the current leader, and it ignores the RequestVote RPC if the proposed term has already begun.

If a RequestVote is valid and for a new term, and the candidate has a sufficiently up to date log, but the recipient is still receiving heartbeats from the current leader, it will record its vote locally, and then send a vote response if the node itself undergoes an election timeout or hears from a client that the current leader is unresponsive. Under lazy voting, a node does not grant a vote to a candidate unless it believes the current leader is faulty.  This prevents nodes that start unnecessary elections from gaining the requisite votes to become leader and starve the system.

Nodes wait until they believe an election needs to occur before ever casting a vote. Once a vote is sent, the node will update its term number. It does not assume that the node it voted for won the election however, and it will still reject AppendEntries (AE) RPCs from the candidate if none of them contain a set of votes proving the candidate won the election. AE’s serve the dual purpose of heartbeats and carriers of new log entries that need replication. The candidate continues in the candidate state until one of the three things happens:

(a)  It wins the election by receiving a majority vote from the cluster. A candidate must save these votes—RequestVoteResponse (RVR) RPCs—for future distribution.

(b)  Another node establishes itself as a leader

(c)   A period of time goes by with no winner (i.e., it experiences another election timeout)

A candidate that wins the election then promotes itself to the leader state and sends an AE heartbeat messages that contains the votes that elected it and the updated term number to establish its authority and prevent new elections. The signed votes effectively prevents a byzantine node from arbitrarily promoting itself as the leader of a higher term. Moreover, each follower performs a recount on the aforementioned majority vote, validating and counting each vote the new leader transmitted to independently verify the validity of the election.

 

Governance

Like Raft, Tangaroa uses randomized timeouts to trigger leader elections. The leader of each term periodically sends heartbeat messages (empty AE RPCs) to maintain its authority. If a follower receives no communication from a leader over a randomly chosen period of time, the election timeout, then it becomes a candidate and initiates a new election.  In addition to the spontaneous follower-triggered elections Tangaroa also allows client intervention: when a client observes no progress with a leader for a period of time called the progress timeout, it broadcasts UpdateLeader RPCs to all nodes, telling them to ignore future heartbeats from what the client believes to be the current leader in the current term These followers will ignore heartbeat messages in the current term and time out as though the current leader had failed, starting a new election.

 

Data Received

The data (new commands) come from clients of the Raft cluster, who send requests to the leader. The leader replicates these requests to the cluster, and responds to the client when a quorum is reached in the cluster on that request. What constitutes a "request" is system-dependent.  How data is stored is system-dependent. It's important for state to persist to disk so that nodes can recover and remember information that they have committed to (which nodes they voted for, what log entries they have committed, etc.). The protocol can't work without this.

Tangaroa adds BFT to Raft evolution

Juno

The JP Morgan project Juno is fork of Tangoroa and was a proof of concept that was able to scale Tangaroa to include up to 50 nodes and get transaction speed up to 5000 transactions per second. The JPM team behind Juno saw the potential that a Tangaroa-like approach represented—a high performance private blockchain. They iterated on the idea for a year and open sourced the project in February ‘16; they added a smart contract language, fixed some design mistakes and succeeding is achieving a 10x performance increase.  It allowed for the number of nodes to vote to change while the system was running. Juno allowed for the adding and removing of nodes.  It was also a permissioned distributed system in which all of the nodes in the network were known. 

The stages of the mechanism and the leader election process are the same as Tangaroa (see above.)  Similarly, a transaction is considered live once it is fully replicated and committed to the log.  The leader decides the order of the commands and every node validates.  Each node independently decides when to commit a log entry based on evidence it receives from other nodes.  Every log entry is individually committed and incrementally hashed against the previous entry.  It takes approximately ~5ms for a single log entry to go from leader receiving the entry to full consensus being reached and network latency.

 

Kadena: The First Real Private Blockchain

 Part 3: Kadena’s blockchain – ScalableBFT, Pact, and Pervasive Determinism

Foreword

This series of posts will cover Kadena’s Blockchain. It uses ScalableBFT to offer high-performance (8k-12k transactions per second) with full replication and distribution at previously impossible scales (the capacity for more than 500 participating nodes). This, along with the multi layered security model and incremental hashing allow for a truly robust blockchain. Based on Raft and Juno, Kadena embeds a full smart contract language (Pact) into its blockchain that can be run as either public (plain text) or private (Double-Ratchet encrypted) transactions. It is a huge step forward in the blockchain space, possibly representing a new-generation of blockchain technology entirely by it’s introduction of the idea of “Pervasive Determinism.”

Similar to Bitcoin, Kadena’s blockchain is tightly integrated; understanding what it is capable of and what these capabilities imply requires covering a considerable amount of ground. As such, I’ve broken the post into 3 parts: Introduction & Raft, Kadena’s predecessors – Tangaroa & Juno, and Kadena’s Blockchain – ScalableBFT, Pact and Pervasive Determinism.

 

Cryptography

However different to Raft, each replica in a BFT Raft system (a family of algorithms that include Tangaroa, Juno, and Kadean’s ScalableBFT) computes a cryptographic hash every time it appends a new entry to its log. The hash is computed over the previous hash and the newly appended log entry. A node can sign its last hash to prove that it has replicated the entirety of a log, and other servers can verify this quickly using the signature and the hash.  BFT Raft nodes and clients always sign before sending messages and reject messages that do not include a valid signature.

BFT Rafts use Incremental Hashing enabling nodes to be certain that both the contents and ordering of other node’s logs match their own. Using this knowledge, nodes can independently commit log entries safely because both the contents and ordering of other node’s logs attested to via matching incremental hashes. BFT Rafts uses digital signatures extensively to authenticate messages and verify their integrity.  This prevents a Byzantine leader from modifying the message contents or forging messages and protects the cluster generally from a large number of Byzantine attacks.

 

Consensus

In Raft, a Leader is elected via randomized timeouts that trigger a Follower to propose itself as a Candidate and request votes. ScalableBFT also does this, but in a cryptographically secured way. For instance, if a Leader becomes unreachable, a timeout would trigger a new election but the election process is robust against Byzantine nodes declaring elections. ScalableBFT fixes the issues that Juno and Tangaroa encountered regarding lazy voting.

The Leader’s only unique capabilities are (1) ordering of new transactions prior to replication and (2) replicating new transactions to Follower nodes. From that point on, all nodes independently prove both consensus validity and individual transaction integrity.

The removal of anonymous participation is a design requirement for private blockchains, and this allowed for a high performance BFT Consensus mechanism to replace mining. ScalableBFT’s primary addition to the family of BFT Rafts is the ability to scale into the 1000’s of nodes without decreasing the system’s throughput.

Every transaction is replicated to every node. When a majority of nodes have replicated the transaction, the transaction is committed. Nodes collect and distributed information (incremental hash) about what they have replicated and use this information to independently decide when to commit (>50% of other nodes send them incremental hashes for uncommitted transactions that they agree with.) It basically works by doing a majority vote on what to commit. Committing a transaction doesn’t mean it will be executed, just that it has been permanently replicated by a majority of the cluster. Bad transactions, ones that error or have bad signatures, are replicated as well as consensus’ job is to provide perfect ordered replication. Committing a transaction allows each node to then independently evaluate (parse/decrypt/validate crypto/execute/etc…) each transaction in an identical way. Every transaction gets paired with an output, this can range from “bad crypto” to the output of the smart contract layer (which can also be an error).

Finally, besides the leader replicating new transactions to every node the nodes are more or less independent. Instead of “syncing” they broadcast “I have replicated up to log index N and it has an incremental hash of H” to the cluster and collect this information from other nodes; based on the results from other nodes each node can independently decided if the cluster has replicated past the bar needed to commit (a majority replication for some as of yet uncommitted log index N). Here’s the subtle part – the incremental hash implies replication of all that came before it. If the leader replicates 8k new transactions (which is what it currently does) each node need only distribute and gather evidence for the last transaction of that batch as it implies correct replication of the ones that came before it. Instead of sending 8k messages (one for each transaction) that attest to proper replication nodes only discuss the latest transaction. This is why Kadena needed so much pipelining because the team figured out how to commit 8k transactions at the same speed of committing a single transaction.

ScalableBFT represents a breakthrough in field of BFT consensus as it is the first and only deterministic BFT consensus mechanism that can scale past 100’s of nodes with full replication and encryption.  ScalableBFT also provides a unique security model known as pervasive determinism which provides security not just at the transaction level but at the consensus level as well while encrypting each and every transaction using the Noise Protocol. (talked about below)

 

Kadena Uses Deterministic Consensus

The consensus mechanism is deterministic if the consensus process is fully specified in the protocol and this process does not employ randomness. As was stated above, Raft, for example, uses randomized timeouts to trigger elections when a leader goes down (because the leader can't communicate "I'm about to crash" so instead there's a timeout that trips to prompt a node to check if the leader is down) but the election isn't part of consensus at the transaction level, it is instead a means to finding a node to orchestrate consensus.

ScalableBFT is deterministic and hardened such that:

 

1.     Each node will commit only when they have a majority of the cluster agreeing with them.

2.     The evidence of agreement must be fully auditable at any time.

3.     When lacking evidence of agreement do nothing.

Kadena is specifically designed for permissioned networks, and as such it assumes that certain attacks (like a DoS) are unlikely and are out of it's control. If one were to occur, the system would either lock (all nodes timeout eventually with but an election would never succeed) or sit idle. Once such an event ends, the nodes will come back into consensus and things will get back to normal. However, in a permissioned network administrators would have full control and kill the connection causing the issue.

 

 

Leader Election

Leader election is very similar to Raft in that any node can be elected leader, every node gets one vote per term, and elections are called when the randomized timeout one of the nodes fires (the timer is reset every time a node hears from the leader). The biggest difference is that in Raft a node that gets enough votes assumes leadership whereas in ScalableBFT a node that gets a majority of votes distributes those votes to every other node to demonstrate (in a BFT way) that it has been elected the leader by the cluster. ScalableBFT’s mechanism fixes issues seen in Juno and Tangaroa, like a “Runaway Candidate” where a non-Byzantine node has timed out due to a network partition but, because its Term has been incremented it can’t come back into consensus and instead continues timeout then then increment its term (“Runaway”.)

Raft consensus guarantees a strict ordering and replication of messages; it doesn’t matter what’s in each message and can range from random numbers to ciphertext to plain-text smart contracts. Kadena leverages the log layer as a messaging service when running in an encrypted context; much like Signal can run Noise protocol encryption over SMS ScalableBFT runs Noise over a blockchain. ScalableBFT adds consensus robustness, which the layer that deals with interpreting the messages assumes as a guarantee, but also incremental hashes that assure perfect replication of messages. Noise protocol slots between consensus and smart contract execution, encrypting/decrypting messages as needed; because the messages are ciphertext only only some of the normal tricks for avoiding a Cartesian blowup of live tests are needed to run per message without leaking information.

Security Model/Pervasive Determinism

Kadena uses the term “pervasive determinism” to describe “the idea of a blockchain that uses PPK-Sig based cryptography for authorship guarantees (like bitcoin) and is composed of a fully deterministic consensus layer in addition to a Turing-incomplete, single-assignment smart contract layer. The implications of a ‘pervasively deterministic’ blockchain are rather profound, as it allows for a bitcoin-ledger class of auditability to be extended deep into the consensus layer by chaining together multiple layers of cryptographic trust. Take as an example a transaction that loads a new smart contract module called “loans”. Say “loans” imports another module called “payments” that is already present in the chain. The successful import of “payments” alone implies the following (with each being fully auditable by cryptographic means):

      who signed the transaction that loaded “payments”

      what consensus nodes were in the cluster at the time of loading

      what consensus nodes agreed that the transaction was valid

      what nodes voted for the current leader at the time of loading

      who the leader was

      who the previous leader was

      etc.

 A pervasively deterministic system allows new transactions to leverage not only the cryptographic trust that naturally occurs as transactions are chained together in a blockchain, but also the trust of how those transactions entered the ledge in the first place. In so doing, you can create a system more secure than Bitcoin because the consensus process becomes as cryptographically trusted, auditable, and entangled as well, with transaction level executions implying that specific consensus level events occurred and with each implication being cryptographically verifiable.“

This provides BFT not just for the consensus layer but for the transaction layer (Bitcoin already does this) as well.  This is different from, say, PBFT which assumes that transactions sent from the client’s server are valid which leaves them with an ability to be compromised. Moreover, non-Raft BFTs generally entrust the client with the ability to depose/ban nodes. Pervasive Determinism takes an alternative viewpoint: trust nothing, audit everything.

Allowing ScalableBFT to incorporate pervasive determinism creates a completely paranoid system that is robust at each and every layer via permanent security (i.e. a form of cryptographic security that can be saved to disk). It has Bitcoin’s security model for transactions, extends this model to the consensus level, and adds smart contracts without the need for mining or the tradeoffs that most in the industry have become accustom to. It’s a real blockchain that’s fast and scalable.   

I asked Will Martino (co-founder of Kadena) for the specifics of how this worked for each layer:

What is your consensus-level security model?

 For replication, Kadena uses an incrementally hashed log of transactions that if identically replicated by each node. The agree on the contents of the log via the distributed signed messages containing the incremental hash of a given log index, which are then collected by other nodes and used to individually reason about when a commit is warranted. No duplicates are allowed in the log and replication messages from the leader containing any duplicates are rejected immediately. We use blake2 hashes and Term number to define uniqueness, allowing clients of the system to not worry about sending duplicates by accident nor about a malicious node/Man in the middle (MITM) resubmitting commands. We employ permanent security, a PPK sig based approach to authorship verification (or any type of approach that can be saved to disk) that is very similar to how bitcoin verifies transactions but at the consensus level (in addition to the transaction level). This is opposed to ephemeral security which uses secured channels (TLS) for authorship validation, a vastly inferior approach where the question “who sent the transaction X?” is answered not via PPK cryptography but via a consensus-level query because any individual node is incapable of providing a BFT answer.

 What is your transaction-level security model?

 The ideas of ephemeral and permanent security span both the consensus and transaction level, as it is consensus that hands the smart contract execution layer individual transactions. At the smart contract/transaction level we also use permanent security as well, supporting row level public key authorization natively in Pact. This is important because ephemeral implies that an attacker is one server away from impersonating an entity; secured channels work by point to point distribution of new transactions by the client/submitter to the cluster nodes over TLS and consensus secures that a given transaction should be committed and replicated. However, if an attacker hacks the client server holding the other end of the TLS connection, they can transact as if they were the client without the cluster being the wiser. Permanent security, on the other hand, has many keys for individual roles in a given entity thus requiring an attacker to gain access to the individual keys; further, with permanent security the CEO’s transactions are signed with a different key than the Mail Clerk’s transactions vs ephemeral where the “who is sending this transaction” is determined by a “from: X” field. If the same TLS connection is used to submit both the CEO’s and the Clerk’s transactions, then the authorship and authorization logic is a “because I said so/trust me” model vs a PPK-sig approach where you verify against the appropriate key before execution. Kadena’s Blockchain is designed to trust as little as possible; if we knew of a more paranoid or fine-grained approach than row-level PPK signatures we’d use that instead.

 What is your confidential transaction model?

 We use Double-Ratchet protocol (what Signal, WhatsApp, etc… use for encrypted communications) embedded into the blockchain (encrypted transaction bodies) for multi-party privacy preserving use cases. We work with the notion of disjoint databases via the `pact` primitive in Pact – they describe a multiphase commit workflow over disjoint databases via encrypted messages.

 

Smart Contracts

Pact is a full smart contract language whose interpreter is built in Haskell.  In Kadena every transaction is a smart contract and the Pact smart contract language is open sourced. Pact is database-focused, transactional, Turing-incomplete, single-assignment (variables cannot be changed in their lifetime), and thus highly amenable to static verification. Pact is also interpreted – the code you write is what executes on chain – whereas Solidity is compiled, making it difficult to verify code, and also impossible to rectify security issues in old language versions, once compiled. Pact ships with its own interpreter, but can run in any deterministic-input blockchain, and can support different backends, including commercial RDBMS. In the ScalableBFT blockchain, it runs with a fast SQLite storage layer.

 

 

Characteristics of the Kadena Blockchain

The Kadena Blockchain contains all these features:

 

In conclusion, Kadena has developed a fully replicating, scalable and deterministic consensus algorithm for private blockchains with high performance.  This blockchain solution can be a giant leap forward for financial services companies looking to employ a real private solution that remains true to many of the key bitcoin blockchain features without mining (Proof of Work), anonymity and censorship resistance while catering to the key design features that financial services are craving particularly scalability and confidentiality. 

 

 

 

 

 

 

 

 

The Trend Towards Blockchain Privacy: Zero Knowledge Proofs

One of the bigger trends in the blockchain world, particularly when it comes to financial services and specifically capital markets operations, has been a need for privacy and confidentiality in the course of daily business.  This has meant that blockchain solutions are being designed with this primary need in mind.  This has led to all the private blockchain solutions being developed today.

When you build for privacy and confidentiality there are tradeoffs that come with that. Mainly you lose transparency, which was the major feature of the the first blockchain: Bitcoin.  As originally designed a blockchain is a transparency machine.  In this system, the computers are distributed and no one entity controls the network.  Not only this but anyone can be a validator and anyone can write to or read from the network.  Clients and validators can be anonymous and all the data gets stored locally in every node. (replication).  This makes all transaction data public. The security of Bitcoin is made possible by a verification process in which all participants can individually and autonomously validate transactions.  While Bitcoin addresses the privacy problem by issuing pseudonymous addresses, it is still possible to find out who's addresses they are through various techniques.

This is the polar opposite of what is happening in the private blockchain world, where decentralization and transparency are not deemed as necessary for many capital markets use cases.  What is important is privacy and confidentiality, latency (speed) and scalability (able to maintain high performance as more nodes are added are added to the blockchain). Encrypted node to node (n2n) transactions where only the two parties involved in the transaction receive data.  In many of these systems there are opt ins for third party nodes (regulators) to be a part of the transaction.  Other systems being developed for similar purposes, which have been written about on this blog, have one designated block generator which collects and validates proposed transactions, periodically batching them together into a new-block proposal.  Consensus is provided by a Generator that applies rules (validates) agreed to by the nodes (chain cores) to the block and designated block signors. 

In these systems, decentralization is simply not necessary because all the nodes are known parties.  In private blockchains the nodes must be known in order to satisfy certain regulatory and compliance requirements. The focus has been on how to preserve privacy and confidentiality while achieving speed, scalability, and network stability.  Therefore, there are ways for legal recourse even between parties who don't necessarily trust each other.  

Strong, Durable Cryptographic Identification

What is Cryptography and Encryption?

As noted above with privacy and confidentiality being pivotal, encryption has become a major focus for all blockchains.  Many of these solutions are using advanced cryptographic techniques that provide strong mathematically provable guarantees for the privacy of data & transactions. 

In a recent blog post  titled "A Gentle Reminder About Encryption" by Kathleen Breitman of R3CEV,  she succintly provides a great working definition:

"Encryption refers to the operation of disguising plaintext, information to be concealed. The set of rules to encrypt the text is called the encryption algorithm. The operation of an algorithm depends on the encryption key, or an input to the algorithm with the message. For a user to obtain a message from the output of an algorithm, there must be a decryption algorithm which, when used with a decryption key, reproduces the plaintext."

If this encryption uses ciphertext to decrypt this plaintext, you get homomorphic encryption and this (combined with digital signature techniques) is the basis for the cryptographic techniques which will be discussed in this post.  Homomorphic encryption allows for computations to be done on encrypted data without first having to decrypt it.  In other words, this technique allows the privacy of the data/transaction to be preserved while computations are performed on it, without revealing that data/transaction.  Only those with decrypt keys can access what exactly that data/transaction was.

Homomorphic encryption means that decrypt(encrypt(A) + encrypt(B)) == A+B. This is known as homomorphic under addition.

So a computation performed on the encrypted data when decrypted is equal to a computation performed on the encrypted data.

The key question being asked is: How can you convince a system of a change of state without revealing too much information?

After all,  blockchains want to share a (change of) state; not information.  On a blockchain, some business process is at state X and now moves to state Y, this needs to be recorded and proved while preserving privacy and not sharing a lot of information.  Furthermore, this change of state needs to happen legally, otherwise there is a privacy breach.

Cryptographic techniques like zero knowledge proofs (ZKPs), which use different types of homomorphic encryption, separate:

1) reaching a conclusion on a state of affairs

2) the information needed to reach that state of affairs

3) show that that state is valid.

The rest of this post will discuss how the trend towards privacy has led to cryptographic techniques, some old and some new, being used to encrypt transactions and the data associated with them from everyone except the parties involved.  The focus will be on Zero Knowledge Proofs, zk SNARKs, Hawk, confidential signatures, state channels and homomorphic encryption.

The privacy problem on a blockchain is the main gap for deployment for all of the cryptographic solutions talked about below.

Outside of a blockchain, there are examples of homomorphic encryption in practice. CryptDB is an example of system that uses homomorphic encryption and other attribute preserving encryption techniques to query databases securely. It is used in production at Google and Microsoft amongst other places. It does have limitations though: you have to define the kinds of queries you want ahead of time and it is easy to leak data.  CryptDB provides confidentiality for data content and for names of columns and tables; however CryptDB does not hide the overall table structure, the number of rows, the types of columns, or the approximate size of data in bytes.  One method CryptDB uses to encrypt each data items is by onioning. This allows each data item to be placed in layers of increasingly stronger encryption.

Confidential Transactions

Gregory Maxwell designed a cryptographic tool (CT) to improve the privacy and security of Bitcoin-style blockchains. It keeps the amounts transferred visible only to participants in the transaction. CT's make the transaction amounts and balances private on a blockchain through encryption, specifically additively homomorphic encryption.  What users can see is is the balances of their own accounts and transactions that they are receiving.  Zero knowledge proofs are needed to demonstrate to the blockchain that none of the encrypted outputs contain a negative value.

The problem with Confidential Transactions is that they only allow for very limited proofs  as mentioned above.  zkSNARKs and Zero Knowledge Proofs (ZKPs) which will be described in detail below, allow you to prove virtually any kinds of transaction validation while keeping all inputs private. 

Zero Knowledge Proofs(ZKPs) 

Zero Knowledge Proofs (ZKPs) are not new.  They were first conceptualized in 1985  in a paper "The Knowledge Complexity of Interactive proof Systems."  A ZKP is a cryptographic technique which allows two parties (a prover and a verifier) to prove that a proposition is true, without revealing any information about that thing apart from it being true. In the case of cryptocurrencies and blockchains this will generally be data about transactional information.

"A zero-knowledge proof must satisfy three properties:

  1. Completeness: if the statement is true, the honest verifier (that is, one following the protocol properly) will be convinced of this fact by an honest prover.
  2. Soundness: if the statement is false, no cheating prover can convince the honest verifier that it is true, except with some small probability.
  3. Zero-knowledge: if the statement is true, no cheating verifier learns anything other than this fact. This is formalized by showing that every cheating verifier has some simulator that, given only the statement to be proved (and no access to the prover), can produce a transcript that "looks like" an interaction between the honest prover and the cheating verifier.

The first two of these are properties of more general interactive proof systems. The third is what makes the proof zero-knowledge."

zk-SNARKs

A zk-SNARK (zero-knowledge Succinct Non-Interactive Arguments of Knowledge) is a Zero Knowledge proof that is a way to prove some computational fact about data without revealing the data.  Zk-SNARKs are the underlying cryptographic tool used in Zcash and Hawk both of which are building blockchains with ZKPs and both will be explained later.  In the case of Zcash these SNARKs are used for verifying transactions and in the case of Hawk they are used for verifying smart contracts.  This is done while still protecting users privacy.

A zk-SNARK is a non-interactive zero-knowledge proof of knowledge that is succinct and  for which proofs are very short and easy to verify.  They can be thought of as little logic circuits that need to generate a proof of statement to verify each and every transaction.  They do this by taking a snapshot of of each transaction, generate a proof and then need to convince the receiving side that the calculation was done correctly without revealing any data except the proof itself.  The basic operation of a SNARK execution is a coded input into this circuit which can be decrypted.  

 Since zk-SNARKs can be verified quickly, and the proofs are small, they can protect the integrity of the computation without burdening non-participants. It should be noted that this technology is just now starting to mature but still has limitations.  They are very CPU intensive to generate proofs and it takes up to 1 minute to generate new proofs, so scaling is still an issue that needs to be resolved.

The very first data points for zk-SNARKs will be Zcash which is a combo of distributed state and proof that you own the assets.

Zcash

Zcash can be described as an encrypted open, permissionless, replicated ledger.  A cryptographic protocol for putting private data on a public blockchain.  Zcash can be thought of  as an extension of the bitcoin protocol.  Basically Zcash added some fields to the bitcoin transaction format to support encrypted transactions.  Zcash uses SNARKs (ZKPs) to encrypt all of the data and only gives decryption keys to authorized parties to see that data.   This could not be done on a public blockchain until now because if you encrypted everything in the past it would prevent miners from checking to see if transactions are valid.  ZKPs have made this possible by allowing the creator of a transaction to make a proof that the transaction is true without revealing the sender's address, the receiver's address and the transaction amount.  Zooko describes this by saying bitcoin has 3 columns, which are the three mentioned above (sender address, receiver address, transaction amount) and Zcash has 4.  The 4th column proof doesn’t know the sender address, the receiver address and amount transferred, but it does know that nobody could have created the proof that comes with the encrypted values unless they have a secret key which has sufficient value to cover the amount amount being transacted.  This is a proof that the data inside the encryption correctly satisfies the validity constructs. This allows the prevention of double spends and transactions of less than zero.

Zcash is mostly the same as bitcoin.  The miners and full nodes are transaction validators. Zcash uses POW that has miners checking ZKP’s attached to each transaction and getting a reward for validating those transactions.  Full nodes are the same except that if you have the private keys you can detect if some transactions have money that is there for you.  SNARKs make it so that miners can reject a transaction from someone if their private key doesn’t have enough money for that transaction.  By keeping all data private except for the 4th column it omits information from leaking onto a private blockchain which allows for everyone to view information about transactions.  zCash has selective transparency while bitcoin has mandatory transparency.  This means that Zcash can reveal specific things to specific people by permissioning.  It reveals specific transactions that anyone looking at them can verify in the blockchain.

Some differences from the zCash whitepaper include:

"Value in Zcash is carried by notes, which specify an amount and a paying key. The paying key is part of a payment address, which is a destination to which notes can be sent. As in Bitcoin, this is associated with a private key that can be used to spend notes sent to the address; in Zcash this is called a spending key.

A payment address includes two public keys: a paying key matching that of notes sent to the address, and a transmission key for a key-private asymmetric encryption scheme. “Key-private” means that ciphertexts do not reveal information about which key they were encrypted to, except to a holder of the corresponding private key, which in this context is called the viewing key. This facility is used to communicate encrypted output notes on the block chain to their intended recipient, who can use the viewing key to scan the block chain for notes addressed to them and then decrypt those notes.

The basis of the privacy properties of Zcash is that when a note is spent, the spender only proves that some commitment for it had been revealed, without revealing which one. This implies that a spent note cannot be linked to the transaction in which it was created."

Zcash is what's known as a decentralized anonymous payment schemes (DAP schemes).  A DAP scheme enables users to directly pay each other privately: the corresponding transaction hides the payment’s origin, destination, and transferred amount.   In Zcash, transactions are less than 1 kB and take under 6 ms to verify — orders of magnitude more efficient than the less-anonymous Zerocoin and competitive with Bitcoin.  However the privacy achieved is significantly greater than with Bitcoin.  De-anonymizing bitcoin has become much easier through services that track and monitor bitcoin movements and the data associated with it.  Mixer services allow for coins to be changed as they move through the system via a central party but this still is not sufficient enough.  The zCash whitepaper states:

"mixes suffer from three limitations: (i) the delay to reclaim coins must be large to allow enough coins to be mixed in; (ii) the mix can trace coins; and (iii) the mix may steal coins. For users with “something to hide,” these risks may be acceptable.  But typical legitimate users (1) wish to keep their spending habits private from their peers, (2) are risk-averse and do not wish to expend continual effort in protecting their privacy, and (3) are often not sufficiently aware of their compromised privacy."

The major motivations for ZKPs and the Zcash protocol are 1)privacy and 2)fungibility.  Fungibility is being able to substitute individual units of  something like a commodity or money for an equal amount.  This can be a real problem when some units of value are deemed less because they are considered "dirty".  Hiding the metadata history doesn't allow for a coin with a bad history to be rejected by a merchant or exchange.  Gregory Maxwell said "Insufficient privacy can also result in a loss of fungibility--where some coins are treated as more acceptable than others--which would further undermine Bitcoin's utility as money."

Zcash is expected to launch soon and with that the genesis block of the Zcash blockchain.  This will allow, like the bitcoin blockchain anyone in the world to mine, for Zcash. It will be an open, permissionless system (fully decentralized).  Users will be able to send it to anyone using zero-knowledge privacy.  

ZCash’s use of cutting edge cryptographic techniques comes with substantial risks. A cryptographic attack that permits the forging of zero knowledge proofs would allow an attacker to invisibly create unlimited currency and debase the value of Zcash. Attacks of this kind have been found and fixed in the recent past. Fortunately, the metadata hiding techniques used in Zcash tread are more production-hardened and can be considered less risky.

 

Hawk

Andrew Miller in his whitepaper: "Hawk: The Blockchain Model of Cryptography and Privacy-Preserving Smart Contracts" has developed a programmable smart contract system which works in much the same way as zCash for smart contracts.  Hawk does not store financial transactions on the blockchain and keeps the code of the contract private, data sent to the contract and money sent and received by the contract from the public.  It is only the proof that can seen and all other useful information is hidden. Like zCash, transparency is selective in Hawk and wouldn't need to be used by all smart contracts but based on use cases and the preferences of the parties involved.  It also aims to tackle the isssues of privacy and fungibility in much the same way as the zCash protocol.

The Hawk whitepaper does a great job of describing the motivation for contractual security it seeks to provide for financial transactions:

"While on-chain privacy protects contractual parties’ privacy against the public (i.e., parties not involved in the financial contract), contractual security protects parties in the same contractual agreement from each other. Hawk assumes that contractual parties act selfishly to maximize their own financial interest. In particular, they can arbitrarily deviate from the prescribed protocol or even abort prematurely. Therefore, contractual security is a multi-faceted notion that encompasses not only cryptographic notions of confidentiality and authenticity, but also financial fairness in the presence of cheating and aborting behavior."

According to Andrew Miller, Hawk is based on several cryptographic primitives.  It uses the same zero knowledge proof library as zCash, which is called libsnark.  Hawk also uses custom implementations of a lattice-based hash function, and public key encryption.  Hawk uses a jSnark tool which is open sourced. 

In Hawk, each party generates their own secret keys. Miller stated that "For each contract, there is also a trusted public parameter, similar to Zcash. The only way to generate these parameters is a process that involves generating a secret value in an intermediate step, which needs to be erased at the end of the protocol. To borrow Zcash's term for this, it's like a "toxic waste byproduct" of the setup procedure, and like all industrial waste, it must be securely disposed of. There are many options... we could do what Zcash does and use a multi-party computation to generate these parameters, simply let a trusted party do it (the trusted party only needs to be used once and can go offline afterwards), or use trusted hardware like SGX."

Miller has said there are some differences between Ethereum contracts and Hawk contracts.  Unlike Ethereum, the input language for private contracts in Hawk is C code.  A private Hawk contract is not a long running stateful process like an Ethereum contract, but rather a 1-shot contract that proceeds in phases, where it first receives the inputs from each party, and then it computes the outputs for each party. After the outputs are computed, the contract is finished and no longer holds any balance. So, it is a slightly different computing model. Hawk supports both private contracts as described above, as well as  public contracts which are exactly like those in Ethereum. (No privacy guarantees are provided for the public contracts, though).

As in Zcash, there are some challenges to blockchain scaling and optimizing cryptographic schemes so they are efficient when using ZKPs.  Hawk tries to do as much computation off chain as possible.  This is done because in public blockchains on chain computing gets replicated to every node and slows things down dramatically.  Producing the proof can take up to several minutes (which is long) and can be costly.  Nodes checking the proof only take milliseconds to do that.  Data from the whitepaper: In Hawk, it takes about a minute of CPU time for each participant in a Hawk contract.  On chain computation takes about 9 to 20 milliseconds.

Hawk has not announced a release date yet as they are still working on optimizing their snark compiling tools to enhance performance.  

State Channels

State channels allow for a payment channels that are off chain and allow for updates to any type of applications that have a change of state.  Like the Lightning Network,  two or more users can exchange payments that would normally require a blockchain transaction without needing to publish them on the  blockchain or wait for confirmations except when setting up or closing out the channel. 

Vitalik Buterin explains this in his paper for R3CEV "Ethereum Platform Review"

"State channels are a strategy that aims to solve the scalability challenge by keeping the underlying blockchain protocol the same, instead changing how the protocol is used: rather than using the blockchain as the primary processing layer for every kind of transaction, the blockchain is instead used purely as a settlement layer, processing only the final transaction of a series of interactions, and executing complex computations only in the event of a dispute.

State channels are not a perfect solution; particularly, it is less clear how they extend to massively-multi-user applications, and they offer no scalability improvements over the original blockchain in terms of its ability to store a large state size - they only increase de-facto transaction throughput. However, they have a number of benefits, perhaps the most important of which is that on top of being a scalability solution they are also a privacy solution, as the blockchain does not see any of the intermediate payments or contracts except for the final settlement and any disputes, and a latency solution, as state channel updates between two parties are instant - much faster than any direct on-blockchain solution, private or public, possibly could be, and potentially even faster than centralized approaches as channel updates from A to B can be secure without going through a centralized server."

State channels aim to address the scalability issues, privacy issues and confirmation delays associated with public blockchains while allowing actors who don't necessarily trust each other to transact.

 

Do You Need A Blockchain At All? Is Consensus Needed?

For many people all of these cryptographic methods which mask all of the transactional data will come as a surprise.  The blockchain is supposed to be a transparency machine in which anyone can join the network and as a result view all information on that network.  Even in private blockchains, there is a more open view into the data then the protocols that have been mentioned in this post.   Another question which might come to mind is if consensus is even needed since everything is private but the proof.   If the proof is only between the two parties involved in the transaction why is consensus needed and why use a public blockchain.  It may seem counterintuitive, but the answer is that yes a public blockchain is needed and so is consensus and its due to the privacy of the proofs.  Essentially, complete transparency is needed to maintain the privacy of the proofs.

ZKPs and blockchains complement each other.  You can't just use one to replace the other.  A blockchain is used to make sure the entire network can agree on some state which may or may not be encrypted. ZKPs allow you to be confident about some properties of that state.  In this scenario, you still need a canonical source of truth.  A view key that reveals all incoming transactions but not outgoing ones.  In order for this to happen, you need a fully decentralized ledger with consensus where everyone agrees with the data written there.  

For example, zcash has data which contains information which is useless and unreadable to most actors. It’s a database of commitments and opaque pieces of data.  It's just a way to synchronize data between actors.  (Zooko Wilcox has publicly stated that if Chainalysis graphed this out it would just be a series of timestamps of when a transaction occurred.)  In cases where the number of transactions are low, then timing attacks could reveal the originator of transactions, imagine this to be equivalent of just one node connected to a Tor network.

The real emphasis is on the wallet side for actors because this allows them to spend money and move assets around, in bitcoin you can take a private key and move bitcoin. Now it's more.  It’s a private key and a set of secrets you keep to prove previous proof and generate a new proof that you use to convince others.  For this, a  fully decentralized ledger is needed  with consensus where everyone agrees with the data written there. 

A blockchain is necessary because you need consensus layer from everyone: It is necessary to have an agreement of proofs in the ledger to move assets around later on, if that proof isn’t available in every node then you can’t convince anyone of the proof when you need to move assets later on.  These proofs need to be stored in an open way so the proofs can be seen as being verified and accepted by receiving parties.

There are two different layers: 1) Needs to be agreement on what proofs everyone accepts 2)  Needs to be agreement on what you can prove and what happens on proof of zero knowledge and what happens once you know the information.  How do you generate proof and pass that information to the next person?  The key is to get authority of the transaction by adding a proof or metadata to the transaction with some type of conditional script (if then statements for transaction acceptance).   This code contains transaction validity rules. A person sees proof from outside but they don’t know if the rule itself  has been triggered or not.  Now that have you privacy from ZKPs,  in order to comply with the transaction, you need to prove that the transaction abides by the rules.  So you can take 2 proofs and create new proofs that the person receiving them can point at and verify that the proof is accepted by the entire network.  Once the proofs have a meaning to you based on the rules, you can agree they were proved in the past and can be used in the future to transact and transfer money.

Limitations

ZKPs are moving out of the realm of theory and becoming production strength.  Now is the time to see how practical they are. They are only now going to start having really world tests and they still suffer from big scalability issues.  The work of developing a proof is enormous and has massive computation costs.  As mentioned before, in Zcash in order to create a proof and move money from someone else it takes between 45 seconds and 1 minute on a really strong computer.  Presently, people are working on making SNARKs and ZKPs more efficient by allowing for more proofs per second or for more elaborate proofs in the same amount of time. 

Deep changes need to be made architecturally in blockchain technology in order to understand knowledge of ZKP architecture. You need to understand the constraints of what you can prove and at what scale.

Very Special Thanks to Zaki Manian (@zmanian), Andrew Miller (@socrates1024) Jonathan Rouach (@jonrouach), Anish Mohammed (@anishmohammed)

Hawk section provided by Andrew Miller from a series of questions I asked.

 

https://github.com/zcash/zips/blob/zips27.reorganisation.1/protocol/protocol.pdf

Chain: Simplified Byzantine Fault Tolerance (SBFT)

This post aims to look at some of the key features of the Chain Open Standard, a permissioned blockchain, specifically its consensus mechanism. 

Blockchain startup Chain,  recently released an open source permissioned blockchain built in collaboration with 10 global financial firms and telcos.  This platform is made for financial applications that require high scalability ( > thousands of transactions per second), robust security and near absolute privacy.  Blockchains must be built for the regulatory requirements of these institutions as well. These are attributes the financial services sector requires.  If speed is the key characteristic of this platform, network stability becomes very important in any solution designed.  Chain was built with this design assumption in mind.

Partners in the project include Capital One, Citi, Fidelity, First Data, Fiserv, Mitsubishi UFJ, Nasdaq, Orange, State Street and Visa, all of which have contributed to the technology. This platform is being called the Chain Open Standard.  Chain Core is the software implementation of the Chain Open Standard and is designed to run for enterprise IT environments.

Note: Chain Core is the name Chain has given to nodes on its platform. 

Consensus Mechanism: Simplified Byzantine Fault Tolerance (SBFT)

In SBFT, one designated block generator collects and validates proposed transactions, periodically batching them together into a new-block proposal.  Consensus is provided by a Generator that applies rules (validates) agreed to by the nodes (chain cores) to the block and designated block signors. Other (multiple) designated block signers ratify the proposed block with their signatures.   All network members know the identities of the block signers (permissioned blockchain) and accept blocks only if signed by a sufficient number of signers. A Block Signer validates all transactions in each block and adds a signature.  Only blocks with sufficient signatures are accepted into the chain. This attempts to prevent the double spending problem by attempting to ensure competing transactions gets resolved. 

By using 1 generator (master replicator) in a trusted, private environment this effectively allows for kind of scale and speed needed for transactions and for the signors to validate transactions.  These signors are configurable meaning they can be added/removed from the system at any time.  The same goes for the nodes (chain cores) in the network.  They can be added/deleted since it is a private network and this adds an extra layer of security particularly when dealing with what could be a malicious actor.

As a result of using 1 generator instead of multiple, synchronization does not occur.  Synchronization is a process that establishes consistency of data between 2 or more entities. This feature allows for scalability and speed to not be affected for the enterprise grade solution.   Since the blockchain is private and the entities are known multiple generators could be seen as a redundancy.  Not all nodes need to be online for this platform to function at a minimum  1 generator and 1 signor are needed.  However, typically it allows 100 participants to interact, only needs 5 signors, 1 generator and 1 issuer (some regulatory body).  The Fault Tolerance in this setup allows for 3 out of 4 or 4 out of 5 signors.

The Privacy section will go into the details of how the Chain Open Standard tackles the problem of Confidentiality of information for the platform participants.  Open, permissionless blockchains like Bitcoin are transparency machines in that all participants can view information on the network.  Chain has built a solution for those who need privacy as a main feature.  Without the need for complete transparency and all nodes (chain cores) receiving transactional information, scalability does not get sacrificed, but transparency does.  All systems have trade-offs.  In this system,  the nodes (chain cores) would only get block proofs by node platform.   

The node (core) itself,  could store all the blockchain data or only a snapshot (balance sheet) and a limited history of activity from the Account Manager (described below).

 

Stages

  1. The Asset Issuer (presumably a node on the platform) creates what can be an unlimited number of cryptographically unique Asset ID's.  (Creation Phase)
  2. Units of these assets are issued on a blockchain.  (Submission Phase)
  3. An Asset ID is associated with an Asset Definition. (Asset Definitions can be self enforcing rules to meet specific conditions depending on the use case.  These can have an unlimited amount of reference data) (Validation Phase)
  4. Once issued, units of an asset ID can be stored in and transferred between accounts, programmatically transacted via smart contracts, or retired from circulation. (Signing Phase and Pulling into Nodes Phase)
  5.  After the Signing Phase the transaction goes live.

One of the interesting features of this system is the Account Manager which serves many key roles.  It stores assets in secure accounts.  This is where transaction data gets stored.  These accounts can contain any combination of assets and created for many different types of users.  These accounts can be thought of as digitally secure wallets. In addition to storing assets, the Account Manager enables the transferability of assets in to and out of accounts via blockchain transactions (within the same Core or between different Cores in the network). The Account Manager builds the smart contracts for all different types of transactions (See Smart Contract Section).  Each transaction is a smart contract.  

Ownership of the assets flows through the system by using a push model.  The addresses are provided by other parties and the unique Asset ID's and accounts that get created are used to designate ownership of the assets.  The smart contract (transaction) defines what actions a designated party can take.

Privacy & Security

The Chain Open Standard is a private network in which confidentiality of information is one of top priorities. This platform has been designed to support selective disclosure of sensitive information. This is done using three techniques: one-time-use addresses, zero knowledge proofs, and encrypted metadata.

A one-time address is created each time an account holder wishes to receives assets. These differing addresses prevent other observers of the blockchain from associating transactions with each other or with a particular party.

To cryptographically conceal the contents (assets and amounts) of a transaction, “zero knowledge proofs,” are used, while still allowing the entire network to validate the integrity of the contents. Zero Knowledge Proofs (ZKPs) do this by one party proving to another party that a given statement is true, without conveying any information (in this case, about the transaction) apart from the fact that the statement is indeed true. Only the counter-parties (and those granted access) can view the details of the transaction.

Also transaction metadata can be encrypted with traditional PKI, to conceal details from all but the relevant parties.  The platform uses keys to prove verifiable authenticity (signatures) of the messages delivered between the nodes (chain cores).

The keys are generated by creating an unlimited number of cryptographically unique Asset IDs.  These keys get rotated every 2-3 weeks.  Rotating keys is a process for decrypting data with an old key and applying the data to a new key by re-keying.  These keys should probably be kept in different places or data centers. If one of the keys gets compromised then use other key to generate backup keys and transfer over all assets to new key.  Key management and rotation is essential to managing secure digital assets. These keys also allow and restrict access to certain activities.

Chain Core also integrates with industry-standard hardware security module (HSM) technology. All block and transaction signing takes place within hardened HSM firmware. Multi-signature accounts using independent HSMs can further increase blockchain security. HSM firmware that secures all transactions and blocks Multi-signature accounts to eliminate single points of failure. 

 

Smart Contracts

The Chain Open Standard platform has designed a framework in which all transactions are smart contracts, that allow for outside events or data to trigger the execution of certain clauses in the contract. It also allows each transaction to contain metadata, such as information required for Know Your Customer (KYC) and anti-money laundering (AML) regulations.  The smart contracts have a built in identity feature.

Some of the use cases Chain is looking at for financial transactions and generate a smart contract on transaction by transaction basis include (See Captions Below for Use Cases):

Asset Issuance

Asset Issuance

Payments

Payments

Uses cases being explored that have smart contract features for transactions are:

  1. Asset Issuance - Digitize existing assets for transacting on a blockchain network
  2. Simple Payment - Transfer assets from one account to another
  3. Bilateral Trade - Swap one asset for another with no counterparty risk
  4. Order book - Name your sale price and let a buyer find you
  5. Collateralized Loan - Lend assets, guaranteed by locked collateral
  6. Auction - Set a minimum price and sell your assets to the highest bidder 

 

Architecture Concept

Chain Core is the software implementation of the Chain Open Standard. It is designed to run in enterprise IT environments. 

http://chain.com/core/

http://chain.com/core/

 

 Conclusion: Trade-offs for Scalability and Speed

Key characteristics of the Chain Open Standard include scalability, speed and privacy. With this in mind, as with any blockchain, trade-offs occurred for high transaction speeds.  Chain created a private blockchain open only to members of the platform.  Data privacy is a major problem private blockchains aim to solve without losing other key features of a blockchain network.  Decentralization and transparency are lost as a result of this. For the types of clients they have this is a non issue and is necessary to ensure privacy and confidentiality of transactions at scale. Having 1 generator effectively act as a master replicator for the private network of known signors and participants also allows transactions to scale into the tens of thousands per second. This being the case synchronization becomes a waste of effort and hurts scalability, so has been discarded as well.  If limitless scalability is a design principle network stability (consistency of the data) and speed cannot be sacrificed.  Transparency and decentralization can. 

Scale also gets achieved in the Chain Open Standard through sharding and replication of the storage layer. Sharding allows for the partitioning of large databases into smaller ones, which make them faster and much easier to manage. (Ethereum aspires to this as well) The one thing in the near term that may hurt this enormous scalability could be using zero knowledge proofs which are not known to scale at this point in time.  

Networks can be fitted and repurposed for any size market. However use cases centered around decentralization (privacy), scalability (speed of transactions), and consistency (Stability of network) will dictate what consensus model gets used.  Illiquid markets will not need the same type of solution as highly liquid ones.  The same can be said of use cases where absolute privacy is necessary.  Within each network, different levels of participation by different institutions are also important for deciding what type of blockchain you will build. 

Sources:

http://chain.com/core/

https://chain.com/os/

 

 

 

 

A New Approach to Consensus: Swirlds HashGraph

(Special thanks to Leemon Baird, creator of the Swirlds Hashgraph Consensus Algorithm)

As many people here know, my interest in consensus mechanisms runs far and wide.  In the KPMG research report I co-authored "Consensus: Immutable Agreement for the Internet of Value", many consensus mechanisms were discussed. In Appendix 3 of the paper, many of the major players in the space discussed their consensus methodologies.  One consensus mechanism which wasn't in the paper was the Swirlds Hashgraph Consensus Algorithm. That whitepaper is a great read and this consensus mechanism holds quite a lot of promise.  I have had many discussions with its creator, Leemon Baird and this blog post comes from conversations, questions and emails about the topic.  Also at the end of the blog I asked Leemon to fill out the consensus questionnaire from the KPMG report and he graciously did. His answers appear at the end of this post.

What exactly is a hashgraph? 

A "hashgraph" is a data structure, storing a certain type of information, and updated according to a certain algorithm.   The data structure is a directed acyclic graph, where each vertex contains the hash of its two parent vertices. This could be called a Merkle DAG, and is used in git, and IPFS, and in other software.

The stored information is a history of how everyone has gossiped.  When Alice tells Bob everything she knows, during a gossip synch, Bob commemorates that occurrence by creating a new "event", which is vertex in the graph, containing the hash of his most recent event, and the hash of Alice's most recent event.  It also contains a timestamp, and any new transactions that Bob wants to create at that moment.  Bob digitally signs this event.  The "hashgraph" is simply the set of all known events.

The hashgraph is updated by gossip: each member repeatedly chooses another member at random, and gives them all the events that they don't yet know.  As the local copy of the hashgraph grows, the member runs the algorithm in the paper to determine the consensus order for the events (and the consensus timestamps).  That determines the order of the transactions, so they can be applied to the state, as specified by the app.

 

What are gossip protocols?

A "gossip protocol" means that information is spread by each computer calling up another computer at random, and sharing everything it knows that the other one doesn't.  It's been used for all sorts of things through the decades. I think the first use of the term "gossip protocol" was for sharing identity information, though the idea probably predates the term. There's a Wikipedia article with more of the history. In Bitcoin, the transactions are gossiped, and the mined blocks are gossiped.  

It's widely used because its so fast (information spreads exponentially fast) and reliable (a single computer going down can't stop the gossip).

The "gossip about gossip" idea is new with hashgraph, as far as I know.  There are many types of information that can be spread by gossip.  But having the information to gossip, be the history of the gossip itself is a novel idea.  

In hashgraph, it's called "gossip about gossip" rather than "gossip of gossip".  Similar to how your friends might "gossip about what Bob did" rather than "gossip of what Bob did".

Key Characteristics of Swirlds Hashgraph Consensus

  1.  Ordering and fairness of transactions are the centerpiece of Swirlds. Simply put, Swirlds seeks to fix the ordering problem found in the blockchain world today (due to different consensus methodologies that have trouble addressing this problem) by using Hashgraph Consensus and "gossip about gossip".
  2. Hashgraph can achieve consensus with no Proof of Work. So it can be used as an open system (non-permissioned) using Proof of Stake, or it can be used as a permissioned system without POW or POS. 
  3.  There's no mining. Any member can create a block (called an "event") at any time.
  4. It supports smart contract creation.
  5. Blocksize can be whatever size you want. When you create a block ("event"), you put in it any new transactions you want to create at that time, plus a few bytes of overhead. So the block ranges from a few bytes (for no transactions), to as big as you want it (for many transactions).  But since you're creating many blocks per second, there's no reason to make any particular block terribly big.
  6. The core hashgraph system is for distributed consensus of a set of transactions. So all nodes receive all data.  One can build a sharded, hierarchical system on top of that. But the core system is a replicated state machine. Data is  stored on each machine. But for the core system, the data is replicated.

Other Questions I asked Leemon Baird about the Whitepaper

Below are some questions I asked Leemon after reading the whitepaper. His answers are elaborate and very useful for those seeking to not only understand Hashgraph Consensus but also the inner workings of blockchains and the consensus algorithms that power them. 

1)  Why is fairness important?

Fairness allows new kinds of applications that weren't possible before.  This creates the fourth generation of distributed trust.

For some applications, fairness doesn't matter. If two coins are spent at about the same time, we don't care which one counts as "first", as long as we all agree.  If two people record their driver's license in the ledger at about the same time, we don't care which counts as being recorded "first". 

On the other hand, there are applications where fairness is of critical importance.  If you and I both bid on a stock on the New York Stock Exchange at the same time, we absolutely care which bid counts as being first!  The same is true if we both try to patent the same thing at the same time. Or if we both try to buy the same domain name at the same time. Or if we are involved in an auction. Or is we are playing an online game: if you shoot me and I dodge, it matters whether I dodged BEFORE you shot, or AFTER you shot.

So hashgraph can do all the things block chain does (with better speed, cost, proofs, etc).  But hashgraph can also do entirely new kinds of things that you wouldn't even consider doing with a block chain.

It's useful to think about the history of distributed trust as being in 4 generations:

1. Cryptocurrency

2. Ledgers

3. Smart Contracts

4. Markets

I think it's inevitable. Once you have a cryptocurrency, people will start thinking about storing other information in it, which turns it into a public ledger with distributed trust. 

Once you have the ledger storing both money and property, people will start thinking about smart contracts to allow you to sell property for money with distributed trust.

Once you have the ability to do smart contracts, people will start thinking about fair markets to match buyers and sellers.  And to do all the other things that fairness allows (like games, auctions, patent offices, etc).

Swirlds is the first system of the fourth generation.  It can do all the things of the first 3 generations (with speed, etc). But it can also do the things of the 4th generation.

 

 2) You mention internet speed and how faster bandwidth matters?  So it acts like the current state of electronic trading in the stock market.  Are you not worried about malicious actors with high speed connections taking over the network?  Kind of like how High Frequency Trading doe in the stock market using low latency trading mechanisms, co-locality and huge bandwidth are extremely advantageous for "winning" as Michael Lewis talks about in "Flash Boys"?

In hashgraph, a fast connection doesn't allow you to "take over the network". It simply allows you to get your message out to the world faster. If Alice creates a transaction, it will spread through gossip to everyone else exponentially fast, through the gossip protocol. This will take some number of milliseconds, depending on the speed of her Internet connection, and the size of the community. If Bob has a much faster connection, then he might create a transaction a few milliseconds later than her, but get it spread to the community before hers.  However, once her transaction has spread to most people, it is then too late for Bob to count as being earlier than her, even if Bob has infinite bandwidth.

This is analogous to the current stock market, except for one nice feature. If Bob wants an advantage of a few milliseconds, he can't just build a single, fast pipe to the single, central server. He instead needs a fast connection to everyone in the network. And the network might be spread across every continent.  So he'll just need to have a fast connection to the Internet backbone. That's the best he can do, and anyone can do that, so it isn't "unfair". 

In other words, the advantage of a fast connection is smaller than the advantage he could get in the current stock market. And it's fair. If the "server" is the entire community, then it is fair to say that whichever transaction reached the entire community first, will count as being "first". Bob's fast connection benefits him a little, but it also benefits the community by making the entire system work faster, so it's good.

"Flash Boys" was a great book, and I found it inspiring. Our system mitigates the worst parts of the existing system, where people pay to have their computers co-located in the same building as the central server, or pay huge amounts to use a single fast pipe tunneled through mountains. In a hashgraph system, there is no central server, so that kind of unfairness can't happen.

3) You mention in the whitepaper that increasing block size "can make the system of fairness worse". Why is that?

That's true for a POW system like Bitcoin.  If Alice submits a transaction, then miner Bob will want to include it in his block, because he's paid a few cents to do so.  But if Carol wants to get her transaction recorded in history before Alice's, she can bribe Bob to ignore Alice's transaction, and include only Carol's in the block. If Bob succeeds in mining the block, then Alice's transaction is unfairly moved to a later point in history, because she has to wait for the next miner to include her transaction.

If each block contains 1 transaction, then Alice has suffered a 1-slot delay in where her transaction appears in history. If each block contains a million transactions, then Alice has suffered a million-slot delay. In that sense, big blocks are worse than small blocks. Big blocks allow dishonest people to delay your transactions into a later position in the consensus order.

The comment about block size doesn't apply to leader-based systems like Paxos. In them, there isn't really a "block". The unfairness simply comes from the current leader accepting a transaction from Alice, but then delaying a long time before sending it out to be recorded by the community.  The comment also doesn't apply to hashgraph.

4) Can you explain how not remembering old blocks works? And why one just needs to know the most frequent blocks and how this doesn't fly in the face of longest chain rule?  

Hashgraph doesn't have a "longest chain rule".  In blockchain, you absolutely must have a single "chain", so if it ever forks to give you two chains, the community must choose to accept one and reject the other. They do so using the longest chain rule. But in hashgraph, forking is fine. Every block is accepted.  The hashgraph is an enormous number of chains, all woven together to form a single graph. We don't care about the "longest chain". We simply accept all blocks.  (In hashgraph, a block is called an "event").

What we have to remember is not the "most frequent block". Instead, we remember the state that results from the consensus ordering of the transactions. Imagine a cryptocurrency, where each transaction is a statement "transfer X coins from wallet Y to wallet Z". At some point, the community will reach a consensus on the exact ordering of the first 100 transactions. At that time, each member of the community can calculate exactly how many coins are in each wallet after processing those 100 transactions (in the consensus order), before processing transaction number 101.  They will therefore agree on the "state", which is the list of amounts of coins in all the non-empty wallets. Each of them digitally signs that state. They gossip their signatures. So then each member will end up having a copy of the state along with the signatures from most of the community.  This combination of the state and list of signatures is something that mathematically proves exactly how much money everyone had after transaction 100.  It proves it in a way that is transferrable: a member could show this to a court of law, to prove that Alice had 10 coins after transaction 100 and before transaction 101.  

At that point, each member can discard those first 100 transactions. And they can discard all the blocks ("events") that contained those 100 transactions.There's no need to keep the old blocks and transactions. Because you still have the state itself, signed by most of the community, proving that there was consensus on it.

Of course, you're also free to keep that old information. Maybe you want to have a record of it, or want to do audits, or whatever. But the point is that there's no harm in throwing it away. 

5) You mention that blockchains don't have a guarantee of Byzantine agreement, b/c a member never reaches certainty that agreement has been achieved. Can you elaborate on this and explain why Hashgraph can achieve this?

Bitcoin doesn't have Byzantine fault tolerance, because of how that's defined.  Hashgraph has it, because of the math proof in the paper.

In computer science, there is a famous problem called "The Byzantine Generals Problem".  Here's a simplified version. You and I are both generals in the Byzantine army. We need to decide whether to attack at dawn. If we both attack or both don't attack, we will be fine. But if only one of us attacks alone, he will be defeated, because he doesn't have enough forces to win by himself.

So, how can we coordinate? This is in an age before radio, so you can send me a messenger telling me to attack. But what if the messenger is captured, so I never get the message?  Clearly, I'm going to need to send a reply by messenger to let you know I got the message.  But what if the reply is lost?  Clearly, you need to send a reply to my reply to let me know it got through. But what if that is lost?  We could spend eternity replying to each other, and never really know for sure we are in agreement. There was actually a theater play that dramatized this problem.

The full problem is more complicated, with more generals, and with two types of generals. But that's the core of the problem.  The standard definition is that a computer system is "Byzantine fault tolerant", if it solves the problem in the following sense:

- assume there are N computers, communicating over the Internet

- each computer starts with a vote of YES or NO

- all computers need to eventually reach consensus, where we all agree on YES, or all agree on NO

- all computers need to know when the consensus has been reached

- more than 2/3 of the computers are "honest", which means they follow the algorithm correctly, and although an honest computer may go down for a while (and stop communicating), it will eventually come back up and start communicating again

- the internet is controlled by an attacker, who can delay and delete messages at will (except, if Alice keeps sending messages to Bob, the attacker eventually must allow one to get through; then if she keeps sending, he must eventually allow another one to get through, and so on)

- each computer starts with a vote (YES or NO), and can change that vote many times, but eventually a time must come when the computer "decides" YES or NO.  After that point, it must never again change its mind.

- all honest computers must eventually decide (with probability one), and all must decide the same way, and it must match the initial vote of at least one honest member.

That's just for a single YES/NO question.  But Byzantine fault tolerance can also be applied to more general problems.  For example, the problem of decided the exact ordering of the first 100 transactions in history.

So if a system is Byzantine fault tolerant, that means eventually all the honest members will eventually know the exact ordering of the first 100 transactions. And, furthermore, each member will reach a point in time where they know that they know it. In other words, their opinion doesn't just stop changing. They actually know a time when it is guaranteed that consensus has been achieved. 

Bitcoin doesn't do that. Your probability of reaching consensus grows after each confirmation. You might decide that after 6 confirmations, you're "sure enough".  But you're never mathematically certain. So Bitcoin doesn't have Byzantine fault tolerance. 

There are a number of discussions online about whether this matters. But, at least for some people, this is important.  

If you're interested in more details on Bitcoin's lack of Byzantine fault tolerance, we can talk about what happens if the internet is partitioned for some period of time. When you start thinking about the details, you actually start to see why Byzantine fault tolerance matters.

6) You mention in the whitepaper, "In hashgraph, every container is used, and none are discarded"? Why is this important and why is this not a waste?

In Bitcoin, you may spends lots of time and electricity mining a block, only to discover later that someone else mined a block at almost the same time, and the community ends up extending their chain instead of yours. So your block is discarded. You don't get paid. That's a waste. Furthermore, Alice may have given you a transaction that ended up in your block but not in that other one. So she thought her transaction had become part of the blockchain, and then later learned that it hadn't.  That's unfortunate.

In hashgraph, the "block" (event) definitely becomes part of the permanent record as soon as you gossip it. Every transaction in it definitely becomes part of the permanent record.  It may take some number of seconds before you know exactly what position it will have in history. But you **immediately** know that it will be part of history. Guaranteed.

In the terminology of Bitcoin, the "efficiency" of hashgraph is 100%.  Because no block is wasted.

Of course, after the transactions have become part of the consensus order and the consensus state is signed, then you're free to throw away the old blocks.  But that isn't because they failed to be used.  That's because they **were** used, and can now be safely discarded, having served their purpose.  That's different from the discarded blocks in Bitcoin, which are not used, and whose transactions aren't guaranteed to ever become part of the history / ledger.

7) On page 8 of the whitepaper you wrote " Suppose Alice has hashgraph A and Bob hash hashgraph B. These hashgraphs may be slightly different at any given moment, but they will always be consistent. Consistent means that if A and B both contain event X, then they will both contain exactly the same set of ancestors for X, and will both contain exactly the same set of edges between those ancestors. If Alice knows of X and Bob does not, and both of them are honest and actively participating, then we would expect Bob to learn of X fairly quickly, through the gossip protocol. But the consensus algorithm does not make any assumptions about how fast that will happen. The protocol is completely asynchronous, and does not make assumptions about timeout periods, or the speed of gossip, or the rate at which progress is made."   What if they are not honest? 

If Alice is honest, then she will learn what the group's consensus is.

If Bob is NOT honest, then he might fool himself into thinking the consensus was something other than what it was. That only hurts himself.

If more than 2/3 of the members are honest, then they are guaranteed to achieve consensus, and each of them will end up with a signed state that they can use to prove to outsiders what the consensus was.  

In that case, the dishonest members can't stop the consensus from happening.  The dishonest members can't get enough signatures to forge a bad "signed state".  The dishonest members can't stop the consensus from being fair.

By the way, that "2/3" number up above is optimal.  There is a theorem that says no algorithm can achieve Byzantine fault tolerance with a number better than 2/3. So that number is as good as it can be.

8) Are the elections mentioned in the whitepaper  to decide the order of transactions or information?

Yes.  Specifically, the elections decide which witness events are famous witnesses.  Then those famous witness events determine the order of events. Which determines the order of transactions (and consensus timestamps).

9) What makes yellow "strongly see" from the chart on page 8 of the whitepaper?

If Y is an ancestor of X, then X can "see" Y, because there is a path from X to Y that goes purely downward in the diagram.  If there are **many** such paths from X to Y, which pass through more than 2/3 of the members, then X can "strongly see" Y.  That turns out to be the foundation of the entire math proof.

(To be complete: for X to see Y, it must also be the case that no forks by the creator of Y are ancestors of X. But normally, that doesn't happen.)

10) Whats the difference btw weak BFT (Byzantine Fault Tolerance) and strong BFT? Which are you using?

Hashgraph is BFT.  It is strong BFT.

"Weak BFT" means "not really BFT, but we want to use the term anyway".  

Those aren't really technical terms.  A google search for "weak byzantine fault tolerance" (in quotes) says that phrase doesn't  occur even once on the entire web.  And "weak BFT" (in quotes) occurs 6 times, none of which refer to Byzantine stuff.

People like to use terms like "Byzantine" in a weaker sense than their technical definition.  The famous paper "Practical Byzantine Fault Tolerance" describes a system that, technically, isn't Byzantine Fault Tolerant at all.  My paper references two other papers that talk about that fact.  So speaking theoretically, those systems aren't actually BFT.  Hashgraph truly is BFT.

We can also talk about it practically, rather than theoretically.  The paper I referenced in my tech report talks about how simple attacks on the network can almost completely paralyze leader-based systems like PBFT or Paxos.  That's not too surprising. If everything is coordinated by a leader, then you can just flood that leader's single computer with packets, and shut down the entire network.  If there is a mechanism for them choosing a new leader (as Paxos has), you can switch to attacking the new leader.  

Systems without leaders, like Bitcoin and hashgraph, don't have that problem.

Some people have also used "Byzantine" in a weaker sense that is called being "synchronous".  This means that you assume an honest computer will **always** respond to messages within X seconds, for some fixed constant X.  Of course, that's not a realistic assumption if we are worried about attacks like I just described.  That's why it's important that systems like both Bitcoin and hashgraph are "asynchronous".  Some people even like to abuse that term by saying a system is "partially asynchronous". So to be clear, I would say that hashgraph is "fully asynchronous" or "completely asynchronous".  That just means we don't have to make any assumptions about how fast a computer might respond.  Computers can go down for arbitrarily-long periods of time. And when they come back up, progress continues where it left off, without missing a beat.

11) Do "Famous witnesses" decide which transactions come first?

Yes. They decide the consensus order of all the events. And they decide the consensus time stamp for all the events.  And that, in turn, determines the order and timestamp for the transactions contained within the events.

It's worth pointing out that a "witness" or a "famous witness" is an event, not a computer. There isn't a computer acting as a leader to make these decisions.  These "decisions" are virtually being made by the events in the hashgraph. Every computer looks at the hashgraph and calculates what the famous witness is saying. So they all get the same answer. There's no way to cheat.

12) On page 8 of the whitepaper you write, "This virtual voting has several benefits. In addition to saving bandwidth, it ensures that members always calculate their votes according to the rules." Who makes the rules?

The "rules" are simply the consensus algorithm given in the paper.  Historically, Byzantine systems that aren't leader based have been based on rounds of voting.  In those votes, the "rules" are, for example, that Alice must vote in round 10 in accordance with the majority of the votes she received from other people in round 9.  But since Alice is a person (or a computer), she might cheat, and vote differently. She might cheat by voting NO in round 10, even though she received mostly YES votes from others in round 9. 

But in the hashgraph, every member looks at the hashgraph and decides how Alice is supposed to vote in round 10, given the virtual votes she is supposed to have received in round 9.  Therefore, the real Alice can't cheat. Because the "voting" is done by the "virtual Alice" that lives on everyone else's computers.

There are also higher-level rules that are enforced by the particular app built on top of the Swirlds platform. For example, the rule that you can't spend the same coin twice.  But that's not what that sentence was talking about.

13) How are transactions validated and who validates them?

The Swirlds platform runs a given app on the computers of every member who is part of that shared world (a "swirld").  In Bitcoin terminology, the community of members is a "network" of "full nodes" (or of "miners"). The hashgraph consensus algorithm ensures that every app sees the same transactions in the same order. The app is then responsible for updating the state according to the rules of the application.  For example, in a cryptocurrency app, a "transaction" is a statement that X coins should be transferred from wallet Y to wallet Z. The app checks whether wallet Y has that many coins. If it does, the app performs the transfer, by updating its local record of how much is in Y and how much is in Z.  If Y doesn't have that many coins, then the app does nothing, because it knew the transaction was invalid.

Since everyone is running the same app (which is Java code, running in a sandbox), and since everyone ends up with the same transactions in the same order, then everyone will end up with the same state.  They will all agree exactly how many coins are in Y after the first 100 transactions. They will all agree on which transfers were valid and which were invalid.  And so, they will all sign that state. And that signed state is the replicated, immutable ledger.

14) What was the original motivation for creating Swirlds?

We can use the cloud to collaborate on a business document, or play a game, or run an auction. But it bothered me that "cloud" meant a central server, with all the costs and security issues that implies.  It bothered me a lot. 

It should be possible for anyone to create a shared world on the internet, and invite as many participants as they want, to collaborate, or buy and sell, or play, or create, or whatever.  There shouldn't be any expensive server. It should be fast and fair and Byzantine.  And the rules of the community should be enforced, even if no single individual is trusted by everyone. This should be what the internet looks like.  This is my vision for how cyberspace should run.  This is what we need.

But no such system existed.  Whenever I tried to design such a system, I kept running into roadblocks. It clearly needed to be built on a consensus system that didn't use much computation, didn't use much bandwidth, and didn't use much storage, yet would be completely fair, fast, and cheap.

I would work hard on it for days until I finally convinced myself it was impossible. Then, a few weeks later, it would start nagging at me again, and I'd have to go back to working intensely on it, until I was again convinced it was impossible.

This went on for a long time, until I finally found the answer. If there's a hashgraph, with gossip about gossip, and virtual voting, then you get fairness and speed and a math proof of Byzantine fault tolerance. When I finally had the complete algorithm and math proof, I then built the software and a company. The entire process was a pretty intense 3 years.  But in the end, it turned out to be a system that is very simple.  And which seems obvious in retrospect.

 SUMMARY:

The DAG with hashes is not new, and has been widely used. Using it to store the history of gossip ("gossip about gossip") is new.  

The consensus algorithm looks similar to voting-based Byzantine algorithms that have been around for decades. But the idea of using "virtual voting" (where no votes ever have to cross the internet) is new. 

A distributed database with consensus (a "replicated state machine") is not new. But a platform for apps that can respond to both the non-consensus and consensus order is new.

It appears that hashgraph and the Swirlds platform can do all the things that are currently being done with blockchain, and that hashgraph has greater efficiency. But hashgraph also offers new kinds of properties, which will allow new kinds of applications to be built.

Overall Consensus Methodology

What is the underlying methodology of the used consensus?

The Swirlds hashgraph consensus system is used to achieve consensus on the fair order of transactions. It also gives the consensus timestamps on when each transaction was received by the community. It also gives consensus on enforcement of rules, such as in smart contracts.

How many nodes are need to validate a transaction? (% vs number)  How would this impact a limited participation network?

Consensus is achieved when more than 2/3 of the community is online and participating. Almost a third of the community could be attackers, and they would be unable to stop consensus, or to unfairly bias what order becomes the consensus for the transactions.

Do all nodes need to be online for system to function?   Number of current nodes?

Over 2/3 of the nodes need to be online for consensus. If fewer are online, the transactions are still communicated to everyone online very quickly, and everyone will immediately know for certain that those transactions are guaranteed to be part of the immutable ledger. They just won't know the consensus order until more than2/3 come online.

Does the algorithm have the underlying assumption that the participants in the network are known ahead of time? 

No, that's not necessary.  Though it can be run that way, if desired.

Ownership of nodes - Consensus Provider or Participants of Network?

The platform can be used to create a network that is permissioned or not.

What are current stages of mechanism?

Transactions are put into "events", which are like blocks, where each miner can mine many blocks per second. There is never a need to slow down mining to avoid forking the chain. The events are spread by a gossip protocol. When Alice gossips with Bob, she tells Bob all of the events that she knows that he doesn't, and vice versa. After Bob receives those, he creates a new event commemorating that gossip sync, which contains the hash of the last event he created and the hash of the last event Alice created before syncing with him. He can also include in the event any new transactions he wants to create at that moment. And he signs the event. That's it. There is no need for any other communication, such as voting. There is no need for proof of work to slow down mining, because anyone can create events at any time. 

When is a transaction considered "safe" or "live"?

As soon as Alice hears of a transaction, she immediately verifies it and knows for certain that it will be part of the official history. And so does anyone she gossips with after that. After a short delay (seconds to a minute or two), she will know its EXACT location in history, and have a mathematical guarantee that this is the consensus order. That knowledge is not probabilistic (as in, after 6 confirmations, you're pretty sure). It's a mathematical guarantee.

What is the Fault Tolerance?  (How many nodes need to be compromised before everything is shut down?)

This is Byzantine fault tolerant as long as less than 1/3 of the nodes are faulty / compromised / attacking.  The math proof assumes the standard assumptions: attacking nodes can collude, and are allowed to mostly control the internet. Their only limit on control of the internet is that if Alice repeatedly sends Bob messages, they must eventually allow Bob to receive one.

Is there a forking vulnerability?

The consensus can't fork as long as less than 1/3 are faulty / attacking.

How are the incentives defined within a permissioned system for the participating nodes?

Different incentive schemes can be built on top of this platform.

How does a party take ownership of an asset?

This is a system for allowing nodes to create transactions, and the community to reach consensus on what transactions occurred, and in what order. Concepts like "assets" can be built on top of this platform, as defined by an app written on it.

Cryptography/Strength of Algorithm:

How are the keys generated?

Each member (node) generates its own public-private key pair when it joins.

Does the algorithm have a leader or no?

No leader.

How is a node behavior currently measured for errors?

If a node creates an invalid event (bad hashes or bad signature) then that invalid event is ignored by honest nodes during syncs. Errors in a node can't hurt the system as long as less than 1/3 of the nodes have errors.

Governance:

How are controls/governance enforced?

If an organization uses the platform to build a network, then that organization can structure governance in the way they desire.

Tokenization (if used):

Are there any transaction signing mechanism?

Every event is signed, which acts as a signature on the transactions within it. An app can be built on top of this platform that would define tokens or cryptocurrencies.

Performance:

What is current time measurement?  For transaction to be validated? For consensus to achieved?

The software is in an early alpha stage. The answers to this questionairre refer to what the platform software will have when it is complete. For a replicated database (every node gets every transaction), it should be able to run at the bandwidth limit, where it handles as many transactions per second as the bandwidth of each node allows, where each node receives and sends each transactions once (on average) plus a small amount of overhead bytes (a few percent size increase). For a hierarchical, sharded system (where a transaction is only seen by a subset of the nodes, and most nodes never see it), it should be possible to scale beyond that limit. But for now, the platform is assuming a replicated system where every node receives every transaction. 

Security:

Does your mechanism have Digital Signature?

Yes, it uses standards for signatures, hashes, and encryption (ECDSA, SHA-256, AES, SSL/TLS)

How does system ensure the synchrony of the network (what is time needed for the nodes to sync up with network?)

No synchrony is assumed. There is no assumption that an honest node will always respond within a certain number of seconds. The Byzantine fault tolerance proofs are for a fully asynchronous system. The community simply makes progress on consensus whenever the communication happens. If every computer goes to sleep, then progress continues as soon as they wake up.  It should even work well over sneaker-net, where devices only sync when they are in physical proximity, and it might take days or months for gossip to reach everyone. Even in that situation, the consensus mechanism should be fine, working slowly as the communication slowly happens. In normal internet connections with a small group, consensus can happen in less than a second.

Do the nodes have access to an internal clock/time mechanism to stay sufficiently accurate?

There is a consensus timestamp on an event, which is the median of the clocks of those nodes that received it. This median will be as accurate as the typical honest computer's clock. This consensus timestamp does NOT need to be accurate for reaching consensus on the ordering the events, or for anything important in the algorithm. But it can be useful to the applications built on top of this platform.

Privacy:

How does system ensure privacy?

The platform allows each member to define their own key pair, and use that as their identity. If an app is built on top of this platform to establish a network, the app designer can decide how members will be allowed to join, such as by setting up a CA for their keys, or by having votes for each member, or by using proof-of-stake based on a cryptocurrency, etc.  The app can also create privacy, such as by allowing multiple wallets for one user. But the platform simply manages consensus based on a key pair per node.

Does the system require verifiable authenticity of the messages delivered between the nodes (Is signature verification in place?)

Yes, everything is signed, and all comm channels are SSL encrypted. 

How does data encryption work?

All comm during a gossip sync is SSL/TLS encrypted, using a session key negotiated using the keys of the two participants.  If an app wants further encryption, such as encrypting data inside a transaction so that only a subset of the members can read it, then the app is free to do so, and some of the API functions in the platform help to make such an app easier to write.

Implementation Approach

What are current uses cases for Consensus Mechanism?

In addition to traditional use cases (cryptocurrency, public ledger, smart contracts), the consensus mechanism also gives fairness in the transaction ordering.  This can enable use cases where the order must be fair, such as a stock market, or an auction, or a contest, or a patent office, or a massively multiplayer online (MMO) game.

Who is currently working with (Venture Capitalist,  Banks, Credit Card companies, etc.) 

Ping Identity has announced a proof of concept product for Distributed Session Management built on the Swirlds platform. Swirlds, Inc. is currently funded by a mixture of investors including venture capital, strategic partner, and angel funding.

Consensus: A Deeper Dive into the State of Blockchain

Q&A with George Samman, co-author of KPMG’s report: “Consensus: Immutable agreement for the Internet of value”

This interview is posted on both www.sammantics.com and www.bitsonblocks.net

Interviewer is Antony Lewis (AL) and interviewee is George Samman (GS).

AL

George, it’s a pleasure to chat with you.  The KPMG report “Consensus: Immutable agreement for the Internet of value” you co-authored was an interesting read and shone a light on some of the challenges facing private blockchains and distributed ledgers.  How would you summarise the findings?


CONSENSUS

GS

One of the key findings is that getting consensus right is really hard and some of the brightest minds in the space are coming to terms with this and re-examining a lot of the work they have done or researched.  Trying to re-model existing blockchain technology turns out not to be the answer.   

AL

When you say “getting consensus right”, what do you mean?  Do you mean multiple databases all reaching the same state quickly, or do you mean something else?

GS

Consensus has been around for as long as human beings have formed societies and needed to come to a state of agreement without necessarily trusting each other.  For purposes of this interview, we can refer to consensus computing for distributed systems.   In this context, it’s a way for nodes to agree on the validity of a transaction and updating the ledger with a coherent set of confirmed facts.  

AL

How would you describe the problems around achieving consensus?

GS

Keeping data in sync and ordering transactions are what consensus mechanisms are supposed to do.  The main problem that is being researched is around network stability and latency.

AL

Why is getting consensus right hard?  Why is the consensus methodology important?

GS

Most of the material on the subject of consensus comes from academic papers and applications in other industries like air traffic control or stabilizing airplanes.  The challenges are very different to the consensus challenges in capital markets - this hasn’t been done before and the issues are different.

For example, ordering becomes really important when you are dealing with stock market order books.  If multiple people are bidding for a stock at the same price who is the first one to get that price?  An issue of fairness also comes into play which some blockchain systems suffer from because of how they are attempting to achieve consensus. Leader based consensus systems have this problem because the leader selects the ordering of data, so you end up with a centralisation of control, which is what we are trying to avoid. So depending on the use case, the consensus mechanisms themselves become extremely important.

Further, with certain consensus systems, it turns out that there are a maximum number of nodes you can have before the system breaks. This is certainly an additional complexity if you need a lot of nodes in a network where parties do not trust each other but want to have equivalent write-access to the same ledger.

Getting consensus right is critical particularly when nodes can be located all over the world, and network latency adds another layer of complexity to system stabilization.

AL

Point taken on pre-trade orderbooks - I suspect that’s why this isn’t an area of focus any more for private blockchain vendors to financial service companies.

In terms of node distribution or decentralisation, I don’t see any reason why nodes in a high throughput distributed ledger will end up being scattered across the world.  Although with Bitcoin, we currently see geographical distribution for nodes, I think that any successful distributed ledger for the traditional financial industry will have nodes clustered in the same data centres, in the same building, where a number of banks rent hardware sitting physically next to each other, connected with short cables.  This should help to reduce some of the latency issues.  Of course this will be replicated to other data centres as redundant backups in case the main one fails.

To summarise, the ‘distributed’ in ‘distributed ledger technology’  will be ownership distribution rather than geographic distribution.

GS

That makes sense.  Although, if you want true distribution of the information, geographically distributing the nodes and using different cloud providers for the nodes add an extra layer of distribution and security.


SCALABILITY

AL

Moving on from consensus, to the concept of scalability and transaction throughput.  In financial markets a lot of tickets are printed, from the start of the process with orders being submitted and cancelled within milliseconds, through to matched trades and eventually settlement.  Clearly you need throughput.

GS

The problem of consensus becomes harder by orders of magnitudes when dealing with how many transactions financial institutions make. Its essential for the network to be up and running all the time. Scaling to 10s of thousands of transactions per second and beyond, keeping the network up and running is extremely difficult. This is why there aren’t many projects that are in production and able to do this as of today.  It’s a big challenge.   A general principle that could be thought about is to run two sets of consensus mechanisms one which runs locally and one which runs globally and make them intersect, this could be done at intervals of time in a Baby Step Giant Step (BSGS) manner.

 

Regarding  scalability the notion that you start a blockchain with an endless lifetime is still preliminary. The reasoning for that is 3 fold:

  1. Public blockchains are supposed to be everlastingly immutable, but are immature and have not yet figured out how to deal with unknown issues, as we have seen with the recent issues with The DAO.

  2. Technological innovation has yet to come up with a suitable solution for the transaction volume common in the financial sector, and this also then becomes a consensus problem as well.

  3. Configurations - you can’t deploy a single service solution until you’ve tested and retested the correct configurations for such a vital network railroad.

 

AL

I have seen internal “Proof of Concepts” where a single or double node blockchain is spun up, with a user-friendly front end.  They seem to work, at a rudimentary level.  Surely it’s now a case of institutionalising the technology?

GS

Yes you are right the Proof of Concepts are validating that the technology “might be able to live up to its promise.”.  They also have great marketing value.  However, this is a long way off from institutionalized technology and the inherent stability necessary for this.  Institutional technology needs to be battle tested and hardened, and has to be as antifragile as possible.  Hence, I believe the cycle to get the technology up to an acceptable level for satisfying a switchover will be longer than people think.  There can be no room for mistakes even if there are inherent benefits in the technology.

AL

Ok, aside from consensus and scalability, what are the other challenges facing the private “DLT” space?

GS

I think one of the challenges continues to be a lack of industry standards. The longer that common standards and protocols aren’t agreed and written up by industry participants, the more harmful it can be when trying to integrate different solutions, and the further away from interoperability we become.  Is distributed ledger technology creating the next legacy system problem?

Another problem is a technical problem around interoperability with existing systems and potentially between different blockchain networks.  I think this directly correlates to the above point about standards and protocols.  How will these ledgers interact if they are built separately for different types of assets and then work with existing market infrastructure technology?

What we are seeing is sort of the exact opposite of a common framework being adopted where people are trying all sorts of different things.

AL

Sure, but that’s what you would expect with a new set of technologies - and then some sort of Darwinian selection process will occur, and the best will win.  At the heart of it, this seems to be an interoperability and standards discussion. APIs and inter-system communication comes to mind here.  It seems that a lot of the interoperability issues could be fixed by creating standards for common APIs.  You then remove the privacy concerns of shared ledgers.  But APIs don’t solve for immutability and decentralised control - if that’s really what’s wanted.


EMERGING SOLUTIONS


AL

An interesting takeaway is that R3 is not building a blockchain.  That’s surprising to some people - one of the world’s most well known “blockchain companies” is not building a blockchain!

 

GS

I think it’s surprising because some people thought that private “distributed ledger technology” would be the panacea to cure us of all the “ills” of public blockchains (total transparency of network, mining and potential centralization, anonymous actors and the slow speed of transaction times) - however we have seen that is not the case.  In my opinion, R3 realized that the financial problems they aim to solve are not blockchain compatible at the present time.

We are seeing the amount of nodes in these distributed ledger networks shrink all the way down to two - ie back to bilateral communication, with “consensus” being the two nodes agreeing. This is centralization and the exact opposite of what blockchains try to solve for.  A blockchain is supposed to offer immutable consensus. The benefits of transparency, no middleman, p2p transacting without needing trust and speed are what appealed to me about blockchains to begin with.

This also applies to replication of the data: while this can certainly be permissioned to allow certain nodes to have certain actions, those in the network benefit by knowing that whatever business logic that was supposed to happen did the way it was supposed to.

Well, in every system and with every tool there are tradeoffs. When you are performing certain capital market operations, and privacy and confidentiality are most important, a distributed ledger may not be your best tool. Particularly when we are still trying to get consensus right for scaling to hundreds of thousands of transactions per second.

Corda solves the consensus and ordering problems by not being a blockchain requiring network consensus: instead, computers communicate and agree bilaterally.  This gets rid of a lot of the complexity involved with the privacy issues of forming a network with your competitors.  This also brings in a great debate about whether or not a blockchain will be the end solution and if that solution will need consensus. Let the debate begin!  In my opinion Corda can be considered more of an oracle that can connect to blockchains if necessary.

AL

What do you mean, an oracle that can connect to blockchains?

GS

What I mean by oracle is a bridge between a blockchain and the outside world, so in the case of Corda it’s a platform that can speak to blockchains but is not a blockchain itself.

AL

On 2 June this year, Morgan Stanley Research published a report stating “For ASX, blockchain deployment seeks to reduce market costs, defend the clearing monopoly and grow new revenue streams”.  It’s amazing that we have moved from blockchains being “disruptive” to blockchains being used to “defend the clearing monopoly” so quickly!  No wonder there is confusion!  I tried to clarify this here: https://bitsonblocks.net/2016/05/09/confused-by-blockchains-revolution-vs-evolution/

GS

You are getting from the banks a lot of Orwellian doublethink. This is the ability to hold two contradictory thoughts in your head and believe that they both are true.  In this case that blockchains will change the world but we can’t use them properly for certain things we need to do, in a way we are comfortable doing them.

There have also been cautious, or even negative sentiments in recent days about the utility of blockchain technology. The CEO of ICE is not convinced about blockchains.

AL

Sure, some will hate, some will love.  What are you getting at?

GS

I would just say be cautious of false narratives and that there is a deep need for understanding what this technology is really good at, and what it might not be good at.

For me, consensus is a feature not a bug.  A blockchain is a transparency machine like nothing that has come before it. Therefore, if you want a blockchain, look for use cases where total transparency is suitable. There are three questions that need to be answered in order to help you guide your decision making:

1) Who are you?
 

2) What do you want to achieve?

3) Who will be the nodes?

If you can answer these questions than you are on your way to figuring out consensus and the types of permissions you want to configure in your blockchain experiment.

Knowing the types of entities that are going to be involved in the transaction as well as the type of assets being created are also big steps.  Once you have a handle on this figuring out the consensus layer is much easier and you can start to build applications on top of the stack.

 

AL

What about the Proof of Concepts that we are seeing in the media?

GS

A lot of the use cases that companies are going after right now don’t need a blockchain - or at the very least, the end clients - often banks, aren’t needing a blockchain solution for them yet.  A lot of the blockchain companies are also Proof of Concepts themselves and have still not been taken “out of the box.”.  This is where separating hype from reality starts. I also think a lot of use cases people are looking at for a blockchain to solve are things that aren’t meant to be solved by a blockchain.

From the company side it is important to define your identity: Are you a blockchain company or a company using a blockchain? There is a big difference. For example, if you are working on solving issues in trade finance and you are using a blockchain as a solution, unless you are designing your own platform from scratch, you are just improving efficiencies using technology, but you are still a trade finance company.


 

INDUSTRY


AL

Clearly we’re at the start of the innovation cycle and the problem is just that the hype and promise has accelerated and deviated from how quickly we can deliver.  This is an unfortunate reality, but sometimes necessary to attract the investment needed to light up a technology.  Can we reach the promised land of $20bn reduced annual cost by using distributed ledgers?

GS

I think eventually we do reach the $20 billion mark, and that’s nice, but it’s not revolutionary.  It’s also a drop in the bucket compared to what banks spend today.  In order to get there and switch systems, the costs saved will need to outweigh the money invested to do that. That hurdle may be too large to jump.  Maybe the way to think about it is, are there other accrued costs which will also be saved aside from just reducing settlement costs and backoffice reduction savings.  The answer is to this is yes.

While the cost savings are appealing to banks for many reasons talked about, I think the more relevant story will be how can we generate revenue from the apps built on DLT technology.  While ideas are floating around now, the reality will probably look very different from their original conception.

AL

You’re talking about how blockchain / DLT technology providers will monetise?

GS

Yes, the VC funding cycle has become long in the tooth and the IPO market is no longer attractive. Some of the private company tech darlings, including Airbnb are getting valuation writedowns.  The FinTech narrative is starting to question monetisation paths, and where the revenues will come from when VC money dries up.

AL

Scary picture - when and how will VC money dry up?

GS

This can come from rate hikes in the future or recession or some shock to the system.  It’s hard to predict, however the funding cycle has become long in the tooth.  Global growth has slowed and even Mary Meeker pointed this out in exquisite detail in her latest state of the union.

Particularly in the blockchain space, The DAO should be looked at as the top. This is really madness in a lot of ways but based on the sheer amount of money that was raised is astounding. I think we are post peak-hype and reality will start to set in sooner rather than later.

AL

The DAO raised the USD equivalent of $150 million from pseudonymous investors, to fund unknown projects, to be voted on by unknown participants.  That really does seem pretty nuts.  It was also hacked recently - or at least it behaved exactly as it was coded to, to the detriment of many of the investors who had not scrutinised the code.

So the billion dollar question - as celebrated startups move towards becoming technology providers, unable to monetise on a “per ticket” basis, how are the company valuations justified?  Who should we invest in?

GS

Valuations seem to be based on what the last round someone else raised at as a starting point. Particularly for the bigger startups who raising later stage rounds.

The financial service companies investing in these large rounds will not be taken for fools.  They understand valuation like no other.  What is interesting is the lack of high profile VC’s investing in these bigger rounds. The funding seems to be coming from the titans of finance and those that are at risk of being “disintermediated” by cryptocurrencies.  It’s a good narrative-control play.

The funding source from finance titans can also come back and bite DLT startups. If they are beholden to only the established incumbents, they might not be able to design the disruptive ecosystems promised by blockchain technology.

I think it’s way too early to predict any clear cut winners. I would be investing in the cloud companies that will be hosting all these companies, their data and their applications, and also the companies that are using blockchain technology properly.  This is not an easy thing to do when people are trying to fit square pegs into round holes. Simplicity always wins.

AL

What’s next for the DLT / Blockchain industry?

GS

Companies need to deliver, and companies need to make money to stay in business, therefore if you are under certain time constraints to make something people want and there are still inherent problems in the technology you want to use, you pivot to making things that can improve existing workflows.

This is what you have called “industry workflow tools” in your blog and although some costs may be saved, this doesn’t transform finance any more than the next efficiency technology.  In fact in many ways it exposes us to the same risks as have been seen in the past because privacy and confidentiality are more important than anything else for banks performing capital market operations.

The problem with this thinking is that this does nothing to benefit the consumer except maybe faster transaction times. The customer experience should be a major focus for banks as they already are one of the most hated brands for young couple consumers.

AL

Perhaps some of the cost savings will be passed to consumers, settlement will speed up, and collateral released so that businesses can make better use of working capital.

GS

We all hope so!

AL

Thanks for your time George!

 

Interviewee: George Samman is a blockchain advisor and consultant to global companies and startups as well as Entrepreneur in Residence at Startupbootcamp for bitcoin and blockchain in New York City. George also writes a blog on blockchain technology and use cases at sammantics.com and he can be found on twitter @sammantic

Interviewer: Antony Lewis is a cryptocurrency and blockchain consultant to financial institutions, policymakers, and professional services firms.  Antony lives in Singapore and writes a blog about bitcoins and blockchains, found at www.bitsonblocks.net

 

‘Immutable Me’ A Discussion Paper Exploring Data Provenance to Enable New Value Chains

The ID2020 Annual Summit will bring together industry leaders, NGOs, governments, emerging technology pioneers and cross-industry experts from both the public and private sector. The aim is, together participants will foster a global conversation and build a working coalition, to identify and build the enabling conditions for the creation of a legal digital identity for all individuals at risk.

In advance of the ID2020 all participants were requested to submit a paper on decentralised identity, or specific problems that could be solved via decentralisation or web-of-trust solutions.

The following paper was authored and submitted by George Samman and Katryna Dow for the Web-of-Trust Workshop following ID2020. This Meeco paper, explores the idea of an ‘Immutable Me’ – a step towards individuals having the means to decentralise attributes, claims, verification and provenance of their personal data. George Samman will represent Meeco at ID202o and the Web-of-Trust Conference.

Problem Statement

With the advent of blockchain, is there an opportunity to add a distributed layer between the data value and the consumer of personal data?

Furthermore, does the verification and provenance of the data enable an attestation, provided by a relying party, to eliminate the need to give up Personally Identified Information (PII) at all?

How can we enable people to access all the data they generate with full control of their master record and permission data on their terms using verified attributes without sacrificing their privacy?

“Up until now the power to capture, analyse and profit from personal data has resided with business, government and social networks. What if you and I had the same power?”
– Meeco Manifesto 2012

According to the QUT and PwC Identity 3.0 white paper:

“Developed economies are moving from an economy of corporations to an economy of people. More than ever, people produce and share value amongst themselves, and create value for corporations through co-creation and by sharing their data. This data remains in the hands of corporations and governments, but people want to regain control. Digital identity 3.0 gives people that control, and much more.”

Identity is moving beyond issued instruments like passports, social security cards and ID cards. It is moving towards contextual identity in which “I can prove who I am” (persona) in the context of what I am doing.

Government issued identity instruments are relatively easy to forge. Every day, stolen identities are exploited through organised crime and on-line hacking activities. Conversely, personal attributes, behaviour, social and reputational data is more difficult to forge, in part because it makes up an immutable timeline of our life.

Increasingly the sum of the parts of our digital exhaust and social presence creates a strong identity. However the opportunity for individuals to use this for their own benefit is limited.

Proposal

The movement from User Centric Identity to Self Sovereign Identity is underway and becoming a key trend for the future of how individuals will control their own attributes.

Using blockchain technology to record data provenance, Meeco is working at the forefront of this movement and currently working on a proof of concept that allows individuals to add verification and provenance attestations to their data attributes. This is in addition to the existing permission management of their attributes.

Meeco aims to be blockchain agnostic (since what type of ledger is used will be use case dependent), thus enabling individuals to link provenance to data/attributes to support a range of personas and enable progressive disclosure. This approach also supports the option for individuals to use private, public, permissioned and permissionless chains.

The identity pieces (data and attributes) can be use-case sensitive, thus create context-based personas through unifying only the relevant attributes across different chains.

Personal control is central to increasing the power individuals hold over the range of attributes and claims that make up their identity. Enabling identity markers to be thin sliced, refined and contextual provides added privacy protection. The combination of attribute, verification and provenance provides the capability for data governance to move from data collection of personally identifiable information (PII), to binary pull requests, i.e. over 18 years (yes/no) versus date of birth.

This approach provides protection as the individual solely has the power to bring these attributes together with the added value of verification. For the relying party, the option exists to store the provenance rather than the attribute on public and private blockchains and distributed ledgers, thus providing an immutable audit trail for assurance without the compliance risk of collecting, managing and holding the data.

Why add Provenance?

Provenance refers to the tracking of supply chains and provides a record of ownership via an audit trail, which can be ordinated (the specific order matters). In the case of attributes and claims it is important that the data can point back to a reliable record (the golden record) and that this is shown to be immutable.

It’s important for purposes of integrity, transparency and counterfeiting that this asset and its path be known or provable every step of the way.  A supply chain refers to the creation of a network in which an asset is moved, touching different actors, before arriving at a destination.  It helps bring time and distance together in a manageable way. The tracking of this asset has real value to the actors involved. This equally applies to identity and all the components that make up that identity.

This approach is a pathway to turning data (single attributes) into networked data flows that compound in value and become assets similar to financial synthetics such as asset backed securities (ABS) as much as anything else in the world considered to be an asset such as Norwegian salmon, diamonds, prescription drugs, or Letters of Credit (LOCs) and Bills of Lading (BOLs).

It is important to note that this is not one master identity (record) token that is tokenized and travels through a network, but rather the attributes associated with that identity that are called upon based on context.

In order for data provenance to be effective (from a technology standpoint), it must fulfill certain requirements and characteristics that would upgrade supply chain management and monitoring.  According to Sabine Bauer in his paper titled “Data Provenance in the Internet of Things” they are:

  • Completeness Every single action, which has ever been performed, is gathered
  • Integrity Data has not been manipulated or modified by an adversary
  • Availability The possibility to verify the collected information. In this context, availability is comparable to auditing
  • Confidentiality The access to the information is only reserved for authorized individuals.
  • Efficiency Provenance mechanisms to collect relevant data should have a reasonable expenditure.

Additionally Meeco believe these requirements require the additional characteristics of:

  • Privacy the ability to control the access, permission and visibility of personal attributes
  • Unlinkability a state where this personal data must be secure and not get into the wrong hands
  • Transparency of the chain and total traceability of the transactions, particularly when it comes to actions and modifications.

A Blockchain fulfils all of these requirements.

How Will Meeco Link Data Provenance To Attributes on a Blockchain?

Blockchain allows for Digital Identity 3.0 Quoted from the PWC paper:

“Digital identity 3.0 is a private and integrated master record that exists independently of any immediate commercial or legal context. It empowers people to create new attributes, share these attributes selectively as they connect with others, and create experiences and value beyond what can be predicted”.

Master Record

For preservation of privacy there must be some way to protect the master record and the range of attributes which can be associated with a master record where explicit permission has not been granted.

 

The primary purpose of a master record is to create value specifically for the individual. However, the existence of this verified master record can in return, if permissioned, create significant value for receiving parties; i.e. organisations, enterprises, institutions and other individuals.

It is not intended for the master record will not be stored or visible on the blockchain. The intention is not to permission the master record, but to reference back to its immutable existence. This way the master record can support infinite links to data attributes of association, without linking all the attributes in one chain. This master record will have an anonymous unique identifier that is only known to the owner via private keys.

Once these attributes are created they can be put on a supply chain without the need to share the entire identity only the value that is needed in order to validate the integrity of it.

The Value of Provenance For Privacy Preservation

Tracking the origin and movement of data across a supply chain in a verifiable way is a difficult thing to do. In supply chains stretching across time and distance, all of these items could suffer from counterfeiting and theft. Establishing a chain of custody that is authenticated, time-stamped and replicated to all interested parties is paramount to creating a robust solution.

The problem can be addressed using blockchains in the following way:

  1. When the data is created, a corresponding digital token is issued by a trusted entity, which acts to authenticate its point of origin (the attribute).
  2. Then, every time the data changes hands that is the attributed associated with that identity (persona), the digital token is moved in parallel, so that the real-world chain of custody is precisely mirrored by a chain of transactions on the blockchain.
  3. A tokenized attribute, is an on-chain representations of an item of value transferred between participants. In this case proof of the golden record.

This token is acting as a virtual ‘assertion of identity’, which is far harder to steal or forge than a piece of paper. Upon receiving the digital token, the final recipient of the physical item; whether a bank, distributor, retailer or customer; can verify the chain of custody all the way back to the point of origin.

This digital identity token can also act as a mechanism for collectively recording and notarizing and linking any type of data back to the master record.  A blockchain can provide a distributed database by which all the records and assertions about an attribute are written and linked, accompanied with a timestamp and proof of origin that ties back to the golden record token in a most verifiable way.  An example could be a hash of a record that a certain element of my attribute or claim was verified when engaged in a certain type of action. This distributed database also stops corruption and theft by storing the multiple pieces of our attributes in a highly distributed manner that require the proper keys to open and put back together.

This approach is designed to counter the current problem of how companies collect personally identifying data and then use it to track individuals without explicit or informed consent. The data is used to target individuals with the aim to mold and influence behavior. This current approach of tracing our identity does not afford individuals to control their attributes or the elements that link them, as a result we don’t get to realise the value of, or monetize the data companies collect on us.

How Will The Data Get Stored?

The Multichain blog eloquently describes how data can be optimally recorded on the blockchain and this approach is informing how Meeco is approaching proof-of-concepts:

“In terms of the actual data stored on the blockchain, there are three popular options:

  1. Unencrypted data. This can be read by every participant in the participating blockchain, providing full collective transparency and immediate resolution in the case of a dispute.
  2. Encrypted data. This can only be read by participants with the appropriate decryption key. In the event of a dispute, anyone can reveal this key to a trusted authority such as a court, and use the blockchain to prove that the original data was added by a certain party at a certain point in time.
  3. Hashed data. A “hash” acts as a compact digital fingerprint, representing a commitment to a particular piece of data while keeping that data hidden. Given some data, any party can easily confirm if it matches a given hash, but inferring data from its hash is computationally impossible. Only the hash is placed on the blockchain, with the original data stored off-chain by interested parties, who can reveal it in case of a dispute.”

For the purpose of Meeco, option (3), hashed data is our area of focus. Meeco advocates the data be at rest in a distributed network, e.g. financial data might remain with the issuing bank, medical records with a physician and student records with the school admin system. In turn all these records can be augmented by the individual records, stored in personal clouds with read and write access based on permissions.

Where the Meeco API-of-Me is used for a direct implementation between peers (individuals and organisations) connected parties can benefit from real-time, read, update, and audit records. Only the individual has the ability to federate attributes into a single context specific view.

Data at rest can reside in a personal cloud and/or enterprise data store. The value created is how these data sources interoperate through an audit trail and order of tasks, aligned to a specific outcome, in a specific context using the minimal number of attributes required to disclose at each step of the process.

It is the sum of the parts that creates a strong, sovereign and immutable series of records. When backed by assertions, verifications and provenance, this master record of records becomes a value personal asset. In this scenario individual will enjoys the same rights and value that governments and institutions currently have.

The ultimate aim is to provide individuals with the means to collect and connect attributes that strengthen their personal and context based assertions. This may include ‘I am’ statements such as:

  • I am a citizen of
  • I am a qualified professional in the field of
  • My income can support my application for
  • I am old enough to access this service
  • My delivery address for today is
  • I can be discovered by.

Conclusion

We are entering what some are calling the 4th phase of the Internet, defined by the right of individuals to assert their sovereignty. In this context we are moving beyond send, search and social to the vantage point of self-sovereignty.

A blockchain or distributed ledger will act as an enabler and supporter of realizing this value.

This approach supports the concept of ‘minimum viable identity information’. This is dependent on the situation, use-case and context. This is why personas become so critical as a means to protect identity, minimize the cost of collection, increase compliance and enable new networked value flows. This network of values flows enables individuals to engage in transactions for a particular need at a particular point in time for a particular desired outcome.

This is the vantage point by which the individual generates identity through every day activities, assertions, attributes, claims and context. By placing the individual above the attributes collected by and about them, they are able to orchestrate a value flow across the silos of their life, enabling new value chains.

Given the increasing security, fraud and counterfeiting issues associated with the current model of collecting and storing personal data, compliance and risk mitigation will drive the opportunity for individuals to control and permission their personal data.

Digital Identity 3.0 together with blockchain and distributed ledger technology takes us towards the vision that we (humans), may become the custodians, issuers and provenance providers of our identity.

Glossary

The following terms are relevant to this article. These are just a subset of the terms generally used to discuss digital identity, and have been minimized to avoid unnecessary complexity.

The below definitions come from the blog of Christopher Allan, unless otherwise referenced:

 Attributes
Every Digital Identity has zero or more identity attributes. Attributes are acquired and contain information about a subject, such as medical history, purchasing behaviour, bank balance, age and so on.[4] Preferences retain a subject’s choices such as favourite brand of shoes, preferred currency. Traits are features of the subject that are inherent, such as eye colour, nationality, place of birth. While attributes of a subject can change easily, traits change slowly, if at all.

 Authority
A trusted entity that is able to verify and authenticate identities. Classically, this was a centralized (or later, federated) entity. Now, this can also be an open and transparent algorithm run in a decentralized manner.

Claim
A statement about an identity. This could be: a fact, such as a person’s age; an opinion, such as a rating of their trustworthiness; or something in between, such as an assessment of a skill.

Credential
In the identity community this term overlaps with claims. Here it is used instead for the dictionary definition: “entitlement to privileges, or the like, usually in written form”23. In other words, credentials refer to the state-issued plastic and paper IDs that grant people access in the modern world. A credential generally incorporates one or more identifiers and numerous claims about a single entity, all authenticated with some sort of digital signature.

Identifier
A name or other label that uniquely identifies an identity. For simplicity’s sake, this term has been avoided in this article (except in this glossary), but it’s generally important to an understanding of digital identity.

Identity
A representation of an entity. It can include claims and identifiers. In this article, the focus is on digital identity.

Permission / Permissionless
A permissioned system is one in which identity for users is whitelisted (or blacklisted) through some type of KYB or KYC procedure; it is the common method of managing identity in traditional finance.67 In contrast, a permissionless system is one in which identity of participants is either pseudonomyous or even anonymous. Bitcoin was originally designed with permissionless parameters although as of this writing many of the on-ramps and off-ramps for Bitcoin are increasingly permission-based.
http://www.ofnumbers.com/wp-content/uploads/2015/04/Permissioned-distributed-ledgers.pdf

Personal Data
The definition of personal data is evolving. Traditionally, that definition was pre-determined and governed through the use of a binary approach: In most jurisdictions, the use of personally identifiable information (PII) was subject to strict restrictions whereas the use of non-PII was often uncontrolled. However, what is considered personal data is increasingly contextual; it changes with personal preferences, new applications context of uses, and changes in cultural and social norms.

Traditionally, organizations have used a variety of techniques to de-identify data and create value for society while protecting an individual’s privacy. Such data was not subject to the same rules as PII, as an individual could not be identified from it. But technological advances and the ability to associate data across multiple sources is shifting boundaries of what is or is not PII, including potential re-identification of previously anonymized data.

This issue is the subject of significant debate with some arguing that this means that all data is effectively personally identifiable and should be treated as such. Others urge caution, arguing that this would curtail many of the beneficial uses of anonymous data with minimal gains in privacy. A shift in approach to thinking less about the data and more about the usage could offer a way forward. If the usage impacts an individual directly it would require different levels of governance than data which is used in an aggregated and anonymized manner.
http://www3.weforum.org/docs/WEF_IT_UnlockingValuePersonalData_CollectionUsage_Report_2013.pdf

Links

Subprime Auto Loans: Another Reason For Using Distributed Ledgers for Asset Backed Securities

 

“Lending bubbles usually look the same. Credit criteria is loosened. Verification standards are relaxed. The people selling the loans make money when the loans are booked, but do not suffer from losses when the loans go bad. Consumers focus on the monthly payment, rather than the economics of the deal. And we convince ourselves that it will be different this time. Almost all of those elements are present in the current subprime auto lending market. Some players are clearly worse than others. And as delinquency and losses increase, which they inevitably will do, we will discover which companies have remained prudent, and which companies have been irresponsible."   Nick Clements MagnifyMoney

There have been reports recently about how the subprime auto lending market is the very similar to the subprime mortgage crisis and could very well be the next bubble.  While there are certainly similarities between the two, the purpose of this post will be to talk about how distributed ledgers should be used for asset backed securities. In one of my recent posts, "Could The 2009 Subprime Mortgage Crisis Have Been Avoided With Blockchain?", I discussed how if distributed ledgers had existed at the time, the crisis at the very least could have been mitigated.  The reason I am bringing up another example of this is because history seems to be repeating itself.  It seems not much has been learned from the last crisis and asset backed securities continue to exist without any real transparency for investors. For consumers and regulators and this can not go on, as the dangers have become too great.

This post will explore the problems in subprime auto lending today, how using a distributed ledger could be beneficial for all parties involved, and why it is imperative going forward that asset backed securities use this technology to dramatically improve transparency in this esoteric asset class.  The two important use cases here are: 1) provenance of data and 2) inter-company audit trails. From a banking perspective, increasing transparency is necessary to retain and attract new customers and investors as well as provide faith in counterparties that are involved in transactions. Regulators will also be satisfied by getting a better handle on risk and the ability to intervene, if needed, to minimize the risk.

What Does An ABS Auto Loan Look Like?

Auto loans are what is know as an amortizing security.  This means that in addition to principal payments on the debt, interest payments are made to the holder (s) of the security. The regular payment the holder of the security receives is derived is from the payments that the borrower who received the loan makes as part of a an agreed payment schedule.  This is very granular and the borrower in this case is a person like you or me, who is taking out a loan and has a certain degree of creditworthiness.  Auto Loan ABS's have cash flows from principal payment, monthly interest payment, and prepayment. Prepayment risk is very low for auto loans and this makes investors very happy as they collect a nice interest rate on holding these instruments. This is also the case due to how fast a car depreciates, so the prepayment is hardly worth it for the borrower.  So as long as the borrowers are able to pay this loan back, investors make a tidy income stream. 

The Subprime Auto Lending Market

As the chart below shows, the total volume of U.S. auto loans has reached an all time high (~1 trillion USD) and one fifth of those loans are subprime loans.   

 

Auto loans make up the second largest sub-sector of the asset-backed security (ABS) market.  As in the case of mortgage backed securities, auto loans are mainly packaged in the same way. There is an underlying pool of assets (of all credit ratings) that are turned into a synthetic security from which a stream of income payments can be derived.  This pool of assets is generally illiquid and small on their own so bundling the pool of assets into a new security (ABS) provides the proper liquidity (this is called securitization) so they can be sold to investors. In theory this is supposed to diversify the risk associated in investing in high risks assets, especially in the subprime market. Each security represents just a fraction of the total underlying payments from a pool of auto loans (of all credit ratings).  

Special Purpose Vehicles are created for the purpose of bundling these pools of assets together into pools based around risk preferences, with the highest rated having the least risk.  The bundling of these assets based on risk preference are called tranches. The chart below shows how this is structured.

 

Banks are incentivized to move the credit risk off their books by securitizing these loans and selling them for cash as quickly as possible to someone else who is willing to assume such risk.

The (Subprime) Auto Loan Market Now And Why It's Similar to 2009

As mentioned above in a recent Wall Street Journal article, the auto loan market is at all time highs (~1 trillion), and 20% of that amount is subprime. The article drills deeper into the numbers and shows that the subprime market has grown 3x since 2009 ($9 billion to $27 billion). As if that isn't bad enough, these loans have become even more subprime:

"In 2011, 12% of securitized loans went to people with credit scores below 550. In 2015, 30% had scores below 550."

Fitch, the rating agency, has stated that US Subprime Auto Loan delinquencies have reached their highest level since 1996. This chart from MagnifyMoney shows the increase:

 

As mentioned above recent reports have shown that delinquencies are reaching record highs and climbing. The MagnifyMoney survey indicate that many elements of the subprime mortgage crisis are evident in today’s auto lending market:

  • Dealers have potentially replaced the role of broker. Auto dealerships make money when they sell cars, and they make commissions (called “dealer discounts”) when they sell auto financing. Dealers, and in particular the people selling the finance products, have limited “skin in the game” if borrowers default. In many ways, dealers have the same financial incentives as the subprime mortgage brokers.

  • Auto finance companies are competing to get the dealer’s business. As a result, they are often compelled to reduce the credit criteria and relax verification. The dealer networks, which control the customer and the volume, have a lot of the same power as banks and finance companies that are hungry for volume. If a bank asks too many questions, the dealer can easily move to the next easiest lender.

  • Down payment requirements have been declining. And, in the MagnifyMoney survey, we see that income verification requirements have also been reducing significantly.

  • To help reduce monthly payments and increase loan amounts, banks have been offering longer terms. It is now relatively easy to take out a used auto loan with a 7-year term.

Sound familiar? Well it certainly should.

The Wall Street Journal has also reported on this in detail.  It mentioned a subprime lender by the name of Skopos which pooled together a group of subprime loans and sold it for $154 million. This was backed by a pool of 10,000 subprime loans with terms of 5.6 years and an interest rate of (wait for it....) 20%.  87% percent of the borrowers had credit scores below 600 and it gets better: 1/3 of all loans were to people with credit scores below 530 or had no credit score at all.  Here is the frontpage of Skopos website. Sounds great right?

 

There was even a rating agency, the Kroll Bond Rating Agency which gave this particular pool an investment grade rating. According to the Wall Street Journal, " (Kroll) said the securities are structured so that half the loans in the pool could default and all the bonds still would be repaid, while the lending company absorbs the losses. The high interest rates on the loans are expected to cover any losses from people defaulting."  Through February, approximately 12% of these loans (as well as those of another company called Exeter) were 30 days past due and 1/3 were more than 60 days. 

Transparency, Provenance and What A Distributed Ledger Would Look Like

In a previous blog post on subprime mortgages I explained why using a distributed ledger would have helped deter the crisis. I also described data provenance and why I thought it made sense for asset backed securities.  In this post I am going to go a step further and describe who I think needs to be a part of the supply chain in order to make a distributed ledger a robust solution for the problems of lack of transparency and tracking audit trails.  I will also go through the process of what a typical asset backed transaction looks like and why being able to track and monitor it is necessary for establishing proper audit trails of the supply chain and increases transparency. Transparency is a very good thing.

In order for provenance to work really well, getting the entire ecosystem to agree to protocols and standards (Identity and who can read/write/view) is very important.  Also getting all the players on the ledger is necessary so let's map out who needs to be the nodes (in no particular order):

  • Auto finance companies- (giving loans to people who may or may not be credit worthy)
  • Banks
  • Underwriters- selected by the seller to purchase the ABS and resell them to investors. They are supposed to perform due diligence on the assets and the structure of the transaction and confirm that the  prospectus and other documentation are accurate. Once they sell, they don't have any more interaction with, or interest in, the transaction. 
  • Lawyers- part of the above process.
  • Rating Agencies
  • Private Investors and other Transaction Participants- (those who buy the ABS's from the sellers and can re-sell them)
  • Servicers- perform duties that are stipulated in the servicing agreement for the transaction investors. Servicers take a fee for performing these obligations. They typically collect all income from the assets, enforce the assets as needed and may perform any evaluations needed to substitute assets.  Servicer reports to the security holders information on collections and how the assets are performing. This helps determine the payment streams or losses to investors. Servicers evaluate and approve loan modifications, short sales and other default strategies to mitigate losses. It also takes control in the case of foreclosures to undertake actions in the trustee's name. (The seller and the servicer have more knowledge of the transaction than any other participants.)
  • Trustees- applies funds delivered and instructed by the servicer and provided in the transaction documents to pay interest and principal on securities, to fund reserve accounts and purchases of additional assets.  Trustees act as Asset Custodians, analytics providers and paying agent.
  • Seller- (the party that sells the ABS)
  • Independent Accountants
  • Regulators

A Typical Asset Backed Security Transaction

  1. A seller transfers assets in 2 ways: A trustee receives pass through certificates that show ownership interests in the assets or a Special Purpose Vehicle (SPV) is created.
  2. Seller of the assets can sell them through an SPV directly or through underwriters to investors.
  3.  A pooling and service agreement or something similar forms the basic document which sets forth the relationships between the parties and the assets. (a very long, hard to understand document)
  4. The seller along with underwriters and/or private investors, determines the structure, drafts the documents and prices the transaction.
  5. The seller selects other participants for the deal including the underwriter, servicer and trustee. (Note: at the beginning of the transaction the seller knows more than any other participant about the assets and the structure of the transaction). Major transparency issues here.
  6. Based on agreement type the seller generally has continuing obligations for the pools of assets, such as adding or replacing assets if they drop below certain value thresholds.  At other times, they have no obligation and the next party to own them can be the one to perform these functions. (This is where chain of custody (provenance) can become very confusing as assets move in and out of the pool and the tranches in the pool. Not to mention in some of the tranches it is only pieces of an asset that are bundled together, while other pieces of that same asset may be pooled elsewhere.)
  7. Over time the securities will change hands as well as what is inside of the pool of assets. So will the credit ratings of these assets.

Following the financial crisis,  Regulation AB was strengthened and enhanced to protect investors. This rule sets out to make disclosure requirements, reporting and registration more transparent. While it has been a good step forward, the rule still falls short.  Using a distributed ledger for asset backed securities will increase transparency for all parties in the ecosystem.

Distributed Ledgers for Asset Backed Securities

Some of the problems with ABS do not need a distributed ledger to solve them. One example is the long and tedious, hard to read documentation around the transaction.  That can be done using artificial intelligence or programmatic algorithms that make sure key terms and conditions are in the documentation. However if this data gets stored on the ledger, it could make for a robust system, especially when certain thresholds are breached things fall out of line or assets are added/removed. If anything in the initial documentation changes or in the original pool of assets (as it changes hands) it will be known by everyone on the ledger at the same time as a message is sent to all the nodes.  Think "smart contracts".  Another thing which can help is a commitment by all parties to digitization of all records, titles and deeds.

What a distributed ledger could be used for are the things that continue to be a problem for all those involved in a transaction including:

  • asymmetric information: the sellers, servicers & trustees hold more information about the transaction than any of the other participants in the transaction.
  • trust- based on the above statement, a distributed ledger allows for parties who don't trust each other to make transactions (this includes the rating agencies who were one of the main culprits of the last crisis) and this allows for creating;
  • real-time audit trails- knowing and tracking exactly what is added to the pools and removed as well as a chain of custody of the ABS changing hands in addition to changing what is in the tranches and pools as it goes from investor to investor. This eventually will lead to
  • disintermediation- of many of the duties of the middlemen involved in the transaction. Particularly servicers, trustees and the rating agencies.  A lot of fees change hands for their services. Allowing for much of this work to be put into smart contracts with digital signatures and setting up escrow accounts disintermediates a lot of the obligations of these trusted 3rd parties.

After some time a large amount of real-time data will be generated.  This will lead to new ways of modeling credit and risk. This creates views of real time actions and transactions of all the actors involved. In fact, real time modeling can lead to extrapolating out into the future many risks with far more accuracy. Right now most of this data is backward looking, particularly when it comes to credit ratings, defaults and delinquencies. This also applies to data around the borrowers (the poor fool taking out these loans) and their rates of delinquency and default.

Risk changes very fast and real time views of that risk are needed to make proper adjustments particularly when thresholds of defaults and delinquencies fall outside of what was modeled. Finding out weeks and months later only exacerbates a bad situation and allows contagion to spread by keeping credit standards very loose. As a result of using a distributed ledger, ratings agencies can be displaced since a real-time virtual credit bureau can be established between all parties. 

The Role of the Trustee Can Be Replaced

In a report entitled "The Trustee's Role In Asset Backed Securities", the role of the trustee is describe as follows:

"In many asset-backed securities transactions, the document may not contemplate any direct check on the performance of the seller or the servicer. Transaction documents virtually never give the trustee any substantive oversight of the seller or the servicer and their activities other than to confirm the timely receipt by the trustee of certain remittances and reports from the servicer, including reports of independent accountants, and certifications in the forms required by the transaction documents. Additional oversight is not explicitly required of trustees and would necessarily be limited to matters that are readily ascertainable and verifiable on a cost and time sensitive basis. Importantly, the trustee typically has no duty under the transaction documents to make investigations on its own for the purpose of detecting defaults, fraud or other breaches. If the servicer becomes insolvent or unable to perform, the trustee may be responsible under the transaction documents for the appointment of a successor servicer. Some transaction documents contain specific provisions relating to ―back-up‖ servicing, i.e., appointing a specified successor servicer from the outset to take over servicing when succession becomes necessary. These provisions range from ―cold‖ back-up servicing, where the trustee agrees in the transaction documents to become the successor servicer (a servicer of last resort) unless another entity (appointed by the trustee when needed) accepts such appointment, to ―hot‖ back-up servicing, where a successor servicer named in the transaction documents agrees to maintain back-up files throughout the transaction. In transactions where the trustee accepts the role of back-up servicer, the trustee typically relies upon arrangements with servicing units within its own institution or with third party providers to pre-arrange succession. 

The trustee is not, however, expected to determine that a security interest of such quality has been established or that a ―true sale‖ transfer has occurred. Instead, the trustee is authorized by the transaction documents to rely upon legal opinions or other evidence to establish at closing that what it has been granted conforms to the documents. The trustee is also authorized to rely on future opinions and instructions from others (usually the servicer or the seller) to establish that the assets are being maintained so as to preserve the trustee’s interests in the assets in accordance with the transaction documents. 

The trustee may be obligated under the transaction documents to determine from time to time that the aggregate value of the assets bears a prescribed ratio to the amount of the debt outstanding or to perform other analytics on the assets. The documents set forth precise valuation procedures and indices and other methods of valuation and authorize the trustee to rely upon specific sources of information. The trustee will instruct the seller or other responsible party to add assets if needed or may permit assets to be withdrawn if the amount held exceeds requirements. In each case the trustee has authority to rely on specified information as to the value and ownership of assets being added or withdrawn. The trustee may also require the substitution of new conforming assets for existing assets that have ceased to conform to the asset requirements for the transaction upon receiving notice of such failure to conform. Usually the seller or servicer will effect the substitution. The trustee will be authorized to rely upon their representations that the substitute assets meet transaction requirements. In certain transactions there is a constant or periodic flow of assets through the trustee’s custody or security interest."

Does this sound necessary to you if a distributed ledger is being used along with smart contracts between the counterparties of transactions? Breaking it down by paragraph:

Paragraph 1: No additional oversight would be needed as each node will have the same information and data and can make decisions based on that without relying on other parties to provide such information. Additionally, this information will be available on a real-time basis and synchronized so everyone has a "golden record of truth". Direct checks can be made implicitly by being on a ledger and malicious actors can be prosecuted for acts of fraud and breaching the rules and regulations set up. This will also make all parties act in a different way since their identities will be known as well as their actions. It will cause malicious actors to think twice before committing such acts.  Permissions management can be set up to appoint the entities responsible for replacing certain parties on the ledger.  There will be no need for backup files also as they will live in the cloud based ledger and can be stored in each parties nodes if necessary.

Paragraph 2: No need for trustee to tell anyone if a true sale has happened. Once it occurs everyone will have the ability to view the records for themselves and determine if this is true.

"Instead, the trustee is authorized by the transaction documents to rely upon legal opinions or other evidence to establish at closing that what it has been granted conforms to the documents."  Not if smart contracts are put into place.  The same goes for relying on future instructions that the assets are being maintained in the transaction documents. This can be done in real time, and changed, as risk changes.

Paragraph 3:  "The trustee may be obligated under the transaction documents to determine from time to time that the aggregate value of the assets bears a prescribed ratio to the amount of the debt outstanding or to perform other analytics on the assets. The documents set forth precise valuation procedures and indices and other methods of valuation and authorize the trustee to rely upon specific sources of information."  Nope. This information will not be valuable on a ledger as it is backward looking. This can all be modeled in real time as these changes happen.  For the regulators this will be invaluable information from a credit risk, systemic risk and economic risk standpoint. 

"The trustee may also require the substitution of new conforming assets for existing assets that have ceased to conform to the asset requirements for the transaction upon receiving notice of such failure to conform. Usually the seller or servicer will effect the substitution. The trustee will be authorized to rely upon their representations that the substitute assets meet transaction requirements. In certain transactions there is a constant or periodic flow of assets through the trustee’s custody or security interest." Absolutely no transparency here and this is where a huge amount of risk comes in as the system exists today.  This can not be allowed to continue as this imposes huge risks to the entire asset backed securities market.  

In order for trust and transparency to be restored in the financial system, use cases like this need to be created and implemented to avoid fraud and financial disaster.  For all parties involved in transactions where assets change hands and can be altered without the purview of all interested parties, distributed ledgers are an answer.

 

 

How Transactions Are Validated On A Distributed Ledger

(A special thanks to Simon Taylor who was instrumental in much of my thinking through conversations we had around this topic. His input was invaluable)

This post will explore how transactions will be validated using distributed ledger technology. It will also provide some use cases around who those transaction validators may be and what that would look like.  First off, transaction validation will be different from the bitcoin blockchain because Proof of Work will not be used.  The features of a  private blockchain network are:

  1. Peer to peer: transfer assets directly between parties who control the assets.
  2. No bitcoin currency:  networks are built for specific markets and can issue & transfer any asset.
  3. No mining: transactions are ordered by trusted parties that form a "federation" or the nodes on a distributed ledger.
  4. Fast: confirmation in seconds.
  5. Scalable: 1000s of transactions per second.

Financial Institutions Don't Want Public Blockchains

This is not a debate anymore. Tim Swanson, Director of Market Research from R3CEV, wrote a paper titled: "Watermarked tokens and pseudonymity on public blockchains"  citing the reasons why this is so.  Tim puts it succinctly here:

"There are at least three identifiable reasons that financial organizations looking to use some kind of public blockchain should be wary of a watermarked approach:

1) the built-in security system inherited from Bitcoin and other proof-of-work-based blockchains is not exportable in a regulated financial settlement setting (through a distortion of incentives); 

2) the lack of legal settlement finality; and

3) the regulatory risks that a watermarked approach introduces

This paper prefers to use the term watermarked token to encompass two types of systems: 1) Colored coins and 2) Embedded consensus systems which use their own proprietary metacoin"

Note: A metacoin is a coin that is launched on top of another blockchain, as a "meta" layer.

I highly recommend reading this paper for deep insights into this issue.  The world of public blockchains was invented so parties who don't know each other and don't trust each other could transact.  The financial world operates in a very different manner.  The parties must know and trust each other and be identified. When parties can trust each other there is no need for the inefficiencies associated with public blockchains in the form of mining and solving the "double spend" problem.  (This occurs if two transactions attempt to spend the the same output, only one of those transactions will be accepted.) Without mining one can just validate the transactions and add to the chain by creating hash functions regardless and forming blocks.  A private blockchain for the most part behaves in the same manner as a public blockchain.  

One of the main differences comes from the transaction validators, who need to be onboarded and accredited/trusted to join the ledger. Their identity is known to everyone. This actually adds an extra layer of security because if a node performs a malicious act, they can be persecuted and ejected from the network.  As opposed to a public blockchain network, the transaction validators in a private blockchain are not incentivized in the form of tokens (money) but in having the benefit of being a part of the ledger and being able to read data they consider valuable.  This post will explore this issue further as perhaps there is a role for disinterested/neutral parties to be involved as the transaction validators so there is no conflict of interest.  For their services, perhaps some form of payment will be necessary. That payment would not be in the form of cryptocurrency token, since with a private blockchain the assets that are being exchanged don't live on the chain (like bitcoin). It is more of a promise of exchange. 

Getting rid of mining allows for significant performance enhancements for distributed ledgers as they still have unique properties over a replicated database pattern:

  1. Any node is able to write to the chain at anytime without a centralized node coordinating "write" operations.
  2. The network could be a coalition of business entities with no one entity owning the network. This creates greater incentives to want to share and use the infrastructure.
  3. A synchronization technology that allows some nodes on the ledger to have non-identical copies of the database. (reason for this explained below)

Transaction Validators Provide A Service

Transaction validators provide a service for the entire ledger.  They determine if transactions meet protocol requirements for the ledger and make a determination that it is "valid".   In distributed ledgers, the transaction validators group these transactions into ordered units (blocks) by agreeing on the validity of the transaction and ordering them specifically so as to prevent a "double spend." 

I spoke to Simon Taylor VP of Entrepreneurial Partnerships at Barclays, around this issue and he gave some good first principles to think about when trying to understand why and how transaction validators should be selected and what purpose they are serving:

"Think about why you need validation. What are you trying to prove with validation? Am I proving that records match? That business logic executed? Who needs to see this happen? Everyone in the network? Some people in the network? What's my threat model for why I might want "consensus" or validation? 

So the first goal is to step back and say: Who needs to see the data? To do what in a financial transaction? This might be the counterparties. A Central Counterparty (CCP). A smart oracle. A regulator and perhaps two law firms and a custodian. Why then would I want other network participants to validate the transactions?"

These transaction validators play a critical role in the success of the blockchain as they have the ability to "write" to the ledger and send out confirmations of the transactions. They provide a unique record of truth from which all the parties act.

Security Concerns Without Proof of Work

Since malicious actors on a distributed ledger are known and can be prosecuted, the main security concerns are the stealing of private keys.  

  1. The actor creating the transaction can store their keys in a secure offline place. This is known as cold storage. This is not very practical though.
  2. The actor can store her private key on the local hard drive of their PC. This is a problem because it could get hacked.
  3. The actor could let a third party provider manage their private keys in a wallet. This is probably the most convenient for non-technical people in financial institutions and corporations who have limited knowledge about blockchains.

This is no different than public blockchains except there is one major security upgrade: trust. Knowing the counterparties and transaction validators keeps incentives aligned for forming a distributed ledger to begin with.

Examples Of How Private Blockchains Are Validating Transactions

Different companies are using different methods for validating transactions. Most of this information is not public knowledge yet. However, a few companies have shared how they are doing it and I will list a few examples of such.  

Antony Lewis, in his fantastic blog, describes how Multichain (a private blockchain company headed by Gideon Greenspan)  validates via a round-robin process:

"Bitcoin’s computation-intensive Proof-of-Work solves for a Sybil attack in an anonymous network i.e. a small group of entities pretending to be a large group of entities who agree on something in order to spoof the system. With a permissioned blockchain where block creators are known and have to sign blocks that they create, you don’t have this problem so you don’t need a ‘difficult’ or slow mining puzzle.

MultiChain uses a randomised round-robin system for block-adders and a concept of mining diversity which is a configurable strictness on how long a block-adder has to sit out for after he has added a block, before the other nodes will accept another block from him.

·       At one extreme of the scale (strictness of zero), any block-adder can add any block meaning it’s very tolerant but also increases the risk that a single block-adder or small group of block-adders can spoof the system.

·       At the other extreme of the scale, (strictness of 1) once you have added a block, you have to let every other block-adder add a block before you can add again. This stops single or groups of block-adders from creating forks, but if a node goes offline then at some point no further blocks will be able to be added while the network waits for the offline node to add the next block.

·       Strictness lets you adjust the balance between security and technical malfunction risk."

The block adders are transaction validators in this model and the asset owners are the parties which are performing the transaction.

Tim Swanson in his latest blog post talks about how Hyperledger, which was acquired by DAH and how they validated transactions:

"The simplest way to describe Hyperledger, the technology platform from Hyper, during its formative year in 2014 was: Ripple without the XRP. Consensus was achieved via PBFT. There were no blocks, transactions were individually validated one by one.

Hyperledger, the technology platform from Hyper, was one of the first platforms that was pitched as, what is now termed a permissioned distributed ledger: validators could be white listed and black listed. It was designed to be first and foremost a scalable ledger and looked to integrate projects like Codiusote, as a means of enabling contract execution."

Note: Practical Byzantine Fault Tolerance (PBFT) 

Ripple achieves consensus via the nodes on the network sharing information about transactions.  Once a supermajority of the nodes agree, consensus is achieved. This can be an iterative process before the transaction becomes validated.

Other private blockchain companies are doing exactly what public blockchains are doing without using proof of work. (as mentioned above). They are grouping transactions into blocks, creating one way cryptographic hash functions and using multi party consensus algorithms to name a few things.  

The methods companies are employing to validate transactions is one I want to learn more about going forward as this is a critical piece to the success of the distributed ledger/private blockchain space.  For a lot of companies they are probably still trying to sort through the best way to do this.  The health of the whole ledger depends on the ability of the parts to adapt and withstand stress. In this case that would be points of failure on how transactions get validated and who those validators are.

Another major assumption which is now being challenged is around a replicated shared database in which all of the data is synchronized that all of the nodes on the ledger have identical copies.  This is where the 3rd assumption above comes to play:  A synchronization technology that allows some nodes on the ledger to have non-identical copies of the database.  This opens the doors for an entirely new option as to how transactions will be validated on a ledger.  

Let's suppose we have a distributed ledger which has 20 banks as nodes. Other nodes would include regulators, lawyers, a custodian, a smart oracle (smart contracts) and perhaps disinterested parties to be transaction validators. How is a decision made as to who the transaction validators are within that ecosystem?  Are all of the banks going to be alright with only a few of them being the transaction validators? This could present a massive conflict of interest based on what data those validators are privy to. Not to mention the fact that most financial institutions, enjoy anonymity in how they are transacting and trading in today's world and consider it a distinct advantage. Are they going to have to give this up in order to be a part of the ledger or will attempts be made to be able to keep this confidential?  How can the transactions be distributed as such to ensure the anonymity?

Some radical thinking has been done to avoid such issues. Only the nodes involved in a trade will validate the transaction amongst themselves.  Hence, every node on the ledger will have write functionality.  This works by not sharing the data and the transactions with any nodes that aren't directly involved in the trade. What is shared is the business logic (the instructions around the structure of the transaction) and the workflows (what goes where) via smart contracts. In other words, the business logic is known to all parties but the transaction itself remains anonymous except between the counterparties. This allows everyone on the ledger to see that something happened the way it was supposed to happen without sacrificing confidentiality.  The data stays in each banks node, while the ledger itself contains an itemized list of where everything is held. 

What comes to one mind instantly is well if the two parties are transacting amongst themselves what stops them from collusion on trades. Certainly a transaction validator would need to be someone else aside from the two parties to make sure bad acting doesn't take place.  Regulators or other market infrastructure players could be nodes to oversee the transactions and trades.  What is really compelling about this idea aside from ensuring anonymity of amounts and types of transactions is banks having their own data stored in their nodes.  This allows each counterparty to control their own data and not the ledger as a whole which would not be acceptable to anyone involved. If the whole network were compromised, that would be perilous to that data.  Storing the data in this way affords extra protections for the individual finanicial institutions.

Transaction Validation in Clearing & Settlement

Perhaps this opens the door for the incumbents to be a part of the ledger.  Parties like DTCC and the exchanges who are more disinterested observers than the other participants and are experts in this could be nodes on the network. The role of both changes in this new world. No longer do they host the data, as that will remain in the banks nodes, but they could be involved in setting contract templates (smart contracts) and managing upgrade cycles.  They could also have a record of all the transactions that are linked to every transaction that has occurred between participants (each and every financial institution)  on the ledger.  This would be extremely valuable for establishing a chain of custody of each and every transaction as it changes hands (provenance).

There is also a need for a private key administrator as well.  This role will need to be filled. Maybe this falls to some of the rising stars in the cybersecurity world. Regardless of whether this a real possibility or not it will be interesting to see how these incumbents position themselves going forward.  It will also be interesting to see how other types of companies begin to think about transaction validation for their own use cases.  In coming blog posts, this topic will be explored further.

 

 

 

 

 

 

 

 

 

Could The 2009 Subprime Mortgage Crisis Have Been Avoided With Blockchain?

 

This post will explore the 2009 subprime mortgage crisis and the hypothetical impact a blockchain may have had with regards to the proliferation of toxic synthetic Mortgage Backed Securities (MBS) and Collateralized Debt Obligations (CDOs).  Hindsight is always 20/20 so it is not the intent of this post to say blockchain would have been the cure and stopped it from happening. Instead, it may have lessened the impact greatly by flattening out boom/bust cycles. When thinking of mortgage backed securities and asset backed securities this will be written from the lens of supply chains and provenance.   This post will describe the assets involved, what provenance is, the subprime mortgage crisis and why distributed ledgers would have been instrumental in lessening the impact of the crisis.  The global economy stood on the precipice of a global depression as the credit markets froze.

What Is Provenance?

Provenance refers to the tracking of supply chains. It's important for purposes of integrity, transparency and counterfeiting that this asset and its path be known every step of the way.  A supply chain refers to the creation of a network in which an asset is moved, touching different actors before arriving at a destination.  The tracking of this asset has big value to the actors involved for the reasons mentioned above. This applies to financial synthetics such as asset backed securities (ABS) as much as anything else in the world considered to be an asset including: Norwegian salmon, diamonds, prescription drugs or Letters of Credit (LOCs) and Bills of Lading (BOLs). In fact due to the liquidity created out of these financial synthetics and the leverage built into them in as they move through the financial supply chain it becomes necessary to ensure two things: 1) Note how individual assets combine to form a newly made synthetic asset and  2) track possession of the asset as it moves through the supply chain. There are two kinds of medicine: preventative (measures taken to prevent the onset of a disease) and curative (treating a disease once you already have it) in this case a blockchain could be a preventative type of medicine.

In order for data provenance to be effective from a technology standpoint it must fulfill certain requirements & characteristics that would upgrade supply chain management and monitoring.  According to Sabine Bauer in his paper titled "Data Provenance in the Internet of Things" they are: 

1.      Completeness: Every single action which has ever been performed is gathered.

2.      Integrity: Data has not been manipulated or modified by an adversary.

3.     Availability: The possibility to verify the collected information. In this context, availability is comparable to auditing.

4.     Confidentiality: The access to the information is only reserved for authorized individuals.

5.     Efficiency: Provenance mechanisms to collect relevant data should have a reasonable expenditure.

In addition to these requirements are some additional characteristics such as Privacy of individuals personal data. This flows into another characteristic which is Unlinkability, a state where this personal data must not be able to leave the system and get into the wrong hands.  It needs a secure wall built around it.  Finally it also have Linkability, which is simply total transparency of the chain and total traceability of the data particularly when it comes actions and modifications to it.  A blockchain contains all of these requirements and as will be shown when the example of the 2009 subprime crisis is explained, it has additional characteristics which if applied to MBS and CDOs would have buffered the crisis in a way that no database could.  

For provenance to work really well on a blockchain, the whole ecosystem (all parties involved) needs to be a part of the ledger so that network effects make for a much more robust supply chain and the associated integrity of the data that comes from it.  It will also allow for easily identifying errors in the supply chain and a real time look at the data which is extremely important.  Trade finance provides an example of the network effects and particularly when using Letters of Credit (LOCs).  A Letter of Credit is defined as:

"A letter of credit is a letter from a bank guaranteeing that a buyer's payment to a seller will be received on time and for the correct amount. In the event that the buyer is unable to make payment on the purchase, the bank will be required to cover the full or remaining amount of the purchase."

While a blockchain would be valuable if  you just had the buyer and seller using it to save costs and track payments and goods, it would remain incomplete and inaccurate at certain data points. (the same will be true for subprime mortgages and the making of them.) What does make this more powerful is to have the customs authorities, the banks, the big corporates, the shippers, and the manufacturers on the ledger. In other words, anyone involved in the chain of custody as the asset is moved either physically or digitally providing all parties with the ability to watch and update and confirm that the asset is genuine and moving along the chain to its intended destination.  Another problem with LOCs which will also reared their ugly head in the MBS crisis, is the document heavy (mostly paper) credit processes that still remain intact.  The corporate databases which are involved suffer from incompatibility and often high transaction costs to get them to communicate with each other.  This makes the current system around supply chain management hard to maintain on a real-time basis.  Physical items become access points to the ledger which can be identified, located and addressed by unique characteristics. 

The 2009 Subprime Mortgage Crisis

A few definitions will be needed to explain the types of assets that caused the crisis.  This will also be useful since provenance and distributed ledgers (as described above) would have been important in stopping the severity of the crisis.

Note: There are different categories of Asset Backed Securities (eg auto loans, student loans, leases, etc) The focus of this piece will be on Mortgage Backed Securities and Collateralized Debt Obligations.  As the Financial Crisis Inquiry Report noted (and found on Wikipedia) "The CDO became the engine that powered the mortgage supply chain."

  1. Asset Backed Securities: a security whose income payments are from and collateralized (backed) by a specific pool of underlying assets.  This pool usually consists (but doesn't have to) of a group of illiquid assets which can't be separately sold.  These assets are then pooled together and a synthetic asset is made that can be sold to investors. This process is known as securitization and is supposed to "de-risk" the assets by diversifying them by bringing together tiny pieces of assets from a what's supposed to be a diverse pool of underlying assets.
  2. Synthetic Asset: A mixture of assets that, when combined have the same effect and value as another asset.  This also gives it the same capital gain potential as the underlying security. 
  3. Mortgage Backed Securitya type of asset backed security that is secured by a mortgage or a pool of mortgages. The mortgages are sold to a group of individuals (a government agency or investment bank) that securitizes, or packages, the loans together into a security that investors can buy. The mortgages of an MBS may be residential or commercial. The structure of the MBS may be known as "pass-through", where the interest and principal payments from the borrower or homebuyer pass through it to the MBS holder, or it may be more complex, made up of a pool of other MBSs. Other types of MBS include collateralized mortgage obligations (CMOs, often structured as real estate mortgage investment conduits) and collateralized debt obligations (CDOs).                                                                                                         

4. Collateralized Debt Obligations (CDO's): are securities backed by debt obligations that encompassed mortgages and MBS.   Like other private label securities backed by assets, a CDO can be thought of as a promise to pay investors in a prescribed sequence, based on the cash flow the CDO collects from the pool of bonds or other assets it owns. The CDO is "sliced" into "tranches", which "catch" the cash flow of interest and principal payments in sequence based on seniority. If some loans default and the cash collected by the CDO is insufficient to pay all of its investors, those in the lowest, most "junior" tranches suffer losses first. The last to lose payment from default are the safest, most senior tranches. Consequently, coupon payments (and interest rates) vary by tranche with the safest/most senior tranches receiving the lowest rates and the lowest tranches receiving the highest rates to compensate for higher default risk. As an example, a CDO might issue the following tranches in order of safeness: Senior AAA (sometimes known as "super senior"); Junior AAA; AA; A; BBB; Residual.

Banks try to make these tranches even safer by insuring them for a fee. This is called a Credit Default Swap

These securities became really popular during the US housing bubble as housing prices soared and with it the amount of new mortgages (whether primary, secondary or tertiary).  Any individual whether credit worthy or not was given a mortgage and this created a whole new pool of mortgages to package and repackage. The banks loved them because they could get these mortgages off their books for cash by passing them through.  They were a way for investors to get a good rate of return in an extremely low interest rate world at a time when there incredible global demand for fixed income and these were seen as safe assets because the rating agencies were putting their seal of approval on these securities.  However, even though the banks were getting these mortgages off their books the CDOs that synthetic CDOs that were made expanded the impact of what would come later: mortgage defaults. Before the creation of Credit Default Swaps (CDSs) and CDOs you could only have as much exposure to non-prime mortgage bonds as there were such mortgage bonds  in existence.  The extreme leverage used by the banks caused an "infinite" amount of synthetic CDOs to be created.

The banks, investors and everyone else involved in the financial supply chain had no transparency into their own risk management. There was (and still is) an inordinate amount of complexity in the securitization of these assets.  As these products were really major money makers for banks, there was enormous pressure to beat your competitors in getting your products out quicker and faster, which led to sharp declines in underwriting standards.  The securitization piece was happening off balance sheet and being guaranteed by the issuer.  This caused enormous amounts of leverage to be used without an afterthought for how unstable the capital structures would become.  Credit risk was extraordinarily underpriced.   A recipe for disaster. 

Another key domino was trust in the rating agencies.  They were putting investment grade ratings on subprime tranches. Not only this, some of the AAA tranches contained nothing but subprime loans. The rating is very important for investment purposes as certain types of investors (pension funds, global municipalities, mutual funds, etc) have mandates on what they can purchase and these are generally only AAA rated. The CDO managers weren't forced to disclose what the securities contained, because the contents of the CDOs was subject to change.  The underwriters who structured them were less concerned about the ingredients, the rating is what mattered most.

In other words, the default rate became 100% as all the MBS and CDOs began blowing up. Real people with real homes who were actually able to pay their mortgages watched the value of their houses drop dramatically versus what they paid for it and people just started walking away.  To add to the complexity, in many cases, not even whole mortgages were out into these synthetics but tiny slices of mortgages were put into different synthetics so chain of title was completely lost in this process.  To add to the complexity these mortgages and pieces of mortgages were geographically dispersed.  In order to de-risk they were packaged together and sold.  

Aside from non-credit worthy individuals being able to own homes as less and less stringent standards were used,  robosignings were another byproduct and apparently are still being used to this day.  Robosignings refer to a practice where a signature from a bank or mortgage official on legal documents guarantees that the information is accurate. The