Académique Documents
Professionnel Documents
Culture Documents
1. INTRODUCTION
Centralized systems are vulnerable to single points of failure. If a central database goes down, entire systems can
be rendered nonoperational. Distributed databases solve this problem by using redundant cooperating nodes so
that the system can continue to function despite individual failures. However, in these systems trust remains
centralized, as database nodes work together and operate under a single authority. Users need to trust the
database owners and assume they have not tampered with the data.
Distributed ledgers take decentralization further by creating databases that are trustless and ownerless. Each
node performs its own independent verification, individually contributing to system security and redundancy.
Since everyone has complete access to the data, this removes the need for any central storage or authority. No
single entity can control or manipulate the contents. In fact, many malicious nodes cooperating together can
not interfere with valid data. Most importantly, inconsistent information is easily detected, a property referred
to as Byzantine fault tolerance. These properties enable a wide variety of new types of distributed applications.
Figure 1. provides a comparison of these different database designs.
Despite the recent success of distributed ledger applications like cryptocurrency, the underlying technology is
immature. The first practical implementation, Bitcoin,1 is barely a decade old. Many problems need to be solved
before the technology can be applied beyond niche applications. The most prominent issue is the approach to
append new information, commonly referred to as mining. Bitcoin, for example, uses a proof-of-work blockchain
mining scheme to record its cryptocurrency transactions.
Proof-of-work is high latency, computationally intensive, and assumes well-connected network infrastructure.
While arguably suitable for a globally scaled cryptocurrency, for most applications it is unnecessary to impose
this burden on CPU and network resources. For smaller applications, proof-of-work is a liability. Since the
Distribution A. Approved for public release: distribution unlimited. Case Number 88ABW-2018-1199 12 March 2018
security of proof-of-work comes from the computational power of the majority, a large malicious entity can easily
subvert a small distributed ledger. Proof-of-work was specifically designed for cryptocurrency, directly applying
it to other applications is likely the wrong approach. New designs are necessary and the research community is
actively investigating this area. So far, permissioned ledgers and DAGs are promising alternatives.
Permissioned ledgers (sometimes referred to as private blockchains) limit participation through an integrated
identity management system. An obvious approach is to define credentials of individuals (by public key) who are
7LPHĺ
allowed to publish information. New blocks are confirmed and propagated only if signed with valid credentials.
Multichain2 provides a great example, using a round-robin system to mine blocks. While this approach to mining
significantly reduces latency, it still assumes reliable network infrastructure and arguably centralizes authority.
DAGs use a more sophisticated structure to store relationships between pieces of information. Where
blockchains use a single hash referencing the previous block, DAGs use a hash set instead. Each hash serves
as a cryptographic witness to another validated block, serving as mathematical proof of order-of-events. This
structure maintains essential distributed ledger properties, non-repudiation, immutability, and Byzantine fault-
tolerance, while enabling a lot more flexibility. Two popular efforts using this approach are IOTA3 and Byteball.4
While permissioned ledgers and DAGs are steps in the right direction, they still have limitations that make
them insufficient to replace traditional databases. First, they both require a confirmation period. This is the
time required for new information to be reliably committed to the ledger (sometimes up to a minute). Second,
they rely on well-connected network infrastructure. Disconnected nodes are unable to independently publish new
information without being connected to the larger network. Our restricted DAG design aims to overcome these
limitations. We present an approach that blends the simplicity of traditional blockchains with the flexibility
DAGs. We completely eliminate confirmation periods and reduce latency enough to be suitable for real-time
communication. Additionally, our design is partition-tolerant and can be applied in situations with intermittent
or unreliable network connections.
The rest of the paper is organized as follows. In Section 2, we present our ledger design and cover the
relevant details. We discuss the partition tolerance and Byzantine-fault tolerance of the design in Section 3.
Section 4 provides a brief mathematical analysis of the random mesh overlay’s connectivity and propagation
latency. Section 5 presents our proof-of-concept prototype called DagChat. Finally, Section 6 provides some
concluding remarks and Section 7 describes work planned for the future.
2. DESIGN
2.1 Block Header
In distributed ledgers, block headers store the relationships between pieces of information using cryptographic
hashes. These hashes are used for message integrity and serve as mathematical proof for order-of-events. Ad-
ditionally digital signatures are used for non-repudiation. Security in distributed ledger systems comes from
multiple independent validations of header data. Since everyone needs to store and propagate this information,
headers are designed to be as small as possible. Figure 2. illustrates our header structure.
In blockchains like Bitcoin, the header begins with a single hash pointing the previously validated block. DAGs
use a hash set instead. Having multiple hashes enables new blocks to be independently and simultaneously mined,
increasing system performance but also complexity. Additionally, traditional DAGs are unstructured meaning
each of these hashes can conceivably point to any other already validated header. This level of freedom makes
achieving Byzantine fault tolerance difficult. IOTA, for example, uses a Markov Chain Monte Carlo (MCMC)
simulation to validate locations for new blocks. While this significantly improves upon traditional mining, it still
requires a confirmation period for information to be reliably committed to the ledger. Current DAG designs are
not fast enough for real-time communication.
In our restricted DAG approach, the first hash provides primary graph attachment. This is the position
in the DAG where there are sufficient credentials to publish a new block. Once the new block is published,
another attachment location opens below it. By pairing miners with specific deterministic locations, there is no
chance of forking. Each miner can be viewed as having their own blockchain. Additional hashes then link these
independent chains together to form a DAG, where each block has traceability back to a common root. Newly
published blocks can reliably and unilaterally be committed to a common ledger, reducing latency enough for
real time communication.
The final section of the block header includes metadata describing the size, format, and Merkle root5 of the
referenced data payload. The Merkle root provides compact tree representation of encoded data allowing for
efficient parallelization and logarithmic segment verification. Merkle trees are commonly used in blockchains and
P2P file sharing systems.
DagChat:addGenesis=/swKMUDlPo2VGDWxqDi1TXDNkPJ5qSnZC6bhzCtlIvw=
Once a user imports the hash of Genesis hash into their client (by clicking on it), their client needs to bootstrap
into the peer-to-peer network of interested participants. This is commonly referred to as the peer-discovery or
P2P rendezvous problem. Currently, we need to manually configure the peer-to-peer links, but we propose (in
Section 6) future work to leverage already existing BitTorrent6 infrastructure.
In addition to providing cryptographic traceability, the genesis block also contains the consensus rules. These
rules dictate how to create valid blocks and determine if received blocks are valid additions to the ledger. These
block validation rules can be represented as a simple Boolean logic function: P : X → {true, false}
Here, X is a boolean logic function which takes a new block as input, evaluates it, and outputs true if validation
criteria are met. The logic should be carefully written for specific application needs. In our current prototype,
the majority of this logic exists in the software client and useful ledger configuration options (like block size) are
encoded into the genesis block. This enables multiple ledgers to operate with the same parameters. While we
try to present useful customizations, we aim to keep the design as open and flexible as possible to allow for a
multitude of use cases. A fully developed software implementation of this design will have the entire consensus
logic will be contained in the genesis block, possibly by using a lightweight scripting language. This will allow
complete ledger customization while maintaining a consistent generic client.
Figure 3 provides a general consensus logic overview. First, when a new block is received we must ensure the
parents have been validated. Next, block credentials and metadata are must meet requirements defined in the
genesis block. Finally, the block must be attached at the correct location in the graph and the parent should
not already have other children with matching credentials.
7LPHĺ
New Block Parent Yes Permission Yes Attachment Location Yes Accept
Valid? Valid? Valid? Block
No No No
Reject Block
Consider the following application: a distributed system to exchange arbitrary data between a pre-defined
set of users A, B, and C, identified by their public key. Users are allowed to invite others. Figure 4 shows a
valid DAG topology. Bold arrows represent primary graph attachment hashes and dash lines are witness hashes.
Note that user D was invited by user C at block C2.
A1 A2
Genesis B1 B2 B3
C1 C2 C3
D1
Figure 4. Example of a valid DAG topology for four users.
3. LEDGER ANALYSIS
3.1 Partition Tolerance
Using our restricted DAG design, each user can immediately and unilaterally publish new data to the ledger.
They can accomplish this by using their private key to make new blocks with valid credentials. Since each user
has their own blockchain within the DAG, there is no contest over who gets to publish where in the ledger.
Additionally, there is no chance of orphaned data which does not meet the group consensus state. Each block
contains cryptographic proof of its valid existence and attachment at the correct location.
Each block contains its own complete proof-of-validity and attachment, therefore our design is inherently
partition tolerant. Groups of users cut off from the main ledger can independently validate each other’s data
and seamlessly rejoin the main ledger once connectivity to the rest of the network is re-established. At the
reconnection time, there is no potential for overlap of data, so there is no need to re-validate or reconcile
differences. The forks can simply be melded back together and each user can fill the gaps in their copy of the
ledger. Figure 5 shows this process. At time T1, the network connections from A and B to C and D are broken,
so during this time they can only validate blocks posted by users in their own fork of the network. Once the
connections are re-established at T2, they can then seamlessly rejoin the two forks to create a complete ledger.
During the down time, users A and B have their own fully functional fork of the DAG, as do users C and D. Upon
reconnection, users A and B can learn what C and D were posting during the down time (and vice versa), and
they know that the two forks happened concurrently, but the only drawback is that there is no way to determine
the exact order all of the messages were posted in during the disconnect.
T1 T2
7LPHĺ
A1 A2 A3 A4 A5
B1 B2 B3
Genesis
C1 C2 C3 C4
D1 D2 D3
Figure 5. Partition tolerance: between T1 and T2, users A and B are disjoint from C and D.
A2† A3†
Genesis A1 A2 A3
B1 B2 B3
Figure 6. Double Spend Attempt. User A attempts to present an alternate history of events (A2 † A3†).
In Figure 6, user A attempts to publish alternate blocks A2† and A3†. Since user B3 has already wit-
nessed blocks A2, A2† would be rejected by all participants. Furthermore, A can be flagged as a malicious or
compromised user for presenting two different solutions for A1.
4. OVERLAY ANALYSIS
4.1 Graph Connectivity
In this section, we assess the probability that the random mesh overlay will be a connected graph. To do this we
apply Erdös-Rényi random graph models8, 9 where graph G(n, p) contains n vertices and edges with probability
p. The set of graphs in G(n, p) can be generated using the following algorithm:
Algorithm RandG(n, p)
for each pair of nodes in a graph containing n nodes
add an edge with probability p
Using G(n, p), we can expect a graph will have on average n2 p total edges. We next apply results from10 to
(1−) ln n
• If p < n , then G(n, p) is almost always not connected.
(1+) ln n
• If p > n , then G(n, p) is almost always connected.
Therefore lnnn is a threshold for the connectedness of G(n, p). We can then apply these results to select a degree
that is sufficient to ensure our overlay network is connected. In Table 1, we compute this value for various
network sizes n. We show for large graphs, an average of ten peers is sufficient to ensure a connected network.
Reasonable TCP connections for the average computer and network are well above this threshold.
5. PROOF-OF-CONCEPT
Figure 8 shows our initial proof of concept, a chatroom that uses our distributed ledger design. Since our
distributed ledger uses a directed acyclic graph (DAG), the software is named DagChat. Each room is identified
by its genesis block, which is created using elliptic curve digital signatures and SHA-256 hashes provided by Java
built-in libraries. Approved users can publish blocks containing messages or files to the room which get stored
in the distributed ledger. The data is propagated to everyone who’s “watching” the room using the random
mesh overlay. The design can be altered based on the needs of a given application to only allow certain users
to watch or publish (read or write) to the room. Further, the design can be customized to only allow certain
data types and sizes to be published as valid blocks. Even with all of this functionality and flexibility, the code
is lightweight enough to run on Raspberry Pi 3.
This proof of concept demonstrates how our distributed ledger works as a means of communication that is
resistant to attack, manipulation, and censorship. A chatroom is only a rudimentary example of this technology,
as it could be used as a trustless, ownerless storage mechanism for any type of database.
6. CONCLUSION
In this work, we have presented a new type of distributed ledger using a novel restricted DAG design. The
block header stores a hashset that dictates where the block belongs in the DAG, credentials of users with
permission to publish to the ledger, and metadata describing the attached payload. The primary advantage of
our restricted DAG is low-latency. Since our design does not require a confirmation period, validated information
can immediately be shared with peers. This enables block propagation to be directly modeled using the degree-
diameter problem from graph theory. In this setup, diameter represents the longest distance between any two
peers. This value provides an upper bound for latency. By pairing miners with specific deterministic locations,
our approach is fast enough for real time communication. Our approach is also made flexible by embedding
consensus logic into the genesis block, which enables a generic client to handle many different types of ledgers.
Additionally, our random mesh overlay network provides consistent, reliable communication between peers.
We have shown that our restricted DAG design is inherently tolerant of partitions and Byzantine failure. We
have also shown that our overlay network has an acceptably high probability of connectivity and the expected
propagation performance is quite good. Finally, we presented our early work in creating a prototype of this
design that is extremely flexible and lightweight.
7. FUTURE WORK
First, we need to finish developing the network overlay. This involves solving the peer-to-peer bootstrapping
problem and using the genesis block hash as the key for P2P rendezvous. We will utilize already existing
BitTorrent technology like Trackers and the Mainline Distributed Hash Table.7
The next major addition would be The Onion Router13 (TOR) integration. This will hide the IP address of
participating peers. Current websites on the Darkweb hidden but centralized. If were successful in this effort,
these sites can be decentralize to the point they are pretty much impossible to shut down.
Finally, in order to validate our software proof of concept at scale, we need to run a few thousand nodes
in a laboratory simulation. We show system stability in the presence of significant adversary nodes. If we are
successful in this work, we will realize a system that will allow one-click creation fully decentralized services.
REFERENCES
[1] Nakamoto, S., “Bitcoin: A peer-to-peer electronic cash system,” (2008). https://bitcoin.org/bitcoin.pdf.
[2] Grenspan, G., “Multichain private blockchain,” (2015). https://www.multichain.com/download/MultiChain-
White-Paper.pdf.
[3] Popov, S., “The tangle,” (2017). https://iota.org/IOTA Whitepaper.pdf.
[4] Churyumov, A., “Byteball, a decentralized system for storage and transfer of value,” (2015).
https://byteball.org/Byteball.pdf.
[5] Merkle, R. C., “Method of providing digital signatures,” (1982). US Patent 4309569.
[6] Cohen, B., “The bittorrent protocol specification,” (2008). http://www.bittorrent.org/beps/bep 0003.html.
[7] Loewenstern, A. and Norberg, A., “DHT Protocol,” (2008). http://www.bittorrent.org/beps/bep 0005.html.
[8] Erdös, P. and Rényi, A., “On random graphs,” Publicationes Mathematicae (Debrecen) 6, 290–297 (1959).
[9] Gilbert, E. N., “Random graphs,” Ann. Math. Statist. 30, 1141–1144 (12 1959).
[10] Erdös, P. and Rényi, A., “On the evolution of random graphs,” in [Publications of the Mathematical Institute
of the Hungarian Academy of Sciences ], 17–61 (1960).
[11] Caccetta, L. and Smyth, W., “Graphs of maximum diameter,” Discrete Mathmatics 102, 121–141 (1992).
[12] Chung, F. and Lu, L., “The average distance in a random graph with given expected degrees,” Proc Natl
Acad Sci USA 99(25), 1587915882 (2002).
[13] Dingledine, R., Mathewson, N., and Syverson, P., “Tor: The second-generation onion router,” in [Pro-
ceedings of the 13th Conference on USENIX Security Symposium - Volume 13], SSYM’04, 21–21, USENIX
Association, Berkeley, CA, USA (2004).