Immutability & Immortality

Opening The Immutable Library

Why Dara’s Project Gutenberg Archive is a Big Deal

The Immutable
9 min readDec 18, 2023

“A book is a fragile creature, it suffers the wear of time, it fears rodents, the elements and clumsy hands. so the librarian protects the books not only against mankind but also against nature and devotes his life to this war with the forces of oblivion.”

― Umberto Eco, The Name of the Rose

Our Gutenberg collection turns library science on its head by replacing a central repository of data protected by curators with a decentralized archive of data protected with cryptography.

Dara recently published a concise and immutable archive of the complete Project Gutenberg (PG) eBook collection.

There’s a lot of technology packed into our unique library archive, and this is a layman’s guide to understanding what has been done. The result is not merely a mirror — it’s a repackaged, republished, redistributed and long rethought approach to digital archival.

Our Gutenberg collection turns library science on its head by replacing a central repository of data protected by curators with a decentralized archive of data protected with cryptography.

Depending on the time you read this, our archive gutenberg.dara.global will have changed to include new features and these will be discussed in future updates.

Project Gutenberg

We’ve spoken at length about the importance of PG and their work, and how their founder Michael S. Hart invented the eBook (then called an etext) way back in 1971.

To start our layman’s guide and describe what’s been done we’ll firstly examine the source.

So let’s look at book number 2600, that being the 2600th title released by Project Gutenberg — Tolstoy’s War and Peace.

War and Peace on gutenberg.org

As you can see the book is available in multiple file formats and from multiple locations, plus is accompanied with a good deal of metadata presented beneath the download links.

Dara has used many technologies to republish these books in a unique manner. Technologies like InterPlanetary File System (IPFS), InterPlanetary Name System (IPNS), Dara’s own OuterPlanetary File System (OPFS), Blockchain, Mutable File System (MFS), and the ZWI file format (ZWI).

That’s a rather daunting list! But don’t worry because rather than explain what all these cryptic terms mean right now, we’ll just get going and find out as we cross those bridges.

Project Dara

I have made me a monument more lasting than bronze

-Horace

Now let’s take a look at the same book, War and Peace, on Dara. We’ll go to gutenberg.dara.global and search for the book at the top of the screen.

The experience is seamless on both desktop and mobile.

gutenberg.dara.global

You’ll notice the search is fast and live and filters through the 70,000 books instantaneously. To search for a specific combination of words just add speech marks, as in the example below.

Clicking on the book link sends you to https://opfs.dara.global/ipfs/QmZEZcQkHEfAP9G9qw63ZqxKzvaLdEvJ8d4mboF9BVHNkf/2600.zwi where you can comfortably read the book online thanks to Dara’s OPFS display technology.

If you append &download to the URL you’ll download a zipped ZWI folder, for example:
https://opfs.dara.global/ipfs/QmZEZcQkHEfAP9G9qw63ZqxKzvaLdEvJ8d4mboF9BVHNkf/2600.zwi&download

Or if you change the OPFS in the URL to IPFS you’ll download a ZWI file, for example:
https://ipfs.dara.global/ipfs/QmZEZcQkHEfAP9G9qw63ZqxKzvaLdEvJ8d4mboF9BVHNkf/2600.zwi

Download icons will be added soon, obviating the need for users to enter these commands manually. For the purpose of this document it is enough for you to know that they exist.

Immutability

unchanging over time or unable to be changed.
“an immutable fact”

ZWI File Format

A ZWI file is a ZIP archive, or compressed container format, most often used to store wiki articles and associated data. So similar are ZWI and ZIP that on macOS you can rename a ZWI file to ZIP to easily peer inside!

Double-clicking to decompress the ZIP opens a folder containing all the assets associated for a book release. Let’s take a look!

In a nutshell all the files you see in this folder together combine to make the Dara version of a Project Gutenberg eBook.

You’ll notice we have an html version of the book (.htm), a plaintext version of the book (.txt), a media folder containing all the images in the book including its cover, a signature file (.json), and a metadata file (.json).

Let’s check out that metadata.json!

Metadata

{
"ZWIversion": 1.3,
"Title": "War and Peace",
"ShortTitle": "The Project Gutenberg eBook of War and Peace, by Leo Tolstoy",
"Topics": [
"Historical fiction",
"War stories",
"Napoleonic Wars, 1800-1815 -- Campaigns -- Russia -- Fiction",
"Russia -- History -- Alexander I, 1801-1825 -- Fiction",
"Aristocracy (Social class) -- Russia -- Fiction"
],
"Lang": [
"en"
],
"Content": {
"2600.htm": "346c81affad86ec4d4f21463fd2c3b44306118b5",
"2600.txt": "cd67a80c8707c035da61e0afc2f98a5f23447636"
},
"Primary": "2600.htm",
"Publisher": "ProjectGutenberg",
"CreatorNames": [
"Tolstoy, Leo, graf, 1828-1910",
"Maude, Aylmer, 1858-1938 [Translator]",
"Maude, Louise, 1855-1939 [Translator]"
],
"ContributorNames": "",
"LastModified": 1682177307,
"TimeCreated": 1682177307,
"PublicationDate": "2001-04-01",
"Categories": [
"Napoleonic(Bookshelf)",
"Opera",
"Historical Fiction",
"Best Books Ever Listings",
"Movie Books"
],
"LoCC": [
"PG"
],
"Rating": "",
"Description": "DARA Archive - Project Gutenberg eBook #2600 titled: The Project Gutenberg eBook of War and Peace, by Leo Tolstoy",
"Comment": "",
"License": "Public Domain in the USA",
"GeneratorName": "Project DARA - Gutenberg Archiver",
"SourceURL": "https://gutenberg.org/files/2600"
}

Before we continue we’ll zoom in the PG web page posted at the top.

As you can see, a lot of the information about the book is now contained in this metadata file associated with the book.

Our ZWI file (itself a compressed folder of other files) contains a lot of metadata information from Project Gutenberg’s website together with the book. So if gutenberg.org goes offline our archive contains a good deal of data necessary to restore it completely.

QmQpT5uZfng7yh6QLUcDB. . .

Whether in the URL or in the name of the ZWI folder, some of you may have noticed a very long string of characters starting with a Q.

What sorcery is this?

It’s the entire library (IPNS) folder’s unique identifier, or hash, known in IPFS lingo as a CID (Content Identifier). Each title within the library also has it’s own unique IPFS hash CID.

InterPlanetary File System

The CID cannot be changed and the ZWI file is now archived in the InterPlanetary File System. Think of IPFS as a decentralized file-sharing network with thousands of participants spread across the entire globe.

Each of the 70000 books has its own unique and immutable identifier. And each of the books can be downloaded through any of the many IPFS gateways worldwide.

So things are looking pretty good. But Project Gutenberg is always adding books and if Dara’s collection is immutable, doesn't that mean we can’t update our library?

Actually, it doesn’t! . . .

Mutable File System?

Why would Dara ever use something called a mutable file system?

Well, since we need to keep our PG library updated we are making use of a technology within IPFS which works in concert with IPNS, the InterPlanetary Name System.

Quoting from official documentation:

Because files in IPFS are content-addressed and immutable, they can be complicated to edit. Mutable File System (MFS) is a tool built into IPFS that lets you treat files like you would a regular name-based filesystem — you can add, remove, move, and edit MFS files and have all the work of updating links and hashes taken care of for you.

So all the books are immutable, but the folder in which they are contained is not.

Every time a book is added the CID of the folder changes, and by recording these new CIDs to a blockchain we can create a permanent and digitally timestamped record, a cryptographic snapshot of the library’s new state, every time books are added.

Immortality

the ability to live forever; eternal life.
“eating the fruit gave the gods immortality”

Blockchains are Timechains

You can think of a blockchain as a decentralized database with a decentralized clock. It is certainly the most robust and enduring type of database ever invented.

Blockchains do not rely on other people’s clocks to work out in which order transactions were sent or received, and all the participants in a blockchain network reach a consensus on which order, and at which time, transactions were seen.

Such was Satoshi’s preoccupation with time, that before calling the Bitcoin ledger a blockchain he called it a timechain.

But Dara PG blockchain transactions do not contain banal details about monetary movements. Instead they contain signatures and other metadata pointing to a book’s location in the InterPlanetary File System.

So if both gutenberg.org and dara.global go offline (OMG!) there still remains sufficient information stored on a decentralized blockchain ledger to find all the books in the PG library together with associated metadata and other assets.

Signatures

Remember there was a signature file in the ZWI folder? No? Well go back and check!

“Let us instead exercise our brains and try to solve this tantalizing conundrum.”

Here is what it looks like:

{
"identityName": "Project DARA",
"identityAddress": "info@dara.global - https://gutenberg.dara.global",
"psqrKid": "did:psqr:did.dara.global/gutenberg#publish",
"webKid": "did:web:did.dara.global:gutenberg#publish",
"token": "eyJ0eXAiOiJKV1QiLCJhbGciOiJFUzM4NCIsImp3ayI6eyJrdHkiOiJFQyIsImNydiI6IlAtMzg0IiwieCI6IjFiZWV4Y0ZCak42RlhZTjZRNW1xZU9JM040RDJCSDNTYlAzXzgxa0FMUjlrVzJaYkU1UUo0eUNjakUwT2NYV3kiLCJ5IjoiLWpNZGZUbnBuQ0NOQnJ0Z0Z3cmx1YUg1UHozdjBZWU56dWNDMGFlelRkeGFVckxyaWlzWVVBSFpyeF9iNUNENCIsImFsZyI6IkVTMzg0Iiwia2lkIjoiZGlkOndlYjpkaWQuZGFyYS5nbG9iYWw6Z3V0ZW5iZXJnI3B1Ymxpc2gifX0.eyJpYXQiOjE2ODIxNzczMDcsImF1dGhvcml0eSI6eyJpZGVudGl0eU5hbWUiOiJQcm9qZWN0IERBUkEiLCJpZGVudGl0eUFkZHJlc3MiOiJpbmZvQGRhcmEuZ2xvYmFsIC0gaHR0cHM6Ly9ndXRlbmJlcmcuZGFyYS5nbG9iYWwiLCJwc3FyS2lkIjoiZGlkOnBzcXI6ZGlkLmRhcmEuZ2xvYmFsL2d1dGVuYmVyZyNwdWJsaXNoIiwid2ViS2lkIjoiZGlkOndlYjpkaWQuZGFyYS5nbG9iYWw6Z3V0ZW5iZXJnI3B1Ymxpc2giLCJ1cGRhdGVkIjoiMjAyMy0wNC0yMlQxNToyODoyNy4yMTg2MjUifSwibWV0YWRhdGEiOiI3ZjJjY2YxMzcwNzhhOGNjNzMzNDc1YTU3NTQ3ZTE5OWNlNjk0MmIxIiwibWVkaWEiOiI2MTJjZjM0YzQwZDBiMzJiNjE5M2ZhMDVmN2JkYmNkMTlkNDQ4MWExIn0.A8LYPS0UAiVmJ16u9HzaySt0k6YAMqO3pTWQqlWSCsE6o81sXM_NuuY6LLrwB_TbNhSbYe6JbERln7JDrqtr5_FsnRHKS0KuvNS8wsoUzJE5Q5Fmm5naFdDx47fRgKsc",
"alg": "ES384",
"updated": "2023-04-22T15:28:27.218625"
}

Trying to keep this layman’s guide layman at this point gets trickier.

For now it’s enough to say that Decentralized Identity (DID) PSQR is a technology for establishing, securing and authenticating digital identities using the latest in cryptography and security.

Signatures verify and authenticate documents and their publishers (in this case Project Dara) and we enshrine these signatures together with IPFS CID hashes (which are also a kind of signature), onto a blockchain ledger.

Epilogue

In very simple terms, Dara uses IPFS to provide immutability and blockchain to reach for immortality.

Our PG library is lightweight and fits on a 128 gigabyte USB flash drive!

We’re in the process of preparing an image file so that you can all copy and distribute this library even more easily.

We archived the web and text versions of all PG eBooks, wrapping them up in ZWI files with relevant metadata, signatures and media assets.

The Immutable PG Library is complete, tamper-proof, self-contained, updatable, and totally unique in how it archives this precious body of knowledge.

In very simple terms, Dara uses IPFS to provide immutability and blockchain to reach for immortality.

The Dara Project Gutenberg Library is dedicated to Michael S. Hart, founder Project Gutenberg, 1947 — 2011.

“Although you may have moved along, we’ll always keep your message strong:
To share words and books together; reading, singing, friends forever.”

Thank you, Michael.

Appendix: Notes from the architect

Original PG is like 2TB the processed PG which strips “attic” contents and keeping the main asset file, if HTML is found, that will be the key file, if not, then text is used, if neither are found (in case of mp3 for instance) then it’s not indexed. out of all the PG titles, 69055 books are processed. every book underwent the clean up, and restructured in-line with ZWI standard then is signed by DARA’s private key, verifiable via DID:WEB and DID:PSQR Title numbers are retained across the library, so book id 1 in PG is 1.zwi in DARA. The aforementioned cleaning results in a library overhead of ~112GB total! All the books are stored inside a MFS Directory on IPFS… this is published to IPNS at k51qzi5uqu5dh0negpidss3lsz59ngevump5kysbcl0g07hd4ig1e102sbhtzq… for instance: https://cloudflare-ipfs.com/ipns/k51qzi5uqu5dh0negpidss3lsz59ngevump5kysbcl0g07hd4ig1e102sbhtzq/ will load up the PG library same as https://cloudflare-ipfs.com/ipfs/QmW6UZQ946t1sJSChPuA6fuMNL3qwwDtxhFKmw6ydiamZu/ would (first URL is /ipns/ and the second is /ipfs/ ) You can download the ZWI files by changing “opfs” to “ipfs” in the URL of a book that you’re viewing, OR simply add “&download” to the end of the URL. for instance: https://opfs.dara.global/ipfs/QmZEZcQkHEfAP9G9qw63ZqxKzvaLdEvJ8d4mboF9BVHNkf/55.zwi will render book ID 55 on the fly https://ipfs.dara.global/ipfs/QmZEZcQkHEfAP9G9qw63ZqxKzvaLdEvJ8d4mboF9BVHNkf/55.zwi will download book ID 55 as a ZWI

--

--