The IPFS is a peer-to-peer hypermedia protocol which promises completely distributed applications and the ability to make the web faster, safer, and smarter. We had a chance to talk with some of the people behind this exciting technology, Juan Benet & Kyle Drake. What they envision for IPFS is not only a blueprint for the permanent web, but an evolution in the way information is distributed and stored.
Travis: IPFS has built upon previous implementations of a peer-to-peer web structure, can you explain what makes this implementation better than those which came before it?
Juan: Many times good ideas don’t take off because the implementations or the deployment strategy weren’t good enough. Sometimes it’s just a timing thing. Think of the Apple Newton well before high bandwidth cell networks and multitouch screens. In the case of IPFS, the ideas we build upon were usually very specific to solve a particular problem. This is a great way to build a solid, usable system. But sometimes, the ideas can be applied in a much broader way.
Much of our contribution is a process of integration, making many good ideas and many pieces work together cleanly and robustly. Developers and engineers are the users of protocols, and the user experience has to be really, really good for something to be maximally useful. We devote a lot of effort to making the UX really good. Many people are blown away by the things our tools make possible, and getting there took careful design and refinement.
Kyle: As I understand it, there were some brilliant attempts to make similar things in the past. Often they didn’t have all the pieces together needed to do this, or they were more academic-oriented. And in many cases, the technology just hadn’t quite caught up with the idea yet. But they were great projects, and everybody learned a lot of lessons from them.
The main enhancement I see IPFS providing is the combination of a P2P network with the MerkleDAG, which is a way to split up data into smaller chunks that can be chained together like a tree. This makes IPFS incredibly efficient, general purpose, and powerful for a large number of data-related problems, including the web. Git, as an example, uses basically the same technology to do efficient revision control for software development. IPFS takes the ideas behind Git, makes it general purpose, adds a P2P network for fast data retrieval, and the result is very powerful system for the distribution of all data.
Aside from being very efficient and easy to distribute, data is verified using the hashes, which are basically just cryptographic signatures allowing for data verification. This enables content addressing – the idea of looking up the content rather than asking a location (IP address) for arbitrary data.
With the content-addressing approach, you get a system where data can be distributed from untrusted computers (nodes), and it becomes possible to create decentralized collections of tons of data, everything from blockchains to video playback to libraries to web sites.
And because it can stream chunks of the data you need from multiple machines, it’s not only more reliable, but it’s also really fast. I’ve seen 4k video streaming from IPFS that I didn’t even know was possible to do without fiber internet. It’s already ridiculously fast, and we haven’t even finished optimizing it yet! It’s a solid signal to me that this idea is going to work really well at scale.
Travis: That actually sounds incredible and I’m already excited about the IPFS.
Described as the ‘Permanent Web’ can you tell us why you feel it so important to preserve older websites and their data? What is the danger in letting this older content erode and disappear?
Kyle: The recording and persistence of information is the single greatest invention of humanity. It allows our ideas to live forever, even when we don’t.
When I read Socrates, I’m listening to the mind of someone that’s been dead for almost 2500 years. And yet he’s speaking clearly to me, as if he’s still alive. That’s magical. There’s a lot to learn from those who came before us.
What IPFS does is use technology to provide this same capability to the web, so that we don’t lose what’s being written. Right now, the web is in the dark ages. We put up data and it erodes and disappears, whether we want it to or not. That’s not where we should be if we want to be a civilization that thrives on ideas. If Socrates put his thoughts on a web site 15 years ago, chances are they would be gone today.
I think it’s appropriate that we call the current environment a “tech culture”. The vikings had a culture too. But civilization is something different. It’s a sense of permanence. The idea that you create something that’s not just for a couple years, but that’s going to be around for thousands of years. Kenneth Clark (the British art historian) was a big believer in this idea, and when I was reading his works, he described the difference between a culture and a civilization, and it just clicked for me. That’s what IPFS is going to do.
We need to build a tech civilization, or else the web will not have a lasting legacy. IPFS finally makes it possible use technology to accomplish this.
I like to share this clip from Digital Amnesia, a great documentary that discusses this issue at depth, and I think Jason Scott really nailed it on the head when it comes to the reasons why we need to preserve our culture, even if a lot of it… isn’t quite Socrates.
Juan: Kyle and I — and Jason Scott, Brewster Kahle, Vint Cerf, and many others — are deeply worried about our collective history being erased, both in the small, daily book burnings that happen when a website moves or is taken offline, and in the potential for large catastrophes that may wipe out entire data centers, and large swaths of information.
Today, we’re at risk of losing all the data that represents our human knowledge, that builds up our understanding of science and technology, that makes up our history and our art, that contains all our personal communications. What if some catastrophe happens, takes out a few datacenters, and then all your email is suddenly vaporized? We could even lose how to make things, how to cure certain diseases, how history actually happened. There’s already been several cases of scientific data archives being lost; academic papers whose data can no longer be checked.
The amazing super powers of humanity — things like how certain laws of nature work, how to make drugs that cure horrible diseases, how to build telecommunication networks — these all rest on our knowledge, and our knowledge depends on being kept safe. Humanity has already suffered catastrophes like this in the past — the burnings of the Library of Alexandria. We’re really not out of the woods on this– it could happen again, and it could happen very suddenly. We don’t like to think about it because we’re tuned for a relatively unchanging environment, and because we like feeling secure in our day-to-day life.
While people like Elon Musk are working really hard to make civilization multiplanetary before some catastrophe wipes us out, we’re working hard to make sure we can backup our knowledge. The first step to this is to establish a really good information distribution protocol that can facilitate the movement of data and ensure its integrity (through cryptographic protocols). We’re at this step, and we will be for a few years. Other steps are identifying and replicating the corpus of all human knowledge, identifying reliable long term media storage, and how to build facilities to store this information.
Travis: It sounds like the IPFS can fill this void in knowledge loss risk by providing a permanent, cryptographic network for all people to use.
Can you explain how the IPFS would make the web much more secure in the way it would replace SSL with a cryptographic, distributed CDN?
Kyle: SSL only encrypts the connection between the user and the server (using a centralized system that is routinely compromised by hackers and governments), but you still have to trust the content on that server. That’s because HTTP has no way to check that the information you’re receiving is what you intended to get. So if you start mirroring your data on a another machine you don’t control, that machine could send malicious data without the ability for anyone to verify it. Because of this, you can only safely serve content from your own servers unless you completely trust the other party.
The workaround people use today is to generate a one-way hash for data (usually distributed as a compressed tarball), and then distribute that tarball and publish the hash in a text file so people can verify it. Of course, this doesn’t work very well because an attacker could also change that hash file. And because there’s no standard for cryptographic verification with HTTP, it’s not possible to automate that process in web browsers, so almost nobody does it.
IPFS bakes cryptographic verification into the protocol itself, which makes it so there is a process for determining if data received is valid by default. As a result, it’s possible to have other servers (or in IPFS’ case, nodes) serve that content as well, even if you don’t trust them, because there is a process to determine if the data is what you expected to receive.
This makes possible the idea community-powered web sites, where users that like a site (or any data) can volunteer to help distribute it. And if the “seed node” goes down, as long as at least one computer on the network has the data, you can still get it. Federating data in IPFS only requires a single command (IPFS pin add HASH). That’s how easy it is to make verifiable copies of data like web sites, and how easy it is to create redundancies in data distribution.
For this reason, I strongly believe IPFS is the future of data distribution, especially when it comes to the web.
Juan: We’re making it very easy to digitally sign the data on the web itself. Right now, there’s no clean way to do this– usually signatures are added on top, and only to pieces of content. We’re embedding cryptographic signatures directly into the protocol itself.
This, coupled with the hash integrity checks that Kyle describes, gives you a distribution system that not only verifies the data hasn’t changed, but gives you a clean way to trace the provenance of the data, to make sure that the developers of applications, creators of media, and authors of pieces of information truly made that information and it is reaching you untampered. As Kyle says, HTTPS (SSL/TLS) is only encrypting and authenticating the communication wire, and not at all authenticating the data itself.
Authentication of data at rest is a big deal. It’s something that we’re pioneering — at the protocol level — and which isn’t really part of the web today. Imagine if news organization, blogs, social networks and other publishing media natively allowed you to digitally sign messages to PROVE that it was truly you who said something, and the data hasn’t been changed.
Travis: You’ve stated that Neocities is planning on integrating with the IPFS. What are your motivations for doing this and do are there many other sites integrating with it as well?
Kyle: Neocities has already integrated with IPFS! Right now we’re using IPFS for archiving sites (see the latest IPFS hash for Neocities), but in the future we’re going to move towards community-hosted sites on Neocities. We’re just waiting for a few more features to solidify before moving forward on the idea (think months, not years). But long-term, my ambition is to build Neocities completely on top of IPFS, making it the first web host completely built on top of IPFS.
There have already been several sites that have built on top of IPFS, including:
- IPFS.pics – decentralized picture hosting with IPFS
- IPFSbin – Distributed code sharing with IPFS
- Ethereum integrating with IPFS and that many ethereum dApps (Persona, by ConsenSys)
- Alexandria – Use IPFS to store data permanently
- Archiving Projects – several projects by community members to archive critical data sets with IPFS
In the future, there are plans to bake this replication functionality directly into the protocol, so that users can agree to seed data from specific nodes. This will be accomplished by using keypair cryptography on IPFS nodes to sign references to IPFS hashes representing all the data on the node. This is currently being called IPNS, which is a way to do mutable addressing on top of immutable data structures. Bitcoin users will be familiar with this – an IPNS hash is basically a pubkeyhash, the same thing Bitcoin uses for spending addresses. IPFS lets you use these “addresses” to retrieve data.
Juan: There are also some interesting integrations in the works with organizations like: research labs, ISPs, and even banks. I cannot speak details about them, as they’re conservative organizations that are exploring at this point.
Also, at this point, hundreds of (commandline savy) individuals use IPFS as a distributed filesystem to distribute + backup various personal files or host their own webpages.
Travis: With the name ‘InterPlanetary File System’, it is clear that the project is not lacking in ambition. Where do you seen the IPFS in 10 years and how do you think we will be using it?
Kyle: As I understand it, Juan named it the InterPlanetary File System as an ode to J.C.R. Licklider, who when designing the principles of what would eventually become the Internet had some ideas for how to make the system work with other planets. The hard speed limit of data transfer is the speed of light. This means there is, on average, a 12.5 minute delay in data transfers between the Earth and Mars. IPFS makes it possible to retrieve a lot of data at a time, so that things like Wikipedia will be accessible and up-to-date for colonists on Mars without requiring a 12.5 minute delay for each lookup. It’s not the primary goal of IPFS, but it’s interesting that the technology has usefulness for human ambitions towards planetary colonization.
My personal hope is that in 10 years, the majority of the world’s data is stored and distributed with IPFS. We’ll have browsers that will be able to seamlessly switch between HTTP and IPFS, and that the process of using either will be seamless, so that the end user doesn’t even know that IPFS is working under the hood. Using IPFS’ ability to use the current Domain Name System, IPFS lookups can be done the usual way we access sites (example.com). Long term, I would personally like to see something more decentralized used for human-readable lookups, such as Namecoin, but we may need some time to figure out how to do this properly.
I also would like to see things like cryptocurrencies implemented on top of IPFS. It would make the process of building and distributing blockchains much easier, and help to negate some of the problems associated with storing a blockchain. A lot of the debate with Bitcoin right now is on how to make it possible to store the entire blockchain without requiring a datacenter, which has lead to a lot of contentious issues like how big blocks should be. IPFS has the potential to solve a lot of these issues by distributing full blockchains across multiple IPFS nodes, enabling easy, verifiable lookups of parts of the blockchain. This, combined with faster internet access and cheaper storage, allows for massive blockchain sizes in the future, powered by millions of IPFS nodes that will have more storage capacity than even the biggest datacenters in the world could ever hope to achieve.
Learn more about the IPFS in this recent talk given by Juan Benet.
There are as many different private key combinations as there are physical atoms in the known universe.
The creators behind the interplanetary file system (IPFS) hope that in 10 years, the majority of the world's data...