We are building an essential infrastructure, that is meant to ensure three main properties for the source code we collect:
We base our infrastructure on three main pillars that provide a solid foundation.
Long term preservation efforts cannot be based on black boxes that hide the process behind closed source. We are long-time Free/Open Source Software developers and advocates, our code and specifications will be open.
We are designing a complex software architecture. Its design and specifications will be made public.
All the code developed for Software Heritage will be released under a Free and Open Source Software (FOSS) license.
We will adopt an open development process, and strive to create a development community around all components of the Software Heritage infrastructure.
Each software component is assigned a unique identifier that is intrinsically bound to it. It does not rely on third parties, so it is truly persistent, and everybody can build on it.
Every software artifact receives an unique identifier. This unique reference can be used in textbooks, documentation, build instructions and many other places to build a consistent web of knowledge.
We use intrinsic identifiers in Software Heritage, that can be directly computed from a software artifact. There is no need to rely on a third party to know whether a given identifier corresponds to a given artifact.
“Let us save what remains: not by vaults and locks which fence them from the public eye and use in consigning them to the waste of time, but by such a multiplication of copies, as shall place them beyond the reach of accident.” — Thomas Jefferson
We are planning a distributed infrastructure, that will enable to duplicate all the contents among a large set of peer nodes.
This is essential to prevent information loss, and will greatly simplify sharing,
We will actively seek to grow a multistakeholder network of peers.
New partners will be able to easily join our efforts along the way, thanks to our open source code, and our open specifications.