Internet Archive is in danger

Moorshou@lemmy.zip · edit-2 3 months ago

Internet Archive is in danger

parpol@programming.dev · 3 months ago

deleted by creator

wizardbeard@lemmy.dbzer0.com · 3 months ago

This is the worst kind of misrepresentation of tech. Nothing you said is explicitly false, it sounds true in passing, but it sure is effectively false.

The amount of data you can actually store in any single node/transaction on a given blockchain is traditionally very small. Even most NFTs are not truly “on the chain” as in the image data fully stored in a node/element, it’s instead a “smart contract” which just says X identity owns Y (with Y itself being stored elsewhere). There have been many many attempts at actually storing data on various chains and there hasn’t been any successes significant enough to come even close to being able to store the classic 90’s Space Jam website, let alone the fucking Internet Archive.

Beyond that, you absolutely can take down nodes in a chain, so to speak. Numerous major “heists” have been “rolled back” or had their nodes/transactions flagged to be ignored by marketplace admins.

makeasnek@lemmy.ml · edit-2 3 months ago

You’re right about NFTs. There’s no reason to store the data on chain. Chain stores the metadata and pointers to the files (IPFS, torrent/magnet link whatever). Chain administers how many copies etc should exist and enforces those rules. Filecoin etc have already successfully done this.

parpol@programming.dev · edit-2 3 months ago

deleted by creator

smileyhead@discuss.tchncs.de · edit-2 3 months ago

And guess what? You don’t need blockchain for that.

Torrents exists, IPFS dropped dependency on block chain. What blockchain do is that if your literal neighbor has the file you want, you must first connect to the global super inefficient network to sync your chain. And if you have censored Internet? Well…

Layer torrent on top of Yggdrasil on top of I2P and you’ll get faster, more decentralized and more resilient network than any blockchain ever done. Such network would continue to work from friend to friend even if your whole town get cut from global net.

parpol@programming.dev · 3 months ago

deleted by creator

makeasnek@lemmy.ml · edit-2 3 months ago

Downvote this guy all you want, but this is an incredibly true point. For 15 years, Bitcoin has maintained a distributed, uncensorable ledger, the question is, can we use similar ledger tech to store archive.org? Wikipedia is a single point of failure, so is Archive.org. So is the library of congress. We could easily store all the text content of wikipedia on chain that’s under 100GB, along with IPFS pointers to media content. Long-term, humanity needs a resilient censorship-resistant system to store our collective knowledge and history. These systems, when sufficiently large, are uncensorable and incredibly difficult to exercise undue influence against or shut down. Ask anybody whose tried to get a judge to enforce a judgement against the bitcoin blockchain lol. And they can survive quite well major disruptive events like wars, natural disasters, and even widespread network disruptions. Blockchain can also solve the spam problem that plagued early P2P systems like Gnutella/Ed2k/etc. Everybody moved to BitTorrent because we could trust custodians (trackers and indexers) to curate lists of valid torrents. But that can be decentralized now.

There’s over a dozen different blockchain projects working on the “file storage problem”, some of them have very interesting proposals, at least one of those is going to emerge from the smoke with something that will replicate archive.org’s current role, but it might be a few years before that happens. Already, we have blockchains which offer “decentralized file storage marketplace” that competes pretty well with current file storage providers (AWS etc), and some of them have been running for years.

Programmer Belch@lemmy.dbzer0.com · 3 months ago

You don’t need blockchain to accomplish what the internet archive is, just a network of computers that share a part of their disk space to the other computers. This is just a torrent network at the end of the day

parpol@programming.dev · 3 months ago

deleted by creator

A1kmm@lemmy.amxl.com · 3 months ago

Blockchain is great for when you need global consensus on the ordering of events (e.g. Alice gave all her 5 ETH to Bob first, so a later transaction to give 5 ETH to Charlie is invalid). It is an unnecessarily expensive solution just for archival, since it necessitates storing the data on every node forever.

Ethereum charges ‘gas’ fees per transaction which helps ensure it doesn’t collapse under the weight of excess usage. Blocks have transaction limits, and transactions have size limits. It is currently working out at about US$7,500 per MB of block data (which is stored forever, and replicated to every node in the network). The Internet Archive have apparently ~50 PB of data, which would cost US$371 trillion to put onto Ethereum (in practice, attempting this would push up the price of ETH further, and if they succeeded, most nodes would not be able to keep up with the network). Really, this is just telling us that blockchain is not appropriate for that use case, and the designers of real world blockchains have created mechanisms to make it financially unviable to attempt at that scale, because it would effectively destroy the ability to operate nodes.

The only real reason to use an existing blockchain anyway would be on the theory that you could argue it is too big to fail due to legitimate business use cases, and too hard to remove censorship resistant data. However, if it became used in the majority for censorship resistant data sharing, and transactions were the minority, I doubt that this would stop authorities going after node operators and so on.

The real problems that an archival project faces are:

The cost of storing and retrieving large amounts of data. That could be decentralised using a solution where not all data is stored on a chain - for example, IPFS.
The problem of curating data and deciding what is worth archiving, and what is a true-to-source archive vs fake copy. This probably requires either a centralised trusted party, or maybe a voting system.
The problem of censorship. Anonymity and opaqueness about what is on a particular node can help - but they might in some cases undermine the other goals of archival.

parpol@programming.dev · 3 months ago

deleted by creator

LiveLM@lemmy.zip · 3 months ago

???
Taking down PirateBay didn’t kill the torrents it hosted

parpol@programming.dev · 3 months ago

deleted by creator

parpol@programming.dev · 3 months ago

deleted by creator

Programmer Belch@lemmy.dbzer0.com · 3 months ago

Well then, just use an anonymus service to distribute magnet links (i2p, tor, blockchain)

parpol@programming.dev · 3 months ago

deleted by creator

Programmer Belch@lemmy.dbzer0.com · 3 months ago

I agree but I find blockchain technology too costly hardwarewise, a simple anonimizing network may be enough