The tagline for /r/datahoarder reads: “It’s A Digital Disease!”. I agree.

Why can't I hold all these hard drives?
Why can’t I hold all these hard drives?

At some point I realized that the pursuit to hoard all the things will just keep on consuming more and more of my time and money. Storage is cheap up until to a point, once you find yourself tracking hard drive prices via camelcamelcamel and being excited that a 12 TB external hard drive is at it’s lowest price, you may start suspecting that you have a hoarding issue.

I don’t have anything against what people at /r/datahoarder and other archival efforts do. Data archival done by volunteers is what allows us to preserve history and cultural artifacts of our time (YouTube videos, memes etc.). I just had to find a solution to my situation before it got too bad.

Something is better than nothing

Once you find yourself with terabytes of data and decide to scale down, you will need to make some changes. Most of the time this results in deleting a lot of data that you don’t care about that much. This works, but there are some files that will be difficult to part with.

In my experience, most of the storage is taken up by media files, such as my archive of select YouTube channels. I would still like to have a copy of some videos, but if they are downloaded in 1080p or 4K resolutions, they will take up a lot of space.

That’s when I decided to change my approach with data like that. I would still like to have a copy, but it doesn’t necessarily have to be full quality all the time. When you rewatch YouTube videos that you have archived from 2006-2010s, you’ll notice that the video quality is bad. That doesn’t mean that you enjoy the videos any less. Watching these old videos is just an act of going through your memories and feeling nostalgic, the medium itself is just the spark needed to revive the memories.

With that, I decided to start following this simple rule: something is better than nothing.

I don’t need 4K versions of all the channels I follow. bestvideo[height<=1080] in yt-dlp configuration will limit the resolution of all the videos you download to 1080p, which is still good enough.

Data, value, and you

If you’re someone like me who watches YouTube frequently, then you probably have some channels that you care about a lot.

With channels that I truly care about, I still keep the highest quality copies of all their videos on my server. This has a real cost in terms of storage, but the value I have got out of the videos is much greater than that.

However, there are some exceptions. LinusTechTips is a great channel and has had a huge influence in my life. However, they upload almost daily and do long streams from time to time. Archiving the whole channel is just not feasible at my scale. Even when limiting the resolution to 480p, the full channel archive takes up over 300 GB of space in 2022. Instead, I’ve opted to recognizing the videos that are the most memorable, such as the petabyte project, and have archived those individually.

There’s also a certain type of data that isn’t that common: data created by you and your friends/family. Extra care and focus should be put on this type of data, because it’s very unlikely that someone else has a copy of gigabytes worth of cat pictures you’ve taken over the years. Make plenty of backups and don’t worry too much about hoarding it now, as long as it’s not a big burden for you.

Constraints

Constraints force you to get creative. It also helps with tuning down your data hoarding habit.

Set physical limitations to the data you can store. For example: limit yourself to running only two hard drives at a time. The only way to get more storage is to upgrade vertically to bigger hard drives, which is naturally throttled by the relatively slow increases in hard drive sizes.

For reference, my current setup has some pretty harsh limits when it comes to storage. The only way to get more is to spend a lot of money on bigger SSD-s.

The Internet Archive

If you don’t want to go through all the hassle, but still want to contribute to archival efforts, then you can donate to the Internet Archive instead. Buying and running all those hard drives isn’t free, you know.

And if you’re planning on uploading data to the Internet Archive, then please do not abuse it. Only upload data that you feel is worth preserving.

Comments

Places where you can discuss this post: