How to start your self-hosting adventure: a high-level overview
Reddit is a great starting point for getting new ideas for your homelab: racks full of machines in /r/homelab, storage measured in terabytes (or even petabytes) over at /r/datahoarder, all the different services that people host over at /r/selfhosted. This can be a bit overwhelming for someone just starting out in this area, which is why I decided to write a small guide on how to start off on your self-hosting adventure.
Why self-host in the first place?
Hobbyists and professionals have a lot of different reasons for self-hosting services. These include (but are not limited to):
- it’s an excuse to play around with computer hardware and rationalize the purchase of thousands of dollars of hardware
- reduce reliance on “big tech”
- the services solve an actual problem for the user (home automation, secure file storage, backups etc.)
- learning new tech and solutions that otherwise have not come up during their day-to-day work
Regardless of the specifics, the main part of the self-hosting experience is to just have fun and work on challenges that you find interesting.
Step 0: find a problem to solve
One of the best motivators for setting up your homelab/self-hosting setup is to solve a real problem that you have.
Do you want to automate your home with HomeAssistant?
Or do you want your family to be able to back up their valuable photos and videos to your server using Nextcloud?
Or do you want to share your media collection with your family and friends using Jellyfin?
It doesn’t really matter what solution you end up using or how you do it, as long as it is a secure solution and that solves your original problem without creating 10 additional ones.
Step 1: find a place for running your software
Once you have a problem that you are going to solve, you will need to also find a home for your project to live in. For this, you have a lot of different options:
- using a virtual machine at a cloud service provider (DigitalOcean, Linode etc.). Performance is not the best and beefier configurations can be quite expensive, but this is a reliable option with a fast network.
- renting a server from a service provider (example: Hetzner). More performant than a virtual machine at most cloud service providers, but probably more expensive as well. Usually comes with a fast network.
- an old laptop that you have collecting dust on your shelf. Not the fastest or the quietest option, but it does not use a lot of power.
- an old desktop PC. Probably plenty fast for any task that you can throw at it, but its power usage is higher compared to laptops.
- a tiny desktop PC. Low power usage, but can pack quite a punch. For inspiration, check out Project TinyMiniMicro by ServeTheHome.
- a single-board computer, such as a Raspberry Pi or alternatives. Plenty of performance for most workloads, uses very little power (5-15W generally) and silent as well.
- a second-hand rack-mounted server. Probably quite performant, but uses a lot of power and is very noisy.
- a NAS box from one of the more popular providers (Synology, QNAP, TerraMaster etc.). Useful for when you need a lot of storage on your machine or something that just works.
Your choice will likely be affected by other factors as well:
- your home internet connection. If your ISP does not allow opening ports or if your upload/download speeds are not that great, then using a cloud service provider or renting a server might be a better option for you.
- living in a small apartment will likely rule out noisy solutions, such as full-blown rack-mounted servers or powerful desktop PC-s.
- if all you want to do is local testing, then a virtual machine on your already existing desktop/laptop PC might be a better fit, as it does not require you to buy any new/used hardware.
- reliability and price of electricity. In some countries, running a server in your home might result in a huge power bill or might not be possible at all due to the power grid being unreliable.
The workloads that you plan on running will also have an effect on your chosen approach. You will need to take into account the resources that your workload requires. It does not make much sense to get a rack-mount server for hosting your Wordpress blog if a Raspberry Pi can do the same job just as well. On the other hand, running a CPU-heavy service like Jellyfin on a Raspberry Pi will be a painful experience when compared to running it on a desktop PC or an enterprise-grade server. If your service will be used by a high number of users, then you might want to opt for a more powerful machine.
Some solutions may also benefit from faster, SSD-based storage. Loading a lot of smaller files or running a big database off of hard drives will be noticeably slower than running them off of fast SSD-s.
There is no one size fits all solution. It’s better to do a little bit of initial research and use the right tool for the job. If in doubt, ask for help from someone more experienced.
Step 2: find the software solution that fits your use case the best
Just like with hardware, there is also a lot of choice when it comes to software.
Some solutions are pretty low-level and require setting up the OS and the services manually. This is not very user-friendly, but it offers the most potential for learning about your system and how to manage it. This approach will likely require you to learn new things and pay more attention to automating common tasks (updates, monitoring, backups) and making sure that everything is secure (no weak passwords, services properly isolated, regular and frequent updates etc.). If something goes wrong, then you will need to fix it yourself, but at least you have all the tools and knowledge to do so.
Example: Debian with ZFS for your main storage and services running using Docker or Podman (my personal preference).
There are also solutions that cater to people who just want to get work done. These solutions usually provide the user with a web-based user interface that can manage the system and any services that run on it. Do you want to set up your very own Wordpress blog? Just click this button and you will soon be set up with one without having to even know what the hell a “Docker” is. The downside of this approach is that when things go wrong, you will still need to familiarize yourself with the inner workings of the system. Troubleshooting issues with no preparation and learning new things under stressful circumstances is not very fun.
Cloud service providers have also started offering one-click solutions to setting up the more popular services. your very own Minecraft server is just a click away!
- a NAS box from Synology that is aimed towards consumers
- VM provided and set up by Linode using a one-click solution in their web UI.
And then there are solutions that fit in somewhere in-between. Solutions like TrueNAS (formerly FreeNAS) and OpenMediaVault are made for users that can probably set things up themselves as well, but are just too lazy to do it. Using these solutions is not in any way worse than the alternatives. If it solves a problem and fits your use case, then go ahead and use it! Just keep in mind that you might need to peek under the hood when there are issues where the GUI cannot help you out.
As always, if in doubt, ask for assistance. It is perfectly fine for you to pick one of the user-friendlier options first and dig deeper once you feel more comfortable. Self-hosting is not a race, you can do everything at your own pace.
Step 3: managing your setup
Once you have everything up and running, there are some topics that you really should pay attention to in order to avoid big problems down the line.
If the only copy of your files lives on your NAS and something should happen to them (accidental delete, hard drives suddenly die, you get hit by ransomware etc.), then you are in a world of hurt. Having backups is a mandatory part of your self-hosting adventure. A mix of automated and manual backups is better than having no backups at all.
Some ideas for backups:
external hard drive attached to your server that receives automated backups that run on an schedule. Just make sure that any failures during the backup will be communicated to you (over e-mail, for example). It’s also a good idea to occasionally check if the last backup was successful or not. Downside of this approach is that since this drive is still connected to the server, it is subject to other events that can shred your data (ransomware, power surges etc.).
offline backups that you will manually run on a regular schedule. Example: copying all files from the server to an external hard drive on the 1st day of each month. If you accidentally delete all files on your server or a piece of malware does this for you, or if your server is toast due to a power surge, you will still have a copy of the data that cannot be affected by those kinds of issues.
snapshots. While technically not a backup, snapshots do help against accidental deletes. Depending on your setup, you might be able to retrieve files deleted a week ago if you keep snapshots of your data for the last 30 days.
a hard drive at your friends house. Helps protect against the worst case scenario where your house burns down.
(encrypted) backups to cloud storage providers, such as Backblaze, OneDrive or Google Drive. It will require a monthly subscription, but at least you will likely have a copy of the file somewhere. Just keep in mind that if you should get locked out of the service or fail to pay for it, then you might not be able to access the data hosted at that service.
If you are exposing a service to the wild wild west that is the internet, then you will be a target. Automated scanners will poke your services and if they find a vulnerability, they will use it. To prevent the worst, consider the following:
keep everything up to date. Don’t forget that for some updates to apply (example: kernel updates), you need to restart the machine. For other services (example: ssh), they can be restarted without restarting the whole machine. Regular updates will help prevent most security issues.
only expose services that you absolutely need to expose. With something like a web server, you really don’t have a lot of choice here, but with services aimed towards a known set of users, you can limit the exposed surface area by using a VPN or enforcing other restrictions (whitelisted IP addresses, fail2ban to block most intrusion attempts, etc.).
make sure that your configuration is correct. A lot of intrusions are caused by configuration issues that usually end up with a service being exposed to the web without any passwords or other protections set up. It’s a good idea to check if you are accidentally exposing too much by checking what others are seeing when they try to connect to your service.
try to set up your services so that you are prepared for an intrusion. If the attacker is able to take control of service X, but it is completely isolated from other services and devices on the network, then the damage they can do is likely limited.
Your setup will likely have issues. Hard drives might die, or the CPU might be overheating due to years of dust accumulation, or one service might be using up all the available resources, causing others to slow down as well. A better understanding of what your system is doing will help out a lot here.
Cloud hosting service providers usually provide at least a basic level overview of your resource usage: CPU, RAM, disk, network etc. If you are running your own machine, then you will need to figure this out on your own.
To understand what your system is doing, a tool like Netdata is a good starting point. If your server slows down, then you can most likely pinpoint the reason by looking at the resource indicators.
It is not realistic for you to keep an eye on all the indicators all the time, so having some kind of alerting set up is also a great idea. Events, such as a hard drive starting to throw errors, the CPU usage being at 100% for more than an hour, or a service not being reachable, are all something that you might be interested in hearing about.
You don’t have to monitor everything and have alerts set up to catch everything, but it is a great starting point for responding to issues proactively. Ending up with all your data lost because you did not hear about hard drive failures is not a great situation to be in.
If you have reached this point, then you are probably aware of most of the topics that will come up during your self-hosting adventure. You will encounter issues that will be specific to the approach that you have chosen at the hardware and software steps, but that’s perfectly normal. Most of these issues have solutions that are one search away, and if you have trouble figuring something out, then feel free to ask for help!
If you need new ideas, then check out the subreddits mentioned at the start of this post, you might find some gems in there! Do keep in mind that you will likely end up in an endless loop of building something, being happy with it for a week and finding something else to try out. It’s really fun, though.
If you prefer to share your thoughts on this post privately, just send me an e-mail!