For over two years now, I've been hard at work to wrestle control over my online life back from tech giants. Getting a NAS was the kick-off to a never-ending project in self-hosting that has taught me a great deal about IT and infrastructure. And I've grown increasingly confident in taking responsibility for my own online self.
While I still rely on third parties like Proton, a considerable part of my digital life is now fully self-managed. Photos, documents, bookmarks, identity: there are amazing open-source solutions for pretty much everything. Aside from the obvious privacy benefits, it's also quite liberating. If you don't like the way your email or photo archive works, just get a new one!
To some â including my ex-girlfriendšâ this endless tinkering sounds exhausting. And I agree that it can be a bit stressful to worry about your own upgrades and backups. Nonetheless, it pays off to have a full picture of where your data lives and which services you use. And I've learned a great deal about DevOps principles that help out in my work.
With the constant changes to my stack, it's become increasingly important to manage everything in a structured manner. That's why my goal is to implement everything as Infrastructure-as-Code (IaC). I heavily rely on two tools here: Terraform to provision resources and PyInfra to configure them. Beyond that, I use GitHub Actions for my CI/CD.
Previously, I wrote about how I use Tailscale on a daily basis. Ever since, I've only become a much bigger fan. In addition to my own hardware, I now use a VPS as a reverse proxy to access my photos and drive over the public internet. The term reverse proxy was a bit confusing to me when I started out. In short, it's a server that's within my VPN but also accessible to visitors not connected to my VPN.
This setup allows me to share documents and albums with friends, family, and colleagues. Tailscale allows the reverse proxy to connect directly to the necessary services â and only to those services.
Whenever I add a new service to my stack, I can generate an Access Control List (ACL) that controls the traffic between my NAS and reverse proxy. With Terraform, I can track changes to this file and automatically roll them out when needed.
Of course, the reverse proxy itself is also managed with Terraform. I previously hosted it on Digital Ocean, but recently switched to Hetzner (cheaper and EU-based). Swapping out a few Terraform resources was a breeze, and within an hour I had a new VPS up and running. That included signing up for a Hetzner account.
Of course, a VPS by itself doesn't do much: I still needed to configure it to serve as my reverse proxy. On Digital Ocean, I had configured Caddy to fulfill this job and route all requests to the right service on my NAS. Porting this over was also a few minutes work, because of PyInfra.
PyInfra basically lets you define a series of commands to execute on a machine. While Terraform is declarative (âI want these components to existâ), PyInfra is imperative (âTake these steps to achieve this configurationâ). Because I'd captured the commands to set up my reverse proxy in a few PyInfra scripts, I could simply run those scripts on my new VPS and get the exact same configuration.
Between Terraform and PyInfra, I can fully configure access to my self-hosted services. However, I'm prone to making errors when I run a bunch of commands. Recently I took the time to set up a true pipeline (or workflow) with GitHub Actions. Now, I'm finally at a point where I'm happy with the entire setup.
Making changes broadly looks like this:
infra
repository
Of course, the pipeline itself required a bit of setup. You can find the configuration itself here, but it broadly does the following:
With all of this running automatically when I change a file or two, I don't have to worry about forgetting a command or making a typo somewhere. And I can always look up how my stack is configured, because the entire configuration is captured in a bunch of text files.
What started out as hosting my own photo archives, has (d)evolved into cosplaying as a sysadmin. Between Terraform, PyInfra, and GitHub Actions, I have all the tools I need for a reasonably robust deployment pipeline. This gives me the confidence to increasingly self-host aspects of my digital life: photos, documents, bookmarks, identity management, and who knows what's next?
As always, there's still room for improvement. While I manage my
services with distinct docker-compose
files, I currently
only manage those files within Portainer. My next project will be to
capture them as part of my infra
repository. And the
subsequent project will be to set up automated deployments for my
containers as well.
Lastly, there's the configuration of my NAS itself. Synology provides a great way to get started, but most of my config has been click-ops'd together. If my NAS is stolen or spontaneously bursts into flames, I don't have a way to easily replicate my setup â aside from restoring a backup, of course. At some point, it would be great to replace my NAS with a Proxmox or TrueNAS setup.
But at that point, I'm unsure whether I'm still cosplaying as a sysadminâŚ
. . .š Now wife â¤
² I.e., the computer that executes the pipeline on your behalf
I like working with data and tech to help people solve problems. Although I am comfortable with the "harder" aspects of data engineering and data science, I firmly believe that tech shouldn't be self-serving. What I like doing best is connecting with people, sharing knowledge, and discovering how data can help improve life and work.
linkedin.com/in/rcdewit