Astrohttps://blogs.c3d2.de/@/astro/atom.xml2022-06-30T18:26:29.733411+00:00<![CDATA[Deploy to MicroVM.nix]]>https://blogs.c3d2.de/~/Astroblog/Deploy%20to%20MicroVM.nix/2022-06-30T18:26:29.733411+00:00Astrohttps://blogs.c3d2.de/@/astro/2022-06-30T18:26:29.733411+00:00<![CDATA[<p dir="auto">For me, isolating the breakage of one service from another is a strong motivation to run containers and virtual machines. Upgrade some, let all others remain unaffected.</p>
<p dir="auto"><a href="https://github.com/astro/microvm.nix" rel="noopener noreferrer">MicroVM.nix</a> ships its own tooling for updating your independent NixOS systems: the <code>microvm</code> command. All that it is doing consists of maintaining the source flakeref in <code>/var/lib/microvms/*/flake</code> so that updates can be built from <em>somewhere</em>, and then creating the <code>/var/lib/microvms/*/current</code> symlink to the <code>config.microvm.declaredRunner</code>. Are these <em>internals</em> already, or should I start documenting them?</p>
<p dir="auto">At c3d2.de we just switched from Proxmox to a setup that consists of MicroVMs on NixOS. It is very pleasing to see my cheap but NixOS-native “virtualization solution” arriving at more deployments. I really wished for this setup to become much more convenient than what we had before.</p>
<p dir="auto">Both Initial deployment and updates from your local working state of this infrastructure’s flake are conducted with a <code>nix run .#microvm-update-$hostname</code> that will run something that looks like this:</p>
<pre><code dir="auto">nix copy --to ssh://root@${server} ${self}
ssh root@${server} -- bash -e<<EOF
mkdir -p /var/lib/microvms/${name}
cd /var/lib/microvms/${name}
nix build -L -o current \
${self}#nixosConfigurations.${name}.config.microvm.declaredRunner
echo 'git+https://gitea.c3d2.de/c3d2/nix-config?ref=flake-update' > flake
systemctl restart microvm@${name}.service
EOF
</code></pre>
<p dir="auto">First, we copy the current flake to the server so that it can build from it, which is then done via ssh. Afterwards, the <code>flake</code> file is written so that <code>microvm -u</code> later works, and finally the new or updated MicroVM is restarted. I simplified a bit. The MicroVM can also be built locally or with a remote builder. The service shouldn’t be restarted if there was no change to the <code>current</code> symlink. There is not only plenty of room but really an open space to design these workflows to your own preferences and work style.</p>
<p dir="auto">To prepare all updates from stable NixOS for our infrastructure, we’ve got a systemd.timer that runs <code>nix flake update --commit-lock-file</code> and pushes to that <code>flake-update</code> branch which is then prebuilt by our local Hydra, saving us a lot of time once we run <code>microvm -u</code>. We can even fetch JSON from the Hydra API to obtain the resulting store path which we can then just <code>nix copy</code> onto target server so that <code>/var/lib/microvms/*/current</code> can get updated, relieving you of that extra Nix evaluation. If you made ssh a requirement for your MicroVMs and also ran with a <code>/nix/store</code> mounted from the host, you could even switch to the new system without rebooting the MicroVM.</p>
<p dir="auto">The included <code>microvm</code> tool still works:
<img src="https://blogs.c3d2.de/static/media/B2D1B778-EAE7-660D-EE66-76BF0B7BBA55.png" alt="microvm -l output">
(stderr omitted for two lines of embarassing warnings.)</p>
<p dir="auto"><em>Composing infrastructure with NixOS is fun!</em></p>
]]><![CDATA[NixOS Clusters]]>https://blogs.c3d2.de/~/Astroblog/NixOS%20Clusters/2022-03-11T18:31:07.063282+00:00Astrohttps://blogs.c3d2.de/@/astro/2022-03-11T18:31:07.063282+00:00<![CDATA[<p dir="auto">I fully buy into the Nix way of having your infrastructure configuration as versioned code, ready to test on the local development machine.</p>
<br>
<p dir="auto">My servers have their services compartmentalized into autonomous Linux systems using <a href="https://gitea.c3d2.de/zentralwerk/network/src/branch/master/nix/nixos-module/server/lxc-containers.nix" rel="noopener noreferrer">LXC containers</a> or <a href="https://github.com/astro/microvm.nix" rel="noopener noreferrer">MicroVMs</a>. Not so much for security reasons with containers, but as a unit that I want to backup, update, or even move to another host. With virtual machines, removing much of the attack surface of a shared host kernel is a welcome side effect.</p>
<br>
<p dir="auto">Simply running containers and virtual machines is fine by me and probably for many other small use-cases. Yet for use at a larger scale, the need for redundancy arises so that services remain operational in the event of hardware failure. It’s a call to duplicate servers. And because just two servers cannot properly decide if they’re the one that is last standing, having at least an odd number of cluster members is the first recommendation in any documentation. With Nix Flakes duplicating servers is a no-brainer because all builds are reproducible and transferrable with <code>nix copy</code>.</p>
<p dir="auto">Moving containers and virtual machines across hosts, that is, stopping it on the source host, and subsequently starting it on the target server, doesn’t have many strings attached because these virtualized systems are fairly self-contained. To achieve automation of that process in the event of hardware failure, I have looked around for the standard solution on Linux servers. The popular answer seems to be <a href="https://clusterlabs.org/pacemaker/" rel="noopener noreferrer">Pacemaker</a> for which I discovered a dead pull request to nixpkgs. <a href="https://github.com/NixOS/nixpkgs/pull/162535" rel="noopener noreferrer">I revived it</a> along with modules and a test for NixOS.</p>
<p dir="auto">I got Pacemaker to take care of my systemd services, starting a container on one server of the cluster, starting it on another if the first goes down. There’s a plethora of Pacemaker’s own tools to operate the cluster and its resources. I wonder how much can be masked away in a declarative setup.</p>
<br>
<p dir="auto">Not all services are stateless, so storage must be synchronized. I was happy to discover that nixpkgs already ship the three major cluster filesystems.</p>
<br>
<p dir="auto"><strong>drbd</strong> shares block devices between hosts. NixOS includes a test. After a proof of concept I’ve had some afterthoughts regarding the identification of them through single numerical identifiers. I haven’t yet had a good idea how to map my declarative configuration to this scheme in a way that is stable enough that configuration changes won’t cause storage chaos.</p>
<br>
<p dir="auto"><strong>glusterfs</strong> looks so easily usable, it seems almost too good to be true. NixOS includes a test. Downside: it only shares directory trees, no block devices.</p>
<br>
<p dir="auto"><strong>ceph</strong> is a handful of magnitudes up in complexity. NixOS includes three tests. It keeps <em>Rados</em> block devices synchronized and directory trees (CephFS), too. Ceph is powerful enough to deal with all sorts of environments but requires a much more well-thought setup.</p>
<br>
<p dir="auto">The cluster filesystems are crucial to keep your stateful <code>/var</code> instances in sync. They will <em>hopefully</em> uphold <strong>c</strong>onsistency and <strong>a</strong>vailability in the event of <strong>p</strong>artition. The <code>/nix/store</code> on the hand could be synchronized in deployment scripts with a simple <code>nix copy</code> to the other servers of the cluster.</p>
<br>
<p dir="auto">I am considering how this might look like packed up in a reusable Nix flake. It should provide tooling to setup the various stateful parts. Then again, every setup is different, especially when it comes to the host’s network configuration. Example: availability is improved by distributing servers spatially. In that case I would like all intra-cluster communication to flow exclusively through Wireguard tunnels. It seems unnecessary to burn CPU like that in other cases where cluster machines have their own Ethernet segment because they just sit atop each other.</p>
<br>
<p dir="auto">As with anything reusable, there are a lot of questions surrounding the balance between ease of use by just dictating opinionated defaults and a configuration schema that allows for maximum customizability. I am seeking opinions and general interest on that topic.</p>
]]><![CDATA[Plume on NixOS]]>https://blogs.c3d2.de/~/Astroblog/Plume%20on%20NixOS/2021-12-25T00:01:35.914856+00:00Astrohttps://blogs.c3d2.de/@/astro/2021-12-25T00:01:35.914856+00:00<![CDATA[<p dir="auto">Having written my diploma thesis on federated social networking services ten years ago, I am happy that there is a contemporary approach named <strong>ActivityPub</strong>. I do follow up with new implementations. <a href="https://joinplu.me/" rel="noopener noreferrer">Plume</a> is one of them, with the extra advantage of being written in my favorite programming language Rust.</p>
<br>
<p dir="auto">I was recently asked if we ran a blogging service at <a href="https://www.c3d2.de/" rel="noopener noreferrer">C3D2</a>. We didn’t, but Plume came to mind immediately.</p>
<br>
<p dir="auto">As our infrastructure runs mostly on NixOS by now, the operating system is always my favorite choice when deploying a new service. Unfortunately, no Plume package is available for Nix so far. From browsing the Web, I’ve got the impression that others have tried and failed before.</p>
<br>
<p dir="auto">In the end, packaging it took a few days and many attempted builds. Having <a href="https://github.com/nix-community/fenix/" rel="noopener noreferrer">fenix</a> provide a rust for <code>wasm32-unknown-unknown</code> got me compiling the frontend <code>plume-front</code> pretty quickly. Luckily there is already wasm-pack in nixpkgs because that produces even more of the needed files compared to just building with cargo. There were also a few more things to learn for me about <a href="https://github.com/nix-community/naersk/" rel="noopener noreferrer">naersk</a>, the new Rust builder for Nix. I think that the resulting hacks are not all too ugly.</p>
<br>
<p dir="auto">By requiring a Rust compiler for a non-native target, which we get from fenix, I think that chances are low to get this into nixpkgs right now. If you think it can be done, let’s try!</p>
<br>
<ul dir="auto">
<li>
<p dir="auto"><a href="https://gitea.c3d2.de/c3d2/nix-config/src/branch/master/overlay/plume" rel="noopener noreferrer">Package</a></p>
</li>
<li>
<p dir="auto"><a href="https://gitea.c3d2.de/c3d2/nix-config/src/branch/master/lib/plume.nix" rel="noopener noreferrer">NixOS module</a></p>
</li>
</ul>
<br>
<p dir="auto">I am considering to break the code out into a separate reusable Plume.nix Flake. If <em>you</em> are interested, don’t hesitate to tell me. I am keen on sharing the maintenance efforts. I will publish that earlier then, because right now tinkering in our main infrastructure repository is kind of convenient while there may still be some issues to come.</p>
]]>