An older version of this is also available on my blog.
Sometimes you might want to run Docker containers on more than one host. Maybe you want to run some at one hosting facility, some at another, and so forth.
Maybe you’d like run VMs at various places, and let them talk to Docker containers and bare metal servers wherever they are.
And maybe you’d like to be able to easily migrate any of these from one provider to another.
There are all sorts of very complicated ways to set all this stuff up. But there’s also a simple one: Yggdrasil.
My blog post Make the Internet Yours Again With an Instant Mesh Network explains some of the possibilities of Yggdrasil in general terms. Here I want to show you how to use Yggdrasil to solve some of these issues more specifically. Because Yggdrasil is always Encrypted, some of the security lifting is done for us.
Often in Docker, we connect multiple containers to a single network that runs on a given host. That much is easy. Once you start talking about containers on multiple hosts, then you start adding layers and layers of complexity. Once you start talking multiple providers, maybe multiple continents, then the complexity can increase. And, if you want to integrate everything from bare metal servers to VMs into this – well, there are ways, but they’re not easy.
I’m a believer in the KISS principle. Let’s not make things complex when we don’t have to.
As I’ve explained before, Yggdrasil can automatically form a global mesh network. This is pretty cool! As most people use it, they join it to the main Yggdrasil network. But Yggdrasil can be run entirely privately as well. You can run your own private mesh, and that’s what we’ll talk about here.
All we have to do is run Yggdrasil inside each container, VM, server, or whatever. We handle some basics of connectivity, and bam! Everything is host- and location-agnostic.
Simple Setup in Docker
The installation of Yggdrasil on a regular system is pretty straightforward. Docker is a bit more complicated for several reasons:
- It blocks IPv6 inside containers by default
- The default set of permissions doesn’t permit you to set up tunnels inside a container
- It doesn’t typically pass multicast (broadcast) packets
Normally, Yggdrasil could auto-discover peers on a LAN interface. This can work with docker, too, but it takes some extra setup. So, this section is going to be setting up one or more Yggdrasil “router” containers on a given Docker host. All the other containers talk directly to the “router” container and it’s all easy. The downside, of course, is that if the router container goes down, all connectivity is lost. But it’s trivially easy to set up multiple router containers. Another alternative is the more advanced broadcast peer discovery discussed later on this page, which lets each container automatically find its local peers, so no central router node is needed at all.
In my Dockerfile, I have something like this:
RUN echo "deb http://deb.debian.org/debian bullseye-backports main" >> /etc/apt/sources.list && \
apt-get --allow-releaseinfo-change update && \
apt-get -y --no-install-recommends -t bullseye-backports install yggdrasil
COPY yggdrasil.conf /etc/yggdrasil/
RUN set -x; \
chown root:yggdrasil /etc/yggdrasil/yggdrasil.conf && \
chmod 0750 /etc/yggdrasil/yggdrasil.conf && \
systemctl enable yggdrasil
The magic parameters to
docker run to make Yggdrasil work are:
--cap-add=NET_ADMIN --sysctl net.ipv6.conf.all.disable_ipv6=0 --device=/dev/net/tun:/dev/net/tun
This example uses my docker-debian-base images, so if you use them as well, you’ll also need to add their parameters.
Note that it is NOT necessary to use
--privileged. In fact, due to the network namespaces in use in Docker, this command does not let the container modify the host’s networking (unless you use
--net=host, which I do not recommend).
--sysctl parameter was the result of a lot of banging my head against the wall. Apparently Docker tries to disable IPv6 in the container by default. Annoying.
Configuration of the router container(s)
The idea is that the router node (or more than one, if you want redundancy) will be the only ones to have an open incoming port. Although the normal Yggdrasil case of directly detecting peers in a broadcast domain is more convenient and more robust, this can work pretty well too.
You can, of course, generate a template
yggdrasil -genconf like usual. Some things to note for this one:
- You’ll want to change
Listento something like
Listen: ["tls://[::]:12345"]where 12345 is the port number you’ll be listening on.
- You’ll want to disable the
MulticastInterfacesentirely by just setting it to
since it doesn’t work in this setup anyway.
- If you expose the port to the Internet, you’ll certainly want to firewall it to only authorized peers. Setting
AllowedPublicKeysis another useful step.
- If you have more than one router container on a host, each of them will both
Listenand act as a client to the others. See below.
Configuration of the non-router nodes
Again, you can start with a simple configuration. Some notes here:
- You’ll want to set
Peersto something like
routernodeis the Docker hostname of the router container, and 12345 is its port number as defined above. If you have more than one local router container, you can simply list them all here. Yggdrasil will then fail over nicely if any one of them go down.
Listenshould be empty.
- As above,
MulticastInterfacesshould be empty.
Using the interfaces
At this point, you should be able to
ping6 between your containers. If you have multiple hosts running Docker, you can simply set up the router nodes on each to connect to each other. Now you have direct, secure, container-to-container communication that is host-agnostic! You can also set up Yggdrasil on a bare metal server or VM using standard procedures and everything will just talk nicely!
Docker Setup with Broadcast Peer Discovery
As hinted above, you can set up Yggdrasil to automatically discover local peers using broadcast. This is a little more difficult, but not all that bad. The payoff is that it is entirely decentralized within a host; no single point of failure. We set up a Docker bridge network to make this happen.
Configuring a bridge interface
First, we need to make a Linux bridge interface. Note that you can create a bridge interface that doesn’t actually attach to a physical interface. On a Debian-type system, you could put this in
iface bryggnet inet static
ifup bryggnet and it’s up.
The next step is to create a Docker network:
docker network create --driver=bridge -o "com.docker.network.bridge.name=bryggnet" yggnet
Using the bridge in containers
Now, when you set up a container, in addition to the Yggdrasil parameters given above, you also will want to add
--net=yggnet. Note that this setup may have implications on non-Yggdrasil container-to-container communication; consult the Docker docs for details.
yggdrasil.conf file, if you removed the
MulticastInterfaces section, put it back. For simplicity sake, I make them all look something like this on my containers:
I even use m4 to automate generating a yggdrasil.conf based on a template, substituting in appropriate container-specific keys.
Ephemeral nodes with autoconf
Yggdrasil has a autoconf mode, which you enable with
yggdrasil --autoconf. According to the docs, “in this mode, Yggdrasil will automatically attempt to peer with other nodes on the same subnet, but it also generates a random set of keys each time it is started, and therefore a random IP address.”
This can be perfect for a number of Docker use cases – for instance, worker containers. It is suitable for any situation in which a container wouldn’t need a stable Docker IP or hostname.
To put this all together, there are several ways you can span container hosts with this setup.
- Using the “simple” option, you can have a “router” Yggdrasil container on each host, and they can peer with each other (over the public Internet or whatever).
- Even if you mostly use broadcast peer discovery, you can still have router containers (which, on a given host, will be discovered by broadcast) which know how to peer with each other. Each Yggdrasil instance will auto-discover the best routes to each other one.
- If you have a unified broadcast domain between container hosts, you can simply put every container on it.
Option 1 was already discussed in the simple section. Option 2 is a hybrid; your router nodes can know about the router nodes on different hosts, and each host’s containers will auto-discover their local peers (including the router nodes) and therefore build routes to every container.
Option 3 means you need a unified broadcast domain. In a physical network, that means all your container hosts are on the same LAN (and can reach each other by broadcast). Many cloud providers offer you a virtual network that provides the same sort of capability. All you would need to do is change the
none to the name of the internal network interface, and that’s it. Your Yggdrasil instances can now auto-discover each other on any container host connected to that virtual LAN. On some cloud providers, you may need to disable IP address filtering. Yggdrasil doesn’t use the assigned IPv4 or IPv6 address when using broadcast-discovered peers, instead using the derived IPv6 link-local address. I ran into a situation on at least one cloud provider where it tried to clamp the IP used by each VM as a security measure, but that was easily enough disabled.
At some point, if you have a vast number of containers, you may find that option 3 doesn’t scale too well, as it results in every container maintaining a connection to every other. The hybrid option 2 would be an easy solution there.
Yggdrasil’s mesh is aggressively greedy. It will peer with any node it can find (unless told otherwise) and will find a route to anywhere it can. There are two main ways to make sure you keep untrusted traffic out: by restricting who can talk to your mesh, and by firewalling the Yggdrasil interface. Both can be used, and they can be used simultaneously.
By disabling multicast discovery, you eliminate the chance for random machines on the LAN to join the mesh. That implies that if you’re using broadcast peer discovery across hosts as in option 3 above, you need to secure your LAN. If you use a non-connected bridge, you can simply say
Regex: ^eth0$ or whatever in your
MulticastInterfaces section, which will limit peer discovery to only other containers on the local host.
By making sure that you firewall off (outside of Yggdrasil) who can connect to a Yggdrasil node with a listening port, you can authorize only your own machines. And, by setting
AllowedPublicKeys on the nodes with listening ports, you can authenticate the Yggdrasil peers. Note that part of the benefit of the Yggdrasil mesh is normally that you don’t have to propagate a configuration change to every participatory node - that’s a nice thing in general!
You can also run a firewall inside your container (I like
firehol for this purpose) and aggressively firewall the IPs that are allowed to connect via the Yggdrasil interface. I like to set a stable interface name like
yggdrasil.conf, and then it becomes pretty easy to firewall the services. The Docker parameters that allow Yggdrasil to run are also sufficient to run firehol.
Naming Yggdrasil peers
You probably don’t want to hard-code Yggdrasil IPs all over the place. There are a few solutions:
- You could run an internal DNS service
- You can do a bit of scripting around Docker’s
--add-hostcommand to add things to /etc/hosts
Other hints & conclusion
Here are some other helpful use cases:
- If you are migrating between hosts, you could leave your reverse proxy up at both hosts, both pointing to the target containers over Yggdrasil. The targets will be automatically found from both sides of the migration while you wait for DNS caches to update and such.
- This can make services integrate with local networks a lot more painlessly than they might otherwise.
This is just an idea. The point of Yggdrasil is expanding our ideas of what we can do with a network, so here’s one such expansion. Have fun!
Links to this note
A network in which the nodes typically discover each other and the routes between each other automatically.
Probably everyone is familiar with a regular VPN. The traditional use case is to connect to a corporate or home network from a remote location, and access services as if you were there.
Here are some (potentially) interesting topics you can find here: