Cluster VMs with Firecracker

Updated on 2022-07-11

Recently I was asked to join the recruitment process and onboarding of new members. As the team and our project grow both in size and complexity, our document to hand on projects is complicated and only work in certain operating system. So to make sure the onboarding process is fast and reproducible, I have to come up with a new plan to create an isolate enviroment for coding and less learning curve as possible.

An isolate enviroment where every member get their own resource and custom preinstall package

Our project in Jobhopin combines multiple languages (Rust, Python, …) and the process of creating virtualenv is quite tedious with multiple attempts to make it work. Usually, It took 2-3 weeks for newcomers to learn about our current projects and work effectively

Firstly I encourage the team to use Docker and docker-compose to code and debug projects but our team has both engineer and scientist members. The science team find it hard to debug in docker and took a lot of time for new members to learn and make use of docker’s image. Furthermore I wanted to mimic a real production machine that the member has root access to – wanted folks to be able to set sysctls, install new packages, make iptables rules, configure networking with ip, run perf, basically literally anything with strong isolation.

I’ve tried some VM vendors (Qemu and Vmware) to create per VM per member but too many problems in the process:

  • VM boosting time is slow plus the snapshot size is too large
  • Lack of API and the snapshot VM have to be manual created without any reproduce code

I want our members only need to provide their credentials with custom VM size and instantly launch a fresh virtual machine.

Initially when I read about Firecracker being released, I thought it was just a tool for cloud providers to use that provide security rather than bare docker, but I didn’t think that it was something that I could directly use it to create a dev VM.

After a few reading information, I just impress with how fast and convenient Firecracker is in boosting VM

The VMM process starts up in around 12ms on AWS EC2 I3.metal instances. Though this time varies, it stays under 60ms. Once the guest VM is configured, it takes a further 125ms to launch the init process in the guest. Firecracker spawns a thread for each VM vCPU to use via the KVM API along with a separate management thread. The memory overhead of each thread (excluding guest memory) is less than 5MB. Ref

Some comperations between Firecracker and QEMU

By comparison: Firecracker is purpose-built in Rust for this one task, provides no BIOS, and offers only network, block, keyboard, and serial device support — with tiny drivers (the serial support is less than 300 lines of code). Ref

Firecracker integrates with existing container tooling, making adoption rather painless and easy to use. I choose to use Ignite that CLI command is very similar to docker

With Ignite, you pick an OCI-compliant image (Docker image) that you want to run as a VM, and then just execute ignite run instead of docker run

Install ignite and start a fresh VM is very simple, there’s basically 3 steps:

Step 1: Check your system is enable KVM virtualization and install Ignite in here Installing-guide

$ ignite version
Ignite version: version.Info{Major:"0", Minor:"8", GitVersion:"v0.10.0", GitCommit:"...", GitTreeState:"clean", BuildDate:"...", GoVersion:"...", Compiler:"gc", Platform:"linux/amd64"}
Firecracker version: v0.22.4
Runtime: containerd

Step 2: Create a VM sample config.yaml

kind: VM
  name: haiche-vm
  cpus: 2
  memory: 1GB
  diskSize: 6GB
    oci: weaveworks/ignite-ubuntu
  ssh: true

Step 3: Start your VM server under 125 ms


It takes <= 125 ms to go from receiving the Firecracker InstanceStart API call to the start of the Linux guest user-space /sbin/init process

$ sudo ignite run --config config.yaml

INFO[0001] Created VM with ID "3c5fa9a18682741f" and name "haiche-vm" 

Wolla 🎉 🎉 🎉 you’ve succeedfully created a new VM. To list the running VMs, enter:

$ ignite ps
VM ID                   IMAGE                           KERNEL                                  CREATED SIZE    CPUS    MEMORY          STATE   IPS             PORTS   NAME
3c5fa9a18682741f        weaveworks/ignite-ubuntu:latest weaveworks/ignite-kernel:5.10.51        63m ago 4.0 GB  2       1.0 GB          Running              haiche-vm

Once the VM is booted, it will have its network configured and will be accessible from the host via password-less SSH and with sudo permissions

$ ignite ssh haiche-vm
Welcome to Ubuntu 18.04.2 LTS (GNU/Linux 5.10.51 x86_64)

To exit SSH, just quit the shell process with exit.

add your public key to ~/.ssh/authorized_keys in new boosted VM or update config and create new VM with default path to public key

  ssh: path/your/

and then ssh to your VM

$ ssh -i path/your/id_rsa root@
Welcome to Ubuntu 18.04.2 LTS (GNU/Linux 5.10.51 x86_64)

Final wrapping with bastion host technique to fully secure ssh to VM host


Host workstation
    User haiche
    IdentityFile ~/.ssh/id_rsa

Host haiche-vm
    ProxyJump workstation
    User root
    IdentityFile ~/.ssh/id_rsa

Add the config above to your ssh folder and run ssh haiche-vm to access VM host from your client

After successfully creating VM, mostly I will install conda and some packages to run my project. But I don’t want to repeatedly install conda and create a new environment each time I create VM. Here is my step to extend the base Ubuntu image and use it to create a better VM experience


FROM weaveworks/ignite-ubuntu

ENV PATH="/root/miniconda3/bin:${PATH}"
ARG PATH="/root/miniconda3/bin:${PATH}"

RUN apt-get update -qq && \
    apt-get update -y && \
    apt-get install git vim rsync \
        build-essential curl -y

RUN wget \ && \
    mkdir /root/.conda && \
    bash -b && \
    rm -f

RUN conda create -n haiche python=3.7

SHELL ["conda", "run", "-n", "haiche", "/bin/bash", "-c"]

RUN conda install fastapi

RUN which python && python -c "import fastapi"

RUN conda init bash && echo "source activate haiche" >> ~/.bashrc


kind: VM
  name: haiche-minconda-vm
  cpus: 2
  memory: 1GB
  diskSize: 6GB
    oci: haiche/ubuntu-minconda
  ssh: path/your/

Here are some tricky parts, the current ignite doesn’t support local docker image build. I have to push the image to the public register Docker Hub to successfully import ignite’s image. To start a new VM

$ sudo ignite run --config miniconda-vm.yaml
INFO[0002] Created image with ID "cae0ac317cca74ba" and name "haiche/ubuntu-minconda" 
INFO[0004] Created VM with ID "c1ab652804e664ed" and name "haiche-minconda-vm" 

If you use a private registry such as ECR, run the command above with --runtime=docker to pull the private registry

Test our new VM with conda environment

$ ssh -i path/your/id_rsa root@
(haiche) root@c1ab652804e664ed:~# python
Python 3.7.13 (default, Mar 29 2022, 02:18:16)
[GCC 7.5.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import fastapi
>>> fastapi.__version__

I liked the configuration file approach so far and it is easier to be able to see everything all in one place. Now the member simply provides the config file and public key to create a fresh VM with all needed environments in instant

Another question I had in mind: “ok, where am I going to run these Firecracker VMs in production?“. The funny thing about running a VM in the cloud is that cloud instances are already VMs. Running a VM inside a VM is called “nested virtualization” and not all cloud providers support it – for example, AWS only supports nested virtualization in Bare-metal instances which are ridiculously high prices.

GCP supports nested virtualization but not on default, you have to enable this feature in creating VM section. DigitalOcean support nested virtualization on default even on their smallest droplets

A few things still stuck in my mind with this approach:

  • Currently firecracker doesn’t support snapshot restore but will support in near future

  • Can’t easy upgrade base image like docker pull. I was dealing with this by making a copy every time, but that’s kind of slow and it felt really inefficient. But there’s some solution online that I will try later Device mapper to manage firecracker images

  • I don’t know if it’s possible to run graphical applications in Firecracker yet

  • Firecracker with Kubernetes is a new thing but I don’t find it appealing cause using Pod to group containers is already fast and secure. Some people gave me this useful thread discuss about why aren’t they compatible yet

Here are some links I found useful when researching about Firecracker: