UNIX/Linux – Zak Abdel-Illah

Building AMD64 QEMU Images remotely using Libvirt and Packer

Zak — Fri, 24 May 2024 22:05:58 +0000

I need to build images based off AMD64 architecture while working from an ARM64 machine. While this is possible directly using the qemu-system-x86_64 binary, it tends to be extremely slow due to the overhead of converting for the ARM architecture.

Workbench

Ubuntu 22.04 LTS with libvirt installed
MacBook Pro M2 with the Packer build files

Configuring the Libvirt Plugin

Connecting to the libvirt host

When using the libvirt plugin, I need to provide a Libvirt URI.

source "libvirt" "image" {
    libvirt_uri = "qemu+ssh://${var.user}@${var.host}/session?keyfile=${var.keyfile}&no_verify=1"
}

qemu+ssh:// denotes that I’ll be using the QEMU / KVM Backend and connecting via SSH. The connection method denotes the rest of the arguments of the string
${var.user}@${var.host} is in the SSH syntax, this is the username and hostname of the machine that is running libvirt
/session is to isolate the running builds from those on the system level. /system would work just as well.
keyfile=${var.keyfile} is used to automatically authenticate to the remote machine without the need of a password. This is useful in the future when I automatically trigger the packer build from a Git repository
no_verify=1 is added so that I can throw the build at any machine and have it “just work”. This is usually guided against due to spoofing attacks.

Communicating with the libvirt guest

communicator {
    communicator                 = "ssh"
    ssh_username                 = var.username
    ssh_bastion_host             = var.host
    ssh_bastion_username         = var.user
    ssh_bastion_private_key_file = var.private_key
  }

The difference between ssh_* and ssh_bastion_* is that the first refers to the target virtual machine being built, and the latter refers to the “middle-man” machine.
- I require this as I don’t plan to expose the VM to a network outside of the machine hosting it.
- Since I won’t have access from my local workstation, I need to communicate with the virtual machine via the machine that is hosting it.
- By adding ssh_bastion_* arguments, I’m telling packer that in-order to communicate with the VM, it needs to access the bastion machine first then execute all SSH commands through it.

Configuring the libvirt daemon

My Observations

I came across a “Permission Denied” error when attempting to upload an existing image (in my case, the KVM Ubuntu Server Image). This was due to AppArmor not being provided a trust rule upon creation of the domain. This error is first visible in the following form directly from Packer:

==> libvirt.example: DomainCreate.RPC: internal error: process exited while connecting to monitor: 2024-05-24T16:41:42.574660Z qemu-system-x86_64: -blockdev {"node-name":"libvirt-2-format","read-only":false,"driver":"qcow2","file":"libvirt-2-storage","backing":null}: Could not open '/var/lib/libvirt/images/packer-cp8c6ap1ijp2kss08iv0-ua-artifact': Permission denied

At first, I assumed that there was an obvious permissions problem, and at first glace there was in-fact that. When looking at this file upon creation, it had root permissions where only the root user can read/write.

# ls -lah /var/lib/libvirt/images
-rw------- 1 root root  925M May 24 16:41 packer-cp8c6ap1ijp2kss08iv0-ua-artifact

This makes sense since libvirtd is running under the root user, which is the default configuration from the Ubuntu repository. I didn’t see any configuration option to manipulate what the permissions should be after an upload with libvirt either. This was an assumed problem since all QEMU instances are running under a non-root user, libvirt-qemu

# ps -aux | grep libvirtd
# ps -aux | grep qemu

root      145945  0.4  0.1 1778340 28760 ?       Ssl  16:43   0:10 /usr/sbin/libvirtd
libvirt+    3312  2.2 11.1 4473856 1817572 ?     Sl   May12 405:19 /usr/bin/qemu-system-x86_64

My second observation was that all images created directly within libvirt (e.g: with virt-manager) had what looked like “correct” permissions, those that matched the user that QEMU would eventually run under;

# ls -lah /var/lib/libvirt/images
-rw-r--r-- 1 libvirt-qemu kvm   11G May 24 17:11 haos_ova-11.1.qcow2

Since no-one else had reported this particular issue when using the libvirt plugin, I had gone down the route of PEBKAC.

Allowing packer-uploaded images as backing store

Thanks to this discussion on Stack Overflow, I found that AppArmor had been blocking the request to the specific file in question.

# dmesg -w
[1081541.249157] audit: type=1400 audit(1716568577.970:119): apparmor="DENIED" operation="open" profile="libvirt-25106acc-cfd8-40f7-a7c6-f5c1c63bc16c" name="/var/lib/libvirt/images/packer-cp8c6ap1ijp2kss08iv0-ua-artifact" pid=43927 comm="qemu-system-x86" requested_mask="w" denied_mask="w" fsuid=64055 ouid=64055

Here, I can see that AppArmor is doing three things;

Denying an open request to the QEMU Image
- apparmor="DENIED"
- operation="open"
Denying writing to the QEMU Image
- denied_mask="w"
Using a profile that is specific to the domain being launched
- profile="libvirt-25106acc-cfd8-40f7-a7c6-f5c1c63bc16c"
- This is achieved because libvirt will automatically push AppArmor rules upon creation of a domain. This also means that libvirt will be using some form of template file or specification to create rules.

This means that I need to find the template file that libvirt is using to design the rules, and allow for writing to packer-uploaded QEMU Images.

# /etc/apparmor.d/libvirt/TEMPLATE.qemu
# This profile is for the domain whose UUID matches this file.
# 

#include 

profile LIBVIRT_TEMPLATE flags=(attach_disconnected) {
  #include 
  /var/lib/libvirt/images/packer-** rwk,
}

As mentioned in the Stack Overflow post, simply adding /var/lib/libvirt/images/packer-** rwk, to the template file is enough to get past this issue.

End Result

By bringing everything together, I get a successful QCOW2 image visible in my default storage pool. I’m using the Ansible provisioner within the build block so that I can keep the execution steps separate from the Packer build script, and re-usable across different cloud providers.

Using AWS CodeBuild to execute Ansible playbooks

Zak — Sat, 06 Apr 2024 19:31:19 +0000

I wanted a clean and automate-able way to package third party software into *.deb format (and multiple others, if needed, in the future), and I had three ways to achieve that;

The simple way: Write a Bash script
The easy way: Write a Python script
My chosen method: Write an Ansible role

While all of the options can get me where I wanted, it felt a lot cleaner to go the Ansible route as I can clearly state (and see) what packages I am building either from the command line level or from a playbook level, rather than having to maintain a separate configuration file to drive what to build and where in an alternative format for either the Bash or Python approaches.

The playbook approach also allows me to monitor and execute a build on a remote machine, should I wish to build cross-platform or need larger resources for testing.

In this scenario, I’ll be executing the Ansible role locally on the CodeBuild instance.

Configuring the CodeBuild Environment

Using GitHub as a source

I have one git repository per Ansible playbook, so by linking CodeBuild to the repository in question I’m able to (eventually) automatically trigger the execution of CodeBuild upon a pushed commit on the main branch.

The only additional setting under sources that I define is the Source version, as I don’t want build executions happening for all branches (as that can get costly).

CodeBuild Environment

For the first iteration of this setup, I am installing the (same) required packages at every launch. This is not the best way to handle pre-installation in terms of cost and build speed. In this instance, I’ve chosen to ignore this and “brute-force” my way through to get a proof-of-concept.

Provisioning Model: On-demand
- I’m not pushing enough packages to require a dedicated fleet, so spinning up VMs in response to a pushed commit (~5 times a week) is good enough.
Environment Image: Managed Image
- As stated above, I had my focus towards a proof-of-concept that running Ansible under CodeBuild was possible. A custom image with pre-installed packages is the way to go in the long run.
Compute: EC2
- Since I’m targeting *.deb format, I choose Ubuntu as the operating system. The playbook I’m expecting to execute doesn’t require GPU resources either.
- Amazon Lambda doesn’t support Ubuntu, nor is able to execute Ansible (directly). I’d have to write a wrapper in Python that will execute the Ansible Playbook which is more overhead.
- Depending on the build time and size of the result package, I had to adjust the memory required accordingly. However, this may be because I’m making use of the /tmp directory by default.

buildspec.yml

I store the following file at the root level of the same Git repository that contains the Ansible playbook.

version: 0.2

phases:
  pre_build:
    commands:
      - apt install -y ansible python3-botocore python3-boto3
      - ansible-galaxy install -r requirements.yaml
      - ansible-galaxy collection install amazon.aws
  build:
    commands:
      - ansible-playbook build.yaml
artifacts:
  files:
    - /tmp/*.deb

As stated above, I’m always installing the required System packages prior to interacting with Ansible. This line (apt install) should be moved into a pre-built image that this CodeBuild environment will then source from.

I keep the role (and therefore, tasks) separate from the playbook itself, which is why I use ansible-galaxy to install the requirements. Each time the session is started, it pulls down a fresh copy of any requirements. This can differ from playbook to playbook.

I use the role for the execution steps, and the playbook (or inventory) to hold the settings that influence the execution, such as (in this scenario) what the package name is and how to package it.

I explicitly include the amazon.aws Ansible collection in this scenario as I’m using the S3 module to pull down sources (or builds of third party software) and to push build packages up to S3. I’m doing this via Ansible as opposed to storing it within Git due to its’ size, as well as opposed to CodeDeploy as I don’t plan on deploying the packages to infrastructure, rather, to a repository.

I did have some issues using the Artifacts option within CodeBuild also, which lead to pushing from Ansible.

Finally, the ansible-playbook can be executed once all the pre-requisites are needed. The only adaptation that’s needed on the playbook level, is that localhost is listed as a target. This ensures that the playbook will execute on the local machine.

---
- hosts: localhost

Once all the configuration and repository setup is done, the build executed successfully and I received my first Debian package via CodeBuild using Ansible.

Packaging proprietary software for Debian using Ansible

Zak — Sun, 25 Feb 2024 18:47:33 +0000

Managing software installations for an animation studio can be a time-consuming process, especially if relying on the provided installation methods, and even more so when using a distribution that isn’t officially supported by the software.

I needed a streamlined workflow that allowed me to download DCCs from their creators, package them into *.deb files and deploy them to a local server for eventual installation on workstations. This will allow me to properly version control the versions of software from a system level, and to rapidly build DCC-backed docker containers without needing to run the lengthy install processes. I achieved this in the form of an Ansible role.

The Role

I will use SideFX® Houdini as an example for how to interpret this role.

{{ package }} refers to one item within packages

{{ source }} refers to an individual item within sources

      packages:
      - name: houdini
        version: 20.0.625
        release: 1
        category: graphics
        architecture: amd64
        author: "WhoAmI? "
        description: Houdini 20
        sources:
          - name: base
            s3:
              bucket: MyLocalPrivateS3Bucket
              object: "houdini/houdini-20.0.625-linux_x86_64_gcc11.2.tar.gz"
            path: /houdini.tar.gz
            archive: yes
        scripts:
          pre_build:
            - "{{ lookup('first_found', 'package_houdini.yaml')}}"
        repo:
          method: s3
          bucket: MyLocalPrivateS3Bucket
          object_prefix: "houdini"

---
- include_tasks: build.yml
  loop: "{{ packages }}"
  loop_control:
    loop_var: package

I want to execute an entire block in iteration, once per package to build. I want to use a block so that I can make use of the rescue: clause for cleanup in the event of a packaging failure, and to make use of the always: clause to release the package to a local repository.

In order to achieve this, I need to put the block in a separate task and iterate on the include_tasks task as the block task doesn’t accept a loop: argument. I’ve also named the iteration variable to package as I will be running iterations within.

- block:
  - name: create build directory
    file:
      path: /tmp/build
      state: directory

Getting the installer archive

From a HTTP source

- name: download archive
    ansible.builtin.get_url:
      url: "{{ source.url }}"
      dest: "{{ source.path }}"
      mode: '0440'
    loop: "{{ package.sources }}"
    loop_control:
      loop_var: source
    when: "source.url is defined"

If I download the package to an on-prem server (to perform a virus scan), I may choose to deliver the files in the form of a HTTP Server.

From S3

  - name: download from s3 bucket
    amazon.aws.s3_object:
      bucket: "{{ source.s3.bucket }}"
      object: "{{ source.s3.object }}"
      dest: "{{ source.path }}"
      mode: get
    loop: "{{ package.sources }}"
    loop_control:
      loop_var: source
    when: "source.s3 is defined"

If I’m working with a cloud-based studio, I may opt in to store the archive on S3 for cheaper expenses, especially if I tend to get multiple errors and need to debug the playbook (as bringing data into AWS is a cost)

From the Ansible controller

  - name: copy archives
    copy:
      src: "{{ source.path }}"
      dest: "{{ source.path }}"
    loop: "{{ package.sources }}"
    loop_control:
      loop_var: source
    when:
      - "source.archive is not defined"
      - "source.url is not defined"

If it’s my first time and I’m on-premises, I may choose to deliver the package straight from my machine to a build machine

From network storage attached to the remote machine

Or if I decide to store the archive on the network storage that’s attached to a build machine, I can choose to pull the file from the network directly without the need to copy.

Getting to the Application

  - name: extract archives
    unarchive:
      src: "{{ source.path }}"
      dest: /tmp/build.src
      remote_src: yes
    loop: "{{ package.sources }}"
    loop_control:
      loop_var: source
    when: "source.archive is defined"

Should I need to extract an archive, I provide an archive: yes attribute onto the package in iteration. The yes isn’t relevant, but the fact that the attribute exists is enough for Ansible to trigger this task based on the when: clause.

  - name: copy archives
    copy:
      src: "{{ source.path }}"
      dest: /tmp/build.src
    loop: "{{ package.sources }}"
    loop_control:
      loop_var: source
    when: "source.archive is not defined"

If we have the other case of not needing to extract anything, we can just copy the source over to the building directory.

  - name: prepare package
    include_tasks: "{{ item }}"
    loop: "{{ package.scripts.pre_build }}"
    vars:
      build_prefix: /tmp/build

Finally, we need to layout the contents in order for dpkg to create the package correctly. The packaging process requires a directory that will act as the “target root directory”.

If I have the file /tmp/build.src/test and build the directory /tmp/build.src, I would get a build.src.deb file that will create a /test file once installed.

Following this logic, I need to install the application I want into /tmp/build.src as if it were the root filesystem.

Since every application is different, I implemented the concept of pre_build scripts, in which this role itself will handle the pulling and releasing of packages, the preparation and destroy of the build directories and the templates and the execution for the packaging system. What it doesn’t handle is how to get the contents of the application itself.

package.scripts.pre_build points to a list of tasks to run to prepare package for packaging. This should not be specific to any distribution. As an example, for SideFX® Houdini, my pre_build is the following:

- name: find houdini installers
  ansible.builtin.find:
    paths: /tmp/build.src
    patterns: 'houdini.install'
    recurse: yes
  register: found_houdini_installers

- name: install houdini to directory
  shell: "{{ item.path }} \
    --make-dir \
    --auto-install \
    --no-root-check \
    /tmp/build/opt/hfsXX.X"
  loop: "{{ found_houdini_installers.files }}"

Where I use the find module to search for any Houdini installers (since it may be nested, I use the recurse: yes variable), and then run the Houdini installer as if I were at the machine.

There are some additional flags that I had removed from the snippet to make the installation automatic (--acceptEULA) and selective (--no-install-license-server). The latter I plan to package into their own individual packages (e.g: houdini-sesinetd, maya2024-houdini-engine etc.) but haven’t got round to it.

  - include_tasks: debian.yml
    when: ansible_os_family == 'Debian'

Finally, since I’m packaging for Debian, I place the procedures for building a Debian package into a debian.yml file. I plan to extend the packaging role to include Arch Linux and Fedora-based in the future.

- name: create debian directory
  file:
    path: /tmp/build/DEBIAN
    state: directory

- name: render debian control template
  template:
    src: debian.control.j2
    dest: /tmp/build/DEBIAN/control

At the “root” level, we require a DEBIAN directory, with a control file inside. This is the minimum to get a deb package. I’m making use of a Jinja2 template for the control file as it’s cleaner to manage down the road.

DEBIAN/control

Package: {{ package.name }}
Version: {{ package.version }}-{{ package.release }}
Section: {{ package.category }}
Priority: optional
Architecture: {{ package.architecture }}
Maintainer: {{ package.author }}
Description: {{ package.description }}

Building the package

- name: execute package build
  shell: "dpkg --build /tmp/build /tmp/{{ package.name }}-{{ package.version }}-{{ package.release }}-{{ package.architecture }}.deb"

- set_fact:
    local_package_location: "/tmp/{{ package.name }}-{{ package.version }}-{{ package.release }}-{{ package.architecture }}.deb"

I set the name of the result *.deb to follow a strict naming convention to avoid file conflicts. Lastly, once the package is built, I use set_fact to return the path to the *.deb file from the sub-task so that the build.yml file can deploy it to where it needs to go, be it S3 or a local repository. I do this as I may be building more than a debian package in the future.

Authenticating DigitalOcean for Terraform OSS

Zak — Tue, 05 Dec 2023 19:21:25 +0000

Scenario

Why?

I’m diving into Terraform as part of my adventure into the DevOps world, which I’ve adopted an interest in the past few months.

I use 2 workstations with DigitalOcean
- MacBook; for when I’m out and about
- ArchLinux; for when I’m at home

Generating the API Tokens

Under API, located within the dashboards’ menu (on the left-hand side), I’m presented with the option to Generate New Token.

Followed by an interface to define;

Name
- I typically name this token as zai.dev or personal, as this token will be shared across my devices. While this approach isn’t the most secure (Ideally, I should have one token per machine), I’m going for the matter of convenience of having one token for my user profile.
Expiry date
- Since I’m sharing the token across workstations (including my laptop, which may be prone to theft), I set the expiration to the lowest possible value of 30 days.
Write permissions
- Since I’ll be using Terraform, and it’s main purpose is to ‘sculpt’ infrastructure, I require the token that it’ll use to connect to DigitalOcean to have write permissions.

Authenticating DigitalOcean Spaces

As the Terraform Provider allows the creation of Spaces, DigitalOceans’ equivalent to AWS’ S3-bucket, I should also create tokens for it. By navigating to the “Spaces Keys” tab under the APIs option, I can repeat the same steps as above

Installing the Tokens

Continuing from the setup of environment variables in my Synchronizing environment variables across Workstations post, I need to add 3 environment variables for connecting to DigitalOcean.

DIGITALOCEAN_TOKEN
- This is the value that is given to you after hitting “Generate Token” on the Tokens tab
SPACES_ACCESS_KEY_ID
- This is the value that is given to you after hitting “Generate Token” on the Spaces Tokens tab
SPACES_SECRET_ACCESS_KEY
- This is the one-time value that is given to you alongside the SPACES_ACCESS_KEY_ID value

Whilst I’m at it, I’m going to add the following environment variables so that I can use any S3-compliant tools to communicate with my object storage, such as the s3 copy command to push build artifacts

AWS_ACCESS_KEY_ID=${SPACES_ACCESS_KEY_ID}
AWS_SECRET_ACCESS_KEY=${SPACES_SECRET_ACCESS_KEY}

To keep things tidy, I created a separate environment file for digital ocean, under ~/.config/zai/env/digitalocean.sh

export DIGITALOCEAN_TOKEN=""
export SPACES_ACCESS_KEY_ID=""
export SPACES_SECRET_ACCESS_KEY=""
export AWS_ACCESS_KEY_ID=${SPACES_ACCESS_KEY_ID}
export AWS_SECRET_ACCESS_KEY=${SPACES_SECRET_ACCESS_KEY}

Synchronizing environment variables across Workstations

Zak — Thu, 30 Nov 2023 20:53:45 +0000

I need to have the configuration for my applications and APIs synchronized across multiple machines.

What’s my situation?

I use at least two workstations
- MacBook Pro; for use when out and about
- ArchLinux Desktop; for use when at home
- Ubuntu Server; for hosting permanent services

What does this mean?

As I’m working across two devices, I need to make sure that the equivalent configuration is available across both devices and immediately. I use SyncThing as the technology to keep my personal configuration such as environment variables synchronized across all devices. I don’t use Git as there is an extra step of manually pulling down the configuration each time, in addition to as not having access to my local git repository at all times.

Mac & Linux are UNIX-based/like platforms, so I can keep my configuration files uniform. I use Bash scripts to define the environment variables needed for any APIs that I use.

How did I achieve it?

Directory structure & files needed

I use ~/.config/zai as my configuration directory and set SyncThing to watch it, and then set it on the other workstations to point to the same path. A file named rc.sh lives inside here centralize anything I want upon loading the terminal.

Installing SyncThing

Installing on Linux

Luckily for most Linux distributions, SyncThing is already provided in the pre-installed repositories.

pacman -S syncthing # Arch Linux
apt install syncthing # Debian / Ubuntu

# Enable & Start Syncthing
systemctl enable --now syncthing@

Installing on macOS

On Mac it’s slightly more trivial, but the instructions are provided within the Downloadable ZIP File for macOS.

Sourcing the `rc.sh` from the shell

The following snippet needs to be placed in a shell initialization script which may differ depending on platform. The source command tells Bash to reference (and execute) the file that follows it

source ~/.config/zai/rc.sh

macOS

macOS will execute the ~/.bash_profile script upon opening a new Bash shell. I switch between zsh and bash from time to time, so either I manually execute /usr/bin/bash to take me to the Bash environment, or I’d just change the default shell under the Terminal properties.

Linux

Most linux platforms will execute ~/.bashrc upon opening a new shell, assuming that Bash is the default shell.

`rc.sh`

I keep this file simple, which is to loop through all the files inside the env/ subdirectory for bash files and execute them. This allows me to not have a single file with numerous lines.

for file in ~/.config/zai/env/*.sh; do
    source $file;
done

What’s next?

I’m diving into the world of DevOps, and will need to configure my local systems to;

Hold the API Credentials for the cloud service(s) of my choice
Hold the API Credentials for an S3 bucket location of my choice