Building a cloud-init template on ZFS
Greetings!

Let’s figure out how to correctly build Cloud-Init templates in Proxmox VE when using ZFS. The material is based on practical experience and typical issues encountered when migrating from classical storage schemes (mdadm + LVM) to ZFS.

Introduction

My acquaintance with Proxmox VE began with version 7, when the system was often deployed on top of Debian manually. Initially, I used mdadm + LVM - it was simple and predictable. After migrating to server hardware and moving to ZFS (with VM restoration from backups), a problem emerged: when deploying virtual machines (VMs) via Terraform, dynamic inventory in Ansible stopped working correctly.

The cause turned out not to be Terraform or Ansible, but deeper - an incompatibility between the ZFS storage model (zvol) and the qcow2 format. This analysis became the foundation of this article.

In this article we will cover:

Software used in the article:

SoftwareVersion
Proxmox VE9.1
Debian13

For a better understanding of this material, it’s important to clarify the theoretical side of this case.

The Problem: ZFS and QCOW2

The key issue is the difference in storage models.

In Proxmox VE with ZFS:

This creates an architectural conflict:

Characteristicqcow2zvol
TypeFileBlock device
Requires filesystemYesNo
Copy-on-WriteInside fileat ZFS level
Usage in Proxmox VEdirectory storageZFS storage

Differences between qcow2, raw, zvol, and ZFS dataset

QCOW2

qcow2 is a file format with support for:

Essentially, it’s a layer on top of the filesystem.

The problem on ZFS is double CoW:

Consequences:

Therefore, the Proxmox VE documentation does not recommend using qcow2 on ZFS.

RAW

RAW is a linear image without additional logic.

Advantages: minimal overhead, predictable performance, and optimization for ZFS.

ZVOL

ZVOL is a block device inside ZFS.

Key parameter: volblocksize (block size). The block size directly affects I/O system performance, so it’s important to set an optimal value.

Typical values:

In Proxmox VE by default, a zvol is created with a block size of 16K, which is optimal for VM disks on ZFS.

ZFS Dataset

ZFS dataset is a filesystem inside ZFS.

Used for:

Features:

Why conflict arises

The problem appears when trying to use a file image (qcow2) as a source for a block device (zvol).

In practice it looks like this:

A typical error in instructions:

BASH
qm importdisk 7777 ./debian-13-generic-amd64.qcow2 storage --format qcow2
Click to expand and view more

Here qcow2 is explicitly set, which is incorrect for zvol.

Result:

Solution options

Approach 1: ZFS Dataset (directory storage)

Create a dataset with a filesystem, for example:

PLAINTEXT
rpool/data/images
Click to expand and view more

And use it as a directory (directory storage).

Advantages: simple setup.

Disadvantages: double CoW.

Conclusion: acceptable, but not optimal.

At the stage of importing the disk into an empty virtual machine, specify the RAW format instead of the standard qcow2:

BASH
qm importdisk <VMID> <image> <storage> --format raw
Click to expand and view more

For example, for a machine with id 7777 from the instruction on the website, the command would be:

BASH
qm importdisk 7777 ./debian-13-generic-amd64.qcow2 storage --format raw
Click to expand and view more

During import, qcow2 is converted to raw during import, raw is written to zvol.

Advantages: no double CoW, maximum performance, and Cloud-Init works correctly.

Building a Cloud-Init template on ZFS

There isn’t much information, but readers unfamiliar with ZFS have probably gotten confused. So I propose once again to sequentially examine the typical process of creating a virtual machine template using Cloud-Init, taking into account the peculiarities of working with ZFS.

1. Image import

  1. Increase disk size:
BASH
qemu-img resize ./debian-13-generic-amd64.qcow2 32G
Click to expand and view more

Where debian-13-generic-amd64.qcow2 is the cloud-init disk image, and 32G is the final size (32GB).

  1. Create a new virtual machine without a disk:
BASH
qm create 9998 --name "debian-13-ci" --memory 2048 --cores 2 --net0 virtio,bridge=vnet01
Click to expand and view more

Explanation:

  1. Import the qcow2 image in raw format (key difference):
BASH
qm importdisk 9998 debian-13-generic-amd64.qcow2 vm-hdd --format raw
Click to expand and view more

Where vm-hdd is the name of your zvol in Proxmox VE.

  1. Set up the SCSI controller and add the previously imported disk:
BASH
qm set 9998 --scsihw virtio-scsi-single --scsi0 vm-hdd:vm-9998-disk-0,discard=on
Click to expand and view more
  1. Update boot order:
BASH
qm set 9998 --boot order=scsi0
Click to expand and view more
  1. Add Cloud-Init disk:
BASH
qm set 9998 --ide1 vm-hdd:cloudinit
Click to expand and view more
  1. Configure user, password and SSH for cloud-init:
BASH
# --- User ---
# Create user:
qm set 9998 --ciuser ansible
qm set 9998 --cipassword <USER_PASSWORD>
# Add SSH key:
qm set 9998 --sshkeys ~/.ssh/id_ed25519.pub

# ------- NETWORK -------
# Configure IP via DHCP:
qm set 9998 --ipconfig0 ip=dhcp
# Static IP:
qm set 9998 --ipconfig0 ip=10.10.10.254/24,gw=10.10.10.1
# Set DNS server address 10.10.10.15:
qm set 9998 --nameserver 10.10.10.15
# Set search domain infra.lan:
qm set 9998 --searchdomain infra.lan

# ----- Updates -----
# Install updates on startup:
qm set 9998 --ciupgrade 1
# Do not install updates:
qm set 9998 --ciupgrade 0
Click to expand and view more
  1. Add a serial port:
BASH
qm set 9998 --serial0 socket --vga serial0
Click to expand and view more
  1. Enable QEMU Guest Agent for interaction between hypervisor and guest system:
BASH
qm set 9998 --agent enabled=1
Click to expand and view more
  1. Save the machine as a template:
BASH
qm template 9998
Click to expand and view more
  1. Create a clone of the previously assembled template to verify Cloud-Init functionality:

  1. Run the test machine and verify:

Done! Now the Cloud-Init template works correctly and is stored on your zvol.

2. Adding Cloud-Init to an existing template

If you recently learned about Cloud-Init and have already managed to build templates for typical virtual machines for your infrastructure, don’t worry!

There is a way to add Cloud-Init support to already existing templates, let’s look at an example with Debian 13 Trixie.

  1. Make a full copy of the template (Full Clone).
  2. Connect to the terminal of the new machine via VNC or SSH.
  3. Switch to root user (if using a custom account):
BASH
sudo -i
Click to expand and view more
  1. Update the system:
BASH
apt update && apt full-upgrade -y
Click to expand and view more
  1. Install the guest agent (if you haven’t done so before) and cloud-init package:
BASH
apt install -y qemu-guest-agent cloud-init
Click to expand and view more
  1. Create a new Cloud-Init configuration for Proxmox VE:
BASH
/etc/cloud/cloud.cfg.d/99-pve.cfg
Click to expand and view more

Insert a line like this into the file:

BASH
datasource_list: [ NoCloud, ConfigDrive ]
Click to expand and view more

Save the changes and exit the text editor.

  1. Clean up logs and machine-id:
BASH
cloud-init clean --logs
rm -f /etc/machine-id
truncate -s 0 /etc/machine-id
Click to expand and view more
  1. If desired, you can clean up system logs and apt cache:
BASH
journalctl --rotate
journalctl --vacuum-time=1s
apt clean
Click to expand and view more
  1. Shut down the machine:
BASH
poweroff
Click to expand and view more
  1. Switch to the hypervisor terminal and add the Cloud-Init disk to this machine:
BASH
qm set <VMID> --ide2 <storage>:cloudinit
Click to expand and view more
  1. Then add a serial port:
BASH
qm set <VMID> --serial0 socket --vga serial0 
Click to expand and view more
  1. Enable the agent:
BASH
qm set <VMID> --agent enabled=1
Click to expand and view more
  1. Save this machine as a template:
BASH
qm template <VMID>
Click to expand and view more

After this you can add additional Cloud-Init parameters and verify the updated template works.

Typical errors

The most common issues:

These errors often don’t manifest immediately, but create problems at scale.

Conclusion

If you use Proxmox VE with ZFS, the optimal scheme would be:

ZFS already implements CoW, snapshots, and thin provisioning. Using qcow2 on top of ZFS duplicates these mechanisms and leads to performance degradation and architecture complexity.

Copyright Notice

Author: Kirill Reshetnikov

Link: https://r4ven.me/en/virtualization/building-cloud-init-template-on-zfs/

License: CC BY-NC-SA 4.0

Blog materials may be used with attribution to the author and source, for non-commercial purposes, and under the same license.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut