Powerful and manageable clusters need a consistent and lightweight node OS. Qlustar’s unique 3rd-generation image generator is built exactly for this purpose. It is designed such that starting from the standard Linux distribution packages, the resulting Qlustar OS images contain the exact minimum of programs/files that a specific node needs for its assigned tasks. By the way, Qlustar’s core OS existed a long time before the (ex-)company with that name was even founded.
Nodes in Qlustar clusters always run their OS in RAM (apart from the head-node or two of them in case of a redundant HA setup). OS images are compressed squashfs file systems that are distributable to thousands of nodes without network congestion due to their extremely small size (150 - 250MB) and QluMan’s unique multi-cast distribution technique.
Qlustar OS images are modular, composed of a core needed by any type of node, and optional modules for additional functionality (e.g. to integrate into a particular workload manager, activate support for Infiniband, parallel file-systems, High-Availability, GPU computing, etc.). By selecting from these modules, an OS image with just the needed functionality for a particular node is easily created using the QluMan GUI.
Creating images is as simple as it can get: Select the required edge platform (a.k.a. flavor, i.e. Ubuntu xenial, CentOS 7, …), Qlustar version and optionally a chroot (optional) in the QluMan image dialog. Then add the image modules, that are needed for the desired functionality of the nodes this image should run on. Finally, press the /Write/ button. The image will then be assembled and ready to be used in less than a minute.
Fig.1 - The image creation process.
While simplicity in the creation process is crucial, it’s equally important that the content of resulting images is reproducable. Due to Qlustar’s unique technology, where image modules are delivered as OS packages, it’s guaranteed, that any two images generated from the same set of image modules with equal versions will be identical to the last bit. Versioning of images is thereby implicit with the version numbers being inherited from the version of the corresponding modules. Hence, Qlustar extends the familiar Debian/RPM packaging concept from application packages to image components.
As opposed to other /Core OS/ approaches, a number of advantages result from the fact, that Qlustar images are assembled from the same packages that are used with conventional installations (using the default installer of the distribution, i.e. Ubuntu, CentOS, …):
Another Qlustar feature, that stands out, is the simplicity by which the disk(s) of a node can be auto-configured. Whether you just have single disks, that should serve a scratch and/or var filesystem or a sophisticated ZFS RAID setup for Lustre/BeeGFS storage targets: Assigning a so-called QluMan disk-config to the nodes and rebooting them, is all it takes, to have it fully configured.
Obviously, diskless clusters are supported by design, given that the Qlustar node OS runs in RAM. It’s just a selectable option in the QluMan disk setup dialog.
Using net-boot images is the method of choice to deliver the OS to cluster nodes. But there are two major challenges when working with them: a) Keeping them small, while still providing all the software and services that will be needed to run on them and b) providing simple configurations, while still having the flexibility to also support exotic ones. Qlustar provides several elegant and powerful solutions to overcome these obstacles:
A matching chroot can be selected when creating a Qlustar image. During the node boot phase, the chosen chroot is then loaded underneath the root filesystem (from the image) using a NFS/FUSE based UnionFS mechanism. The chroot holds a normal installation of the chosen edge platform and any software package can be installed in it using standard apt-get methods. The contents of such installed packages is available immediately on any node that has booted the image.
When working on a node booted with a loaded UnionFS chroot, you won’t be able to distinguish whether an executable/file comes from the image itself or from the chroot. This mechanism transforms the minimal node OS into a full-blown installation with thousands of installable distribution packages at your fingertip.
To add further flexibility, a specific NFS directory is searched for executable scripts in a late phase of the boot process. The scripts located in that directory are then executed one by one. This mechanism can be used to apply arbitrary modifications and customization to the node OS. It can equally well be employed to start custom services not being part of the booted image.
Controlled content customization is achieved in two possible ways: a) by putting the content in a certain path that is checked during the image creation process (modification before node boot) and b) by a particular boot job that consults a NFS configuration file (modification during node boot). In there, one can specify directories to be created, files to be copied from an NFS to a local path, or links that need to be created in the RAM based root filesystem.
Fig.2 - The node boot process.