CemAcpt

CemAcpt is an acceptance testing library / command line application for running acceptance tests for the CEM modules.

CemAcpt uses Terraform for provisioning test nodes, and Goss to execute acceptance tests. For provisioning nodes in GCP, CemAcpt also uses the gcloud CLI.

Installation

gem install cem_acpt

cem_acpt was developed using Ruby 3.2.1, but other Ruby versions >= 3.0.0 should work.

Usage

Quickstart

Make sure Terraform and the gcloud CLI are installed and in your PATH. Instructions for installing Terraform can be found here and instructions for installing the gcloud CLI can be found here.Then, navigate to the root of the cem_linux module and run the following command:

cem_acpt --config ./cem_acpt_config.yaml

Command-line

cem_acpt -h | --help
cem_acpt_image -h | --help

If you do not delete Gemfile.lock before running bundle install, you may encounter dependency version errors. Just delete Gemfile.lock and run bundle install again to get past these.

Test directory structure

CemAcpt expects a specific directory structure for your acceptance tests. The directory structure is as follows:

spec
└── acceptance
    └── <framework>_<os>-<version>_firewalld_<profile>_<level>
        ├── goss.yaml # Goss test file
        └── manifest.pp # Puppet manifest to apply

For example, the following directory structure would be valid:

spec
└── acceptance
    └── cis_rhel-8_firewalld_server_2
        ├── goss.yaml
        └── manifest.pp

The directory name is used to generate test data. See Test data for more information. The file goss.yaml is used as the Goss test file. The file manifest.pp is used as the Puppet manifest to apply.

Configuration

There are several ways to configure CemAcpt which are outlined below in order of precendence. Options specified later in the list will be merged with or override options specified earlier in the list. If an option is specified in multiple places, the last place it is specified will be used.

Environment variables
User-specific config file
Config file specified by --config option
Command-line options

You can view your working config by adding the -Y flag to any command. This will print the merged config to STDOUT as a YAML document. You can also use the -X flag to print an explanation of how the config was merged.

Environment variables

Environment variables are the most basic way to configure CemAcpt. They are useful for setting sensitive information like API keys or passwords. Environment variables are prefixed with CEM_ACPT_ and are converted to the nested structure of the config file, if applicable. Double underscores (__) are used to separate key levels. For example, the environment variable CEM_ACPT_NODE_DATA__DISK_SIZE=40 would be converted to the following config:

node_data:
  disk_size: 40

All environment variables are optional. If an environment variable is not set, the value will be taken from the config file or command-line option. If an environment variable is set, it will be overridden by the same value from the config file or command-line option.

Config file

The most common way is to create a config file and pass it to CemAcpt using the --config option. The config file should be a YAML file. See sample_config.yaml for an example.

You can also create a user-specific config file at $HOME/.cem_acpt/config.yaml. This file will be merged with the config file specified by the --config option. This is useful for storing sensitive information or making config changes that you don't want to commit to a repo.

Options set in the user-specific config file will be overridden by options set in the config file specified by the --config option. Both config files will override options specified by environment variables.

Command-line options

CemAcpt can be configured using command-line options. These options are specified using the --<option> syntax. For a full list of options, run cem_acpt -h. Options specified on the command-line will override options specified in the config file and environment variables.

Tracing

To aide in development, you can enable tracing on CemAcpt's Ruby code execution by using the --trace flag. Traces are logged only for events executed by CemAcpt code itself, and the logger code is exempt from tracing. CAUTION Tracing produces a lot of logs and it's advised to only use it when logging to a file. Additionally, the specific trace events that are logged can be specified using the --trace-events flag. Tracing is enabled by the TracePoint class.

Goss

Goss is the core infrastructure testing tool used by CemAcpt. The goss.yaml file is used to specify the tests to run. Please see the Goss documentation for more information on how to write Goss tests.

Terraform

CemAcpt uses Terraform for managing the lifecycle of test nodes. Users don't interact with Terraform directly, but you will need it installed to use CemAcpt.

Bolt task testing

CemAcpt can execute Bolt tasks against the test nodes and perform basic validation against their status and outputs.

Configuring Bolt tests

Bolt tests expose the following configuration options that can be set via a config file:

bolt.inventory_path - A relative or absolute path to an existing inventory file, or the where the inventory file will be created
bolt.project.name - The name of the Bolt project to be created and used during the Bolt tests
bolt.project.analytics - Whether or not to enable Bolt analytics. Should normally be false.
bolt.project.path - A relative or absolute path to an existing project file, or where the project file will be created
bolt.tests.only - An array of acceptance tests to only run Bolt tests for. When acceptance test names are specified here, Bolt tasks will only be ran against the nodes created for those acceptance tests.
bolt.tests.ignore - An array of acceptance tests to not run Bolt tests for. When acceptance test names are specified here, Bolt tasks will not be ran against the nodes created for those acceptance tests.
bolt.tasks.only - An array of Bolt tasks to only run against nodes. Bolt tasks not listed here will not be ran.
bolt.tasks.ignore - An array of Bolt tasks to not run against nodes. Bolt tasks specified here will not be ran at all.
bolt.tasks.module_pattern - If specified, will only run Bolt tasks whose module prefix matches the specified pattern will be ran. The module prefix is the first part of the full task name before the first ::. This option is interpreted as a RegEx pattern. For example, if bolt.tasks.module_prefix is set to ^our_module, and our we have two tasks, our_module::the_task and other_module::another_task, only our_module::task will be ran.
bolt.tasks.name_filter - Similar to module_pattern, but works on the portion of the task name after the module prefix and is exclusionary. For example, if bolt.tasks.name_filter is set to ^another_, and our we have two tasks, our_module::the_task and other_module::another_task, only our_module::the_task will be ran.

Creating Bolt tests

Bolt tests are not mandatory as all Bolt tasks that run will automatically be validated based on if they run successfully or not. However, sometimes we may want to test certain aspects of a Bolt task's output or pass a Bolt task parameters. To do this, we need to create a Bolt test file.

Bolt test files are YAML files named bolt.yaml that live inside the individual acceptance test directories. For each acceptance test, only one bolt.yaml file should exist. The bolt.yaml file consists of one or more YAML hashes that take the following form:

'module_name::task_name':
  params: # Optional, if you want to specify params to pass to the task at runtime
    param_name: 'param_string_value'
  status: <success | failure | skipped> # Optional, this is set to 'success' by default
  other_bolt_json_output_keys: # Optional
    match: '^a string RegEx pattern$' # Optional
    not_match: '^a string RegEx pattern$' # Optional

All Bolt tasks are ran with the --format json flag, which returns a JSON document of their output. The keys of this document can be validated to either a specified pattern, not match a specified pattern, or both. You can also specify a string value for a key, and then the key's value with be compared for simple string equality with the given value.

Example:

In this example, we will go over how a Bolt task's output would be evaluated by a Bolt test hash. This example assumes one Bolt task named cem_linux::audit_sssd_certmap that takes no parameters was ran successfully against two nodes.

Bolt task JSON output:

{
  "items": [
    {
      "target":"35.212.146.14",
      "action":"task",
      "object":"cem_linux::audit_sssd_certmap",
      "status":"success",
      "value":{
        "sssd_certmap_exists":false
      }
    },
    {
      "target":"35.212.197.53",
      "action":"task",
      "object":"cem_linux::audit_sssd_certmap",
      "status":"success",
      "value":{
        "sssd_certmap_exists":false
      }
    }
  ],
  "target_count": 2,
  "elapsed_time": 7
}

Bolt test hash in bolt.yaml:

'cem_linux::audit_sssd_certmap':
  status: 'success'
  value:
    match: 'false'

This should result in the following log message after a run: INFO: CemAcpt: SUMMARY: Bolt tests: status: passed, tests total: 1, tests succeeded: 1, tests failed: 0

Platforms

Platforms are the underlying infrastructure that nodes are provisioned on. Currently, only GCP is supported. Each platform has two parts to it: the platform and the node data.

Platform

A platform represents where nodes are provisioned. Currently, only GCP is supported. Platforms are configured using the top-level key platform. For example, the following config would configure a GCP platform:

platform:
  name: gcp
  project: my-project
  region: us-west1
  zone: us-west1-b
  subnetwork: my-subnetwork

Node data

Node data is a more generic complement to the platform that specifies node-level details for test nodes. Node data is configured using the top-level key node_data. For example, the following config would configure node data:

node_data:
  disk_size: 40
  machine_type: e2-small

Developing new platforms

Platforms are defined as Ruby files in the lib/cem_acpt/platform directory. Each platform file should define a module called Platform that implements the following methods:

module Platform
  # @return [Hash] A hash of platform data
  def platform_data
    # Return a hash of platform data
  end

  # @return [Hash] A hash of node data
  def node_data
    # Destroy a node
  end
end

Both the #platform_data and #node_data methods should return a hash. Each method should also also ensure the config is queried for values for each key in each hash. This is enabled by the exposed instance variables @config and @test_data in the Platform module. For an example, see the GCP platform.

Tests

Each acceptance test should be specified in the config under the top-level key tests. These tests SHOULD NOT be paths, just the test directory name. For example, if you have an acceptance test directory spec/acceptance/cis_rhel_acceptance_test/*, the tests config would look like this:

tests:
  - our_acceptance_test

Aside from ways of manipulating test data outlined below, tests are typically matched one-to-one with a test node.

Test data

Test data is a collection of data about a specific test that persists through the entire acceptance test suite lifecycle. Specifically, test data is implemented as an array of hashes with each hash representing a single run through the acceptance test suite lifecycle. What this means is that each item in the test data array is couple with a single test node that is provisioned and a single test run against that node. Test data is used to store values that are test-specific, as opposed to node data which is generic.

Test data can be configured using the top-level key test_data. There are several supported options for manipulating test data:

for_each

When specified, should be a hash of key -> Array pairs. For each of these pairs, a copy of the test data is made for each item in the array, with that item becoming the value of a variable key.

Example:

test_data:
  for_each:
    collection:
      - puppet6
      - puppet7

tests:
  - our_acceptance_test

In the above config, instead of test data consisting of a single hash, it will have two hashes that have the collection variable set to puppet6 and puppet7 respectively. This means that there will be two test nodes provisioned and two runs of the test our_acceptance_test, one run against each node.

vars

Aribitrary key-value pairs that are injected into the test data hashes. Think of these as constants.

name_pattern_vars

A Ruby regex pattern that is matched against the test name with the intent of creating variables from named capture groups. See sample_config.yaml for an example.

vars_post_processing

Rules that allow for processing variables after all other test data rules are ran. See sample_config.yaml for an example.

Image name builder

Much like name_pattern_vars, specifying the image_name_builder top-level key in the config allows you to manipulate acceptance test names to create a special test data variable called image_name. This is helpful for when you have multiple platform base images and want to use the correct image with the correct test. See sample_config.yaml for an example.

The acceptance test lifecycle

Load and merge the config
Create the local test directory under $HOME/.cem_acpt
Build the Puppet module. Uses current dir if no module_dir is specified in the config
Copy all relevant files into the local test directory
- This includes the Terraform files provided by CemAcpt, as well as the files under the specified acceptance test directory, and the built Puppet module
Provision test nodes using Terraform
- After the node is created, the contents of the local test directory are copied to the node
- Additionally, the Puppet module is installed on the node and Puppet is ran once to apply manifest.pp
- After, the Goss server endpoints are started and exposed on the node
Once node is provisioned, make HTTP get requests to the Goss server endpoints to run the tests
Destroy the test nodes
Report the results of the tests

Generating acceptance test node images

CemAcpt provides the command cem_acpt_image that allows you to generate new acceptance test node images that are then used with the cem_acpt command. This is useful for when you want to test a new version of Puppet or a new OS.

Configuring the image builder

Images are built according to the entries in the images config key. Each entry in the images hash should follow the following format:

images:
  <image family name>:
    os: <os key string (rhel, alma, etc.)>
    os_major_version: <os major version integer>
    puppet_version: <puppet major version integer>
    base_image: <base image name / family string>
    provision_commands: <array of commands to run on the image>

For example, the following config would build an image for Puppet 7 on RHEL 8:

images:
  cem-acpt-rhel-8-puppet7-firewalld:
    os: rhel
    os_major_version: 8
    puppet_version: 7
    base_image: 'rhel-cloud/rhel-8'
    provision_commands:
      - 'systemctl enable firewalld'
      - 'systemctl start firewalld'
      - 'firewall-cmd --permanent --add-service=ssh'
      - 'firewall-cmd --reload'
      - 'useradd testuser1'
      - "echo 'testuser1:P@s5W-rd$' | chpasswd"

See sample_config.yaml for a more complete example.

Testing with cem_windows

While testing with cem_windows follows pretty much the same outline as cem_linux, there are a few extra step that we have to do when bootstrapping the generated node. Two of the most prominents steps are enabling long path and using NSSM to start the goss service. When we enable long path on Windows, it allows the command puppet module install to install dependencies correctly (specifically the dsc modules). As for NSSM, stands for Non-sucking service manager, it is necessary for us to use this tool because without it, we will not be able to create services for Goss. Windows services cannot run from an executable but rather through a Windows service project that can be build with Visual Studio. NSSM allows us to bypass having to create Windows service project for Goss and create services directly from the Goss executable.