Class: ProxmoxWaiter

Inherits:
Object
  • Object
show all
Defined in:
lib/hybrid_platforms_conductor/hpc_plugins/provisioner/proxmox/proxmox_waiter.rb

Overview

Serve Proxmox reservation requests, like a waiter in a restaurant ;-) Multi-process safe.

Constant Summary collapse

FUTEX_TIMEOUT =

Integer: Timeout in seconds to get the futex Take into account that some processes can be lengthy while the futex is taken:

  • POST/DELETE operations in the Proxmox API requires tasks to be performed which can take a few seconds, depending on the load.

  • Proxmox API sometimes fails to respond when containers are being locked temporarily (we have a 30 secs timeout for each one).

600
RETRY_QUEUE_WAIT =

Integer: Maximum timeout in seconds before retrying getting the Futex when we are not first in the queue (a rand will be applied to it)

30

Instance Method Summary collapse

Constructor Details

#initialize(config_file, proxmox_user, proxmox_password, proxmox_realm) ⇒ ProxmoxWaiter

Constructor

Parameters
  • config_file (String): Path to a JSON file containing a configuration for ProxmoxWaiter. Here is the file structure:

    • proxmox_api_url (String): Proxmox API URL.

    • futex_file (String): Path to the file serving as a futex.

    • logs_dir (String): Path to the directory containing logs [default: ‘.’]

    • api_max_retries (Integer): Max number of API retries

    • api_wait_between_retries_secs (Integer): Number of seconds to wait between API retries

    • pve_nodes (Array<String>): List of PVE nodes allowed to spawn new containers [default: all]

    • vm_ips_list (Array<String>): The list of IPs that are available for the Proxomx containers.

    • vm_ids_range ([Integer, Integer]): Minimum and maximum reservable VM ID

    • coeff_ram_consumption (Integer): Importance coefficient to assign to the RAM consumption when selecting available PVE nodes

    • coeff_disk_consumption (Integer): Importance coefficient to assign to the disk consumption when selecting available PVE nodes

    • expiration_period_secs (Integer): Number of seconds defining the expiration period

    • expire_stopped_vm_timeout_secs (Integer): Number of seconds before defining stopped VMs as expired

    • limits (Hash): Limits to be taken into account while reserving resources. Each property is optional and no property means no limit.

      • nbr_vms_max (Integer): Max number of VMs we can reserve.

      • cpu_loads_thresholds ([Float, Float, Float]): CPU load thresholds from which a PVE node should not be used (as soon as 1 of the value is greater than 1 of those thresholds, discard the node).

      • ram_percent_used_max (Float): Max percentage (between 0 and 1) of RAM that can be reserved on a PVE node.

      • disk_percent_used_max (Float): Max percentage (between 0 and 1) of disk that can be reserved on a PVE node.

  • proxmox_user (String): Proxmox user to be used to connect to the API.

  • proxmox_password (String): Proxmox password to be used to connect to the API.

  • proxmox_realm (String): Proxmox realm to be used to connect to the API.



46
47
48
49
50
51
52
53
54
55
56
57
58
59
# File 'lib/hybrid_platforms_conductor/hpc_plugins/provisioner/proxmox/proxmox_waiter.rb', line 46

def initialize(config_file, proxmox_user, proxmox_password, proxmox_realm)
  @config = JSON.parse(File.read(config_file))
  @proxmox_user = proxmox_user
  @proxmox_password = proxmox_password
  @proxmox_realm = proxmox_realm
  # Keep a memory of non-debug stopped containers, so that we can guess if they are expired or not after some time.
  # Time when we noticed a given container is stopped, per creation date, per VM ID, per PVE node
  # We add the creation date as a VM ID can be reused (with a different creation date) and we want to make sure we don't think a newly created VM is here for longer that it should.
  # Hash< String,   Hash< Integer, Hash< String,        Time                 > > >
  # Hash< pve_node, Hash< vm_id,   Hash< creation_date, time_seen_as_stopped > > >
  @non_debug_stopped_containers = {}
  @log_file = "#{@config['logs_dir'] || '.'}/proxmox_waiter_#{Time.now.utc.strftime('%Y%m%d%H%M%S')}_pid_#{Process.pid}_#{File.basename(config_file, '.json')}.log"
  FileUtils.mkdir_p File.dirname(@log_file)
end

Instance Method Details

#create(vm_info) ⇒ Object

Reserve resources for a new container. Check resources availability.

Parameters
  • vm_info (Hash<String,Object>): The VM info to be created, using the same properties as LXC container creation through Proxmox API.

Result
  • Hash<Symbol, Object> or Symbol: Reserved resource info, or Symbol in case of error. The following properties are set as resource info:

    • pve_node (String): Node on which the container has been created.

    • vm_id (Integer): The VM ID

    • vm_ip (String): The VM IP

    Possible error codes returned are:

    • not_enough_resources: There is no available free resources to be reserved

    • no_available_ip: There is no available IP to be reserved

    • no_available_vm_id: There is no available VM ID to be reserved

    • exceeded_number_of_vms: There is already too many VMs running



77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
# File 'lib/hybrid_platforms_conductor/hpc_plugins/provisioner/proxmox/proxmox_waiter.rb', line 77

def create(vm_info)
  log "Ask to create #{vm_info}"
  # Extract the required resources from the desired VM info
  nbr_cpus = vm_info['cpulimit']
  ram_mb = vm_info['memory']
  disk_gb = Integer(vm_info['rootfs'].split(':').last)
  reserved_resource = nil
  start do
    pve_node_scores = pve_scores_for(nbr_cpus, ram_mb, disk_gb)
    # Check if we are not exceeding hard-limits:
    # * the number of vms to be created
    # * the free IPs
    # * the free VM IDs
    # In such case, even when free resources on PVE nodes are enough to host the new container, we still need to clean-up before.
    nbr_vms = nbr_vms_handled_by_us
    if nbr_vms >= @config['limits']['nbr_vms_max'] || free_ips.empty? || free_vm_ids.empty?
      log 'Hitting at least 1 hard-limit. Check if we can destroy expired containers.'
      log "[ Hard limit reached ] - Already #{nbr_vms} are created (max is #{@config['limits']['nbr_vms_max']})." if nbr_vms >= @config['limits']['nbr_vms_max']
      log '[ Hard limit reached ] - No more available IPs.' if free_ips.empty?
      log '[ Hard limit reached ] - No more available VM IDs.' if free_vm_ids.empty?
      clean_up_done = false
      # Check if we can remove some expired ones
      @config['pve_nodes'].each do |pve_node|
        if api_get("nodes/#{pve_node}/lxc").any? { |lxc_info| is_vm_expired?(pve_node, Integer(lxc_info['vmid'])) }
          destroy_expired_vms_on(pve_node)
          clean_up_done = true
        end
      end
      if clean_up_done
        nbr_vms = nbr_vms_handled_by_us
        if nbr_vms >= @config['limits']['nbr_vms_max']
          log "[ Hard limit reached ] - Still too many running VMs after clean-up: #{nbr_vms}."
          reserved_resource = :exceeded_number_of_vms
        elsif free_ips.empty?
          log '[ Hard limit reached ] - Still no available IP'
          reserved_resource = :no_available_ip
        elsif free_vm_ids.empty?
          log '[ Hard limit reached ] - Still no available VM ID'
          reserved_resource = :no_available_vm_id
        end
      else
        log 'Could not find any expired VM to destroy.'
        # There was nothing to clean. So wait for other processes to destroy their containers.
        reserved_resource =
          if nbr_vms >= @config['limits']['nbr_vms_max']
            :exceeded_number_of_vms
          elsif free_ips.empty?
            :no_available_ip
          else
            :no_available_vm_id
          end
      end
    end
    if reserved_resource.nil?
      # Select the best node, first keeping expired VMs if possible.
      # This is the index of the scores to be checked: if we can choose without recycling VMs, do it by considering score index 0.
      score_idx =
        if pve_node_scores.all? { |_pve_node, pve_node_scores| pve_node_scores[0].nil? }
          # No node was available without removing expired VMs.
          # Therefore we consider only scores without expired VMs.
          log 'No PVE node has enough free resources without removing eventual expired VMs'
          1
        else
          0
        end
      selected_pve_node, selected_pve_node_score = pve_node_scores.inject([nil, nil]) do |(best_pve_node, best_score), (pve_node, pve_node_scores)|
        if pve_node_scores[score_idx].nil? ||
          (!best_score.nil? && pve_node_scores[score_idx] >= best_score)
          [best_pve_node, best_score]
        else
          [pve_node, pve_node_scores[score_idx]]
        end
      end
      if selected_pve_node.nil?
        # No PVE node can host our request.
        log 'Could not find any PVE node with enough free resources'
        reserved_resource = :not_enough_resources
      else
        log "[ #{selected_pve_node} ] - PVE node selected with score #{selected_pve_node_score}"
        # We know on which PVE node we can instantiate our new container.
        # We have to purge expired VMs on this PVE node before reserving a new creation.
        destroy_expired_vms_on(selected_pve_node) if score_idx == 1
        # Now select the correct VM ID and VM IP.
        vm_id_or_error, ip = reserve_on(selected_pve_node, nbr_cpus, ram_mb, disk_gb)
        if ip.nil?
          # We have an error
          reserved_resource = vm_id_or_error
        else
          # Create the container for real
          completed_vm_info = vm_info.dup
          completed_vm_info['vmid'] = vm_id_or_error
          completed_vm_info['net0'] = "#{completed_vm_info['net0']},ip=#{ip}/32"
          completed_vm_info['description'] = "#{completed_vm_info['description']}creation_date: #{Time.now.utc.strftime('%FT%T')}\n"
          log "[ #{selected_pve_node}/#{vm_id_or_error} ] - Create LXC container"
          wait_for_proxmox_task(selected_pve_node, @proxmox.post("nodes/#{selected_pve_node}/lxc", completed_vm_info))
          reserved_resource = {
            pve_node: selected_pve_node,
            vm_id: vm_id_or_error,
            vm_ip: ip
          }
        end
      end
    end
  end
  reserved_resource
end

#destroy(vm_info) ⇒ Object

Destroy a VM.

Parameters
  • vm_info (Hash<String,Object>): The VM info to be destroyed:

    • vm_id (Integer): The VM ID

    • node (String): The node for which this VM has been created

    • environment (String): The environment for which this VM has been created

Result
  • Hash<Symbol, Object> or Symbol: Released resource info, or Symbol in case of error. The following properties are set as resource info:

    • pve_node (String): Node on which the container has been released (if found).

    Possible error codes returned are: None



197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
# File 'lib/hybrid_platforms_conductor/hpc_plugins/provisioner/proxmox/proxmox_waiter.rb', line 197

def destroy(vm_info)
  log "Ask to destroy #{vm_info}"
  found_pve_node = nil
  start do
    vm_id_str = vm_info['vm_id'].to_s
    # Destroy the VM ID
    # Find which PVE node hosts this VM
    unless @config['pve_nodes'].any? do |pve_node|
        api_get("nodes/#{pve_node}/lxc").any? do |lxc_info|
          if lxc_info['vmid'] == vm_id_str
            # Make sure this VM is still used for the node and environment we want.
            # It could have been deleted manually and re-affected to another node/environment automatically, and in this case we should not remove it.
             = (pve_node, vm_info['vm_id'])
            if [:node] == vm_info['node'] && [:environment] == vm_info['environment']
              destroy_vm_on(pve_node, vm_info['vm_id'])
              found_pve_node = pve_node
              true
            else
              log "[ #{pve_node}/#{vm_info['vm_id']} ] - This container is not hosting the node/environment to be destroyed: #{[:node]}/#{[:environment]} != #{vm_info['node']}/#{vm_info['environment']}"
              false
            end
          else
            false
          end
        end
      end
      log "Could not find any PVE node hosting VM #{vm_info['vm_id']}"
    end
  end
  reserved_resource = {}
  reserved_resource[:pve_node] = found_pve_node unless found_pve_node.nil?
  reserved_resource
end