Class: ComputeUnit::Gpu

Inherits:
ComputeBase show all
Defined in:
lib/compute_unit/gpu.rb

Direct Known Subclasses

AmdGpu, NvidiaGpu

Constant Summary collapse

DEVICE_CLASS =
'030000'
DEVICE_CLASS_NAME =
'GPU'

Constants inherited from ComputeBase

ComputeBase::CACHE_TIMEOUT

Constants inherited from Device

Device::PROC_PATH, Device::SYSFS_DEVICES_PATH

Instance Attribute Summary collapse

Attributes inherited from ComputeBase

#index, #meta, #power_offset, #serial, #timestamp, #type, #uuid

Attributes inherited from Device

#device_class_id, #device_id, #device_path, #device_vendor_id, #make, #model, #subsystem_device_id, #subsystem_vendor_id, #vendor

Class Method Summary collapse

Instance Method Summary collapse

Methods inherited from ComputeBase

#attached_processes, compute_classes, #device_class_name, #expired_metadata?, #top_processes

Methods included from Logger

color, log_file, log_level, logger, #logger

Methods inherited from Device

#base_hwmon_path, create_from_path, device, device_class, device_lookup, device_vendor, #expired_metadata?, #generic_model, #hwmon_path, #lock_rom, logger, manual_device_database, manual_device_lookup, manual_vendor_lookup, manual_vendors, name_map, name_translation, pci_database, #read_file, #read_hwmon_data, #read_kernel_setting, read_kernel_setting, #rom_data, #rom_path, subsystem_device, subsystem_device_lookup, subsystem_vendor, subsystem_vendor_lookup, #sysfs_model_name, system_checksum, #to_json, #unlock_rom, vendor_lookup, #write_hwmon_data, #write_kernel_setting, write_kernel_setting

Methods included from Utils

check_for_root, #root?, root?

Constructor Details

#initialize(device_path, opts = {}) ⇒ Gpu

Returns a new instance of Gpu.

Parameters:

  • device_path (String)
    • that pci bus path to the device

  • opts (Hash) (defaults to: {})
  • bios (Hash)

    a customizable set of options

  • model (Hash)

    a customizable set of options

  • serial (Hash)

    a customizable set of options

  • busid (Hash)

    a customizable set of options

  • meta (Hash)

    a customizable set of options

  • index (Hash)

    a customizable set of options

  • uuid (Hash)

    a customizable set of options

  • use_opencl (Hash)

    a customizable set of options



71
72
73
74
75
76
77
78
79
80
81
82
83
84
# File 'lib/compute_unit/gpu.rb', line 71

def initialize(device_path, opts = {})
  super(device_path, opts)
  @type = :GPU
  @bios = opts[:bios].upcase if opts[:bios]
  @model = opts[:model]
  @serial = opts[:serial]
  @pci_loc = opts[:busid]
  @meta = opts[:meta]
  @index = opts[:index].to_i
  @uuid = opts[:uuid] || opts[:serial]
  @name = model
  @power_offset = 0
  @use_opencl = opts[:use_opencl] || false
end

Instance Attribute Details

#biosObject (readonly)

Returns the value of attribute bios.



7
8
9
# File 'lib/compute_unit/gpu.rb', line 7

def bios
  @bios
end

#nameObject (readonly)

Returns the value of attribute name.



7
8
9
# File 'lib/compute_unit/gpu.rb', line 7

def name
  @name
end

#pci_locObject (readonly)

Returns the value of attribute pci_loc.



7
8
9
# File 'lib/compute_unit/gpu.rb', line 7

def pci_loc
  @pci_loc
end

#power_limitObject

Returns the value of attribute power_limit.

Raises:

  • (NotImplementedError)


10
11
12
# File 'lib/compute_unit/gpu.rb', line 10

def power_limit
  @power_limit
end

#use_openclObject

Returns the value of attribute use_opencl.



10
11
12
# File 'lib/compute_unit/gpu.rb', line 10

def use_opencl
  @use_opencl
end

Class Method Details

.attached_processes(field = :pctcpu, filter = %r{/dev/dri|nvidia\d+}) ⇒ Array

Returns - an array of attached processes.

Parameters:

  • filter (Regex) (defaults to: %r{/dev/dri|nvidia\d+})
    • if supplied filter out devices from fd list

  • field (Symbol) (defaults to: :pctcpu)
    • the field to sort by

Returns:

  • (Array)
    • an array of attached processes



20
21
22
23
24
25
26
27
# File 'lib/compute_unit/gpu.rb', line 20

def self.attached_processes(field = :pctcpu, filter = %r{/dev/dri|nvidia\d+})
  filter ||= %r{/dev/dri|nvidia\d+}
  # looks for any fd device with dri or nvidia in the name
  p = Sys::ProcTable.ps(smaps: false).find_all do |p|
    p.fd.values.find { |f| f =~ filter }
  end
  p.sort_by(&field)
end

.devicesArray

Note:

the devices are sorted by the device path

Note:

this can mean AMD, NVIDIA, Intel or other crappy embedded devices

Returns - returns a list of device paths of all devices considered for display.

Returns:

  • (Array)
    • returns a list of device paths of all devices considered for display



55
56
57
58
59
# File 'lib/compute_unit/gpu.rb', line 55

def self.devices
  @devices ||= ComputeUnit::ComputeBase.devices.find_all do |device|
    ComputeUnit::Device.device_class(device) == DEVICE_CLASS
  end.sort
end

.find_all(use_opencl = false) ⇒ Array

Returns - returns an array of gpu objects, sorted by index.

Returns:

  • (Array)
    • returns an array of gpu objects, sorted by index



278
279
280
281
282
283
# File 'lib/compute_unit/gpu.rb', line 278

def self.find_all(use_opencl = false)
  require 'compute_unit/gpus/amd_gpu'
  require 'compute_unit/gpus/nvidia_gpu'
  g = compute_classes.map { |klass| klass.find_all(use_opencl) }.flatten
  g.sort_by(&:index)
end

.found_devicesArray

Returns - array of devices paths either from amd or nvidia.

Returns:

  • (Array)
    • array of devices paths either from amd or nvidia



335
336
337
# File 'lib/compute_unit/gpu.rb', line 335

def self.found_devices
  @found_devices ||= ComputeUnit::AmdGpu.devices + ComputeUnit::NvidiaGpu.devices
end

.opencl_cacheCacheStore

Returns - returns an instance of the cachestore for storign opencl cache.

Returns:

  • (CacheStore)
    • returns an instance of the cachestore for storign opencl cache



286
287
288
# File 'lib/compute_unit/gpu.rb', line 286

def self.opencl_cache
  @opencl_cache ||= ComputeUnit::CacheStore.new('opencl_cache')
end

.opencl_devicesArray

overwrites cache if new devices are found OpenCL should only be used when necessary as it can freeze sometimes OpenCL indexes items differently

Returns:

  • (Array)
    • returns an array of opencl devices



343
344
345
346
347
348
349
# File 'lib/compute_unit/gpu.rb', line 343

def self.opencl_devices
  @opencl_devices ||= opencl_devices_from_cache || begin
    items = opencl_devices_from_platform
    opencl_cache.write_cache('opencl_compute_units', ComputeUnit::Device.system_checksum.to_s => items)
    items
  end
end

.opencl_devices_from_cacheArray

Returns - array of openstruct or nil.

Returns:

  • (Array)
    • array of openstruct or nil



291
292
293
294
# File 'lib/compute_unit/gpu.rb', line 291

def self.opencl_devices_from_cache
  data = opencl_cache.read_cache('opencl_compute_units', {})
  data[ComputeUnit::Device.system_checksum]
end

.opencl_devices_from_platformObject



297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
# File 'lib/compute_unit/gpu.rb', line 297

def self.opencl_devices_from_platform
  require 'ostruct'
  # opencl takes a second to load so we cache later in the process
  # which is why we need the openstruct object here
  # opencl can also freeze the system if it tries to enumerate a dead GPU
  # opencl sould be used sparingly as a result and only read when absolutely
  # neccessary and no dead GPUs.
  # TODO: warn when dead gpus detected
  begin
    require 'opencl_ruby_ffi'
    ComputeUnit::Logger.logger.debug('Searching for openCL devices')
    OpenCL.platforms.map(&:devices).flatten.map do |d|
      type = d.platform.name.include?('AMD') ? 'AMD' : 'Nvidia'
      board_name = type == 'AMD' ? d.board_name_amd : ''
      max_computes = d.respond_to?(:max_compute_units) ? d.max_compute_units : 0
      OpenStruct.new(
        name: d.name,
        type: type,
        board_name: board_name,
        max_compute_units: max_computes
      )
    end
  rescue OpenCL::Error::DEVICE_NOT_FOUND => e
    ComputeUnit::Logger.logger.debug("OpenCL error: #{e.message}, are you root?")
    []
  rescue RuntimeError => e # OpenCL::Error::PLATFORM_NOT_FOUND_KHR,
    ComputeUnit::Logger.logger.debug("OpenCL error: #{e.message}")
    ComputeUnit::Logger.logger.debug("OpenCL error: #{e.backtrace}")
    []
  end
end

Instance Method Details

#asic_tempInteger

Returns - the temperature of the asic chip.

Returns:

  • (Integer)
    • the temperature of the asic chip



218
219
220
# File 'lib/compute_unit/gpu.rb', line 218

def asic_temp
  0
end

#compute_typeObject



12
13
14
# File 'lib/compute_unit/gpu.rb', line 12

def compute_type
  type
end

#configured_core_voltageNumeric

Returns - returns voltage of core in mV.

Returns:

  • (Numeric)
    • returns voltage of core in mV



165
166
167
# File 'lib/compute_unit/gpu.rb', line 165

def configured_core_voltage
  0
end

#core_clockInteger

Returns - the core clock speed.

Returns:

  • (Integer)
    • the core clock speed



155
156
157
# File 'lib/compute_unit/gpu.rb', line 155

def core_clock
  0
end

#core_voltageNumeric

Returns - returns voltage of core in mV.

Returns:

  • (Numeric)
    • returns voltage of core in mV



160
161
162
# File 'lib/compute_unit/gpu.rb', line 160

def core_voltage
  0
end

#fanObject

Raises:

  • (NotImplementedError)


86
87
88
# File 'lib/compute_unit/gpu.rb', line 86

def fan
  raise NotImplementedError
end

#fan_limitInteger

Returns - a percentage value of the current fan limit.

Returns:

  • (Integer)
    • a percentage value of the current fan limit



106
107
108
# File 'lib/compute_unit/gpu.rb', line 106

def fan_limit
  fan
end

#fan_max_limitInteger

Returns - a percentage value of the max fan limit.

Returns:

  • (Integer)
    • a percentage value of the max fan limit



116
117
118
# File 'lib/compute_unit/gpu.rb', line 116

def fan_max_limit
  nil
end

#fan_min_limitInteger

Returns - a percentage value of the min fan limit.

Returns:

  • (Integer)
    • a percentage value of the min fan limit



111
112
113
# File 'lib/compute_unit/gpu.rb', line 111

def fan_min_limit
  nil
end

#hardware_infoHash

Returns - hash of information about the gpu data.

Returns:

  • (Hash)
    • hash of information about the gpu data



202
203
204
205
206
207
208
209
210
211
212
213
214
215
# File 'lib/compute_unit/gpu.rb', line 202

def hardware_info
  {
    uuid: uuid,
    gpuId: "GPU#{index}",
    syspath: device_path,
    pciLoc: pci_loc,
    name: name,
    bios: bios,
    subType: subtype,
    make: make,
    model: model,
    vendor: vendor
  }
end

#mem_infoObject



169
170
171
172
173
174
175
176
177
178
179
180
181
182
# File 'lib/compute_unit/gpu.rb', line 169

def mem_info
  {
    index: "#{device_class_name}#{index}",
    name: name,
    volt: memory_volt,
    clock: memory_clock,
    memory_name: nil,
    memory_type: nil,
    memory_used: memory_used,
    memory_free: memory_free,
    memory_total: memory_total,
    mem_temp: mem_temp
  }
end

#mem_tempInteger

Returns - temperature of the memory.

Returns:

  • (Integer)
    • temperature of the memory



223
224
225
# File 'lib/compute_unit/gpu.rb', line 223

def mem_temp
  0
end

#memory_clockInteger

Returns - the memory speed.

Returns:

  • (Integer)
    • the memory speed



145
146
147
# File 'lib/compute_unit/gpu.rb', line 145

def memory_clock
  0
end

#memory_freeObject

Raises:

  • (NotImplementedError)


136
137
138
# File 'lib/compute_unit/gpu.rb', line 136

def memory_free
  raise NotImplementedError
end

#memory_totalObject

Raises:

  • (NotImplementedError)


128
129
130
# File 'lib/compute_unit/gpu.rb', line 128

def memory_total
  raise NotImplementedError
end

#memory_usedObject

Raises:

  • (NotImplementedError)


132
133
134
# File 'lib/compute_unit/gpu.rb', line 132

def memory_used
  raise NotImplementedError
end

#memory_voltInteger

Returns - the memory speed.

Returns:

  • (Integer)
    • the memory speed



150
151
152
# File 'lib/compute_unit/gpu.rb', line 150

def memory_volt
  0
end

#opencl_board_nameString

Returns - returns the raw data of the board name from opencl, return nil if no device.

Returns:

  • (String)
    • returns the raw data of the board name from opencl, return nil if no device



35
36
37
# File 'lib/compute_unit/gpu.rb', line 35

def opencl_board_name
  @opencl_board_name ||= opencl_device&.board_name if use_opencl
end

#opencl_deviceOpenCL_Device

Returns:

  • (OpenCL_Device)


30
31
32
# File 'lib/compute_unit/gpu.rb', line 30

def opencl_device
  @opencl_device ||= self.class.opencl_devices.find_all { |cu| cu[:type] == make }[index] if use_opencl
end

#opencl_nameString

Note:

not really needed for Nvidia types since nvidia-smi returns really complete information

ie. GeForce GTX 1070 or RX 580

Returns:

  • (String)
    • the device name



48
49
50
# File 'lib/compute_unit/gpu.rb', line 48

def opencl_name
  @opencl_name ||= opencl_device.name if use_opencl
end

#opencl_unitsInteger

Returns - returns the number of compute units decteded by opencl not to be confused with stream processors. Can be helpful when determining which product vega56 or vega64.

Returns:

  • (Integer)
    • returns the number of compute units decteded by opencl

    not to be confused with stream processors. Can be helpful when determining which product vega56 or vega64



41
42
43
# File 'lib/compute_unit/gpu.rb', line 41

def opencl_units
  @opencl_units ||= opencl_device.max_compute_units.to_i if use_opencl
end

#powerObject

Raises:

  • (NotImplementedError)


97
98
99
# File 'lib/compute_unit/gpu.rb', line 97

def power
  raise NotImplementedError
end

#power_max_limitObject

Raises:

  • (NotImplementedError)


124
125
126
# File 'lib/compute_unit/gpu.rb', line 124

def power_max_limit
  raise NotImplementedError
end

#pstateObject

Raises:

  • (NotImplementedError)


101
102
103
# File 'lib/compute_unit/gpu.rb', line 101

def pstate
  raise NotImplementedError
end

#statusObject



90
91
92
93
94
95
# File 'lib/compute_unit/gpu.rb', line 90

def status
  return 0 if utilization > 20 && power >= 50
  return 2 if power < 20

  1
end

#status_infoHash

Returns - hash of hardware status about the gpu.

Returns:

  • (Hash)
    • hash of hardware status about the gpu



185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
# File 'lib/compute_unit/gpu.rb', line 185

def status_info
  {
    index: "#{device_class_name}#{index}",
    name: name,
    bios: bios,
    core_clock: core_clock,
    memory_clock: memory_clock,
    power: power,
    fan: fan,
    core_volt: core_voltage,
    temp: temp,
    mem_temp: mem_temp,
    status: status
  }
end

#tempObject



232
233
234
# File 'lib/compute_unit/gpu.rb', line 232

def temp
  0
end

#to_hObject



236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
# File 'lib/compute_unit/gpu.rb', line 236

def to_h
  {
    uuid: uuid,
    gpuId: "GPU#{index}",
    syspath: device_path,
    pciLoc: pci_loc,
    name: name,
    bios: bios,
    subType: subtype,
    make: make,
    model: model,
    vendor: vendor,
    # memory_name: nil,
    # memory_type: nil,
    # gpu_platform: nil,
    power: power,
    # power_limit: power_limit,
    # power_max_limit: power_max_limit,
    utilization: utilization,
    # memory_used: memory_used ,
    # memory_free: memory_free,
    # memory_total: memory_total,
    temperature: temp,
    status: status,
    pstate: pstate,
    fanSpeed: fan,
    type: compute_type,
    maxTemp: nil,
    mem: memory_clock,
    cor: core_clock,
    vlt: core_voltage,
    mem_temp: mem_temp,
    maxFan: nil,
    dpm: nil,
    vddci: nil,
    maxPower: nil,
    ocProfile: nil,
    opencl_enabled: use_opencl
  }
end

#utilizationObject

Raises:

  • (NotImplementedError)


140
141
142
# File 'lib/compute_unit/gpu.rb', line 140

def utilization
  raise NotImplementedError
end

#vddgfxInteger

Returns - the voltage reading of the card, maybe just amd cards (mV).

Returns:

  • (Integer)
    • the voltage reading of the card, maybe just amd cards (mV)



228
229
230
# File 'lib/compute_unit/gpu.rb', line 228

def vddgfx
  0
end

#voltage_tableHash

Returns - a hash of voltages per the voltage table, nil if no table available.

Returns:

  • (Hash)
    • a hash of voltages per the voltage table, nil if no table available



330
331
332
# File 'lib/compute_unit/gpu.rb', line 330

def voltage_table
  []
end