Torch.rb

:fire: Deep learning for Ruby, powered by LibTorch

Check out:

Build Status

Installation

First, download LibTorch. For Mac arm64, use:

curl -L https://download.pytorch.org/libtorch/cpu/libtorch-macos-arm64-2.5.0.zip > libtorch.zip
unzip -q libtorch.zip

For Linux x86-64, use the cxx11 ABI version. For other platforms, build LibTorch from source.

Then run:

bundle config build.torch-rb --with-torch-dir=/path/to/libtorch

And add this line to your application’s Gemfile:

gem "torch-rb"

It can take 5-10 minutes to compile the extension. Windows is not currently supported.

Getting Started

A good place to start is Deep Learning with Torch.rb: A 60 Minute Blitz.

Tutorials

Examples

API

This library follows the PyTorch API. There are a few changes to make it more Ruby-like:

  • Methods that perform in-place modifications end with ! instead of _ (add! instead of add_)
  • Methods that return booleans use ? instead of is_ (tensor? instead of is_tensor)
  • Numo is used instead of NumPy (x.numo instead of x.numpy())

You can follow PyTorch tutorials and convert the code to Ruby in many cases. Feel free to open an issue if you run into problems.

Overview

Some examples below are from Deep Learning with PyTorch: A 60 Minutes Blitz

Tensors

Create a tensor from a Ruby array

x = Torch.tensor([[1, 2, 3], [4, 5, 6]])

Get the shape of a tensor

x.shape

There are many functions to create tensors, like

a = Torch.rand(3)
b = Torch.zeros(2, 3)

Each tensor has four properties

  • dtype - the data type - :uint8, :int8, :int16, :int32, :int64, :float32, :float64, or :bool
  • layout - :strided (dense) or :sparse
  • device - the compute device, like CPU or GPU
  • requires_grad - whether or not to record gradients

You can specify properties when creating a tensor

Torch.rand(2, 3, dtype: :float64, layout: :strided, device: "cpu", requires_grad: true)

Operations

Create a tensor

x = Torch.tensor([10, 20, 30])

Add

x + 5 # tensor([15, 25, 35])

Subtract

x - 5 # tensor([5, 15, 25])

Multiply

x * 5 # tensor([50, 100, 150])

Divide

x / 5 # tensor([2, 4, 6])

Get the remainder

x % 3 # tensor([1, 2, 0])

Raise to a power

x**2 # tensor([100, 400, 900])

Perform operations with other tensors

y = Torch.tensor([1, 2, 3])
x + y # tensor([11, 22, 33])

Perform operations in-place

x.add!(5)
x # tensor([15, 25, 35])

You can also specify an output tensor

result = Torch.empty(3)
Torch.add(x, y, out: result)
result # tensor([15, 25, 35])

Numo

Convert a tensor to a Numo array

a = Torch.ones(5)
a.numo

Convert a Numo array to a tensor

b = Numo::NArray.cast([1, 2, 3])
Torch.from_numo(b)

Autograd

Create a tensor with requires_grad: true

x = Torch.ones(2, 2, requires_grad: true)

Perform operations

y = x + 2
z = y * y * 3
out = z.mean

Backprop

out.backward

Get gradients

x.grad # tensor([[4.5, 4.5], [4.5, 4.5]])

Stop autograd from tracking history

x.requires_grad # true
(x**2).requires_grad # true

Torch.no_grad do
  (x**2).requires_grad # false
end

Neural Networks

Define a neural network

class MyNet < Torch::NN::Module
  def initialize
    super()
    @conv1 = Torch::NN::Conv2d.new(1, 6, 3)
    @conv2 = Torch::NN::Conv2d.new(6, 16, 3)
    @fc1 = Torch::NN::Linear.new(16 * 6 * 6, 120)
    @fc2 = Torch::NN::Linear.new(120, 84)
    @fc3 = Torch::NN::Linear.new(84, 10)
  end

  def forward(x)
    x = Torch::NN::F.max_pool2d(Torch::NN::F.relu(@conv1.call(x)), [2, 2])
    x = Torch::NN::F.max_pool2d(Torch::NN::F.relu(@conv2.call(x)), 2)
    x = Torch.flatten(x, 1)
    x = Torch::NN::F.relu(@fc1.call(x))
    x = Torch::NN::F.relu(@fc2.call(x))
    @fc3.call(x)
  end
end

Create an instance of it

net = MyNet.new
input = Torch.randn(1, 1, 32, 32)
net.call(input)

Get trainable parameters

net.parameters

Zero the gradient buffers and backprop with random gradients

net.zero_grad
out.backward(Torch.randn(1, 10))

Define a loss function

output = net.call(input)
target = Torch.randn(10)
target = target.view(1, -1)
criterion = Torch::NN::MSELoss.new
loss = criterion.call(output, target)

Backprop

net.zero_grad
p net.conv1.bias.grad
loss.backward
p net.conv1.bias.grad

Update the weights

learning_rate = 0.01
net.parameters.each do |f|
  f.data.sub!(f.grad.data * learning_rate)
end

Use an optimizer

optimizer = Torch::Optim::SGD.new(net.parameters, lr: 0.01)
optimizer.zero_grad
output = net.call(input)
loss = criterion.call(output, target)
loss.backward
optimizer.step

Saving and Loading Models

Save a model

Torch.save(net.state_dict, "net.pth")

Load a model

net = MyNet.new
net.load_state_dict(Torch.load("net.pth"))
net.eval

When saving a model in Python to load in Ruby, convert parameters to tensors (due to outstanding bugs in LibTorch)

state_dict = {k: v.data if isinstance(v, torch.nn.Parameter) else v for k, v in state_dict.items()}
torch.save(state_dict, "net.pth")

Tensor Creation

Here’s a list of functions to create tensors (descriptions from the C++ docs):

  • arange returns a tensor with a sequence of integers
  Torch.arange(3) # tensor([0, 1, 2])
  • empty returns a tensor with uninitialized values
  Torch.empty(3) # tensor([7.0054e-45, 0.0000e+00, 0.0000e+00])
  • eye returns an identity matrix
  Torch.eye(2) # tensor([[1, 0], [0, 1]])
  • full returns a tensor filled with a single value
  Torch.full([3], 5) # tensor([5, 5, 5])
  • linspace returns a tensor with values linearly spaced in some interval
  Torch.linspace(0, 10, 5) # tensor([0, 5, 10])
  • logspace returns a tensor with values logarithmically spaced in some interval
  Torch.logspace(0, 10, 5) # tensor([1, 1e5, 1e10])
  • ones returns a tensor filled with all ones
  Torch.ones(3) # tensor([1, 1, 1])
  • rand returns a tensor filled with values drawn from a uniform distribution on [0, 1)
  Torch.rand(3) # tensor([0.5444, 0.8799, 0.5571])
  • randint returns a tensor with integers randomly drawn from an interval
  Torch.randint(1, 10, [3]) # tensor([7, 6, 4])
  • randn returns a tensor filled with values drawn from a unit normal distribution
  Torch.randn(3) # tensor([-0.7147,  0.6614,  1.1453])
  • randperm returns a tensor filled with a random permutation of integers in some interval
  Torch.randperm(3) # tensor([2, 0, 1])
  • zeros returns a tensor filled with all zeros
  Torch.zeros(3) # tensor([0, 0, 0])

LibTorch Compatibility

Here’s the list of compatible versions.

Torch.rb LibTorch
0.18.x 2.5.x
0.17.x 2.4.x
0.16.x 2.3.x
0.15.x 2.2.x
0.14.x 2.1.x
0.13.x 2.0.x
0.12.x 1.13.x

Performance

Deep learning is significantly faster on a GPU.

Linux

With Linux, install CUDA and cuDNN and reinstall the gem.

Check if CUDA is available

Torch::CUDA.available?

Move a neural network to a GPU

net.cuda

If you don’t have a GPU that supports CUDA, we recommend using a cloud service. Paperspace has a great free plan. We’ve put together a Docker image to make it easy to get started. On Paperspace, create a notebook with a custom container. Under advanced options, set the container name to:

ankane/ml-stack:torch-gpu

And leave the other fields in that section blank. Once the notebook is running, you can run the MNIST example.

Mac

With Apple silicon, check if Metal Performance Shaders (MPS) is available

Torch::Backends::MPS.available?

Move a neural network to a GPU

device = Torch.device("mps")
net.to(device)

History

View the changelog

Contributing

Everyone is encouraged to help improve this project. Here are a few ways you can help:

To get started with development:

git clone https://github.com/ankane/torch.rb.git
cd torch.rb
bundle install
bundle exec rake compile -- --with-torch-dir=/path/to/libtorch
bundle exec rake test

You can use this script to test on GPUs with the AWS Deep Learning Base AMI (Ubuntu 18.04).

Here are some good resources for contributors: