AthenaUDF

Gem Version Coverage Status

Ruby-version Athena User Defined Function (UDF).

This gem is highly inspired by the Python-version Athena UDF.

See an official example implementation for more detail of a lambda function for Athena UDF.

Installation

Install the gem and add to the application's Gemfile by executing:

$ bundle add athena-udf

If bundler is not being used to manage dependencies, install the gem by executing:

$ gem install athena-udf

Usage

Just make a subclass of AthenaUDF::BaseUDF and implement a concrete function logic.

require "athena-udf"

class SimpleVarcharUDF < AthenaUDF::BaseUDF
  def self.handle_athena_record(_input_schema, _output_schema, record)
    [record[0].downcase]
  end
end

Then, it can be called as SimpleVarcharUDF.lambda_handler in your lambda function for Athena UDF workloads.

After pushing an image to Amazon ECR, you can call the function like the following SQL.

USING EXTERNAL FUNCTION my_udf(col1 varchar) RETURNS varchar LAMBDA 'athena-udf-simple-varchar'

SELECT my_udf('FooBar');

See the official document for the UDF usage.

Development

To contribute to this library, first checkout the code. Then, install the dependent gems.

$ bundle install

To run the tests:

$ bundle exec rspec

Deployment

You can try the example with the following steps.

First, push a container image to Amazon ECR:

$ aws ecr get-login-password | docker login --username AWS --password-stdin https://<ACCOUNT_ID>.dkr.ecr.<AWS_REGION>.amazonaws.com
$ docker build --platform=linux/amd64 -t <ACCOUNT_ID>.dkr.ecr.<AWS_REGION>.amazonaws.com/athena-udf-test -f Dockerfile.example .
$ docker push <ACCOUNT_ID>.dkr.ecr.<AWS_REGION>.amazonaws.com/athena-udf-test

Then, create a lambda function with the CLI:

$ aws iam create-role --role-name athena-udf-simple-varchar --assume-role-policy-document '{"Version": "2012-10-17","Statement": [{ "Effect": "Allow", "Principal": {"Service": "lambda.amazonaws.com"}, "Action": "sts:AssumeRole"}]}'
$ aws iam attach-role-policy --role-name athena-udf-simple-varchar --policy-arn arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole
$ aws lambda create-function --function-name athena-udf-simple-varchar --package-type Image --role arn:aws:iam::<ACCOUNT_ID>:role/athena-udf-simple-varchar --code ImageUri=<ACCOUNT_ID>.dkr.ecr.<AWS_REGION>.amazonaws.com/athena-udf-test:latest --publish

Development

You can use the dev container image, which includes necessary packages, to develop this library.

$ docker build -t ruby-athena-udf-dev -f Dockerfile.dev .
$ docker run -v $PWD:/src -it ruby-athena-udf-dev

Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/dtaniwaki/ruby-athena-udf.

License

The gem is available as open source under the terms of the MIT License.