link-header-parser-ruby

A Ruby gem for parsing HTTP Link headers.

Gem Downloads Build

Getting Started

Before installing and using link-header-parser-ruby, you'll want to have Ruby 3.0 (or newer) installed. Using a Ruby version managment tool like rbenv, chruby, or rvm is recommended.

link-header-parser-ruby is developed using Ruby 3.3.1 and is tested against additional Ruby versions using GitHub Actions.

Installation

Add link-header-parser-ruby to your project's Gemfile and run bundle install:

source "https://rubygems.org"

gem "link-header-parser"

Usage

With link-header-parser-ruby added to your project's Gemfile and installed, you may parse a URL's HTTP Link headers by doing:

require "net/http"
require "link-header-parser"

url = "https://sixtwothree.org"
link_headers = Net::HTTP.get_response(URI.parse(url)).get_fields("Link")

collection = LinkHeaderParser.parse(link_headers, base: url)

The parse method accepts two arguments:

  1. an Array of strings representing HTTP Link headers (e.g. ['</>; rel="home"', '</chapters/1>; anchor="#copyright"; rel="license"'])
  2. a String (or any String-like object) representing the absolute URL of the resource providing the HTTP Link headers

In the example above, collection is an instance of LinkHeadersCollection which includes Ruby's Enumerable mixin. This mixin allows for use of common methods like each, first/last, and map.

For example, you could retrieve an array of target_uris:

puts collection.map(&:target_uri)
#=> ["https://assets.sixtwothree.org/", "https://fonts.googleapis.com/", "https://fonts.gstatic.com/", "https://sixtwothree.org/webmentions"]

Working with a LinkHeadersCollection

In addition to the included Enumerable methods, the following methods may be used to interact with a LinkHeadersCollection:

The relation_types Method

puts collection.relation_types
#=> ["preconnect", "webmention"]

The group_by_relation_type Method

Using the collection from above, the group_by_relation_type method returns a Hash:

{
  preconnect: [
    #<LinkHeaderParser::LinkHeader target_uri: "https://assets.sixtwothree.org/", relation_types: ["preconnect"]>,
    #<LinkHeaderParser::LinkHeader target_uri: "https://fonts.googleapis.com/", relation_types: ["preconnect"]>,
    #<LinkHeaderParser::LinkHeader target_uri: "https://fonts.gstatic.com/", relation_types: ["preconnect"]>
  ],
  webmention: [
    #<LinkHeaderParser::LinkHeader target_uri: "https://sixtwothree.org/webmentions", relation_types: ["webmention"]>
  ]
}

Working with a LinkHeader

You may interact with one or more LinkHeaders in a LinkHeadersCollection using the methods outlined below. The naming conventions for these methods draws heavily on the terminology established in RFC-5988 and RFC-8288.

link_header = LinkHeaderParser.parse(%(</index.html>; rel="home"), base: "https://example.com/").first

link_header.target_string
#=> "/index.html"

link_header.target_uri
#=> "https://example.com/index.html"

The target_string method returns a string of the value between the opening and closing angle brackets at the beginning of the Link header. The target_uri method returns a string representing the resolved URL.

link_header = LinkHeaderParser.parse(%(</chapters/1>; anchor="#copyright"; rel="license"), base: "https://example.com/").first

link_header.context_string
#=> "#copyright"

link_header.context_uri
#=> "https://example.com/chapters/1#copyright"

The anchor parameter's value may be a fragment identifier (e.g. #foo), a relative URL (e.g. /foo), or an absolute URL (e.g. https://context.example.com). The context_string method returns the anchor parameter's value (when present) and defaults to the target_string value. The context_uri method returns a string representing the resolved URL.

Relation Type (§ 3.3)

link_header = LinkHeaderParser.parse(%(</chapters/1>; rel="prev start"), base: "https://example.com/").first

link_header.relations_string
#=> "prev start"

link_header.relation_types
#=> ["prev", "start"]
link_header = LinkHeaderParser.parse(%(</posts.rss>; rel="alternate"; hreflang="en-US"; title="sixtwothree.org: Posts"; type="application/rss+xml"), base: "https://sixtwothree.org").first

link_header.link_parameters
#=> [#<LinkHeaderParser::LinkHeaderParameter name: "rel", value: "alternate">, #<LinkHeaderParser::LinkHeaderParameter name: "hreflang", value: "en-US">, #<LinkHeaderParser::LinkHeaderParameter name: "title", value: "sixtwothree.org: Posts">, #<LinkHeaderParser::LinkHeaderParameter name: "type", value: "application/rss+xml">]

Note that the Array returned by the link_parameters method may include multiple LinkHeaderParameters with the same name depending on the provided Link header. Certain methods on LinkHeader will return values from the first occurrence of a parameter name (e.g. link_header.relations_string) in accordance with RFC-8288.

Acknowledgments

link-header-parser-ruby is written and maintained by Jason Garber.

License

link-header-parser-ruby is freely available under the MIT License. Use it, learn from it, fork it, improve it, change it, tailor it to your needs.