A Ruby binding to re2, an "efficient, principled regular expression library".
Current version: 1.4.0
Supported Ruby versions: 1.8.7, 1.9.3, 2.0, 2.1, 2.2, 2.3, 2.4, 2.5, 2.6, 2.7, 3.0
Supported re2 versions: libre2.0 (< 2020-03-02), libre2.1 (2020-03-02), libre2.6 (2020-03-03), libre2.7 (2020-05-01), libre2.8 (2020-07-06), libre2.9 (2020-11-01)
You will need re2 installed as well as a C++ compiler such as gcc (on Debian and Ubuntu, this is provided by the build-essential package). If you are using Mac OS X, I recommend installing re2 with Homebrew by running the following:
$ brew install re2
If you are using Debian, you can install the libre2-dev package like so:
$ sudo apt-get install libre2-dev
Recent versions of re2 require a compiler with C++11 support such as clang 3.4 or gcc 4.8.
If you are using a packaged Ruby distribution, make sure you also have the Ruby header files installed such as those provided by the ruby-dev package on Debian and Ubuntu.
You can then install the library via RubyGems with
gem install re2 or
install re2 -- --with-re2-dir=/path/to/re2/prefix if re2 is not installed in
any of the following default locations:
Full documentation automatically generated from the latest version is available at http://mudge.name/re2/.
Note that re2's regular expression syntax differs from PCRE and Ruby's
Regexp library, see the official syntax page for more
While re2 uses the same naming scheme as Ruby's built-in regular expression
MatchData), its API is slightly
$ irb -rubygems > require 're2' > r = RE2::Regexp.new('w(\d)(\d+)') => #<RE2::Regexp /w(\d)(\d+)/> > m = r.match("w1234") => #<RE2::MatchData "w1234" 1:"1" 2:"234"> > m => "1" > m.string => "w1234" > m.begin(1) => 1 > m.end(1) => 2 > r =~ "w1234" => true > r !~ "bob" => true > r.match("bob") => nil
RE2::Regexp.compile) can be quite verbose, a helper method has been
Kernel so you can use a shorter version to create regular
> RE2('(\d+)') => #<RE2::Regexp /(\d+)/>
Note the use of single quotes as double quotes will interpret
in the following example:
> RE2("(\d+)") => #<RE2::Regexp /(d+)/>
As of 0.3.0, you can use named groups:
> r = RE2::Regexp.new('(?P<name>\w+) (?P<age>\d+)') => #<RE2::Regexp /(?P<name>\w+) (?P<age>\d+)/> > m = r.match("Bob 40") => #<RE2::MatchData "Bob 40" 1:"Bob" 2:"40"> > m[:name] => "Bob" > m["age"] => "40"
As of 0.6.0, you can use
RE2::Regexp#scan to incrementally scan text for
matches (similar in purpose to Ruby's
scan will return an
RE2::Scanner which is
enumerable meaning you can
each to iterate through the matches (and even use
re = RE2('(\w+)') scanner = re.scan("It is a truth universally acknowledged") scanner.each do |match| puts match end scanner.rewind enum = scanner.to_enum enum.next #=> ["It"] enum.next #=> ["is"]
Pre-compiling regular expressions with
RE2(re)(including specifying options, e.g.
RE2::Regexp.new("pattern", :case_sensitive => false)
Extracting matches with
re2.match(text)(and an exact number of matches with
re2.match(text, number_of_matches)such as
Extracting matches by name (both with strings and symbols)
Checking for matches with
re2 =~ text,
re2 === text(for use in
re2 !~ text
Incrementally scanning text with
Checking regular expression compilation with
Checking regular expression "cost" with
Checking the options for an expression with
re2.optionsor individually with
Performing a single string replacement with
Performing a global string replacement with
Escaping regular expressions with
- Thanks to Jason Woods who contributed the
original implementations of
- Thanks to Stefano Rivera who first contributed C++11 support;
- Thanks to Stan Hu for reporting a bug with empty patterns and
- Thanks to Sebastian Reitenbach for reporting
the deprecation and removal of the
utf8encoding option in re2;
- Thanks to Sergio Medina for reporting a bug when
RE2::Scanner#scanwith an invalid regular expression.
All issues and suggestions should go to GitHub Issues.