Xapit

Xapit (pronounced “zap it”) is a high level interface for working with a Xapian database.

Note: This project is early in development and the API is subject to change.

Install

If you haven’t already, first install Xapian and the Xapian Bindings for Ruby. wiki.github.com/Overbryd/acts_as_xapian/installation

To install as a Rails plugin, run this command.

script/plugin install git://github.com/ryanb/xapit.git

Note: this will soon also be available as a Ruby gem, but not yet.

Setup

Simply call “xapit” in the model and pass a block to define the indexed attributes.

class Article < ActiveRecord::Base
  xapit do |index|
    index.text :name, :content
    index.field :category_id
    index.facet :author_name, "Author"
    index.sortable :id, :category_id
  end
end

First we index “name” and “content” attributes for full text searching. The “category_id” field is indexed for :conditions searching. The “author_name” is indexed as a facet with “Author” being the display name of the facet. See the facets section below for details. Finally the “id” and “category_id” attributes are indexed as sortable attributes so they can be included in the :order option in a search.

Because the indexing happens in Ruby these attributes do no have to be database columns. They can be simple Ruby methods. For example, the “author_name” attribute mentioned above can be defined like this.

def author_name
  author.name
end

This way you can create a completely custom facet by simply defining your own method. Multiple facet options or field values per record are supported if you return an array.

def author_names
  authors.map(&:name) # => ["John", "Bob"]
end

Finally, you can pass any find options to the xapit method to determine what gets indexed or improve performance with eager loading or a different batch size.

xapit(:batch_size => 100, :include => :author, :conditions => { :visible => true })

You can specify a :weight option to give a text attribute more importance. This will cause search terms matching that attribute to have a higher rank. The default weight is 1. Decimal (0.5) weight values are not supported.

index.text :name, :weight => 10

Index

To perform the indexing, run the xapit:index rake task.

rake xapit:index

It can also be triggered through Ruby code using this command.

Xapit::Config.remove_database
Xapit.index_all

You can then perform a search on the model.

# perform a simple full text search
@articles = Article.search("phone")

# add pagination if you're using will_paginate
@articles = Article.search("phone", :per_page => 10, :page => params[:page])

# search based on indexed fields
@articles = Article.search("phone", :conditions => { :category_id => params[:category_id] })

# manually sort based on any number of indexed fields, sort defaults to most relevant
@articles = Article.search("phone", :order => [:category_id, :id], :descending => true)

# basic boolean matching is supported
@articles = Article.search("phone OR fax NOT email")

You can also search all indexed models through Xapit.search.

# search all indexed models
@records = Xapit.search("phone")

Results

Simply iterate through the returned set to display the results.

<% for article in @articles %>
  <%= article.name %>
  <%= article.xapit_relevance %>
<% end %>

The “xapit_relevance” holds a percentage (between 0 and 100) determining how relevant the given document is to the user’s search query.

Spelling

If the searched term isn’t found, but it is similar to another term then it will show up as a spelling suggestion.

<% if @articles.spelling_suggestion %>
  Did you mean <%= link_to h(@articles.spelling_suggestion), :overwrite_params => { :keywords => @articles.spelling_suggestion } %>?
<% end %>

Facets

Facets allow you to further filter the result set based on certain attributes.

<% for facet in @articles.facets %>
  <%= facet.name %>
  <% for option in facet.options %>
    <%= link_to option.name, :overwrite_params => { :facets => option } %>
    (<%= option.count %>)
  <% end %>
<% end %>

The to_param method is defined on option to return an identifier which will be passed through the URL. Use this in the search.

Article.search("phone", :facets => params[:facets])

You can also list the applied facets along with a remove link.

<% for option in @articles.applied_facet_options %>
  <%=h option.name %>
  <%= link_to "remove", :overwrite_params => { :facets => option } %>
<% end %>

Config

When installing Xapit as a Rails plugin, an initializer file is automatically created to setup. It looks like this.

Xapit::Config.setup(:database_path => "#{Rails.root}/db/xapiandb")

There are many other options you can pass into here. This is a more advanced configuration setting which changes the stemming language, disables spelling, and changes the indexer and parser to a classic variation. The classic ones use Xapian’s built-in term generator and query parser instead of the ones offered by Xapit.

Xapit::Config.setup(
  :database_path => "#{Rails.root}/db/external/xapiandb",
  :spelling => false,
  :stemming => "german",
  :indexer => ClassicIndexer,
  :query_parser => ClassicQueryParser
)

Outside of Rails

Xapit can be used outside of Rails too. You’ll just need to include the membership module:

class Product # not Active Record
  include Xapit::Membership

  xapit # ...
end

This class is expected to respond to “find_each” (Product.find_each) which is used to iterate through the records when indexing and “find” for fetching a record by “id”.

Bug Reports

If you have found a bug to report or a feature to request, please add it to the GitHub issue tracker if it is not there already.

github.com/ryanb/xapit/issues/

Development

This project can be found on github at the following URL.

github.com/ryanb/xapit

If you would like to contribute to this project, please fork the repository and send me a pull request.