MakeTextSearch
MakeTextSearch is a tool that let you make full-text search using the engine of your RDBMS easily.
There are a tools like Sphinx or Lucene very powerful and fast, but they require an extra effort to configure and maintain them, because they are tools outside of the RDBMS. Some RDBMS, like PostgreSQL or MySQL, have their own full-text search engine, so the time in configuration and maintenance is lesser.
In this first version we have implemented support for PostgreSQL. In the near future we will implement support for MySQL and more. If the database has no full-text search engine it will use an equivalent using plain SQL.
MakeTextSearch works with Rails 3
Installation
In the Gemfile
file add
gem "make-text-search"
After the bundle install you have to generate the migration to create the documents table
rails generate text_search:migration
Usage
In the models where you want to run full-text searchs, you have to declare the indexed fields using has_text_search
class Post < ActiveRecord::Base
has_text_search :title, :content
end
The fields added to the index can be virtual.
class Post < ActiveRecord::Base
belongs_to :user
# Add the user name to the index
def user_name
user.try :name
end
has_text_search :title, :content, :user_name
end
Filters
The content added to the index can be filtered. Right now there are two filters: :substrings
y :strip_html
class Post < ActiveRecord::Base
has_text_search :title, :filter => :substrings
has_text_search :content, :intro, :filter => [:strip_html, :substrings]
end
You can use several filters using an array. The order is important. If you use both :substrings
and :strip_html
, :strip_html
should be the first.
:substrings
let you search inside the words. For example, the word knowledge
can be found with owled
if you filter the content with :substrings
.
:strip_html
removes the HTML tags and it translates HTML entities to its equivalent in UTF-8:
Ir a <a href="http://www.google.es">Google España</a>
will be
Ir a Google España
Language
The documents can be parsed using a language. You can set the default language with config.make_text_search.default_language
. The initial value is nil, which means that the documents are parsed in a agnostic way.
If you want to set the default language add this line to the config/application.rb
file.
config.make_text_search.default_language = "spanish"
If you want to have a different language for every record you have to implement the text_search_language
instance method. For example
class Post < ActiveRecord::Base
has_text_search :title, :filter => :substrings
has_text_search :content, :intro, :filter => [:strip_html, :substrings]
def text_search_language
case locale
when "es" "spanish"
when "en" "english"
when "de" "german"
when "it" "italian"
else
Rails.application.config.make_text_search.default_language
end
end
end
You can get the available languages of your PostgreSQL server with
select * from pg_ts_dict;
Search
Para realizar las búsquedas hay que usar el scope #search_text To perform search you have to use the scope #search_text
Post.search_text("foo")
Post.published.search_text("foo & bar").paginate(:page => params[:page])
The query language is the same used by PostgreSQL. See www.postgresql.org/docs/8.4/static/datatype-textsearch.html#DATATYPE-TSQUERY
Resources
TODO
-
Query builder. Add & and | operators
-
Option :language in #search_text
-
In PostgreSQL, use ts_headline and ts_rank
-
RDoc-ize methods