RSolr
A Ruby client for Apache Solr. RSolr has been developed to be simple and extendable. It features transparent JRuby DirectSolrConnection support and a simple Hash-in, Hash-out architecture.
Installation:
gem sources -a http://gemcutter.org
sudo gem install rsolr
Related Resources & Projects
-
RSolr::Ext – an extension kit for RSolr
-
Sunspot – an awesome Solr DSL, built with RSolr
-
Blacklight – a next generation Library OPAC, built with RSolr
-
solr-ruby – the original Solr Ruby Gem
Simple usage:
require 'rubygems'
require 'rsolr'
solr = RSolr.connect :url=>'http://solrserver.com'
# send a request to /select
response = rsolr.select :q=>'*:*'
# send a request to a custom request handler; /catalog
response = rsolr.request '/catalog', :q=>'*:*'
# alternative to above:
response = rsolr.catalog :q=>'*:*'
To use a DirectSolrConnection (no http) in JRuby:
# "apache-solr" should be a path to a solr build.
Dir['apache-solr/dist/*.jar'].each{|jar|require jar}
Dir['apache-solr/lib/*.jar'].each{|jar|require jar}
opts = {:home_dir=>'/path/to/solr/home'}
# note: you'll have to close the direct connection yourself unless using a block.
solr = RSolr.direct_connect(opts)
solr.select :q=>'*:*'
solr.connection.close
# OR using a block for automatic connection closing:
RSolr.direct_connect opts do |solr|
solr.select :q=>'*:*'
end
In general, the direct connection is less than ideal in most applications. You’ll be missing out on Http caching, and it’ll be impossible to do distributed searches. The direct connection could possibly come in handy though, for quickly indexing large numbers of documents.
For more information about DirectSolrConnection, see the API.
Querying
Use the #select method to send requests to the /select handler:
response = solr.select({
:q=>'washington',
:start=>0,
:rows=>10
})
The params sent into the method are sent to Solr as-is. The one exception is if a value is an array. When an array is used, multiple parameters are generated for the Solr query. Example:
solr.select :q=>'roses', :fq=>['red', 'violet']
The above statement generates this Solr query:
?q=roses&fq=red&fq=violet
Use the #request method for a custom request handler path:
response = solr.request '/documents', :q=>'test'
A shortcut for the above example:
response = solr.documents :q=>'test'
Doc Pagination
RSolr is compatible with WillPaginate. To use pagination, call the “paginate” method:
response = solr.paginate current_page, per_page, :q=>'*:*'
You can also set a handler path:
response = solr.paginate current_page, per_page, '/music', :q=>'*:*'
Handler paths can also be set using a paginate_* method call like so:
response = solr.paginate_music current_page, per_page, :q=>'testing'
Pagination Responses
The response array from a paginate method has the following methods:
start, per_page, total, current_page, total_pages, previous_page, next_page, has_next?, has_previous?
For example:
result = solr.paginate 1, 2, :q=>'*:*'
result['response']['docs'].has_next?
To use with WillPaginate:
<%= will_paginate result['response']['docs'] %>
Updating Solr
Updating can be done using native Ruby structures. Hashes are used for single documents and arrays are used for a collection of documents (hashes). These structures get turned into simple XML “messages”. Raw XML strings can also be used.
Raw XML via #update
solr.update '</commit>'
solr.update '</optimize>'
Single document via #add
solr.add :id=>1, :price=>1.00
Multiple documents via #add
documents = [{:id=>1, :price=>1.00}, {:id=>2, :price=>10.50}]
solr.add documents
When adding, you can also supply “add” xml element attributes and/or a block for manipulating other “add” related elements (docs and fields) when using the #add method:
doc = {:id=>1, :price=>1.00}
add_attributes = {:allowDups=>false, :commitWithin=>10.0}
solr.add(doc, add_attributes) do |doc|
# boost each document
doc.attrs[:boost] = 1.5
# boost the price field:
doc.field_by_name(:price).attrs[:boost] = 2.0
end
Delete by id
solr.delete_by_id 1
or an array of ids
solr.delete_by_id [1, 2, 3, 4]
Delete by query:
solr.delete_by_query 'price:1.00'
Delete by array of queries
solr.delete_by_query ['price:1.00', 'price:10.00']
Commit & optimize shortcuts
solr.commit
solr.optimize
Response Formats
The default response format is Ruby. When the :wt param is set to :ruby, the response is eval’d resulting in a Hash. You can get a raw response by setting the :wt to “ruby” - notice, the string – not a symbol. RSolr will eval the Ruby string ONLY if the :wt value is :ruby. All other response formats are available as expected, :wt=>‘xml’ etc..
Evaluated Ruby (default)
solr.select(:wt=>:ruby) # notice :ruby is a Symbol
Raw Ruby
solr.select(:wt=>'ruby') # notice 'ruby' is a String
XML:
solr.select(:wt=>:xml)
JSON:
solr.select(:wt=>:json)
You can access the original request context (path, params, url etc.) by calling the #raw method:
response = solr.select :q=>'*:*'
response.raw[:status_code]
response.raw[:body]
response.raw[:url]
The raw is a hash that contains the generated params, url, path, post data, headers etc., very useful for debugging and testing.