Sunspot Cell (gem)

Note by Zheileman

This gem adds Cell support (for indexing rich documents like pdf, docs, html, etc…) to Sunspot (developed against Sunspot 1.3.0). Support Paperclip and S3 Storage

The code is based on the patch included here: outoftime.lighthouseapp.com/projects/20339/tickets/98-solr-cell

Requirements

  • Sunspot gem installed (>= 1.3.0)

  • Solr Cell libraries (dist/apache-solr-cell-1.4.X.jar and contrib/extraction/lib/*.jar from the standard Solr distribution) placed in the /solr/lib directory as created by the Sunspot gem, in development environment. Your production setup might vary.

  • Adjustments to the Solr schema.xml:

<fieldType name=“ignored” stored=“false” indexed=“false” multiValued=“true” class=“solr.StrField” />

and

<dynamicField name=“*_attachment” stored=“true” type=“text” multiValued=“true” indexed=“true”/> <dynamicField name=“ignored_*” type=“ignored”/>

Install Plugin

Add sunspot gem and sunspot_cell to Gemfile:

gem 'sunspot_rails', '~> 1.3.0'
gem 'sunspot_cell', :git => 'git://github.com/zheileman/sunspot_cell.git'

Usage

class Doc
  searchable do
     text :title
     attachment :file
   end
end

Paperclip & S3 Storage

require 'open-uri'

class Doc
  searchable do
     text :title
     attachment :attached_file
   end

private
  def attached_file
    URI.parse(remote_full_url)
  end
end