Class: Document
- Inherits:
-
Object
- Object
- Document
- Defined in:
- lib/picolena/templates/app/models/document.rb
Overview
Document class retrieves information from filesystem and the index for any given document.
Instance Attribute Summary collapse
-
#complete_path ⇒ Object
readonly
Returns the value of attribute complete_path.
-
#matching_content ⇒ Object
Returns the value of attribute matching_content.
-
#score ⇒ Object
Returns the value of attribute score.
Class Method Summary collapse
-
.default_fields_for(complete_path) ⇒ Object
Fields that are shared between every document.
Instance Method Summary collapse
-
#alias_path ⇒ Object
End users should not always know where documents are stored internally.
-
#basename ⇒ Object
Returns filename without extension “buildings.odt” => “buildings”.
-
#cached ⇒ Object
Cache à la Google.
-
#content ⇒ Object
Retrieves content as it is now.
- #filename ⇒ Object
- #highlighted_cache(raw_query) ⇒ Object
-
#initialize(path) ⇒ Document
constructor
A new instance of Document.
-
#language ⇒ Object
Returns language.
- #mtime ⇒ Object
-
#pretty_date ⇒ Object
Returns the last modification date before the document got indexed.
- #pretty_mtime ⇒ Object
-
#probably_unique_id ⇒ Object
Returns an id for this document.
-
#supported? ⇒ Boolean
Returns true iff some PlainTextExtractor has been defined to convert it to plain text.
Constructor Details
#initialize(path) ⇒ Document
Returns a new instance of Document.
6 7 8 9 10 11 |
# File 'lib/picolena/templates/app/models/document.rb', line 6 def initialize(path) #To ensure @complete_path is an absolute direction. @complete_path=File.(path) validate_existence_of_file validate_in_indexed_directory end |
Instance Attribute Details
#complete_path ⇒ Object (readonly)
Returns the value of attribute complete_path.
3 4 5 |
# File 'lib/picolena/templates/app/models/document.rb', line 3 def complete_path @complete_path end |
#matching_content ⇒ Object
Returns the value of attribute matching_content.
4 5 6 |
# File 'lib/picolena/templates/app/models/document.rb', line 4 def matching_content @matching_content end |
#score ⇒ Object
Returns the value of attribute score.
4 5 6 |
# File 'lib/picolena/templates/app/models/document.rb', line 4 def score @score end |
Class Method Details
.default_fields_for(complete_path) ⇒ Object
Fields that are shared between every document.
95 96 97 98 99 100 101 102 103 104 |
# File 'lib/picolena/templates/app/models/document.rb', line 95 def self.default_fields_for(complete_path) { :complete_path => complete_path, :probably_unique_id => complete_path.base26_hash, :filename => File.basename(complete_path), :basename => File.basename(complete_path, File.extname(complete_path)).gsub(/_/,' '), :filetype => File.extname(complete_path), :modified => File.mtime(complete_path).strftime("%Y%m%d%H%M%S") } end |
Instance Method Details
#alias_path ⇒ Object
End users should not always know where documents are stored internally. An alias path can be specified in config/indexed_directories.yml
For example, with:
"/media/wiki_dump/" : "http://www.mycompany.com/wiki/"
The documents
"/media/wiki_dump/organigram.odp"
will be displayed as being:
"http://www.mycompany.com/wiki/organigram.odp"
35 36 37 38 39 |
# File 'lib/picolena/templates/app/models/document.rb', line 35 def alias_path original_dir=indexed_directory alias_dir=Picolena::IndexedDirectories[original_dir] dirname.sub(original_dir,alias_dir) end |
#basename ⇒ Object
Returns filename without extension
"buildings.odt" => "buildings"
21 22 23 |
# File 'lib/picolena/templates/app/models/document.rb', line 21 def basename filename.chomp(extname) end |
#cached ⇒ Object
Cache à la Google. Returns content as it was at the time it was indexed.
63 64 65 |
# File 'lib/picolena/templates/app/models/document.rb', line 63 def cached from_index[:content] end |
#content ⇒ Object
Retrieves content as it is now.
57 58 59 |
# File 'lib/picolena/templates/app/models/document.rb', line 57 def content PlainTextExtractor.extract_content_from(complete_path) end |
#filename ⇒ Object
17 |
# File 'lib/picolena/templates/app/models/document.rb', line 17 alias_method :filename, :basename |
#highlighted_cache(raw_query) ⇒ Object
67 68 69 70 71 72 73 |
# File 'lib/picolena/templates/app/models/document.rb', line 67 def highlighted_cache(raw_query) #TODO: Report to Ferret. Highlight should accept :key and not only :doc_id. Indexer.index.highlight(Query.extract_from(raw_query), doc_id, :field => :content, :excerpt_length => :all, :pre_tag => "<<", :post_tag => ">>" ).first end |
#language ⇒ Object
Returns language.
90 91 92 |
# File 'lib/picolena/templates/app/models/document.rb', line 90 def language from_index[:language] end |
#mtime ⇒ Object
85 86 87 |
# File 'lib/picolena/templates/app/models/document.rb', line 85 def mtime from_index[:modified].to_i end |
#pretty_date ⇒ Object
Returns the last modification date before the document got indexed. Useful to know how old a document is, and to which version the cache corresponds.
77 78 79 |
# File 'lib/picolena/templates/app/models/document.rb', line 77 def pretty_date from_index[:modified].sub(/(\d{4})(\d{2})(\d{2})\d{6}/,'\1-\2-\3') end |
#pretty_mtime ⇒ Object
81 82 83 |
# File 'lib/picolena/templates/app/models/document.rb', line 81 def pretty_mtime from_index[:modified].sub(/(\d{4})(\d{2})(\d{2})(\d{2})(\d{2})(\d{2})/,'\1-\2-\3 \4:\5:\6') end |
#probably_unique_id ⇒ Object
Returns an id for this document. This id will be used in Controllers in order to get tiny urls. Since it’s a base26 hash of the absolute filename, it can only be “probably unique”. For huge amount of indexed documents, it would be wise to increase HashLength in config/custom/picolena.rb
45 46 47 |
# File 'lib/picolena/templates/app/models/document.rb', line 45 def probably_unique_id @probably_unique_id||=complete_path.base26_hash end |
#supported? ⇒ Boolean
52 53 54 |
# File 'lib/picolena/templates/app/models/document.rb', line 52 def supported? PlainTextExtractor.supported_extensions.include?(self.ext_as_sym) end |