Class: DocStorage::SimpleDocument
- Inherits:
-
Object
- Object
- DocStorage::SimpleDocument
- Defined in:
- lib/doc_storage/simple_document.rb
Overview
The SimpleDocument
class represents a simple RFC 822-like document, suitable for storing text associated with some metadata (e.g. a blog article with a title and a publication date). The SimpleDocument
class allows to create the document programatically, parse it from a file, manipulate its structure and save it to a file.
Each document consist of headers and a body. Headers are a dictionary, mapping string names to string values. Body is a free-form text. The header names can contain only alphanumeric characters and a hyphen (“-”) and they are case sensitive. The header values can contain any text.
Document Format
In serialized form, a simple document looks like this:
Title: My blog article
Datetime: 2009-11-01 18:03:27
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nunc vel lorem
massa. Sed blandit orci id leo blandit ut fermentum lacus ullamcorper.
Suspendisse metus sapien, consectetur vitae imperdiet vel, ornare a metus.
In imperdiet euismod mi, nec volutpat lorem porta id.
Headers are first, each on its own line. Header names are separated from values by a colon (“:”) and any amount of whitespace, trailing whitespace after values is ignored. Values containing special characters (especially newlines or leading/trailing whitepsace) must be enclosed in single or double quotes. Quoted values can contain usual C-like escape sequences (e.g. “n”, “xFF”, etc.). Duplicate headers are allowed with later value overwriting the earlier one. Other than that, the order of headers does not matter. The body is separated from headers by empty line.
Documents without any headers are perfectly legal and so are documents with an empty body. However, the separating line must be always present. This means that an empty file is not a valid document, but a file containing a single newline is.
Example Usage
require "lib/doc_storage"
# Create a new document with headers and body
document = DocStorage::SimpleDocument.new(
{
"Title" => "Finishing the documentation",
"Priority" => "urgent"
},
"We should finish the documentation ASAP."
)
# Load from a file
document = DocStorage::SimpleDocument.load_file("examples/simple.txt")
# Document manipulation
document.headers["Tags"] = "example"
document.body += "Nulla mi dui, pellentesque et accumsan vitae, mattis et velit."
# Save the modified document
document.save_file("examples/simple_modified.txt")
Instance Attribute Summary collapse
-
#body ⇒ Object
document body (
String
). -
#headers ⇒ Object
document headers (
Hash
).
Class Method Summary collapse
-
.load(source, boundary = nil) ⇒ Object
Loads a simple document from its serialized form and returns a new
SimpleDocument
instance. -
.load_file(file, boundary = nil) ⇒ Object
Loads a simple document from a file and returns a new
SimpleDocument
instance.
Instance Method Summary collapse
-
#==(other) ⇒ Object
Tests if two documents are equal, i.e.
-
#initialize(headers, body) ⇒ SimpleDocument
constructor
Creates a new
SimpleDocument
with given headers and body. -
#save(io) ⇒ Object
Saves this document to an
IO
-like object. -
#save_file(file) ⇒ Object
Saves this document to a file.
-
#to_s ⇒ Object
Returns string representation of this document.
Constructor Details
#initialize(headers, body) ⇒ SimpleDocument
Creates a new SimpleDocument
with given headers and body.
213 214 215 |
# File 'lib/doc_storage/simple_document.rb', line 213 def initialize(headers, body) @headers, @body = headers, body end |
Instance Attribute Details
#body ⇒ Object
document body (String
)
65 66 67 |
# File 'lib/doc_storage/simple_document.rb', line 65 def body @body end |
#headers ⇒ Object
document headers (Hash
)
63 64 65 |
# File 'lib/doc_storage/simple_document.rb', line 63 def headers @headers end |
Class Method Details
.load(source, boundary = nil) ⇒ Object
Loads a simple document from its serialized form and returns a new SimpleDocument
instance.
The source
can be either an IO
-like object or a String
. In the latter case, it is assumed that the string contains a serialized document (not a file name).
The boundary
determines how the end of the document body is detected:
-
If
boundary
isnil
, the document is read until the end of file. -
If
boundary
is:detect
, the document is read until the end of file or until a line containing only a boundary string is read. The boundary string is the value of the “Boundary” header prefixed with “–”. -
Otherwise, it is assumed that
boundary
contains a boundary string without the “–” prefix (the “Boundary” header value is ignored for the purpose of boundary detection). The document is read until the end of file or until a line containing only the boundary string is read.
The boundary
parameter is provided mainly for parsing parts of multipart documents (see the MultipartDocument
class documentation) and usually should not be used.
If any syntax error occurs, a SyntaxError
exception is raised. This can happen when an invalid header is encountered, headers are not terminated (no empty line separating headers and body is parsed before the end of file) or if no “Boundary” header is found when detecting a boundary.
See the SimpleDocument
class documentation for a detailed document format description.
193 194 195 196 197 198 |
# File 'lib/doc_storage/simple_document.rb', line 193 def load(source, boundary = nil) load_from_io( source.is_a?(String) ? StringIO.new(source) : source, boundary ) end |
.load_file(file, boundary = nil) ⇒ Object
Loads a simple document from a file and returns a new SimpleDocument
instance. This method is just a thin wrapper around SimpleDocument#load – see its documentation for description of the behavior and parameters of this method.
See the SimpleDocument
class documentation for a detailed document format description.
207 208 209 |
# File 'lib/doc_storage/simple_document.rb', line 207 def load_file(file, boundary = nil) File.open(file, "r") { |f| load(f, boundary) } end |
Instance Method Details
#==(other) ⇒ Object
Tests if two documents are equal, i.e. whether they have the same class and equal headers and body (in the ==
sense).
219 220 221 222 223 |
# File 'lib/doc_storage/simple_document.rb', line 219 def ==(other) other.instance_of?(self.class) && @headers == other.headers && @body == other.body end |
#save(io) ⇒ Object
Saves this document to an IO
-like object. The result is in the format described in the SimpleDocument
class documentation.
Raises SyntaxError
if any document header has invalid name.
252 253 254 |
# File 'lib/doc_storage/simple_document.rb', line 252 def save(io) io.write(to_s) end |
#save_file(file) ⇒ Object
Saves this document to a file. The result is in the format described in the SimpleDocument
class documentation.
Raises SyntaxError
if any document header has invalid name.
260 261 262 |
# File 'lib/doc_storage/simple_document.rb', line 260 def save_file(file) File.open(file, "w") { |f| save(f) } end |
#to_s ⇒ Object
Returns string representation of this document. The result is in the format described in the SimpleDocument
class documentation.
Raises SyntaxError
if any document header has invalid name.
229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 |
# File 'lib/doc_storage/simple_document.rb', line 229 def to_s @headers.keys.each do |name| if name !~ /\A[a-zA-Z0-9-]+\Z/ raise SyntaxError, "Invalid header name: #{name.inspect}." end end serialized_headers = @headers.keys.sort.inject("") do |acc, key| value_is_simple = @headers[key] !~ /\A\s+/ && @headers[key] !~ /\s+\Z/ && @headers[key] !~ /[\n\r]/ value = value_is_simple ? @headers[key] : @headers[key].inspect acc + "#{key}: #{value}\n" end serialized_headers + "\n" + @body end |