Class: RangesIO
- Inherits:
-
Object
- Object
- RangesIO
- Defined in:
- lib/ole/ranges_io.rb
Overview
Introduction
RangesIO
is a basic class for wrapping another IO object allowing you to arbitrarily reorder slices of the input file by providing a list of ranges. Intended as an initial measure to curb inefficiencies in the Dirent#data method just reading all of a file’s data in one hit, with no method to stream it.
This class will encapuslate the ranges (corresponding to big or small blocks) of any ole file and thus allow reading/writing directly to the source bytes, in a streamed fashion (so just getting 16 bytes doesn’t read the whole thing).
In the simplest case it can be used with a single range to provide a limited io to a section of a file.
Limitations
-
No buffering. by design at the moment. Intended for large reads
TODO
On further reflection, this class is something of a joining/optimization of two separate IO classes. a SubfileIO, for providing access to a range within a File as a separate IO object, and a ConcatIO, allowing the presentation of a bunch of io objects as a single unified whole.
I will need such a ConcatIO if I’m to provide Mime#to_io, a method that will convert a whole mime message into an IO stream, that can be read from. It will just be the concatenation of a series of IO objects, corresponding to headers and boundaries, as StringIO’s, and SubfileIO objects, coming from the original message proper, or RangesIO as provided by the Attachment#data, that will then get wrapped by Mime in a Base64IO or similar, to get encoded on-the- fly. Thus the attachment, in its plain or encoded form, and the message as a whole never exists as a single string in memory, as it does now. This is a fair bit of work to achieve, but generally useful I believe.
This class isn’t ole specific, maybe move it to my general ruby stream project.
Direct Known Subclasses
Instance Attribute Summary collapse
-
#io ⇒ Object
readonly
Returns the value of attribute io.
-
#mode ⇒ Object
readonly
Returns the value of attribute mode.
-
#pos ⇒ Object
(also: #tell)
Returns the value of attribute pos.
-
#ranges ⇒ Object
readonly
Returns the value of attribute ranges.
-
#size ⇒ Object
Returns the value of attribute size.
Class Method Summary collapse
-
.open(*args, &block) ⇒ Object
add block form.
Instance Method Summary collapse
- #close ⇒ Object
- #eof? ⇒ Boolean
-
#gets ⇒ Object
(also: #readline)
i can wrap it in a buffered io stream that provides gets, and appropriately handle pos, truncate.
-
#initialize(io, mode = 'r', params = {}) ⇒ RangesIO
constructor
io
-
the parent io object that we are wrapping.
- #inspect ⇒ Object
-
#offset_and_size(pos) ⇒ Object
returns the [
offset
,size
], pair inorder to read/write atpos
(like a partial range), and its index. -
#read(limit = nil) ⇒ Object
read bytes from file, to a maximum of
limit
, or all available if unspecified. -
#truncate(size) ⇒ Object
you may override this call to update @ranges and @size, if applicable.
- #write(data) ⇒ Object
Constructor Details
#initialize(io, mode = 'r', params = {}) ⇒ RangesIO
io
-
the parent io object that we are wrapping.
mode
-
the mode to use
params
-
hash of params.
-
:ranges - byte offsets, either:
-
an array of ranges [1..2, 4..5, 6..8] or
-
an array of arrays, where the second is length [[1, 1], [4, 1], [6, 2]] for the above (think the way String indexing works)
-
-
:close_parent - boolean to close parent when this object is closed
NOTE: the ranges
can overlap.
54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 |
# File 'lib/ole/ranges_io.rb', line 54 def initialize io, mode='r', params={} mode, params = 'r', mode if Hash === mode ranges = params[:ranges] @params = {:close_parent => false}.merge params @mode = IO::Mode.new mode @io = io # convert ranges to arrays. check for negative ranges? ranges ||= [0, io.size] @ranges = ranges.map { |r| Range === r ? [r.begin, r.end - r.begin] : r } # calculate size @size = @ranges.inject(0) { |total, (pos, len)| total + len } # initial position in the file @pos = 0 # handle some mode flags truncate 0 if @mode.truncate? seek size if @mode.append? end |
Instance Attribute Details
#io ⇒ Object (readonly)
Returns the value of attribute io.
43 44 45 |
# File 'lib/ole/ranges_io.rb', line 43 def io @io end |
#mode ⇒ Object (readonly)
Returns the value of attribute mode.
43 44 45 |
# File 'lib/ole/ranges_io.rb', line 43 def mode @mode end |
#pos ⇒ Object Also known as: tell
Returns the value of attribute pos.
43 44 45 |
# File 'lib/ole/ranges_io.rb', line 43 def pos @pos end |
#ranges ⇒ Object (readonly)
Returns the value of attribute ranges.
43 44 45 |
# File 'lib/ole/ranges_io.rb', line 43 def ranges @ranges end |
#size ⇒ Object
Returns the value of attribute size.
43 44 45 |
# File 'lib/ole/ranges_io.rb', line 43 def size @size end |
Class Method Details
.open(*args, &block) ⇒ Object
add block form. TODO add test for this
78 79 80 81 82 83 84 85 86 87 |
# File 'lib/ole/ranges_io.rb', line 78 def self.open(*args, &block) ranges_io = new(*args) if block_given? begin; yield ranges_io ensure; ranges_io.close end else ranges_io end end |
Instance Method Details
#close ⇒ Object
105 106 107 |
# File 'lib/ole/ranges_io.rb', line 105 def close @io.close if @params[:close_parent] end |
#eof? ⇒ Boolean
124 125 126 |
# File 'lib/ole/ranges_io.rb', line 124 def eof? @pos == @size end |
#gets ⇒ Object Also known as: readline
i can wrap it in a buffered io stream that provides gets, and appropriately handle pos, truncate. mostly added just to past the tests. FIXME
200 201 202 203 204 205 |
# File 'lib/ole/ranges_io.rb', line 200 def gets s = read 1024 i = s.index "\n" @pos -= s.length - (i+1) s[0..i] end |
#inspect ⇒ Object
208 209 210 211 212 213 214 |
# File 'lib/ole/ranges_io.rb', line 208 def inspect # the rescue is for empty files pos, len = (@ranges[offset_and_size(@pos).last] rescue [nil, nil]) range_str = pos ? "#{pos}..#{pos+len}" : 'nil' "#<#{self.class} io=#{io.inspect}, size=#@size, pos=#@pos, "\ "range=#{range_str}>" end |
#offset_and_size(pos) ⇒ Object
returns the [offset
, size
], pair inorder to read/write at pos
(like a partial range), and its index.
111 112 113 114 115 116 117 118 119 120 121 122 |
# File 'lib/ole/ranges_io.rb', line 111 def offset_and_size pos total = 0 ranges.each_with_index do |(offset, size), i| if pos <= total + size diff = pos - total return [offset + diff, size - diff], i end total += size end # should be impossible for any valid pos, (0...size) === pos raise ArgumentError, "no range for pos #{pos.inspect}" end |
#read(limit = nil) ⇒ Object
read bytes from file, to a maximum of limit
, or all available if unspecified.
129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 |
# File 'lib/ole/ranges_io.rb', line 129 def read limit=nil data = '' return data if eof? limit ||= size partial_range, i = offset_and_size @pos # this may be conceptually nice (create sub-range starting where we are), but # for a large range array its pretty wasteful. even the previous way was. but # i'm not trying to optimize this atm. it may even go to c later if necessary. ([partial_range] + ranges[i+1..-1]).each do |pos, len| @io.seek pos if limit < len # convoluted, to handle read errors. s may be nil s = @io.read limit @pos += s.length if s break data << s end # convoluted, to handle ranges beyond the size of the file s = @io.read len @pos += s.length if s data << s break if s.length != len limit -= len end data end |
#truncate(size) ⇒ Object
you may override this call to update @ranges and @size, if applicable.
156 157 158 |
# File 'lib/ole/ranges_io.rb', line 156 def truncate size raise NotImplementedError, 'truncate not supported' end |
#write(data) ⇒ Object
166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 |
# File 'lib/ole/ranges_io.rb', line 166 def write data # short cut. needed because truncate 0 may return no ranges, instead of empty range, # thus offset_and_size fails. return 0 if data.empty? data_pos = 0 # if we don't have room, we can use the truncate hook to make more space. if data.length > @size - @pos begin truncate @pos + data.length rescue NotImplementedError raise IOError, "unable to grow #{inspect} to write #{data.length} bytes" end end partial_range, i = offset_and_size @pos ([partial_range] + ranges[i+1..-1]).each do |pos, len| @io.seek pos if data_pos + len > data.length chunk = data[data_pos..-1] @io.write chunk @pos += chunk.length data_pos = data.length break end @io.write data[data_pos, len] @pos += len data_pos += len end data_pos end |