Class: XZ::Stream
Overview
The base class for XZ::StreamReader and XZ::StreamWriter. This is an abstract class that is not meant to be used directly. You can, however, test against this class in kind_of?
tests.
XZ::StreamReader and XZ::StreamWriter are IO-like classes that allow you to access XZ-compressed data the same way you access an IO-object, easily allowing to fool other libraries that expect IO objects. The most noticable example for this may be reading and writing XZ-compressed tarballs using the minitar RubyGem; see the README.md file for an example.
Most of IO’s methods are implemented in this class or one of the subclasses. The most notable exception is that it is not possible to seek in XZ archives (#seek and #pos= are not defined). Many methods that are not expressly documented in the RDoc still exist; this class uses Ruby’s Forwardable module to forward them to the underlying IO object.
Stream and its subclasses honour Ruby’s external+internal encoding system just like Ruby’s own IO does. All of what the Ruby docs say about external and internal encodings applies to this class with one important difference. The “external encoding” does not refer to the encoding of the file on the hard disk (this file is always a binary file as it’s compressed data), but to the encoding of the decompressed data inside the compressed file.
As with Ruby’s IO class, instances of this class and its subclasses default their external encoding to Encoding.default_external and their internal encoding to Encoding.default_internal. You can use #set_encoding or pass appropriate arguments to the new
method to change these encodings per-instance.
Direct Known Subclasses
Instance Attribute Summary collapse
-
#external_encoding ⇒ Object
readonly
Returns the encoding used inside the compressed data stream.
-
#internal_encoding ⇒ Object
readonly
When compressed data is read, the decompressed data is transcoded from the external_encoding to this encoding.
-
#lineno ⇒ Object
Like IO#lineno and IO#lineno=.
Instance Method Summary collapse
-
#<<(obj) ⇒ Object
Like IO#<<.
-
#advise ⇒ Object
Like IO#advise.
-
#close ⇒ Object
If not done yet, call #finish.
-
#close_read ⇒ Object
Always raises IOError, because XZ streams can never be duplex.
-
#close_write ⇒ Object
Always raises IOError, because XZ streams can never be duplex.
-
#closed? ⇒ Boolean
True if the delegate IO has been closed.
-
#each(*args) ⇒ Object
(also: #each_line)
Like IO#each.
-
#each_byte ⇒ Object
Like IO#each_byte.
-
#each_char ⇒ Object
Like IO#each_char.
-
#each_codepoint ⇒ Object
Like IO#each_codepoint.
-
#eof ⇒ Object
Alias for #eof?.
-
#eof? ⇒ Boolean
Overridden in StreamReader to be like IO#eof?.
-
#finish ⇒ Object
Free internal libzlma memory.
-
#finished? ⇒ Boolean
True if liblzma’s internal memory has been freed.
-
#getbyte ⇒ Object
Like IO#getbyte.
-
#getc ⇒ Object
Like IO#getc.
-
#gets(separator = $/, limit = nil) ⇒ Object
Like IO#gets.
-
#initialize(delegate_io) ⇒ Stream
constructor
Private API only for use by subclasses.
-
#lzma_code(str, action) ⇒ Object
Pass the given
str
into libzlma’s lzma_code() function. -
#pos ⇒ Object
(also: #tell)
Returns the position in the decompressed data (regardless of whether this is a reader or a writer instance).
-
#print(*objs) ⇒ Object
Like IO#print.
-
#printf(*args) ⇒ Object
Like IO#printf.
-
#putc(obj) ⇒ Object
Like IO#putc.
- #puts(*objs) ⇒ Object
-
#read(*args) ⇒ Object
Overridden in StreamReader to be like IO#read.
-
#readbyte ⇒ Object
Like IO#readbyte.
-
#readchar ⇒ Object
Like IO#readchar.
-
#readline(*args) ⇒ Object
Like IO#readline.
-
#reopen(*args) ⇒ Object
It is not possible to reopen an lzma stream, hence this method always raises NotImplementedError.
-
#rewind ⇒ Object
Partial implementation of
rewind
abstracting common operations. -
#set_encoding(*args) ⇒ Object
Like IO#set_encoding.
-
#to_io ⇒ Object
You can mostly treat this as if it were an IO object.
-
#write(*args) ⇒ Object
Overridden in StreamWriter to be like IO#write.
Constructor Details
#initialize(delegate_io) ⇒ Stream
Private API only for use by subclasses.
95 96 97 98 99 100 101 102 103 104 105 106 107 108 |
# File 'lib/xz/stream.rb', line 95 def initialize(delegate_io) # :nodoc: @delegate_io = delegate_io @lzma_stream = XZ::LibLZMA::LZMAStream.malloc XZ::LibLZMA::LZMA_STREAM_INIT(@lzma_stream) @finished = false @lineno = 0 @pos = 0 @external_encoding = Encoding.default_external @internal_encoding = Encoding.default_internal @transcode_options = {} @input_buffer_p = Fiddle::Pointer.malloc(XZ::CHUNK_SIZE) @output_buffer_p = Fiddle::Pointer.malloc(XZ::CHUNK_SIZE) end |
Instance Attribute Details
#external_encoding ⇒ Object (readonly)
Returns the encoding used inside the compressed data stream. Like IO#external_encoding.
87 88 89 |
# File 'lib/xz/stream.rb', line 87 def external_encoding @external_encoding end |
#internal_encoding ⇒ Object (readonly)
When compressed data is read, the decompressed data is transcoded from the external_encoding to this encoding. If this encoding is nil, no transcoding happens.
92 93 94 |
# File 'lib/xz/stream.rb', line 92 def internal_encoding @internal_encoding end |
#lineno ⇒ Object
Like IO#lineno and IO#lineno=.
83 84 85 |
# File 'lib/xz/stream.rb', line 83 def lineno @lineno end |
Instance Method Details
#<<(obj) ⇒ Object
Like IO#<<.
289 290 291 |
# File 'lib/xz/stream.rb', line 289 def <<(obj) write(obj.to_s) end |
#advise ⇒ Object
Like IO#advise. No-op, because not meaningful on compressed data.
294 295 296 |
# File 'lib/xz/stream.rb', line 294 def advise nil end |
#close ⇒ Object
If not done yet, call #finish. Then close the delegate IO. The latter action is going to cause the delegate IO to flush its buffer. After this method returns, it is guaranteed that all pending data has been flushed to the OS’ kernel.
227 228 229 230 231 |
# File 'lib/xz/stream.rb', line 227 def close finish unless @finished @delegate_io.close unless @delegate_io.closed? nil end |
#close_read ⇒ Object
Always raises IOError, because XZ streams can never be duplex.
234 235 236 |
# File 'lib/xz/stream.rb', line 234 def close_read raise(IOError, "Not a duplex I/O stream") end |
#close_write ⇒ Object
Always raises IOError, because XZ streams can never be duplex.
239 240 241 |
# File 'lib/xz/stream.rb', line 239 def close_write raise(IOError, "Not a duplex I/O stream") end |
#closed? ⇒ Boolean
True if the delegate IO has been closed.
187 188 189 |
# File 'lib/xz/stream.rb', line 187 def closed? @delegate_io.closed? end |
#each(*args) ⇒ Object Also known as: each_line
Like IO#each.
365 366 367 368 369 370 371 |
# File 'lib/xz/stream.rb', line 365 def each(*args) return enum_for __method__ unless block_given? while line = gets(*args) yield(line) end end |
#each_byte ⇒ Object
Like IO#each_byte.
375 376 377 378 379 380 381 |
# File 'lib/xz/stream.rb', line 375 def each_byte return enum_for __method__ unless block_given? while byte = getbyte yield(byte) end end |
#each_char ⇒ Object
Like IO#each_char.
384 385 386 387 388 389 390 |
# File 'lib/xz/stream.rb', line 384 def each_char return enum_for __method__ unless block_given? while char = getc yield(char) end end |
#each_codepoint ⇒ Object
Like IO#each_codepoint.
393 394 395 396 397 |
# File 'lib/xz/stream.rb', line 393 def each_codepoint return enum_for __method__ unless block_given? each_char{|c| yield(c.ord)} end |
#eof ⇒ Object
Alias for #eof?
182 183 184 |
# File 'lib/xz/stream.rb', line 182 def eof eof? end |
#eof? ⇒ Boolean
Overridden in StreamReader to be like IO#eof?. This abstract implementation only raises IOError.
177 178 179 |
# File 'lib/xz/stream.rb', line 177 def eof? raise(IOError, "Stream not opened for reading") end |
#finish ⇒ Object
Free internal libzlma memory. This needs to be called before you leave this object for the GC. If you used a block-form initializer, this done automatically for you.
Subsequent calls to #read or #write will cause an IOError.
Returns the underlying IO object. This allows you to retrieve the File instance that was automatically created when using the open
method’s block form.
208 209 210 211 212 213 214 215 216 217 218 219 220 |
# File 'lib/xz/stream.rb', line 208 def finish return if @finished # Clean up the lzma_stream structure's internal memory. # This would belong into a destructor if Ruby had that. XZ::LibLZMA.lzma_end(@lzma_stream) Fiddle.free @lzma_stream.to_ptr Fiddle.free @input_buffer_p Fiddle.free @output_buffer_p @finished = true @delegate_io end |
#finished? ⇒ Boolean
True if liblzma’s internal memory has been freed. For writer instances, receiving true from this method also means that all of liblzma’s compressed data has been flushed to the underlying IO object.
195 196 197 |
# File 'lib/xz/stream.rb', line 195 def finished? @finished end |
#getbyte ⇒ Object
Like IO#getbyte. Note this method isn’t exactly performant, because it actually reads compressed data as a string and then needs to figure out the bytes from that again.
301 302 303 304 |
# File 'lib/xz/stream.rb', line 301 def getbyte return nil if eof? read(1).bytes.first end |
#getc ⇒ Object
Like IO#getc.
312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 |
# File 'lib/xz/stream.rb', line 312 def getc str = String.new # Read byte-by-byte until a valid character in the external # encoding was built. loop do str.force_encoding(Encoding::BINARY) str << read(1) str.force_encoding(@external_encoding) break if str.valid_encoding? || eof? end # Transcode to internal encoding if one was requested if @internal_encoding str.encode(@internal_encoding) else str end end |
#gets(separator = $/, limit = nil) ⇒ Object
Like IO#gets.
339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 |
# File 'lib/xz/stream.rb', line 339 def gets(separator = $/, limit = nil) return nil if eof? @lineno += 1 # Mirror IO#gets' weird call-seq if separator.respond_to?(:to_int) limit = separator.to_int separator = $/ end buf = String.new buf.force_encoding(target_encoding) until eof? || (limit && buf.length >= limit) buf << getc return buf if buf[-1] == separator end buf end |
#lzma_code(str, action) ⇒ Object
Pass the given str
into libzlma’s lzma_code() function. action
is either LibLZMA::LZMA_RUN (still working) or LibLZMA::LZMA_FINISH (this is the last piece).
113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 |
# File 'lib/xz/stream.rb', line 113 def lzma_code(str, action) # :nodoc: previous_encoding = str.encoding str.force_encoding(Encoding::BINARY) # Need to operate on bytes now begin pos = 0 until pos > str.bytesize # Do not use >=, that conflicts with #lzma_finish substr = str[pos, XZ::CHUNK_SIZE] @input_buffer_p[0, substr.bytesize] = substr pos += XZ::CHUNK_SIZE @lzma_stream.next_in = @input_buffer_p @lzma_stream.avail_in = substr.bytesize loop do @lzma_stream.next_out = @output_buffer_p @lzma_stream.avail_out = XZ::CHUNK_SIZE res = XZ::LibLZMA.lzma_code(@lzma_stream.to_ptr, action) XZ.send :check_lzma_code_retval, res # call package-private method data = @output_buffer_p[0, XZ::CHUNK_SIZE - @lzma_stream.avail_out] yield(data) break unless @lzma_stream.avail_out == 0 end end ensure str.force_encoding(previous_encoding) end end |
#pos ⇒ Object Also known as: tell
Returns the position in the decompressed data (regardless of whether this is a reader or a writer instance).
257 258 259 |
# File 'lib/xz/stream.rb', line 257 def pos @pos end |
#print(*objs) ⇒ Object
Like IO#print.
439 440 441 442 443 444 445 446 447 448 449 450 451 |
# File 'lib/xz/stream.rb', line 439 def print(*objs) if objs.empty? write($_) else objs.each do |obj| write(obj.to_s) write($,) if $, end end write($\) if $\ nil end |
#printf(*args) ⇒ Object
Like IO#printf.
400 401 402 403 |
# File 'lib/xz/stream.rb', line 400 def printf(*args) write(sprintf(*args)) nil end |
#putc(obj) ⇒ Object
Like IO#putc.
406 407 408 409 410 411 412 413 414 |
# File 'lib/xz/stream.rb', line 406 def putc(obj) if obj.respond_to? :chr write(obj.chr) elsif obj.respond_to? :to_str write(obj.to_str) else raise(TypeError, "Can only #putc strings and numbers") end end |
#puts(*objs) ⇒ Object
416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 |
# File 'lib/xz/stream.rb', line 416 def puts(*objs) if objs.empty? write("\n") return nil end objs.each do |obj| if obj.respond_to? :to_ary puts(*obj.to_ary) else # Don't squeeze multiple subsequent trailing newlines in `obj' obj = obj.to_s if obj.end_with?("\n".encode(obj.encoding)) write(obj) else write(obj + "\n".encode(obj.encoding)) end end end nil end |
#read(*args) ⇒ Object
Overridden in StreamReader to be like IO#read. This abstract implementation only raises IOError.
245 246 247 |
# File 'lib/xz/stream.rb', line 245 def read(*args) raise(IOError, "Stream not opened for reading") end |
#readbyte ⇒ Object
Like IO#readbyte.
307 308 309 |
# File 'lib/xz/stream.rb', line 307 def readbyte getbyte || raise(EOFError, "End of stream reached") end |
#readchar ⇒ Object
Like IO#readchar.
334 335 336 |
# File 'lib/xz/stream.rb', line 334 def readchar getc || raise(EOFError, "End of stream reached") end |
#readline(*args) ⇒ Object
Like IO#readline.
360 361 362 |
# File 'lib/xz/stream.rb', line 360 def readline(*args) gets(*args) || raise(EOFError, "End of stream reached") end |
#reopen(*args) ⇒ Object
It is not possible to reopen an lzma stream, hence this method always raises NotImplementedError.
455 456 457 |
# File 'lib/xz/stream.rb', line 455 def reopen(*args) raise(NotImplementedError, "Can't reopen an lzma stream") end |
#rewind ⇒ Object
Partial implementation of rewind
abstracting common operations. The subclasses implement the rest.
146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 |
# File 'lib/xz/stream.rb', line 146 def rewind # :nodoc: # Free the current lzma stream and rewind the underlying IO. # It is required to call #rewind before allocating a new lzma # stream, because if #rewind raises an exception (because the # underlying IO is not rewindable), a memory leak would occur # with regard to an allocated-but-never-freed lzma stream. finish @delegate_io.rewind # Reset internal state @pos = @lineno = 0 @finished = false @lzma_stream = XZ::LibLZMA::LZMAStream.malloc @input_buffer_p = Fiddle::Pointer.malloc(XZ::CHUNK_SIZE) @output_buffer_p = Fiddle::Pointer.malloc(XZ::CHUNK_SIZE) XZ::LibLZMA::LZMA_STREAM_INIT(@lzma_stream) 0 # Mimic IO#rewind's return value end |
#set_encoding(*args) ⇒ Object
Like IO#set_encoding.
263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 |
# File 'lib/xz/stream.rb', line 263 def set_encoding(*args) if args.count < 1 || args.count > 3 raise ArgumentError, "Wrong number of arguments: Expected 1-3, got #{args.count}" end # Clean `args' to [external_encoding, internal_encoding], # and @transcode_options. return set_encoding($`, $', *args[1..-1]) if args[0].respond_to?(:to_str) && args[0].to_str =~ /:/ @transcode_options = args.delete_at(-1) if args[-1].kind_of?(Hash) # `args' is always [external, internal] or [external] at this point @external_encoding = args[0].kind_of?(Encoding) ? args[0] : Encoding.find(args[0]) if args[1] @internal_encoding = args[1].kind_of?(Encoding) ? args[1] : Encoding.find(args[1]) else @internal_encoding = Encoding.default_internal # Encoding.default_internal defaults to nil end self end |
#to_io ⇒ Object
You can mostly treat this as if it were an IO object. At least for subclasses. This class itself is abstract, you shouldn’t be using it directly at all.
Returns the receiver.
171 172 173 |
# File 'lib/xz/stream.rb', line 171 def to_io self end |
#write(*args) ⇒ Object
Overridden in StreamWriter to be like IO#write. This abstract implementation only raises IOError.
251 252 253 |
# File 'lib/xz/stream.rb', line 251 def write(*args) raise(IOError, "Stream not opened for writing") end |