Class: Bzip2::FFI::Reader

Inherits:
IO
  • Object
show all
Defined in:
lib/bzip2/ffi/reader.rb

Overview

Reader reads and decompresses a bzip2 compressed stream or file. The public instance methods of Reader are intended to be equivalent to those of a standard IO object.

Data can be read as a stream using Reader.open and #read, for example:

Bzip2::FFI::Reader.open(io_or_path) do |reader|
  while buffer = reader.read(1024) do
    # process uncompressed bytes in buffer
  end
end

Alternatively, without passing a block to open:

reader = Bzip2::FFI::Reader.open(io_or_path)
begin
  while buffer = reader.read(1024) do
    # process uncompressed bytes in buffer
  end
ensure
  reader.close
end

An entire bzip2 structure can be read in a single step using Reader.read:

uncompressed = Bzip2::FFI::Reader.read(io_or_path)

The Reader.open and Reader.read methods accept either an IO-like object or a file path. IO-like objects must have a read method. Paths can be given as either a String or Pathname.

No character conversion is performed on decompressed bytes. The Reader.read and #read methods return instances of String that represent the raw decompressed bytes, with encoding set to Encoding::ASCII_8BIT (also known as Encoding::BINARY).

Reader will read a single bzip2 compressed structure from the given stream or file. If the stream or file contains data beyond the end of the bzip2 structure, such data may be read during decompression. If such an overread has occurred and the IO-like object being read from has a seek method, Reader will use it to reposition the stream to the byte immediately following the end of the bzip2 structure. If seek raises an IOError, it will be caught and the stream position will be left unchanged.

Class Method Summary collapse

Instance Method Summary collapse

Methods inherited from IO

#autoclose=, #autoclose?, #binmode, #binmode?, #closed?, #external_encoding, #internal_encoding

Constructor Details

#initialize(io, options = {}) ⇒ Reader

Initializes a Bzip2::FFI::Reader to read compressed data from an IO-like object (io). io must have a read method.

The following options can be specified using the options Hash:

  • :autoclose - Set to true to close io when the Reader instance is closed.
  • :small - Set to true to use an alternative decompression algorithm that uses less memory, but at the cost of decompressing more slowly (roughly 2,300 kB less memory at about half the speed).

binmode is called on io if io responds to binmode.

After use, the Reader instance should be closed using the #close method.

Parameters:

  • io (Object)

    An IO-like object with a read method.

  • options (Hash) (defaults to: {})

    Optional parameters (:autoclose and :small).

Raises:

  • (ArgumentError)

    If io is nil or does not respond to read.

  • (Error::Bzip2Error)

    If an error occurs when initializing libbz2.


191
192
193
194
195
196
197
198
199
200
201
202
203
204
# File 'lib/bzip2/ffi/reader.rb', line 191

def initialize(io, options = {})
  super
  raise ArgumentError, 'io must respond to read' unless io.respond_to?(:read)

  small = options[:small]

  @in_eof = false
  @out_eof = false
  @in_buffer = nil

  check_error(Libbz2::BZ2_bzDecompressInit(stream, 0, small ? 1 : 0))

  ObjectSpace.define_finalizer(self, self.class.send(:finalize, stream))
end

Class Method Details

.open(io_or_path, options = {}) ⇒ Object

Opens a Bzip2::FFI::Reader to read and decompress data from either an IO-like object or a file. IO-like objects must have a read method. Files can be specified using either a String containing the file path or a Pathname.

If no block is given, the opened Reader instance is returned. After use, the instance should be closed using the #close method.

If a block is given, it will be passed the opened Reader instance as an argument. After the block terminates, the Reader instance will automatically be closed. open will then return the result of the block.

The following options can be specified using the options Hash:

  • :autoclose - When passing an IO-like object, set to true to close the IO when the Reader instance is closed.
  • :small - Set to true to use an alternative decompression algorithm that uses less memory, but at the cost of decompressing more slowly (roughly 2,300 kB less memory at about half the speed).

If an IO-like object that has a binmode method is passed to open, binmode will be called on io_or_path before yielding to the block or returning.

Parameters:

  • io_or_path (Object)

    Either an IO-like object with a read method or a file path as a String or Pathname.

  • options (Hash) (defaults to: {})

    Optional parameters (:autoclose and :small).

Returns:

  • (Object)

    The opened Reader instance if no block is given, or the result of the block if a block is given.

Raises:

  • (ArgumentError)

    If io_or_path is not a String, Pathname or an IO-like object with a read method.

  • (Errno::ENOENT)

    If the specified file does not exist.

  • (Error::Bzip2Error)

    If an error occurs when initializing libbz2.


103
104
105
106
107
108
109
110
111
112
113
# File 'lib/bzip2/ffi/reader.rb', line 103

def open(io_or_path, options = {})
  if io_or_path.kind_of?(String) || io_or_path.kind_of?(Pathname)
    options = options.merge(autoclose: true)
    proc = -> { open_bzip_file(io_or_path.to_s, 'rb') }
    super(proc, options)
  elsif !io_or_path.kind_of?(Proc)
    super
  else
    raise ArgumentError, 'io_or_path must be an IO-like object or a path'
  end
end

.read(io_or_path, options = {}) ⇒ String

Reads and decompresses and entire bzip2 compressed structure from either an IO-like object or a file and returns the decompressed bytes as a String. IO-like objects must have a read method. Files can be specified using either a String containing the file path or a Pathname.

The following options can be specified using the options Hash:

  • :autoclose - When passing an IO-like object, set to true to close the IO when the compressed data has been read.
  • :small - Set to true to use an alternative decompression algorithm that uses less memory, but at the cost of decompressing more slowly (roughly 2,300 kB less memory at about half the speed).

No character conversion is performed on decompressed bytes. read returns a String that represents the raw decompressed bytes, with encoding set to Encoding::ASCII_8BIT (also known as Encoding::BINARY).

If an IO-like object that has a binmode method is passed to read, binmode will be called on io_or_path before any compressed data is read.

Parameters:

  • io_or_path (Object)

    Either an IO-like object with a read method or a file path as a String or Pathname.

  • options (Hash) (defaults to: {})

    Optional parameters (:autoclose and :small).

Returns:

  • (String)

    The decompressed data.

Raises:

  • (ArgumentError)

    If io_or_path is not a String, Pathname or an IO-like object with a read method.

  • (Errno::ENOENT)

    If the specified file does not exist.

  • (Error::Bzip2Error)

    If an error occurs when initializing libbz2 or decompressing data.


150
151
152
153
154
# File 'lib/bzip2/ffi/reader.rb', line 150

def read(io_or_path, options = {})
  open(io_or_path, options) do |reader|
    reader.read
  end
end

Instance Method Details

#closeNilType

Ends decompression and closes the Bzip2::FFI::Reader.

If the open method is used with a block, it is not necessary to call close. Otherwise, close should be called once the Reader is no longer needed.

Returns:

  • (NilType)

    nil.

Raises:

  • (IOError)

    If the Reader has already been closed.


214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
# File 'lib/bzip2/ffi/reader.rb', line 214

def close
  s = stream

  unless @out_eof
    decompress_end(s)
  end

  s[:next_in] = nil
  s[:next_out] = nil
  
  if @in_buffer
    @in_buffer.free
    @in_buffer = nil
  end
  
  super
end

#read(length = nil, buffer = nil) ⇒ String

Reads and decompresses data from the bzip2 compressed stream or file, returning the uncompressed bytes.

length must be a non-negative integer or nil.

If length is a positive integer, it specifies the maximum number of uncompressed bytes to return. read will return nil or a String with a length of 1 to length bytes containing the decompressed data. A result of nil or a String with a length less than length bytes indicates that the end of the decompressed data has been reached.

If length is nil, read reads until the end of the decompressed data, returning the uncompressed bytes as a String.

If length is 0, read returns an empty String.

If the optional buffer argument is present, it must reference a String that will receive the decompressed data. buffer will contain only the decompressed data after the call to read, even if it is not empty beforehand.

No character conversion is performed on decompressed bytes. read returns a String that represents the raw decompressed bytes, with encoding set to Encoding::ASCII_8BIT (also known as Encoding::BINARY).

Parameters:

  • length (Integer) (defaults to: nil)

    Must be a non-negative integer or nil. Set to a positive integer to specify the maximum number of uncompressed bytes to return. Set to nil to return the remaining decompressed data. Set to 0 to return an empty String.

  • buffer (String) (defaults to: nil)

    An optional buffer to receive the decompressed data.

Returns:

  • (String)

    The decompressed data as a String with ASCII-8BIT encoding, or nil if length was a positive integer and the end of the decompressed data has been reached.

Raises:

  • (ArgumentError)

    If length is negative.

  • (Error::Bzip2Error)

    If an error occurs during decompression.

  • (IOError)

    If the Reader has been closed.


271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
# File 'lib/bzip2/ffi/reader.rb', line 271

def read(length = nil, buffer = nil)
  if buffer
    buffer.clear
    buffer.force_encoding(Encoding::ASCII_8BIT)
  end

  if length
    raise ArgumentError 'length must be a non-negative integer or nil' if length < 0

    if length == 0
      check_closed
      return buffer || ''
    end

    decompressed = decompress(length)
    
    return nil unless decompressed
    buffer ? buffer << decompressed : decompressed
  else
    result = buffer ? StringIO.new(buffer) : StringIO.new

    # StringIO#binmode is a no-op, but call in case it is implemented in
    # future versions.
    result.binmode
    
    result.set_encoding(Encoding::ASCII_8BIT)

    loop do
      decompressed = decompress(DEFAULT_DECOMPRESS_COUNT)            
      break unless decompressed
      result.write(decompressed)
      break if decompressed.bytesize < DEFAULT_DECOMPRESS_COUNT
    end

    result.string
  end
end