Module: XZ

Defined in:
lib/xz.rb,
lib/xz/lib_lzma.rb

Overview

The MIT License

Basic liblzma-bindings for Ruby.

Copyright © 2011,2013 Marvin Gülker et al.

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the ‘Software’), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED ‘AS IS’, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Defined Under Namespace

Modules: LibLZMA Classes: LZMAError, LZMAStream, Stream, StreamReader, StreamWriter

Constant Summary collapse

VERSION =

The version of this library.

"0.2.1"
CHUNK_SIZE =

Number of bytes read in one chunk.

4096

Class Method Summary collapse

Class Method Details

.compress(str, compression_level = 6, check = :crc64, extreme = false) ⇒ Object

Compresses arbitrary data using the XZ algorithm.

Parameters

str

The data to compress.

For the other parameters, see the compress_stream method.

Return value

The compressed data as a BINARY-encoded string.

Example

data = "I love Ruby"
comp = XZ.compress(data) #=> binary blob

Remarks

Don’t use this method for big amounts of data–you may run out of memory. Use compress_file or compress_stream instead.

Raises:

  • (NotImplementedError)


217
218
219
220
221
# File 'lib/xz.rb', line 217

def compress(str, compression_level = 6, check = :crc64, extreme = false)
  raise(NotImplementedError, "StringIO isn't available!") unless defined? StringIO
  s = StringIO.new(str)
  compress_stream(s, compression_level, check, extreme)
end

.compress_file(in_file, out_file, compression_level = 6, check = :crc64, extreme = false) ⇒ Object

Compresses in_file and writes the result to out_file.

Parameters

in_file

The path to the file to read from.

out_file

The path of the file to write to. If it exists, it will be

overwritten.

For the other parameters, see the ::compress_stream method.

Return value

The number of bytes written, i.e. the size of the archive.

Example

XZ.compress("myfile.txt", "myfile.txt.xz")
XZ.compress("myarchive.tar", "myarchive.tar.xz")

Remarks

This method is safe to use with big files, because files are not loaded into memory completely at once.



195
196
197
198
199
200
201
202
203
# File 'lib/xz.rb', line 195

def compress_file(in_file, out_file, compression_level = 6, check = :crc64, extreme = false)
  File.open(in_file, "rb") do |i_file|
    File.open(out_file, "wb") do |o_file|
      compress_stream(i_file, compression_level, check, extreme) do |chunk|
        o_file.write(chunk)
      end
    end
  end
end

.compress_stream(io, compression_level = 6, check = :crc64, extreme = false, &block) ⇒ Object Also known as: encode_stream

call-seq:

compress_stream(io [, compression_level [, check [, extreme ] ] ] ) → a_string
compress_stream(io [, compression_level [, check [, extreme ] ] ] ){|chunk| ... } → an_integer
encode_stream(io [, compression_level [, check [, extreme ] ] ] ) → a_string
encode_stream(io [, compression_level [, check [, extreme ] ] ] ){|chunk| ... } → an_integer

Compresses a stream of data into XZ-compressed data.

Parameters

io

The IO to read the data from. Must be opened for

reading.
compression_level

(6) Compression strength. Higher values indicate a

smaller result, but longer compression time. Maximum
is 9.
check

(:crc64) The checksum algorithm to use for verifying

the data inside the archive. Possible values are:
* :none
* :crc32
* :crc64
* :sha256
extreme

(false) Tries to get the last bit out of the

compression. This may succeed, but you can end
up with *very* long computation times.
chunk

(Block argument) One piece of compressed data.

Return value

If a block was given, returns the number of bytes written. Otherwise, returns the compressed data as a BINARY-encoded string.

Example

data = File.read("file.txt")
i = StringIO.new(data)
XZ.compress_stream(i) #=> Some binary blob
i.rewind
str = ""
XZ.compress_stream(i, 4, :sha256){|c| str << c} #=> 123
str #=> Some binary blob

Remarks

The block form is much better on memory usage, because it doesn’t have to load everything into RAM at once. If you don’t know how big your data gets or if you want to compress much data, use the block form. Of course you shouldn’t store the data your read in RAM then as in the example above.

Raises:

  • (ArgumentError)


156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
# File 'lib/xz.rb', line 156

def compress_stream(io, compression_level = 6, check = :crc64, extreme = false, &block)
  raise(ArgumentError, "Invalid compression level!") unless (0..9).include?(compression_level)
  raise(ArgumentError, "Invalid checksum specified!") unless [:none, :crc32, :crc64, :sha256].include?(check)

  stream = LZMAStream.new
  res = LibLZMA.lzma_easy_encoder(stream.pointer,
                                  compression_level | (extreme ? LibLZMA::LZMA_PRESET_EXTREME : 0),
                                  LibLZMA::LZMA_CHECK[:"lzma_check_#{check}"])

  LZMAError.raise_if_necessary(res)

  res = ""
  res.encode!(Encoding::BINARY)
  if block_given?
    res = lzma_code(io, stream, &block)
  else
    lzma_code(io, stream){|chunk| res << chunk}
  end

  LibLZMA.lzma_end(stream.pointer)

  block_given? ? stream[:total_out] : res
end

.decompress(str, memory_limit = LibLZMA::UINT64_MAX, flags = [:tell_unsupported_check]) ⇒ Object

Decompresses data in XZ format.

Parameters

str

The data to decompress.

For the other parameters, see the decompress_stream method.

Return value

The decompressed data as a BINARY-encoded string.

Example

comp = File.open("data.xz", "rb"){|f| f.read}
data = XZ.decompress(comp) #=> "I love Ruby"

Remarks

Don’t use this method for big amounts of data–you may run out of memory. Use decompress_file or decompress_stream instead.

Raises:

  • (NotImplementedError)


235
236
237
238
239
# File 'lib/xz.rb', line 235

def decompress(str, memory_limit = LibLZMA::UINT64_MAX, flags = [:tell_unsupported_check])
  raise(NotImplementedError, "StringIO isn't available!") unless defined? StringIO
  s = StringIO.new(str)
  decompress_stream(s, memory_limit, flags)
end

.decompress_file(in_file, out_file, memory_limit = LibLZMA::UINT64_MAX, flags = [:tell_unsupported_check]) ⇒ Object

Decompresses in_file and writes the result to out_file.

Parameters

in_file

The path to the file to read from.

out_file

The path of the file to write to. If it exists, it will

be overwritten.

For the other parameters, see the decompress_stream method.

Return value

The number of bytes written, i.e. the size of the uncompressed data.

Example

XZ.decompres("myfile.txt.xz", "myfile.txt")
XZ.decompress("myarchive.tar.xz", "myarchive.tar")

Remarks

This method is safe to use with big files, because files are not loaded into memory completely at once.



255
256
257
258
259
260
261
262
263
# File 'lib/xz.rb', line 255

def decompress_file(in_file, out_file, memory_limit = LibLZMA::UINT64_MAX, flags = [:tell_unsupported_check])
  File.open(in_file, "rb") do |i_file|
    File.open(out_file, "wb") do |o_file|
      decompress_stream(i_file, memory_limit, flags) do |chunk|
        o_file.write(chunk)
      end
    end
  end
end

.decompress_stream(io, memory_limit = LibLZMA::UINT64_MAX, flags = [:tell_unsupported_check], &block) ⇒ Object Also known as: decode_stream

call-seq:

decompress_stream(io [, memory_limit [, flags ] ] )               → a_string
decompress_stream(io [, memory_limit [, flags ] ] ){|chunk| ... } → an_integer
decode_stream(io [, memory_limit [, flags ] ] )                   → a_string
decode_stream(io [, memory_limit [, flags ] ] ){|chunk| ... }     → an_integer

Decompresses a stream containing XZ-compressed data.

Parameters

io

The IO to read from. It must be opened for reading.

memory_limit

(UINT64_MAX) If not XZ::LibLZMA::UINT64_MAX, makes liblzma

use no more memory than +memory_limit+ bytes.
flags

([:tell_unsupported_check]) Additional flags

passed to liblzma (an array). Possible flags are:
[:tell_no_check] Spit out a warning if the archive hasn't an
                 integrity checksum.
[:tell_unsupported_check] Spit out a warning if the archive
                          has an unsupported checksum type.
[:concatenated] Decompress concatenated archives.
chunk

(Block argument) One piece of decompressed data.

Return value

If a block was given, returns the number of bytes written. Otherwise, returns the decompressed data as a BINARY-encoded string.

Example

data = File.open("archive.xz", "rb"){|f| f.read}
io = StringIO.new(data)
XZ.decompress_stream(io) #=> "I AM THE DATA"
io.rewind
str = ""
XZ.decompress_stream(io, XZ::LibLZMA::UINT64_MAX, [:tell_no_check]){|c| str << c} #=> 13
str #=> "I AM THE DATA"

Remarks

The block form is much better on memory usage, because it doesn’t have to load everything into RAM at once. If you don’t know how big your data gets or if you want to decompress much data, use the block form. Of course you shouldn’t store the data you read in RAM then as in the example above.

Raises:

  • (ArgumentError)


87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
# File 'lib/xz.rb', line 87

def decompress_stream(io, memory_limit = LibLZMA::UINT64_MAX, flags = [:tell_unsupported_check], &block)
  raise(ArgumentError, "Invalid memory limit set!") unless (0..LibLZMA::UINT64_MAX).include?(memory_limit)
  flags.each do |flag|
    raise(ArgumentError, "Unknown flag #{flag}!") unless [:tell_no_check, :tell_unsupported_check, :tell_any_check, :concatenated].include?(flag)
  end

  stream = LZMAStream.new
  res = LibLZMA.lzma_stream_decoder(
    stream.pointer,
    memory_limit,
    flags.inject(0){|val, flag| val | LibLZMA.const_get(:"LZMA_#{flag.to_s.upcase}")}
  )

  LZMAError.raise_if_necessary(res)

  res = ""
  res.encode!(Encoding::BINARY)
  if block_given?
    res = lzma_code(io, stream, &block)
  else
    lzma_code(io, stream){|chunk| res << chunk}
  end

  LibLZMA.lzma_end(stream.pointer)

  block_given? ? stream[:total_out] : res
end