Module: Amp::RevlogSupport::Support

Extended by:: Support

Included in:: Index, Support

Defined in:: lib/amp/revlogs/revlog_support.rb

Constant Summary collapse

REVLOG_VERSION_0 = Old version of the revlog file format

REVLOG_VERSION_NG = Current version of the revlog file format

REVLOG_NG_INLINE_DATA = A flag marking that the data is stored with the index

(1 << 16)

REVLOG_DEFAULT_FLAGS = Default flags - always start inline (turn off inline if file is huge)

REVLOG_NG_INLINE_DATA

REVLOG_DEFAULT_FORMAT = Default format - the most recent

REVLOG_VERSION_NG

REVLOG_DEFAULT_VERSION = Default version in general

REVLOG_DEFAULT_FORMAT | REVLOG_DEFAULT_FLAGS

Instance Method Summary collapse

#compress(text) ⇒ Hash

returns the possibly-compressed version of the text, in a hash:.
#decompress(binary) ⇒ String

Decompresses the given binary text.
#get_offset(o) ⇒ Object

This bears some explanation.
#get_version(t) ⇒ Object

And yeah.
#history_hash(text, p1, p2) ⇒ String

generate a hash from the given text and its parent hashes.
#offset_version(offset, type) ⇒ Object

Combine an offset and a version to spit this baby out.

Instance Method Details

#compress(text) ⇒ `Hash`

returns the possibly-compressed version of the text, in a hash:

Returns:

(Hash) —

:compression => ‘u’ or ”

# File 'lib/amp/revlogs/revlog_support.rb', line 77

def compress(text)
  return {:compression => "", :text => text} if text.empty?
  size = text.size
  binary = nil
  if size < 44
  elsif size > 1000000 #big ole file
    deflater = Zlib::Deflate.new
    parts = []
    position = 0
    while position < size
      newposition = position + 2**20
      p << deflater.deflate(text[position..(newposition-1)], Zlib::NO_FLUSH)
      position = newposition
    end
    p << deflater.flush
    binary = p.join if p.map {|e| e.size}.sum < size # only add it if
                                           # compression made it smaller
  else #tiny, just compress it
    binary = Zlib::Deflate.deflate text
  end
  
  if binary.nil? || binary.size > size
    return {:compression => "",  :text => text} if text[0,1] == "\0"
    return {:compression => 'u', :text => text}
  end
  {:compression => "", :text => binary}
end

#decompress(binary) ⇒ `String`

Decompresses the given binary text. The binary text could be uncompressed, in which case, we’ll figure that out. Don’t worry.

Parameters:

binary (String) —

the text to (possibly) decompress

Returns:

(String) —

the text decompressed

# File 'lib/amp/revlogs/revlog_support.rb', line 111

def decompress(binary)
  return binary if binary.empty?
  case binary[0,1]
  when "\0"
    binary #we're just stored as binary
  when "x"
    Zlib::Inflate.inflate(binary) #we're zlibbed
  when "u"
    binary[1..-1] #we're uncompressed text
  else
    raise LookupError.new("Unknown compression type #{binary[0,1]}")
  end
end

#get_offset(o) ⇒ `Object`

This bears some explanation.

Rather than simply having a 4-byte header for the index file format, the Mercurial format takes the first entry in the index, and stores the header in its offset field. (The offset field is a 64-bit unsigned integer which stores the offset into the data or index of the associated record’s data) They take advantage of the fact that the first entry’s offset will always be 0. As such, its offset field is always going to be zero, so it’s safe to store data there.

The format is ((flags << 16) | (version)), where flags is a bitmask (up to 48 bits) and version is a 16-bit unsigned short.

The worst part is, EVERY SINGLE ENTRY has its offset shifted 16 bits to the left, apparently all because of this. It fucking baffles my mind.

So yeah. offset = value >> 16.

42	# File 'lib/amp/revlogs/revlog_support.rb', line 42 def get_offset(o); o >> 16; end

#get_version(t) ⇒ `Object`

And yeah. version = value && 0xFFFF (last 16 bits)

44	# File 'lib/amp/revlogs/revlog_support.rb', line 44 def get_version(t); t & 0xFFFF; end

#history_hash(text, p1, p2) ⇒ `String`

generate a hash from the given text and its parent hashes

This hash combines both the current file contents and its history in a manner that makes it easy to distinguish nodes with the same content in the revision graph.

since an entry in a revlog is pretty much [parent1, parent2, text], we use a hash of the previous entry as a reference to that previous entry. To create a reference to this entry, we make a hash of the first parent (which is just its ID), the second parent, and the text.

Returns:

(String) —

the digest of the two parents and the extra text

# File 'lib/amp/revlogs/revlog_support.rb', line 65

def history_hash(text, p1, p2)
  list = [p1, p2].sort
  s = list[0].sha1
  s.update list[1]
  s.update text
  s.digest
end

#offset_version(offset, type) ⇒ `Object`

Combine an offset and a version to spit this baby out



47
48
49

# File 'lib/amp/revlogs/revlog_support.rb', line 47

def offset_version(offset,type)
  (offset << 16) | type
end

Module: Amp::RevlogSupport::Support

Constant Summary collapse

Instance Method Summary collapse

Instance Method Details

#compress(text) ⇒ Hash

#decompress(binary) ⇒ String

#get_offset(o) ⇒ Object

#get_version(t) ⇒ Object

#history_hash(text, p1, p2) ⇒ String

#offset_version(offset, type) ⇒ Object

#compress(text) ⇒ `Hash`

#decompress(binary) ⇒ `String`

#get_offset(o) ⇒ `Object`

#get_version(t) ⇒ `Object`

#history_hash(text, p1, p2) ⇒ `String`

#offset_version(offset, type) ⇒ `Object`