Class: Cabriolet::HLP::WinHelp::Decompressor

Inherits:
Object
  • Object
show all
Defined in:
lib/cabriolet/hlp/winhelp/decompressor.rb

Overview

Decompressor for Windows Help files

Extracts and decompresses content from WinHelp files using:

  • WinHelp::Parser for file structure

  • ZeckLZ77 for topic decompression

Handles both WinHelp 3.x and 4.x formats.

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(filename, io_system = nil) ⇒ Decompressor

Initialize decompressor

Parameters:

  • filename (String)

    Path to WinHelp file

  • io_system (System::IOSystem, nil) (defaults to: nil)

    Custom I/O system



25
26
27
28
29
30
31
# File 'lib/cabriolet/hlp/winhelp/decompressor.rb', line 25

def initialize(filename, io_system = nil)
  @filename = filename
  @io_system = io_system || System::IOSystem.new
  @parser = Parser.new(@io_system)
  @zeck = ZeckLZ77.new
  @header = nil
end

Instance Attribute Details

#headerObject (readonly)

Returns the value of attribute header.



19
20
21
# File 'lib/cabriolet/hlp/winhelp/decompressor.rb', line 19

def header
  @header
end

#io_systemObject (readonly)

Returns the value of attribute io_system.



19
20
21
# File 'lib/cabriolet/hlp/winhelp/decompressor.rb', line 19

def io_system
  @io_system
end

Instance Method Details

#decompress_topic(compressed_data, output_size) ⇒ String

Decompress topic data using Zeck LZ77

Parameters:

  • compressed_data (String)

    Compressed topic data

  • output_size (Integer)

    Expected decompressed size

Returns:

  • (String)

    Decompressed topic text



89
90
91
# File 'lib/cabriolet/hlp/winhelp/decompressor.rb', line 89

def decompress_topic(compressed_data, output_size)
  @zeck.decompress(compressed_data, output_size)
end

#extract_all(output_dir) ⇒ Integer

Extract all files to a directory

Parameters:

  • output_dir (String)

    Output directory path

Returns:

  • (Integer)

    Number of files extracted



118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
# File 'lib/cabriolet/hlp/winhelp/decompressor.rb', line 118

def extract_all(output_dir)
  parse unless @header

  FileUtils.mkdir_p(output_dir)

  count = 0
  @header.internal_files.each do |file_entry|
    data = extract_internal_file(file_entry[:filename])
    next unless data

    # Sanitize filename for file system
    safe_name = sanitize_filename(file_entry[:filename])
    output_path = File.join(output_dir, safe_name)

    File.binwrite(output_path, data)
    count += 1
  end

  count
end

#extract_internal_file(filename) ⇒ String?

Extract a specific internal file by name

Parameters:

  • filename (String)

    Internal filename (e.g., “|SYSTEM”, “|TOPIC”)

Returns:

  • (String, nil)

    Raw file data or nil if not found



44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
# File 'lib/cabriolet/hlp/winhelp/decompressor.rb', line 44

def extract_internal_file(filename)
  parse unless @header

  file_entry = @header.find_file(filename)
  return nil unless file_entry

  # Use file_offset if available (B+ tree format), otherwise fall back to starting_block
  if file_entry[:file_offset]
    file_offset = file_entry[:file_offset]
  else
    # Calculate file offset from starting block (WinHelp 3.x format)
    # Block size is typically 4096 bytes
    block_size = 4096
    file_offset = file_entry[:starting_block] * block_size
  end

  # Open the WinHelp file and seek to file data
  handle = @io_system.open(@filename, Constants::MODE_READ)
  begin
    @io_system.seek(handle, file_offset, Constants::SEEK_START)
    @io_system.read(handle, file_entry[:file_size])
  ensure
    @io_system.close(handle)
  end
end

#extract_system_fileString?

Extract |SYSTEM file data

Returns:

  • (String, nil)

    System file data



73
74
75
# File 'lib/cabriolet/hlp/winhelp/decompressor.rb', line 73

def extract_system_file
  extract_internal_file("|SYSTEM")
end

#extract_topic_fileString?

Extract |TOPIC file data

Returns:

  • (String, nil)

    Topic file data (compressed)



80
81
82
# File 'lib/cabriolet/hlp/winhelp/decompressor.rb', line 80

def extract_topic_file
  extract_internal_file("|TOPIC")
end

#extract_topicsArray<Hash>

Extract all topics from |TOPIC file

This is a simplified implementation that returns raw topic data. Full implementation would parse topic headers and extract individual topics.

Returns:

  • (Array<Hash>)

    Array of topic hashes with :data key



99
100
101
102
103
104
105
106
107
108
109
110
111
112
# File 'lib/cabriolet/hlp/winhelp/decompressor.rb', line 99

def extract_topics
  parse unless @header

  topic_data = extract_topic_file
  return [] unless topic_data

  # For now, return the raw topic data
  # Full implementation would parse topic block headers
  [{
    index: 0,
    data: topic_data,
    compressed: true,
  }]
end

#has_system_file?Boolean

Check if |SYSTEM file exists

Returns:

  • (Boolean)

    true if |SYSTEM present



177
178
179
180
# File 'lib/cabriolet/hlp/winhelp/decompressor.rb', line 177

def has_system_file?
  parse unless @header
  @header.has_system_file?
end

#has_topic_file?Boolean

Check if |TOPIC file exists

Returns:

  • (Boolean)

    true if |TOPIC present



185
186
187
188
# File 'lib/cabriolet/hlp/winhelp/decompressor.rb', line 185

def has_topic_file?
  parse unless @header
  @header.has_topic_file?
end

#internal_filenamesArray<String>

Get list of internal filenames

Returns:

  • (Array<String>)

    Internal file names



169
170
171
172
# File 'lib/cabriolet/hlp/winhelp/decompressor.rb', line 169

def internal_filenames
  parse unless @header
  @header.internal_filenames
end

#parseModels::WinHelpHeader

Parse the WinHelp file structure

Returns:



36
37
38
# File 'lib/cabriolet/hlp/winhelp/decompressor.rb', line 36

def parse
  @header = @parser.parse(@filename)
end

#sanitize_filename(filename) ⇒ String

Sanitize filename for file system

Parameters:

  • filename (String)

    Internal filename

Returns:

  • (String)

    Safe filename



143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
# File 'lib/cabriolet/hlp/winhelp/decompressor.rb', line 143

def sanitize_filename(filename)
  # Encode to ASCII, replacing non-ASCII and control characters with _
  sanitized = filename.encode("ASCII", invalid: :replace,
                                       undef: :replace, replace: "_")

  # Replace | with _pipe_ (after encoding to handle | correctly)
  sanitized = sanitized.gsub("|", "_pipe_")

  # Replace remaining invalid filename characters with _
  sanitized = sanitized.gsub(/[\/\\:<>"|?*]/, "_")

  # Replace multiple consecutive underscores with single underscore
  sanitized = sanitized.squeeze("_")

  # Remove leading/trailing underscores
  sanitized = sanitized.gsub(/^_+|_+$/, "")

  # Use default name if empty
  sanitized = "_unnamed_file_" if sanitized.empty?

  sanitized
end