Class: Mapi::Pst
Defined Under Namespace
Modules: Desc2, Index2 Classes: Attachment, AttachmentTable, BlockParser, CompressibleEncryption, Desc, Desc64, FormatError, Header, ID2Assoc, ID2Assoc64, ID2Mapping, Index, Index64, Item, RangesIOEncryptable, RangesIOID2, RangesIOIdxChain, RawPropertyStore, RawPropertyStoreTable, Recipient, RecipientTable, TablePtr
Constant Summary collapse
- ToTree =
this is the index and desc record loading code
Module.new
- ITEM_COUNT_OFFSET =
more constants from libpst.c these relate to the index block
0x1f0
- LEVEL_INDICATOR_OFFSET =
count byte
0x1f3
- BACKLINK_OFFSET =
node or leaf
0x1f8
- ITEM_COUNT_OFFSET_64 =
mostly guesses.
0x1e8
- LEVEL_INDICATOR_OFFSET_64 =
diff of 3 between these 2 as above…
0x1eb
Instance Attribute Summary collapse
-
#desc ⇒ Object
readonly
Returns the value of attribute desc.
-
#header ⇒ Object
readonly
Returns the value of attribute header.
-
#idx ⇒ Object
readonly
Returns the value of attribute idx.
-
#io ⇒ Object
readonly
Returns the value of attribute io.
-
#special_folder_ids ⇒ Object
readonly
Returns the value of attribute special_folder_ids.
Class Method Summary collapse
-
.make_property_set(property_list) ⇒ Object
higher level item code.
-
.unpack(str, unpack_spec) ⇒ Object
unfortunately there is no Q analogue which is little endian only.
Instance Method Summary collapse
-
#desc_from_id(id) ⇒ Object
as for idx.
-
#dump_debug_info ⇒ Object
other random code —————————————————————————-.
- #each(&block) ⇒ Object
- #encrypted? ⇒ Boolean
-
#id2_block_idx_chain(idx) ⇒ Object
corresponds to: * _pst_ff_getID2block * _pst_ff_getID2data * _pst_ff_compile_ID.
-
#idx_from_id(id) ⇒ Object
most access to idx objects will use this function.
-
#initialize(io) ⇒ Pst
constructor
corresponds to * pst_open * pst_load_index.
- #inspect ⇒ Object
-
#load_desc ⇒ Object
corresponds to * _pst_build_desc_ptr * record_descriptor.
-
#load_desc_rec(offset, linku1, start_val) ⇒ Object
load the flat list of desc records recursively.
-
#load_idx ⇒ Object
corresponds to * _pst_build_id_ptr.
- #load_idx2(idx) ⇒ Object
-
#load_idx2_rec(idx) ⇒ Object
corresponds to * _pst_build_id2.
-
#load_idx_rec(offset, linku1, start_val) ⇒ Object
load the flat idx table, which maps ids to file ranges.
-
#load_xattrib ⇒ Object
corresponds to * pst_load_extended_attributes.
- #name ⇒ Object
-
#pst_parse_item(desc) ⇒ Object
corresponds to * _pst_parse_item.
-
#pst_read_block_size(offset, size, decrypt = true) ⇒ Object
corresponds to: * _pst_read_block_size * _pst_read_block ?? * _pst_ff_getIDblock_dec ?? * _pst_ff_getIDblock ??.
- #root ⇒ Object
- #root_desc ⇒ Object
- #root_item ⇒ Object
-
#warn(s) ⇒ Object
until i properly fix logging…
Constructor Details
#initialize(io) ⇒ Pst
corresponds to
-
pst_open
-
pst_load_index
265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 |
# File 'lib/mapi/pst.rb', line 265 def initialize io @io = io io.pos = 0 @header = Header.new io.read(Header::SIZE) # would prefer this to be in Header#validate, but it doesn't have the io size. # should perhaps downgrade this to just be a warning... raise FormatError, "header size field invalid (#{header.size} != #{io.size}}" unless header.size == io.size load_idx load_desc load_xattrib @special_folder_ids = {} end |
Instance Attribute Details
#desc ⇒ Object (readonly)
Returns the value of attribute desc.
260 261 262 |
# File 'lib/mapi/pst.rb', line 260 def desc @desc end |
#header ⇒ Object (readonly)
Returns the value of attribute header.
260 261 262 |
# File 'lib/mapi/pst.rb', line 260 def header @header end |
#idx ⇒ Object (readonly)
Returns the value of attribute idx.
260 261 262 |
# File 'lib/mapi/pst.rb', line 260 def idx @idx end |
#io ⇒ Object (readonly)
Returns the value of attribute io.
260 261 262 |
# File 'lib/mapi/pst.rb', line 260 def io @io end |
#special_folder_ids ⇒ Object (readonly)
Returns the value of attribute special_folder_ids.
260 261 262 |
# File 'lib/mapi/pst.rb', line 260 def special_folder_ids @special_folder_ids end |
Class Method Details
.make_property_set(property_list) ⇒ Object
higher level item code. wraps up the raw properties above, and gives nice objects to work with. handles item relationships too.
1502 1503 1504 1505 1506 1507 |
# File 'lib/mapi/pst.rb', line 1502 def self.make_property_set property_list hash = property_list.inject({}) do |hash, (key, type, value)| hash.update PropertySet::Key.new(key) => value end PropertySet.new hash end |
.unpack(str, unpack_spec) ⇒ Object
unfortunately there is no Q analogue which is little endian only. this translates T as an unsigned quad word, little endian byte order, to not pollute the rest of the code.
didn’t want to override String#unpack, cause its too hacky, and incomplete.
74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 |
# File 'lib/mapi/pst.rb', line 74 def self.unpack str, unpack_spec return str.unpack(unpack_spec) unless unpack_spec['T'] @unpack_cache ||= {} t_offsets, new_spec = @unpack_cache[unpack_spec] unless t_offsets t_offsets = [] offset = 0 new_spec = '' unpack_spec.scan(/([^\d])_?(\*|\d+)?/o) do num_elems = $1.downcase == 'a' ? 1 : ($2 || 1).to_i if $1 == 'T' num_elems.times { |i| t_offsets << offset + i } new_spec << "V#{num_elems * 2}" else new_spec << $~[0] end offset += num_elems end @unpack_cache[unpack_spec] = [t_offsets, new_spec] end a = str.unpack(new_spec) t_offsets.each do |offset| low, high = a[offset, 2] a[offset, 2] = low && high ? low + (high << 32) : nil end a end |
Instance Method Details
#desc_from_id(id) ⇒ Object
as for idx
corresponds to:
-
_pst_getDptr
748 749 750 |
# File 'lib/mapi/pst.rb', line 748 def desc_from_id id @desc_from_id[id] end |
#dump_debug_info ⇒ Object
other random code
1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 |
# File 'lib/mapi/pst.rb', line 1689 def dump_debug_info puts "* pst header" p header =begin Looking at the output of this, for blank-o1997.pst, i see this part: ... - (26624,516) desc block data (overlap of 4 bytes) - (27136,516) desc block data (gap of 508 bytes) - (28160,516) desc block data (gap of 2620 bytes) ... which confirms my belief that the block size for idx and desc is more likely 512 =end if 0 + 0 == 0 puts '* file range usage' file_ranges = # these 3 things, should account for most of the data in the file. [[0, Header::SIZE, 'pst file header']] + @idx_offsets.map { |offset| [offset, Index::BLOCK_SIZE, 'idx block data'] } + @desc_offsets.map { |offset| [offset, Desc::BLOCK_SIZE, 'desc block data'] } + @idx.map { |idx| [idx.offset, idx.size, 'idx id=0x%x (%s)' % [idx.id, idx.type]] } (file_ranges.sort_by { |idx| idx.first } + [nil]).to_enum(:each_cons, 2).each do |(offset, size, name), next_record| # i think there is a padding of the size out to 64 bytes # which is equivalent to padding out the final offset, because i think the offset is # similarly oriented pad_amount = 64 warn 'i am wrong about the offset padding' if offset % pad_amount != 0 # so, assuming i'm not wrong about that, then we can calculate how much padding is needed. pad = pad_amount - (size % pad_amount) pad = 0 if pad == pad_amount gap = next_record ? next_record.first - (offset + size + pad) : 0 extra = case gap <=> 0 when -1; ["overlap of #{gap.abs} bytes)"] when 0; [] when +1; ["gap of #{gap} bytes"] end # how about we check that padding @io.pos = offset + size pad_bytes = @io.read(pad) extra += ["padding not all zero"] unless pad_bytes == 0.chr * pad puts "- #{offset}:#{size}+#{pad} #{name.inspect}" + (extra.empty? ? '' : ' [' + extra * ', ' + ']') end end # i think the idea of the idx, and indeed the idx2, is just to be able to # refer to data indirectly, which means it can get moved around, and you just update # the idx table. it is simply a list of file offsets and sizes. # not sure i get how id2 plays into it though.... # the sizes seem to be all even. is that a co-incidence? and the ids are all even. that # seems to be related to something else (see the (id & 2) == 1 stuff) puts '* idx entries' @idx.each { |idx| puts "- #{idx.inspect}" } # if you look at the desc tree, you notice a few things: # 1. there is a desc that seems to be the parent of all the folders, messages etc. # it is the one whose parent is itself. # one of its children is referenced as the subtree_entryid of the first desc item, # the root. # 2. typically only 2 types of desc records have idx2_id != 0. messages themselves, # and the desc with id = 0x61 - the xattrib container. everything else uses the # regular ids to find its data. i think it should be reframed as small blocks and # big blocks, but i'll look into it more. # # idx_id and idx2_id are for getting to the data. desc_id and parent_desc_id just define # the parent <-> child relationship, and the desc_ids are how the items are referred to in # entryids. # note that these aren't unique! eg for 0, 4 etc. i expect these'd never change, as the ids # are stored in entryids. whereas the idx and idx2 could be a bit more volatile. puts '* desc tree' # make a dummy root hold everything just for convenience root = Desc.new '' def root.inspect; "#<Pst::Root>"; end root.children.replace @orphans # this still loads the whole thing as a string for gsub. should use directo output io # version. puts root.to_tree.gsub(/, (parent_desc_id|idx2_id)=0x0(?!\d)/, '') # this is fairly easy to understand, its just an attempt to display the pst items in a tree form # which resembles what you'd see in outlook. puts '* item tree' # now streams directly root_item.to_tree STDOUT end |
#each(&block) ⇒ Object
1791 1792 1793 1794 1795 |
# File 'lib/mapi/pst.rb', line 1791 def each(&block) root = self.root block[root] root.each_recursive(&block) end |
#encrypted? ⇒ Boolean
281 282 283 |
# File 'lib/mapi/pst.rb', line 281 def encrypted? @header.encrypted? end |
#id2_block_idx_chain(idx) ⇒ Object
corresponds to:
-
_pst_ff_getID2block
-
_pst_ff_getID2data
-
_pst_ff_compile_ID
911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 |
# File 'lib/mapi/pst.rb', line 911 def id2_block_idx_chain idx if (idx.id & 0x2) == 0 [idx] else buf = idx.read type, fdepth, count = buf[0, 4].unpack 'CCv' unless type == 1 # libpst.c:3958 warn 'Error in idx_chain - %p, %p, %p - attempting to ignore' % [type, fdepth, count] return [idx] end # there are 4 unaccounted for bytes here, 4...8 if header.version_2003? ids = buf[8, count * 8].unpack("T#{count}") else ids = buf[8, count * 4].unpack('V*') end if fdepth == 1 ids.map { |id| idx_from_id id } else ids.map { |id| id2_block_idx_chain idx_from_id(id) }.flatten end end end |
#idx_from_id(id) ⇒ Object
most access to idx objects will use this function
corresponds to
-
_pst_getID
652 653 654 |
# File 'lib/mapi/pst.rb', line 652 def idx_from_id id @idx_from_id[id] end |
#inspect ⇒ Object
1801 1802 1803 |
# File 'lib/mapi/pst.rb', line 1801 def inspect "#<Pst name=#{name.inspect} io=#{io.inspect}>" end |
#load_desc ⇒ Object
corresponds to
-
_pst_build_desc_ptr
-
record_descriptor
659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 |
# File 'lib/mapi/pst.rb', line 659 def load_desc @desc = [] @desc_offsets = [] if header.version_2003? @desc = Desc64.load_chain io, header @desc.each { |desc| desc.pst = self } else load_desc_rec header.index2, header.index2_count, 0x21 end # first create a lookup cache @desc_from_id = {} @desc.each do |desc| desc.pst = self warn "there are duplicate desc records with id #{desc.desc_id}" if @desc_from_id[desc.desc_id] @desc_from_id[desc.desc_id] = desc end # now turn the flat list of loaded desc records into a tree # well, they have no parent, so they're more like, the toplevel descs. @orphans = [] # now assign each node to the parents child array, putting the orphans in the above @desc.each do |desc| parent = @desc_from_id[desc.parent_desc_id] # note, besides this, its possible to create other circular structures. if parent == desc # this actually happens usually, for the root_item it appears. #warn "desc record's parent is itself (#{desc.inspect})" # maybe add some more checks in here for circular structures elsif parent parent.children << desc next end @orphans << desc end # maybe change this to some sort of sane-ness check. orphans are expected # warn "have #{@orphans.length} orphan desc record(s)." unless @orphans.empty? end |
#load_desc_rec(offset, linku1, start_val) ⇒ Object
load the flat list of desc records recursively
corresponds to
-
_pst_build_desc_ptr
-
record_descriptor
705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 |
# File 'lib/mapi/pst.rb', line 705 def load_desc_rec offset, linku1, start_val @desc_offsets << offset buf = pst_read_block_size offset, Desc::BLOCK_SIZE, false item_count = buf[ITEM_COUNT_OFFSET] # not real desc desc = Desc.new buf[BACKLINK_OFFSET, 4] raise 'blah 1' unless desc.desc_id == linku1 if buf[LEVEL_INDICATOR_OFFSET] == 0 # leaf pointers raise "have too many active items in index (#{item_count})" if item_count > Desc::COUNT_MAX # split the data into item_count desc objects buf[0, Desc::SIZE * item_count].scan(/.{#{Desc::SIZE}}/mo).each_with_index do |data, i| desc = Desc.new data # first entry raise 'blah 3' if i == 0 and start_val != 0 and desc.desc_id != start_val # this shouldn't really happen i'd imagine break if desc.desc_id == 0 @desc << desc end else # node pointers raise "have too many active items in index (#{item_count})" if item_count > Index::COUNT_MAX # split the data into item_count table pointers buf[0, TablePtr::SIZE * item_count].scan(/.{#{TablePtr::SIZE}}/mo).each_with_index do |data, i| table = TablePtr.new data # for the first value, we expect the start to be equal note that ids -1, so even for the # first we expect it to be equal. thats the 0x21 (dec 33) desc record. this means we assert # that the first desc record is always 33... raise 'blah 3' if i == 0 and start_val != -1 and table.start != start_val # this shouldn't really happen i'd imagine break if table.start == 0 load_desc_rec table.offset, table.u1, table.start end end end |
#load_idx ⇒ Object
corresponds to
-
_pst_build_id_ptr
588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 |
# File 'lib/mapi/pst.rb', line 588 def load_idx @idx = [] @idx_offsets = [] if header.version_2003? @idx = Index64.load_chain io, header @idx.each { |idx| idx.pst = self } else load_idx_rec header.index1, header.index1_count, 0 end # we'll typically be accessing by id, so create a hash as a lookup cache @idx_from_id = {} @idx.each do |idx| warn "there are duplicate idx records with id #{idx.id}" if @idx_from_id[idx.id] @idx_from_id[idx.id] = idx end end |
#load_idx2(idx) ⇒ Object
856 857 858 859 860 861 862 863 |
# File 'lib/mapi/pst.rb', line 856 def load_idx2 idx if header.version_2003? id2 = ID2Assoc64.load_chain idx else id2 = load_idx2_rec idx end ID2Mapping.new self, id2 end |
#load_idx2_rec(idx) ⇒ Object
corresponds to
-
_pst_build_id2
867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 |
# File 'lib/mapi/pst.rb', line 867 def load_idx2_rec idx # i should perhaps use a idx chain style read here? buf = pst_read_block_size idx.offset, idx.size, false type, count = buf.unpack 'v2' unless type == 0x0002 raise 'unknown id2 type 0x%04x' % type #return end id2 = [] count.times do |i| assoc = ID2Assoc.new buf[4 + ID2Assoc::SIZE * i, ID2Assoc::SIZE] id2 << assoc if assoc.table2 != 0 id2 += load_idx2_rec idx_from_id(assoc.table2) end end id2 end |
#load_idx_rec(offset, linku1, start_val) ⇒ Object
load the flat idx table, which maps ids to file ranges. this is the recursive helper
corresponds to
-
_pst_build_id_ptr
610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 |
# File 'lib/mapi/pst.rb', line 610 def load_idx_rec offset, linku1, start_val @idx_offsets << offset #_pst_read_block_size(pf, offset, BLOCK_SIZE, &buf, 0, 0) < BLOCK_SIZE) buf = pst_read_block_size offset, Index::BLOCK_SIZE, false item_count = buf[ITEM_COUNT_OFFSET] raise "have too many active items in index (#{item_count})" if item_count > Index::COUNT_MAX idx = Index.new buf[BACKLINK_OFFSET, Index::SIZE] raise 'blah 1' unless idx.id == linku1 if buf[LEVEL_INDICATOR_OFFSET] == 0 # leaf pointers # split the data into item_count index objects buf[0, Index::SIZE * item_count].scan(/.{#{Index::SIZE}}/mo).each_with_index do |data, i| idx = Index.new data # first entry raise 'blah 3' if i == 0 and start_val != 0 and idx.id != start_val idx.pst = self # this shouldn't really happen i'd imagine break if idx.id == 0 @idx << idx end else # node pointers # split the data into item_count table pointers buf[0, TablePtr::SIZE * item_count].scan(/.{#{TablePtr::SIZE}}/mo).each_with_index do |data, i| table = TablePtr.new data # for the first value, we expect the start to be equal raise 'blah 3' if i == 0 and start_val != 0 and table.start != start_val # this shouldn't really happen i'd imagine break if table.start == 0 load_idx_rec table.offset, table.u1, table.start end end end |
#load_xattrib ⇒ Object
corresponds to
-
pst_load_extended_attributes
754 755 756 757 758 759 760 761 762 763 764 765 766 767 |
# File 'lib/mapi/pst.rb', line 754 def load_xattrib unless desc = desc_from_id(0x61) warn "no extended attributes desc record found" return end unless desc.desc warn "no desc idx for extended attributes" return end if desc.list_index end #warn "skipping loading xattribs" # FIXME implement loading xattribs end |
#name ⇒ Object
1797 1798 1799 |
# File 'lib/mapi/pst.rb', line 1797 def name @name ||= root_item.props.display_name end |
#pst_parse_item(desc) ⇒ Object
corresponds to
-
_pst_parse_item
1680 1681 1682 |
# File 'lib/mapi/pst.rb', line 1680 def pst_parse_item desc Item.new desc, RawPropertyStore.new(desc).to_a end |
#pst_read_block_size(offset, size, decrypt = true) ⇒ Object
corresponds to:
-
_pst_read_block_size
-
_pst_read_block ??
-
_pst_ff_getIDblock_dec ??
-
_pst_ff_getIDblock ??
774 775 776 777 778 779 |
# File 'lib/mapi/pst.rb', line 774 def pst_read_block_size offset, size, decrypt=true io.seek offset buf = io.read size warn "tried to read #{size} bytes but only got #{buf.length}" if buf.length != size encrypted? && decrypt ? CompressibleEncryption.decrypt(buf) : buf end |
#root ⇒ Object
1784 1785 1786 |
# File 'lib/mapi/pst.rb', line 1784 def root root_item end |
#root_desc ⇒ Object
1774 1775 1776 |
# File 'lib/mapi/pst.rb', line 1774 def root_desc @desc.first end |
#root_item ⇒ Object
1778 1779 1780 1781 1782 |
# File 'lib/mapi/pst.rb', line 1778 def root_item item = pst_parse_item root_desc item.type = :root item end |