Searcher
Searcher is experimental now. Note that all interfaces are not stable at all.
Example
epub = EPUB::Parser.parse('childrens-literature-20130206.epub')
search_word = 'INTRODUCTORY'
results = EPUB::Searcher.search(epub, search_word)
# => [#<EPUB::Searcher::Result:0x007f938ed517a8
# @end_steps=[#<EPUB::Searcher::Result::Step:0x007f938ed51a50 @index=12, @info={}, @type=:character>],
# @parent_steps=
# [#<EPUB::Searcher::Result::Step:0x007f938f1c1e78 @index=2, @info={:name=>"spine", :id=>nil}, @type=:element>,
# #<EPUB::Searcher::Result::Step:0x007f938f1caa78 @index=1, @info={:id=>nil}, @type=:itemref>,
# #<EPUB::Searcher::Result::Step:0x007f938ed521d0 @index=1, @info={:name=>"body", :id=>nil}, @type=:element>,
# #<EPUB::Searcher::Result::Step:0x007f938ed52158 @index=0, @info={:name=>"nav", :id=>"toc"}, @type=:element>,
# #<EPUB::Searcher::Result::Step:0x007f938ed52108 @index=1, @info={:name=>"ol", :id=>"tocList"}, @type=:element>,
# #<EPUB::Searcher::Result::Step:0x007f938ed52090 @index=0, @info={:name=>"li", :id=>"np-313"}, @type=:element>,
# #<EPUB::Searcher::Result::Step:0x007f938ed52040 @index=1, @info={:name=>"ol", :id=>nil}, @type=:element>,
# #<EPUB::Searcher::Result::Step:0x007f938ed51ff0 @index=1, @info={:name=>"li", :id=>"np-317"}, @type=:element>,
# #<EPUB::Searcher::Result::Step:0x007f938ed51f78 @index=0, @info={:name=>"a", :id=>nil}, @type=:element>,
# #<EPUB::Searcher::Result::Step:0x007f938ed51f28 @index=0, @info={}, @type=:text>],
# @start_steps=[#<EPUB::Searcher::Result::Step:0x007f938ed51e88 @index=0, @info={}, @type=:character>]>,
# #<EPUB::Searcher::Result:0x007f938ef8f5d8
# @end_steps=[#<EPUB::Searcher::Result::Step:0x007f938ef8f808 @index=12, @info={}, @type=:character>],
# @parent_steps=
# [#<EPUB::Searcher::Result::Step:0x007f938f1c1e78 @index=2, @info={:name=>"spine", :id=>nil}, @type=:element>,
# #<EPUB::Searcher::Result::Step:0x007f938ed51730 @index=2, @info={:id=>nil}, @type=:itemref>,
# #<EPUB::Searcher::Result::Step:0x007f938ef8fce0 @index=1, @info={:name=>"body", :id=>nil}, @type=:element>,
# #<EPUB::Searcher::Result::Step:0x007f938ef8fc90 @index=0, @info={:name=>"section", :id=>"pgepubid00492"}, @type=:element>,
# #<EPUB::Searcher::Result::Step:0x007f938ef8fc40 @index=3, @info={:name=>"section", :id=>"pgepubid00498"}, @type=:element>,
# #<EPUB::Searcher::Result::Step:0x007f938ef8fbf0 @index=1, @info={:name=>"h3", :id=>nil}, @type=:element>,
# #<EPUB::Searcher::Result::Step:0x007f938ef8fb28 @index=0, @info={}, @type=:text>],
# @start_steps=[#<EPUB::Searcher::Result::Step:0x007f938ef8fa88 @index=0, @info={}, @type=:character>]>]
puts results.collect(&:to_cfi_s)
# /6/4!/4/2[toc]/4[tocList]/2[np-313]/4/4[np-317]/2/1,:0,:12
# /6/6!/4/2[pgepubid00492]/8[pgepubid00498]/4/1,:0,:12
# => nil
Search result
Search result is an array of EPUB::Searcher::Result and it may be converted to an EPUBCFI string by EPUB::Searcher::Result#to_cfi_s.
Seamless XHTML Searcher
Now default searcher for XHTML is seamless searcher, which ignores tags when searching.
You can search words 'search word' from XHTML document below:
<html>
<head>
<title>Sample document</title>
</head>
<body>
<p><em>search</em> word</p>
</body>
</html>
Restricted XHTML Searcher
You can also use restricted searcher, which means that it can search from only single elements. For instance, it can find 'search word' from XHTML document below:
<html>
<head>
<title>Sample document</title>
</head>
<body>
<p>search word</p>
</body>
</html>
But cannot from document below:
<html>
<head>
<title>Sample document</title>
</head>
<body>
<p><em>search</em> word</p>
</body>
</html>
because the words 'search' and 'word' are not in the same element.
To use restricted searcher, specify algorithm
option for search
method:
results = EPUB::Searcher.search(epub, search_word, algorithm: :restricted)