Class: Care::Cache
- Inherits:
-
Object
- Object
- Care::Cache
- Defined in:
- lib/care.rb
Overview
Stores cached pages of data from the given IO as strings. Pages are sized to be ‘page_size` or less (for the last page).
Instance Method Summary collapse
-
#byteslice(io, at, n_bytes) ⇒ String?
Returns the maximum possible byte string that can be recovered from the given ‘io` at the given offset.
-
#clear ⇒ Object
Clears the page cache of all strings with data.
-
#hydrate_page(io, page_i) ⇒ Object
Hydrates a page at the certain index or returns the contents of that page if it is already in the cache.
-
#initialize(page_size = DEFAULT_PAGE_SIZE) ⇒ Cache
constructor
Initializes a new cache pages container with pages of given size.
-
#inspect ⇒ Object
We provide an overridden implementation of #inspect to avoid printing the actual contents of the cached pages.
-
#read_page(io, page_i) ⇒ Object
Reads the requested page from the given IO.
Constructor Details
#initialize(page_size = DEFAULT_PAGE_SIZE) ⇒ Cache
Initializes a new cache pages container with pages of given size
80 81 82 83 84 85 |
# File 'lib/care.rb', line 80 def initialize(page_size = DEFAULT_PAGE_SIZE) @page_size = page_size.to_i raise ArgumentError, 'The page size must be a positive Integer' unless @page_size > 0 @pages = {} @lowest_known_empty_page = nil end |
Instance Method Details
#byteslice(io, at, n_bytes) ⇒ String?
Returns the maximum possible byte string that can be recovered from the given ‘io` at the given offset. If the IO has been exhausted, `nil` will be returned instead. Will use the cached pages where available, or fetch pages where necessary
98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 |
# File 'lib/care.rb', line 98 def byteslice(io, at, n_bytes) raise ArgumentError, "The number of bytes to fetch must be a positive Integer, but was #{n_bytes}" if n_bytes < 1 raise ArgumentError, "Negative offsets are not supported (got #{at})" if at < 0 first_page = at / @page_size last_page = (at + n_bytes) / @page_size relevant_pages = (first_page..last_page).map { |i| hydrate_page(io, i) } # Create one string combining all the pages which are relevant for # us - it is much easier to address that string instead of piecing # the output together page by page, and joining arrays of strings # is supposed to be optimized. slab = if relevant_pages.length > 1 # If our read overlaps multiple pages, we do have to join them, this is # the general case relevant_pages.join else # We only have one page # Optimize a little. If we only have one page that we need to read from # - which is likely going to be the case *often* we can avoid allocating # a new string for the joined pages and juse use the only page # directly as the slab. Since it might contain a `nil` and we do # not join (which casts nils to strings) we take care of that too relevant_pages.first || '' end offset_in_slab = at % @page_size slice = slab.byteslice(offset_in_slab, n_bytes) # Returning an empty string from read() is very confusing for the caller, # and no builtins do this - if we are at EOF we should return nil slice if slice && !slice.empty? end |
#clear ⇒ Object
Clears the page cache of all strings with data
135 136 137 138 |
# File 'lib/care.rb', line 135 def clear @pages.map { |maybe_page_str| maybe_page_str.clear if maybe_page_str.respond_to?(:clear) } @pages.clear end |
#hydrate_page(io, page_i) ⇒ Object
Hydrates a page at the certain index or returns the contents of that page if it is already in the cache
145 146 147 148 149 150 151 |
# File 'lib/care.rb', line 145 def hydrate_page(io, page_i) # Avoid trying to read the page if we know there is no content to fill it # in the underlying IO return if @lowest_known_empty_page && page_i >= @lowest_known_empty_page @pages[page_i] ||= read_page(io, page_i) end |
#inspect ⇒ Object
We provide an overridden implementation of #inspect to avoid printing the actual contents of the cached pages
155 156 157 158 159 160 161 162 163 164 165 166 |
# File 'lib/care.rb', line 155 def inspect # Simulate the builtin object ID output https://stackoverflow.com/a/11765495/153886 oid_str = (object_id << 1).to_s(16).rjust(16, '0') ivars = instance_variables ivars.delete(:@pages) ivars_str = ivars.map do |ivar| "#{ivar}=#{instance_variable_get(ivar).inspect}" end.join(' ') synthetic_vars = 'num_hydrated_pages=%d' % @pages.length '#<%s:%s %s %s>' % [self.class, oid_str, synthetic_vars, ivars_str] end |
#read_page(io, page_i) ⇒ Object
Reads the requested page from the given IO
172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 |
# File 'lib/care.rb', line 172 def read_page(io, page_i) Measurometer.increment_counter('format_parser.parser.care.page_reads_from_upsteam', 1) io.seek(page_i * @page_size) read_result = Measurometer.instrument('format_parser.care.read_page') { io.read(@page_size) } if read_result.nil? # If the read went past the end of the IO the read result will be nil, # so we know our IO is exhausted here @lowest_known_empty_page = page_i if @lowest_known_empty_page.nil? || @lowest_known_empty_page > page_i elsif read_result.bytesize < @page_size # If we read less than we initially wanted we know there are no pages # to read following this one, so we can also optimize @lowest_known_empty_page = page_i + 1 end read_result end |