Class: Gitgo::Index
- Inherits:
-
Object
- Object
- Gitgo::Index
- Defined in:
- lib/gitgo/index.rb,
lib/gitgo/index/idx_file.rb,
lib/gitgo/index/sha_file.rb
Overview
Index provides an index of documents used to expedite searches. Index structures it’s data into a branch-specific directory structure:
.git/gitgo/refs/[branch]/index
|- filter
| `- key
| `- value
|
|- head
|- list
`- graph_map
The files contain the following data (in conceptual order):
head The user commit sha at which the last reindex occured.
list A list of H* packed shas representing all of the documents
accessible by the gitgo branch. The index (idx) of the sha in
list serves as an identifier for the sha in map and filters.
map A list of L* packed idx pairs mapping a document to its graph
head.
[value] A list of L* packed idx that match the key-value pair. These
lists act as filters in searches.
The packing format for each of the index files was chosen for performance; both to minimize the footprint of the file and optimize the usage of the file data. – Index also maintains a cache of temporary files that auto-expire after a certain period of time. The temporary files contain H* packed shas and represent the results of various queries, such as rev-lists. ++
Usage
Index files are used primarily to select documents based on various filters. For example, to select the shas for all comments tagged as ‘important’ you would do this:
index = Index.new('path')
comment = index['type']['comment']
important = index['tag']['important']
selected = comment & important
heads = selected.collect {|id| idx.map[id] }
shas = heads.collect {|id| idx.list[id] }.uniq
The array operations are very quick because the filters are composed of integers, as is the map. The final step resolves the integers to shas, but this too is simply an array lookup. The select method encapsulates this logic:
shas = index.select(
:all => {'type' => 'comment', 'tag' => 'important'},
:map => true,
:shas => true
)
Importantly, any of the index files can contain duplication without affecting the results of select; this allows new documents to be quickly added into a filter or appended to list/map. As needed or convenient, the index can compact itself and remove duplication.
Defined Under Namespace
Constant Summary collapse
- HEAD =
A file containing the ref at which the last index was performed; used to determine when a reindex is required relative to some other ref.
'head'
- MAP =
An idx file mapping shas to their graph heads
'map'
- LIST =
A sha file listing indexed shas; the index of the sha is used as an identifer for the sha in all idx files.
'list'
- FILTER =
The filter directory
'filter'
Instance Attribute Summary collapse
-
#cache ⇒ Object
readonly
Returns an in-memory, self-filling cache of idx files.
-
#head_file ⇒ Object
readonly
The head file for self.
-
#list_file ⇒ Object
readonly
The list file for self.
-
#map_file ⇒ Object
readonly
The map file for self.
Instance Method Summary collapse
-
#[](key) ⇒ Object
Returns cache, a self-filling hash of filter values.
- #all(*keys) ⇒ Object
- #assoc(source, target, type) ⇒ Object
-
#clear ⇒ Object
Clears all index files, and the cache.
- #compact ⇒ Object
- #create(source) ⇒ Object
- #delete(source) ⇒ Object
- #each_idx(key, values) ⇒ Object
- #each_sha(key, values, filter = delete_filter) ⇒ Object
-
#graph_head_idx(idx) ⇒ Object
Return the graph head idx for the specified idx.
-
#head ⇒ Object
Returns the sha in the head_file, if it exists, and nil otherwise.
-
#idx(sha) ⇒ Object
Returns the idx for sha, as specified in list.
-
#initialize(path) ⇒ Index
constructor
A new instance of Index.
- #join(key, *values) ⇒ Object
-
#keys ⇒ Object
Returns a list of possible index keys.
- #link(source, target) ⇒ Object
-
#list ⇒ Object
Returns the contents of list_file, as an array.
-
#map ⇒ Object
Returns the contents of map_file, as a hash.
-
#path(*segments) ⇒ Object
Returns the segments joined to the path used to initialize self.
-
#reset ⇒ Object
Clears the cache.
- #select(options = {}) ⇒ Object
- #update(source, target) ⇒ Object
-
#values(key) ⇒ Object
Returns a list of possible values for the specified index key.
-
#write(sha = nil) ⇒ Object
Writes cached changes.
Constructor Details
#initialize(path) ⇒ Index
Returns a new instance of Index.
97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 |
# File 'lib/gitgo/index.rb', line 97 def initialize(path) @path = path @head_file = File.(HEAD, path) @map_file = File.(MAP, path) @list_file = File.(LIST, path) @cache = Hash.new do |key_hash, key| key_hash[key] = Hash.new do |value_hash, value| value_hash[value] = begin index = self.path(FILTER, key, value) File.file?(index) ? IdxFile.read(index) : [] end end end end |
Instance Attribute Details
#cache ⇒ Object (readonly)
Returns an in-memory, self-filling cache of idx files
95 96 97 |
# File 'lib/gitgo/index.rb', line 95 def cache @cache end |
#head_file ⇒ Object (readonly)
The head file for self
86 87 88 |
# File 'lib/gitgo/index.rb', line 86 def head_file @head_file end |
#list_file ⇒ Object (readonly)
The list file for self
92 93 94 |
# File 'lib/gitgo/index.rb', line 92 def list_file @list_file end |
#map_file ⇒ Object (readonly)
The map file for self
89 90 91 |
# File 'lib/gitgo/index.rb', line 89 def map_file @map_file end |
Instance Method Details
#[](key) ⇒ Object
138 139 140 |
# File 'lib/gitgo/index.rb', line 138 def [](key) cache[key] end |
#all(*keys) ⇒ Object
226 227 228 229 230 231 232 233 234 235 |
# File 'lib/gitgo/index.rb', line 226 def all(*keys) results = [] keys.collect do |key| values(key).each do |value| results.concat(cache[key][value]) end end results.uniq! results end |
#assoc(source, target, type) ⇒ Object
162 163 164 165 166 167 168 169 |
# File 'lib/gitgo/index.rb', line 162 def assoc(source, target, type) source_idx = idx(source) target_idx = idx(target) map[target_idx] = source_idx tail_filter << source_idx unless type == :create delete_filter << source_idx if type == :delete self end |
#clear ⇒ Object
Clears all index files, and the cache.
345 346 347 348 349 350 |
# File 'lib/gitgo/index.rb', line 345 def clear if File.exists?(path) FileUtils.rm_r(path) end reset end |
#compact ⇒ Object
290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 |
# File 'lib/gitgo/index.rb', line 290 def compact # reindex shas in list, and create an idx map for updating idxs old_list = {} list.each {|sha| old_list[old_list.length] = sha } list.uniq! new_list = {} list.each {|sha| new_list[sha] = new_list.length} idx_map = {} old_list.each_pair {|idx, sha| idx_map[idx] = new_list[sha]} # update/deconvolute mapped idx values new_map = {} map.each_pair {|idx, head_idx| new_map[idx_map[idx]] = idx_map[head_idx] } new_map.keys.each {|idx| new_map[idx] = deconvolute(idx, new_map) || idx } @map = new_map # update filter values @cache.values.each do |value_hash| value_hash.values.each do |idxs| idxs.collect! {|idx| idx_map[idx] }.uniq! end end self end |
#create(source) ⇒ Object
171 172 173 174 175 176 177 178 179 180 181 |
# File 'lib/gitgo/index.rb', line 171 def create(source) source_idx = idx(source) head_idx = graph_head_idx(source_idx) unless head_idx.nil? || head_idx == source_idx raise "create graph fail: #{source} (already associated with graph #{list[head_idx]})" end map[source_idx] = source_idx self end |
#delete(source) ⇒ Object
191 192 193 194 195 196 |
# File 'lib/gitgo/index.rb', line 191 def delete(source) source_idx = idx(source) tail_filter << source_idx delete_filter << source_idx self end |
#each_idx(key, values) ⇒ Object
237 238 239 240 241 242 243 244 245 246 247 |
# File 'lib/gitgo/index.rb', line 237 def each_idx(key, values) unless values.respond_to?(:each) values = [values] end values.each do |value| cache[key][value].each do |idx| yield idx end end end |
#each_sha(key, values, filter = delete_filter) ⇒ Object
249 250 251 252 253 254 |
# File 'lib/gitgo/index.rb', line 249 def each_sha(key, values, filter=delete_filter) each_idx(key, values) do |idx| next if filter.include?(idx) yield list[idx] end end |
#graph_head_idx(idx) ⇒ Object
Return the graph head idx for the specified idx.
158 159 160 |
# File 'lib/gitgo/index.rb', line 158 def graph_head_idx(idx) deconvolute(idx, map) end |
#head ⇒ Object
Returns the sha in the head_file, if it exists, and nil otherwise.
114 115 116 |
# File 'lib/gitgo/index.rb', line 114 def head File.exists?(head_file) ? File.open(head_file) {|io| io.read(40) } : nil end |
#idx(sha) ⇒ Object
Returns the idx for sha, as specified in list. If the sha is not in list then it is appended to list.
144 145 146 147 148 149 150 151 152 153 154 155 |
# File 'lib/gitgo/index.rb', line 144 def idx(sha) case when list[-1] == sha list.length - 1 when idx = list.index(sha) idx else idx = list.length list << sha idx end end |
#join(key, *values) ⇒ Object
256 257 258 |
# File 'lib/gitgo/index.rb', line 256 def join(key, *values) values.collect {|value| cache[key][value] }.flatten end |
#keys ⇒ Object
Returns a list of possible index keys.
199 200 201 202 203 204 205 206 207 208 209 210 |
# File 'lib/gitgo/index.rb', line 199 def keys keys = cache.keys Dir.glob(self.path(FILTER, '*')).select do |path| File.directory?(path) end.each do |path| keys << File.basename(path) end keys.uniq! keys end |
#link(source, target) ⇒ Object
183 184 185 |
# File 'lib/gitgo/index.rb', line 183 def link(source, target) associate :link, source, target end |
#list ⇒ Object
Returns the contents of list_file, as an array.
127 128 129 |
# File 'lib/gitgo/index.rb', line 127 def list @list ||= (File.exists?(list_file) ? ShaFile.read(list_file) : []) end |
#map ⇒ Object
Returns the contents of map_file, as a hash.
119 120 121 122 123 124 |
# File 'lib/gitgo/index.rb', line 119 def map @map ||= begin entries = File.exists?(map_file) ? IdxFile.read(map_file) : [] Hash[*entries] end end |
#path(*segments) ⇒ Object
Returns the segments joined to the path used to initialize self.
132 133 134 |
# File 'lib/gitgo/index.rb', line 132 def path(*segments) File.join(@path, *segments) end |
#reset ⇒ Object
Clears the cache.
337 338 339 340 341 342 |
# File 'lib/gitgo/index.rb', line 337 def reset @list = nil @map = nil @cache.clear self end |
#select(options = {}) ⇒ Object
260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 |
# File 'lib/gitgo/index.rb', line 260 def select(={}) basis = [:basis] || (0..list.length).to_a if all = [:all] each_pair(all) do |key, value| basis = basis & cache[key][value] break if basis.empty? end end if any = [:any] matches = [] each_pair(any) do |key, value| matches.concat cache[key][value] end basis = basis & matches end if [:map] basis.collect! {|idx| map[idx] } end if [:shas] basis.collect! {|idx| list[idx] } end basis.uniq! basis end |
#update(source, target) ⇒ Object
187 188 189 |
# File 'lib/gitgo/index.rb', line 187 def update(source, target) associate :update, source, target end |
#values(key) ⇒ Object
Returns a list of possible values for the specified index key.
213 214 215 216 217 218 219 220 221 222 223 224 |
# File 'lib/gitgo/index.rb', line 213 def values(key) values = cache[key].keys base = path(FILTER, key) start = base.length + 1 Dir.glob("#{base}/**/*").each do |value_path| values << value_path[start, value_path.length-start] end values.uniq! values end |
#write(sha = nil) ⇒ Object
Writes cached changes.
321 322 323 324 325 326 327 328 329 330 331 332 333 334 |
# File 'lib/gitgo/index.rb', line 321 def write(sha=nil) @cache.each_pair do |key, value_hash| value_hash.each_pair do |value, idx| IdxFile.write(path(FILTER, key, value), idx) end end FileUtils.mkdir_p(path) unless File.exists?(path) File.open(head_file, "w") {|io| io.write(sha) } if sha ShaFile.write(list_file, list.join) IdxFile.write(map_file, map.to_a.flatten) self end |