Module: Repertoire::Faceting::Model::ClassMethods
- Defined in:
- lib/repertoire-faceting/model.rb
Overview
Facet declarations
Facet declarations consist of a facet name and an optional ActiveRecord relation that describes the attribute column to facet over, any joins necessary to reach it, and other defaults. They work similarly to Rails scoped queries. For example:
class Nobelist < ActiveRecord::Base
include Repertoire::Faceting::Model
has_many :affiliations
facet :discipline
facet :degree, joins(:affiliations).group('affiliations.degree')
facet :birthdate, order('birthdate ASC')
end
Implicitly, any facet declaration is an SQL aggregate that divides the attribute values into discrete groups. When no relation is provided, /model/.group(/facet name/) is assumed by default. So the discipline facet declaration above is equivalent to
facet :discipline, group(:discipline)
and the grouping on degree could be left out. You can use this behavior to construct a facet from differently-named columns:
facet :balloon_color, group(:color)
or to synthesize values using an SQL expression:
facet :birth_year, group('EXTRACT(year FROM birthdate)')
As shown above, facets can be constructed from an arbitrary set of joined tables.
Nested facets
Facets can be built from a nested hierarchy of values by providing multiple group columns. In this case, value counts are aggregated at each level in turn.
facet :birth_place, group(:birth_country, :birth_state, :birth_city)
As for basic facets, nested facets may consist of SQL expressions. This is particularly useful in faceting over data in more complex types such as dates or geographical regions:
facet :birth_date, group('EXTRACT(year FROM birthdate)', 'EXTRACT(month FROM birthdate)', 'EXTRACT(day FROM birthdate)')
Facet options
The following query options can also be specified in the facet declaration.
- order
-
Order for facet value counts. Two computed columns are available, “count” and another with the facet’s name. For example, to order a genre facet alphanumerically within each descending count:
facet :genre, order('count DESC', 'genre ASC')
- nils
-
Whether to include null facet values in the results or not. Defaults to true:
facet :genre, nils(false)
- minimum
-
Cut-off below which facet value counts should not be listed:
facet :genre, minimum(5)
Executing Queries
Facet value count and result queries follow the format familiar from ActiveRecord group and count aggregation. This allows you to execute a facet value count query given a base set of records.
Nobelist.where("name LIKE 'Robert%'").count(:discipline)
To incorporate refinements on other facets on this model, use refine:
Nobelist.refine(:nobel_year => 2001, :degree => 'Ph.D.').count(:discipline)
If you provide multiple values for a simple facet refinement, they are interpreted as a logical “or”:
Nobelist.refine(:nobel_year => [2000, 2001]) # => 'WHERE name IN (2000, 2001)'
In the case of a nested facet, multiple values identify levels in the taxonomy:
Nobelist.refine(:birth_place => [ 'Ukraine', 'Kiev' ]).count(:nobel_year)
Refinements are integrated into result queries automatically:
Nobelist.refine(:birth_place => [ 'Ukraine', 'Kiev' ]).all
Index access
As you will have noted already, facet counts and queries are quite similar to their ActiveRecord/SQL counterparts. Behind the scenes, the Repertoire faceting code re-writes your query.
Facets defined on associations are joined and limited automatically, and facet indices in the database are used wherever possible rather than querying the model table.
Facet registration
The system supports plugins for new facet type implementations. When a new facet is declared, the available facet implementations are polled until one claims the new relation. For example, of the built-in facet implementations, BasicFacet claims facets with a single group column, and NestedFacet claims those with several group columns. If several facet implementations claim a facet, the one that registered later wins.
See AbstractFacet for more details.
Instance Method Summary collapse
-
#facet(name, rel = nil) ⇒ Object
Declare a facet by name.
-
#facet?(name) ⇒ Boolean
Is there a facet by this name?.
-
#facet_cache_key(facet = nil) ⇒ Object
Return a key suitable for use in HTTP headers, either for the base model table or one of its facets.
-
#facet_names ⇒ Object
All defined facets by name.
-
#faceting_id ⇒ Object
Returns the name of the id column to use for constructing bitset signatures over this model.
-
#index_facets(next_indexes = nil, next_faceting_id = nil) ⇒ Object
Drops any unused facet indices, updates its packed ids, then recreates indices for the facets with the provided names.
-
#indexed_facets ⇒ Object
Returns a list of the facets that currently have indices declared.
-
#reset_column_information ⇒ Object
Over-rides reset_column_information in ActiveRecord::ModelSchema.
-
#scoped_all ⇒ Object
Once clients have migrated to Rails 4, delete and replace with ‘all’ where this is called.
-
#signature_wastage(signature_column = nil) ⇒ Object
Returns the proportion of wasted slots in 0..max(id).
-
#stat_table(timestamp_column = nil) ⇒ Object
Returns the row count and most recent update timestamp for a model table; or nil if there is no updated_at field.
Instance Method Details
#facet(name, rel = nil) ⇒ Object
Declare a facet by name
125 126 127 128 129 130 131 132 133 134 135 |
# File 'lib/repertoire-faceting/model.rb', line 125 def facet(name, rel=nil) name = name.to_sym # default: group by column with facet name, order by count descending rel ||= scoped_all rel = rel.group(name) if rel.group_values.empty? rel = rel.order(["count DESC", "#{name} ASC"]) if rel.order_values.empty? # locate facet implementation that can handle relation facets[name] = Facets::AbstractFacet.mixin(name, rel) end |
#facet?(name) ⇒ Boolean
Is there a facet by this name?
138 139 140 |
# File 'lib/repertoire-faceting/model.rb', line 138 def facet?(name) facets.key?(name.to_sym) end |
#facet_cache_key(facet = nil) ⇒ Object
Return a key suitable for use in HTTP headers, either for the base model table or one of its facets.
If the name of an indexed facet is provided, the timestamp of the facet index table is used to construct the facet’s cache key. This ensures that facet counts expire when the facet is re-indexed (and not before).
If the name of an unindexed facet is given, a cache key for the entire model table is provided. This ensures that faceted browsers over live data expire facet count caches when the base model table is updated.
Calling facet_cache_key with no arguments returns a cache key for the entire model table. The key is a combination of the most recent update_at column value and the table row count. If the model has no updated_at attribute, caching is disabled.
See the FAQ for additional information.
280 281 282 283 284 285 286 287 288 289 290 |
# File 'lib/repertoire-faceting/model.rb', line 280 def facet_cache_key(facet = nil) facet = facet.try(:to_s) result = nil stats = facets[facet].stat_table if facet_names.include?(facet) stats ||= stat_table result = { :etag => stats[:count], :last_modified => stats[:timestamp] } if stats.present? result end |
#facet_names ⇒ Object
All defined facets by name
143 144 145 |
# File 'lib/repertoire-faceting/model.rb', line 143 def facet_names facets.keys end |
#faceting_id ⇒ Object
Returns the name of the id column to use for constructing bitset signatures over this model.
247 248 249 |
# File 'lib/repertoire-faceting/model.rb', line 247 def faceting_id @faceting_id ||= [PACKED_SIGNATURE_COLUMN, DEFAULT_SIGNATURE_COLUMN].detect { |c| column_names.include?(c) } end |
#index_facets(next_indexes = nil, next_faceting_id = nil) ⇒ Object
Drops any unused facet indices, updates its packed ids, then recreates indices for the facets with the provided names. If no names are provided, then the existing facet indices are refreshed.
If a signature id column name is provided, it will be used to build the bitset indices. Otherwise the indexer will add or remove a new packed id column as appropriate.
Examples:
Refresh existing facet indices
Nobelist.index_facets
Adjust which facets are indexed
Nobelist.index_facets([:degree, :nobel_year])
Drop all facet indices, but add/remove packed id as necessary
Nobelist.index_facets([])
Drop absolutely everything, force manual faceting using ‘id’
column
Nobelist.index_facets([], 'id')
179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 |
# File 'lib/repertoire-faceting/model.rb', line 179 def index_facets(next_indexes=nil, next_faceting_id=nil) # default: update existing facets current_indexes = indexed_facets next_indexes ||= current_indexes # sanity checks current_indexes = current_indexes.map { |name| name.to_sym } next_indexes = next_indexes.map { |name| name.to_sym } (current_indexes | next_indexes).each do |name| raise QueryError, "Unknown facet #{name}" unless facet?(name) end # determine best column for signature bitsets, unless set manually next_faceting_id ||= if signature_wastage(DEFAULT_SIGNATURE_COLUMN) < SIGNATURE_WASTAGE_THRESHOLD DEFAULT_SIGNATURE_COLUMN else PACKED_SIGNATURE_COLUMN end # default behavior: no changes to packed id column drop_packed_id = create_packed_id = false # default behavior: adjust facet indexes drop_list = current_indexes - next_indexes refresh_list = next_indexes & current_indexes create_list = next_indexes - current_indexes # adding or removing a packed id column if next_faceting_id != faceting_id drop_packed_id = (next_faceting_id == DEFAULT_SIGNATURE_COLUMN) create_packed_id = (next_faceting_id != DEFAULT_SIGNATURE_COLUMN) end # special case: repacking an existing packed id column if next_faceting_id == faceting_id && next_faceting_id != DEFAULT_SIGNATURE_COLUMN drop_packed_id = create_packed_id = (signature_wastage > SIGNATURE_WASTAGE_THRESHOLD) end # changing item ids invalidates all existing facet indices if drop_packed_id || create_packed_id drop_list, refresh_list, create_list = [ current_indexes, [], next_indexes ] end connection.transaction do # adjust faceting id column connection.remove_column(table_name, PACKED_SIGNATURE_COLUMN) if drop_packed_id connection.add_column(table_name, PACKED_SIGNATURE_COLUMN, "SERIAL") if create_packed_id @faceting_id = next_faceting_id # adjust facet indices drop_list.each { |name| facets[name].drop_index } refresh_list.each { |name| facets[name].refresh_index } create_list.each { |name| facets[name].create_index } end # TODO. in a nested transaction, this would need to fire after the final commit... reset_column_information end |
#indexed_facets ⇒ Object
Returns a list of the facets that currently have indices declared
148 149 150 |
# File 'lib/repertoire-faceting/model.rb', line 148 def indexed_facets connection.indexed_facets(table_name) end |
#reset_column_information ⇒ Object
Over-rides reset_column_information in ActiveRecord::ModelSchema
240 241 242 243 |
# File 'lib/repertoire-faceting/model.rb', line 240 def reset_column_information @faceting_id = nil super end |
#scoped_all ⇒ Object
Once clients have migrated to Rails 4, delete and replace with ‘all’ where this is called
295 296 297 |
# File 'lib/repertoire-faceting/model.rb', line 295 def scoped_all where(nil) end |
#signature_wastage(signature_column = nil) ⇒ Object
Returns the proportion of wasted slots in 0..max(id)
252 253 254 255 |
# File 'lib/repertoire-faceting/model.rb', line 252 def signature_wastage(signature_column = nil) signature_column ||= faceting_id connection.signature_wastage(table_name, signature_column) end |
#stat_table(timestamp_column = nil) ⇒ Object
Returns the row count and most recent update timestamp for a model table; or nil if there is no updated_at field.
259 260 261 262 |
# File 'lib/repertoire-faceting/model.rb', line 259 def stat_table( = nil) ||= "updated_at" connection.stat_table(table_name) if column_names.include?() end |