Class: AcceptLanguage::Matcher Private
- Inherits:
-
Object
- Object
- AcceptLanguage::Matcher
- Defined in:
- lib/accept_language/matcher.rb
Overview
This class is part of a private API. You should avoid using this class if possible, as it may be removed or be changed in the future.
This class is used internally by Parser#match and should not be instantiated directly. Use parse followed by Parser#match instead.
Language Preference Matcher
Matcher implements the Basic Filtering matching scheme defined in RFC 4647 Section 3.3.1. It takes parsed language preferences (from Parser) and determines the optimal language choice from a set of available languages.
Overview
The matching process balances multiple factors:
-
**Quality values**: Higher q-values indicate stronger user preference
-
**Declaration order**: Tie-breaker when q-values are equal
-
**Prefix matching**: Allows
ento matchen-US,en-GB, etc. -
Wildcards: The * range matches any otherwise unmatched language
-
Exclusions: Languages with q=0 are explicitly unacceptable
RFC 4647 Section 3.3.1 Compliance
This implementation follows the Basic Filtering matching rules:
> A language-range matches a language-tag if it exactly equals the tag, > or if it exactly equals a prefix of the tag such that the first tag > character following the prefix is “-”.
This means:
-
enmatchesen,en-US,en-GB,en-Latn-US -
en-USmatches onlyen-US(notenoren-GB) -
endoes NOT matcheng(no hyphen boundary)
Quality Value Semantics
Quality values have specific meanings per RFC 7231 Section 5.3.1:
-
q=1 (or omitted): Most preferred
-
0 < q < 1: Acceptable with relative preference
-
q=0: Explicitly NOT acceptable
The q=0 case is special: it doesn’t just indicate low preference, it completely excludes the language from consideration. This is used with wildcards to express “any language except X”:
Accept-Language: *, en;q=0
Wildcard Behavior
The wildcard * matches any language not explicitly matched by another language range. This behavior is specific to HTTP, as noted in RFC 4647 Section 3.3.1. When processing a wildcard:
-
Collect all explicitly listed language ranges (excluding the wildcard)
-
Find available languages that don’t match any explicit range
-
Return the first such language
This ensures explicit preferences always take priority over the wildcard.
Internal Design
The Matcher separates languages into two categories during initialization:
-
preferred_langtags: Languages with q > 0, sorted by descending quality
-
excluded_langtags: Languages with q = 0 (explicitly unacceptable)
This separation optimizes the matching algorithm by allowing quick filtering of excluded languages before attempting matches.
Thread Safety
Matcher instances are immutable after initialization. Both preferred_langtags and excluded_langtags are frozen, making instances safe for concurrent use.
Constant Summary collapse
- HYPHEN =
This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.
The hyphen character used as a subtag delimiter in language tags.
Per RFC 4647 Section 3.3.1, prefix matching must respect hyphen boundaries. A language range matches a language tag only if the character immediately following the prefix is a hyphen.
"-"- LANGTAG_TYPE_ERROR =
This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.
Error message raised when an available language tag is not a Symbol.
This guards against accidental non-Symbol values in the available languages array, which would cause unexpected behavior during matching.
"Language tag must be a Symbol"- WILDCARD =
This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.
The wildcard character that matches any language not explicitly listed.
Per RFC 4647 Section 3.3.1, the wildcard has special semantics in HTTP:
-
It matches any language not matched by other ranges
-
*;q=0 makes all unlisted languages unacceptable
-
It has lower effective priority than explicit language ranges
-
"*"
Instance Attribute Summary collapse
-
#excluded_langtags ⇒ Set<String>
readonly
private
Language ranges explicitly marked as unacceptable (+q=0+).
-
#preferred_langtags ⇒ Array<String>
readonly
private
Language ranges sorted by preference (descending quality value).
Instance Method Summary collapse
-
#call(*available_langtags) ⇒ Symbol?
private
Finds the best matching language from the available options.
-
#initialize(**languages_range) ⇒ Matcher
constructor
private
Creates a new Matcher instance from parsed language preferences.
Constructor Details
#initialize(**languages_range) ⇒ Matcher
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
Creates a new Matcher instance from parsed language preferences.
The initialization process:
-
Separates excluded ranges (+q=0+) from preferred ranges (+q > 0+)
-
Sorts preferred ranges by descending quality value
-
Preserves original order for ranges with equal quality (stable sort)
Exclusion Rules
Only specific language ranges with q=0 are added to the exclusion set. The wildcard * is explicitly NOT added even when *;q=0 is present, because:
-
Adding * to exclusions would break prefix matching logic
-
*;q=0 semantics are: “no unlisted language is acceptable”
-
This is achieved by having an empty preferred_langtags (no wildcards)
Stable Sorting
Ruby’s sort_by is stable since Ruby 2.0, meaning elements with equal sort keys maintain their relative order. This ensures that when multiple languages have the same quality value, the first one declared in the Accept-Language header wins.
191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 |
# File 'lib/accept_language/matcher.rb', line 191 def initialize(**languages_range) @excluded_langtags = ::Set[] languages_range.each do |langtag, quality| next unless quality.zero? && !wildcard?(langtag) # Exclude specific language ranges, but NOT the wildcard. # When "*;q=0" is specified, all non-listed languages become # unacceptable implicitly (they won't match any preferred_langtags). # Adding "*" to excluded_langtags would break prefix_match? logic. @excluded_langtags << langtag end # Sort by descending quality. Ruby's sort_by is stable, so languages # with identical quality values preserve their original order from # the Accept-Language header (first declared = higher priority). @preferred_langtags = languages_range .reject { |_, quality| quality.zero? } .sort_by { |_, quality| -quality } .map(&:first) end |
Instance Attribute Details
#excluded_langtags ⇒ Set<String> (readonly)
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
The wildcard * is never added to this set, even when *;q=0 is specified. Wildcard exclusion is handled implicitly: when *;q=0 and no other languages have q > 0, the preferred_langtags list is empty, resulting in no matches.
Language ranges explicitly marked as unacceptable (+q=0+).
These ranges are filtered out from available languages before any matching occurs. Exclusions apply via prefix matching, so excluding en also excludes en-US, en-GB, etc.
137 138 139 |
# File 'lib/accept_language/matcher.rb', line 137 def @excluded_langtags end |
#preferred_langtags ⇒ Array<String> (readonly)
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
Language ranges sorted by preference (descending quality value).
This array contains only ranges with q > 0, ordered from most preferred to least preferred. When quality values are equal, the original declaration order from the Accept-Language header is preserved.
The stable sort guarantee ensures deterministic matching: given the same header and available languages, the result is always the same.
156 157 158 |
# File 'lib/accept_language/matcher.rb', line 156 def @preferred_langtags end |
Instance Method Details
#call(*available_langtags) ⇒ Symbol?
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
Finds the best matching language from the available options.
Algorithm
-
Filter: Remove available languages that match any excluded range
-
Match: For each preferred range (in quality order):
-
If it’s a wildcard, return the first available language not matching any other preferred range
-
Otherwise, return the first available language that matches via exact match or prefix match
-
-
Result: Return the first match found, or
nilif none
Return Value
The returned value preserves the exact form (case) of the matched element from available_langtags. This is important for direct use with APIs like I18n.locale= that may be case-sensitive.
250 251 252 253 254 255 |
# File 'lib/accept_language/matcher.rb', line 250 def call(*) = drop_unacceptable(*) return if .empty? find_best_match() end |