Class: AcceptLanguage::Matcher Private

Inherits:
Object
  • Object
show all
Defined in:
lib/accept_language/matcher.rb

Overview

This class is part of a private API. You should avoid using this class if possible, as it may be removed or be changed in the future.

Note:

This class is used internally by Parser#match and should not be instantiated directly. Use parse followed by Parser#match instead.

Language Preference Matcher

Matcher implements the Basic Filtering matching scheme defined in RFC 4647 Section 3.3.1. It takes parsed language preferences (from Parser) and determines the optimal language choice from a set of available languages.

Overview

The matching process balances multiple factors:

  1. **Quality values**: Higher q-values indicate stronger user preference

  2. **Declaration order**: Tie-breaker when q-values are equal

  3. **Prefix matching**: Allows en to match en-US, en-GB, etc.

  4. Wildcards: The * range matches any otherwise unmatched language

  5. Exclusions: Languages with q=0 are explicitly unacceptable

RFC 4647 Section 3.3.1 Compliance

This implementation follows the Basic Filtering matching rules:

> A language-range matches a language-tag if it exactly equals the tag, > or if it exactly equals a prefix of the tag such that the first tag > character following the prefix is “-”.

This means:

  • en matches en, en-US, en-GB, en-Latn-US

  • en-US matches only en-US (not en or en-GB)

  • en does NOT match eng (no hyphen boundary)

Quality Value Semantics

Quality values have specific meanings per RFC 7231 Section 5.3.1:

  • q=1 (or omitted): Most preferred

  • 0 < q < 1: Acceptable with relative preference

  • q=0: Explicitly NOT acceptable

The q=0 case is special: it doesn’t just indicate low preference, it completely excludes the language from consideration. This is used with wildcards to express “any language except X”:

Accept-Language: *, en;q=0

Wildcard Behavior

The wildcard * matches any language not explicitly matched by another language range. This behavior is specific to HTTP, as noted in RFC 4647 Section 3.3.1. When processing a wildcard:

  1. Collect all explicitly listed language ranges (excluding the wildcard)

  2. Find available languages that don’t match any explicit range

  3. Return the first such language

This ensures explicit preferences always take priority over the wildcard.

Internal Design

The Matcher separates languages into two categories during initialization:

  • preferred_langtags: Languages with q > 0, sorted by descending quality

  • excluded_langtags: Languages with q = 0 (explicitly unacceptable)

This separation optimizes the matching algorithm by allowing quick filtering of excluded languages before attempting matches.

Thread Safety

Matcher instances are immutable after initialization. Both preferred_langtags and excluded_langtags are frozen, making instances safe for concurrent use.

Examples:

Internal usage (via Parser)

# Don't do this:
matcher = AcceptLanguage::Matcher.new("en" => 1000, "fr" => 800)

# Do this instead:
AcceptLanguage.parse("en, fr;q=0.8").match(:en, :fr)

See Also:

Since:

  • 1.0.0

Constant Summary collapse

HYPHEN =

This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.

The hyphen character used as a subtag delimiter in language tags.

Per RFC 4647 Section 3.3.1, prefix matching must respect hyphen boundaries. A language range matches a language tag only if the character immediately following the prefix is a hyphen.

Returns:

  • (String)

    “-”

Since:

  • 1.0.0

"-"
LANGTAG_TYPE_ERROR =

This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.

Error message raised when an available language tag is not a Symbol.

This guards against accidental non-Symbol values in the available languages array, which would cause unexpected behavior during matching.

Returns:

  • (String)

Since:

  • 1.0.0

"Language tag must be a Symbol"
WILDCARD =

This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.

The wildcard character that matches any language not explicitly listed.

Per RFC 4647 Section 3.3.1, the wildcard has special semantics in HTTP:

  • It matches any language not matched by other ranges

  • *;q=0 makes all unlisted languages unacceptable

  • It has lower effective priority than explicit language ranges

Returns:

  • (String)

    “*”

Since:

  • 1.0.0

"*"

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(**languages_range) ⇒ Matcher

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Creates a new Matcher instance from parsed language preferences.

The initialization process:

  1. Separates excluded ranges (+q=0+) from preferred ranges (+q > 0+)

  2. Sorts preferred ranges by descending quality value

  3. Preserves original order for ranges with equal quality (stable sort)

Exclusion Rules

Only specific language ranges with q=0 are added to the exclusion set. The wildcard * is explicitly NOT added even when *;q=0 is present, because:

  • Adding * to exclusions would break prefix matching logic

  • *;q=0 semantics are: “no unlisted language is acceptable”

  • This is achieved by having an empty preferred_langtags (no wildcards)

Stable Sorting

Ruby’s sort_by is stable since Ruby 2.0, meaning elements with equal sort keys maintain their relative order. This ensures that when multiple languages have the same quality value, the first one declared in the Accept-Language header wins.

Examples:

Matcher.new("en" => 1000, "fr" => 800, "de" => 0)
# preferred_langtags: ["en", "fr"]
# excluded_langtags: #<Set: {"de"}>

Parameters:

  • languages_range (Hash{String => Integer})

    language ranges mapped to quality values (0-1000), as produced by Parser

Since:

  • 1.0.0



191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
# File 'lib/accept_language/matcher.rb', line 191

def initialize(**languages_range)
  @excluded_langtags = ::Set[]

  languages_range.each do |langtag, quality|
    next unless quality.zero? && !wildcard?(langtag)

    # Exclude specific language ranges, but NOT the wildcard.
    # When "*;q=0" is specified, all non-listed languages become
    # unacceptable implicitly (they won't match any preferred_langtags).
    # Adding "*" to excluded_langtags would break prefix_match? logic.
    @excluded_langtags << langtag
  end

  # Sort by descending quality. Ruby's sort_by is stable, so languages
  # with identical quality values preserve their original order from
  # the Accept-Language header (first declared = higher priority).
  @preferred_langtags = languages_range
                        .reject { |_, quality| quality.zero? }
                        .sort_by { |_, quality| -quality }
                        .map(&:first)
end

Instance Attribute Details

#excluded_langtagsSet<String> (readonly)

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Note:

The wildcard * is never added to this set, even when *;q=0 is specified. Wildcard exclusion is handled implicitly: when *;q=0 and no other languages have q > 0, the preferred_langtags list is empty, resulting in no matches.

Language ranges explicitly marked as unacceptable (+q=0+).

These ranges are filtered out from available languages before any matching occurs. Exclusions apply via prefix matching, so excluding en also excludes en-US, en-GB, etc.

Examples:

# For "*, en;q=0, de;q=0"
matcher.excluded_langtags
# => #<Set: {"en", "de"}>

Returns:

  • (Set<String>)

    downcased language ranges with q=0

Since:

  • 1.0.0



137
138
139
# File 'lib/accept_language/matcher.rb', line 137

def excluded_langtags
  @excluded_langtags
end

#preferred_langtagsArray<String> (readonly)

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Language ranges sorted by preference (descending quality value).

This array contains only ranges with q > 0, ordered from most preferred to least preferred. When quality values are equal, the original declaration order from the Accept-Language header is preserved.

The stable sort guarantee ensures deterministic matching: given the same header and available languages, the result is always the same.

Examples:

# For "fr;q=0.8, en, de;q=0.9"
# Sorted: en (q=1), de (q=0.9), fr (q=0.8)
matcher.preferred_langtags
# => ["en", "de", "fr"]

Returns:

  • (Array<String>)

    downcased language ranges, highest quality first

Since:

  • 1.0.0



156
157
158
# File 'lib/accept_language/matcher.rb', line 156

def preferred_langtags
  @preferred_langtags
end

Instance Method Details

#call(*available_langtags) ⇒ Symbol?

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Finds the best matching language from the available options.

Algorithm

  1. Filter: Remove available languages that match any excluded range

  2. Match: For each preferred range (in quality order):

    • If it’s a wildcard, return the first available language not matching any other preferred range

    • Otherwise, return the first available language that matches via exact match or prefix match

  3. Result: Return the first match found, or nil if none

Return Value

The returned value preserves the exact form (case) of the matched element from available_langtags. This is important for direct use with APIs like I18n.locale= that may be case-sensitive.

Examples:

Basic matching

matcher = Matcher.new("en" => 1000, "fr" => 800)
matcher.call(:en, :fr, :de)
# => :en

Prefix matching

matcher = Matcher.new("en" => 1000)
matcher.call(:"en-US", :"en-GB")
# => :"en-US"

With exclusion

matcher = Matcher.new("*" => 500, "en" => 0)
matcher.call(:en, :fr)
# => :fr

Parameters:

  • available_langtags (Array<Symbol>)

    languages to match against

Returns:

  • (Symbol, nil)

    the best matching language, or nil

Raises:

  • (TypeError)

    if any available language tag is not a Symbol

Since:

  • 1.0.0



250
251
252
253
254
255
# File 'lib/accept_language/matcher.rb', line 250

def call(*available_langtags)
  filtered_tags = drop_unacceptable(*available_langtags)
  return if filtered_tags.empty?

  find_best_match(filtered_tags)
end