Class: Ur::ContentType
- Inherits:
-
String
- Object
- String
- Ur::ContentType
- Defined in:
- lib/ur/content_type.rb
Overview
Ur::ContentType represents a Content-Type header field. it parses the media type and its components, as well as any parameters.
this class aims to be permissive in what it will parse. it will not raise any error when given a malformed or syntactically invalid Content-Type string. fields and parameters parsed from invalid Content-Type strings are undefined, but this class generally tries to make the most sense of what it's given.
this class is based on RFCs:
- Hypertext Transfer Protocol (HTTP/1.1): Semantics and Content Section 3.1.1.1. Media Type https://tools.ietf.org/html/rfc7231#section-3.1.1.1
- Media Type Specifications and Registration Procedures https://tools.ietf.org/html/rfc6838
- Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies. Section 5.1. Syntax of the Content-Type Header Field https://tools.ietf.org/html/rfc2045#section-5.1
- Multipurpose Internet Mail Extensions (MIME) Part Two: Media Types https://tools.ietf.org/html/rfc2046
- Additional Media Type Structured Syntax Suffixes https://tools.ietf.org/html/rfc6839
Constant Summary collapse
- MEDIA_TYPE_REGEXP =
the character ranges in this SHOULD be significantly more restrictive, and the
/<subtype>
construct should not be optional. however, we'll aim to match whatever media type we are given.example:
MEDIA_TYPE_REGEXP.match('application/vnd.github+json').named_captures => { "media_type" => "application/vnd.github+json", "type" => "application", "subtype" => "vnd.github+json", "facet" => "vnd", "suffix" => "json", }
example of being more permissive than the spec allows:
MEDIA_TYPE_REGEXP.match('where the %$*! am I').named_captures => { "media_type" => "where the %$*! am I", "type" => "where the %$*! am I", "subtype" => nil, "facet" => nil, "suffix" => nil }
%r{ (?<media_type> # the media type includes the type and subtype (?<type>[^\/;\"]*) # the type precedes the first slash (?:\/ # slash (?<subtype> # the subtype includes the facet, the suffix, and bits in between (?: (?<facet>[^.+;\"]*) # the facet name comes before the first . in the subtype \. # dot )? [^\+;\"]* # anything between facet and suffix (?:\+ # plus (?<suffix>[^;\"]*) # optional suffix )? ) )? # the subtype should not be optional, but we will match a type without subtype anyway ) }x
- SOME_TEXT_SUBTYPES =
%w( x-www-form-urlencoded json json-seq jwt jose yaml x-yaml xml html css javascript ecmascript ).map(&:freeze).freeze
Instance Attribute Summary collapse
-
#facet ⇒ String?
readonly
the 'facet' portion of our media type.
-
#media_type ⇒ String?
readonly
the media type of this content type.
-
#parameters ⇒ Hash<String, String>
readonly
parameters of this content type.
-
#subtype ⇒ String?
readonly
the 'subtype' portion of our media type.
-
#suffix ⇒ String?
readonly
the 'suffix' portion of our media type.
-
#type ⇒ String?
readonly
the 'type' portion of our media type.
Instance Method Summary collapse
-
#binary?(unknown: true) ⇒ Boolean
does this content type appear to be binary? this library makes its best guess based on a very incomplete knowledge of which media types indicate binary or text.
-
#form_urlencoded? ⇒ Boolean
is this a
x-www-form-urlencoded
content type?. -
#initialize(*a) ⇒ ContentType
constructor
A new instance of ContentType.
-
#json? ⇒ Boolean
is this a JSON content type?.
-
#subtype?(other_subtype) ⇒ Boolean
is the 'subtype' portion of our media type equal (case-insensitive) to the given other_subtype.
-
#suffix?(other_suffix) ⇒ Boolean
is the 'suffix' portion of our media type equal (case-insensitive) to the given other_suffix.
-
#type?(other_type) ⇒ Boolean
is the 'type' portion of our media type equal (case-insensitive) to the given other_type.
-
#type_application? ⇒ Boolean
is the 'type' portion of our media type 'application'.
-
#type_audio? ⇒ Boolean
is the 'type' portion of our media type 'audio'.
-
#type_image? ⇒ Boolean
is the 'type' portion of our media type 'image'.
-
#type_message? ⇒ Boolean
is the 'type' portion of our media type 'message'.
-
#type_multipart? ⇒ Boolean
is the 'type' portion of our media type 'multipart'.
-
#type_text? ⇒ Boolean
is the 'type' portion of our media type 'text'.
-
#type_video? ⇒ Boolean
is the 'type' portion of our media type 'video'.
-
#xml? ⇒ Boolean
is this an XML content type?.
Constructor Details
#initialize(*a) ⇒ ContentType
Returns a new instance of ContentType.
73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 |
# File 'lib/ur/content_type.rb', line 73 def initialize(*a) super scanner = StringScanner.new(self) if scanner.scan(MEDIA_TYPE_REGEXP) @media_type = scanner[:media_type].strip.freeze if scanner[:media_type] @type = scanner[:type].strip.freeze if scanner[:type] @subtype = scanner[:subtype].strip.freeze if scanner[:subtype] @facet = scanner[:facet].strip.freeze if scanner[:facet] @suffix = scanner[:suffix].strip.freeze if scanner[:suffix] end @parameters = Hash.new do |h, k| if k.respond_to?(:downcase) && k != k.downcase h[k.downcase] else nil end end while scanner.scan(/(;\s*)+/) key = scanner.scan(/[^;=\"]*/) if key && scanner.scan(/=/) value = String.new until scanner.eos? || scanner.check(/;/) if scanner.scan(/\s+/) ws = scanner[0] # discard trailing whitespace. # other whitespace isn't technically valid but we are permissive so we put it in the value. value << ws unless scanner.eos? || scanner.check(/;/) elsif scanner.scan(/"/) until scanner.eos? || scanner.scan(/"/) if scanner.scan(/\\/) value << scanner.getch unless scanner.eos? end value << scanner.scan(/[^\"\\]*/) end else value << scanner.scan(/[^\s;\"]*/) end end @parameters[key.downcase.freeze] = value.freeze end end @parameters.freeze freeze end |
Instance Attribute Details
#facet ⇒ String? (readonly)
the 'facet' portion of our media type.
e.g. "vnd"
in content-type: application/vnd.github+json; charset="utf-8"
142 143 144 |
# File 'lib/ur/content_type.rb', line 142 def facet @facet end |
#media_type ⇒ String? (readonly)
the media type of this content type.
e.g. "application/vnd.github+json"
in content-type: application/vnd.github+json; charset="utf-8"
127 128 129 |
# File 'lib/ur/content_type.rb', line 127 def media_type @media_type end |
#parameters ⇒ Hash<String, String> (readonly)
parameters of this content type.
e.g. {"charset" => "utf-8"}
in content-type: application/vnd.github+json; charset="utf-8"
152 153 154 |
# File 'lib/ur/content_type.rb', line 152 def parameters @parameters end |
#subtype ⇒ String? (readonly)
the 'subtype' portion of our media type.
e.g. "vnd.github+json"
in content-type: application/vnd.github+json; charset="utf-8"
137 138 139 |
# File 'lib/ur/content_type.rb', line 137 def subtype @subtype end |
#suffix ⇒ String? (readonly)
the 'suffix' portion of our media type.
e.g. "json"
in content-type: application/vnd.github+json; charset="utf-8"
147 148 149 |
# File 'lib/ur/content_type.rb', line 147 def suffix @suffix end |
#type ⇒ String? (readonly)
the 'type' portion of our media type.
e.g. "application"
in content-type: application/vnd.github+json; charset="utf-8"
132 133 134 |
# File 'lib/ur/content_type.rb', line 132 def type @type end |
Instance Method Details
#binary?(unknown: true) ⇒ Boolean
does this content type appear to be binary? this library makes its best guess based on a very incomplete knowledge of which media types indicate binary or text.
196 197 198 199 200 201 202 203 204 205 206 207 208 |
# File 'lib/ur/content_type.rb', line 196 def binary?(unknown: true) return false if type_text? SOME_TEXT_SUBTYPES.each do |cmpsubtype| return false if (suffix ? suffix.casecmp?(cmpsubtype) : subtype ? subtype.casecmp?(cmpsubtype) : false) end # these are generally binary return true if type_image? || type_audio? || type_video? # we're out of ideas return unknown end |
#form_urlencoded? ⇒ Boolean
is this a x-www-form-urlencoded
content type?
224 225 226 |
# File 'lib/ur/content_type.rb', line 224 def form_urlencoded? suffix ? suffix.casecmp?('x-www-form-urlencoded') : subtype ? subtype.casecmp?('x-www-form-urlencoded') : false end |
#json? ⇒ Boolean
is this a JSON content type?
212 213 214 |
# File 'lib/ur/content_type.rb', line 212 def json? suffix ? suffix.casecmp?('json') : subtype ? subtype.casecmp?('json') : false end |
#subtype?(other_subtype) ⇒ Boolean
is the 'subtype' portion of our media type equal (case-insensitive) to the given other_subtype
164 165 166 |
# File 'lib/ur/content_type.rb', line 164 def subtype?(other_subtype) subtype ? subtype.casecmp?(other_subtype) : false end |
#suffix?(other_suffix) ⇒ Boolean
is the 'suffix' portion of our media type equal (case-insensitive) to the given other_suffix
171 172 173 |
# File 'lib/ur/content_type.rb', line 171 def suffix?(other_suffix) suffix ? suffix.casecmp?(other_suffix) : false end |
#type?(other_type) ⇒ Boolean
is the 'type' portion of our media type equal (case-insensitive) to the given other_type
157 158 159 |
# File 'lib/ur/content_type.rb', line 157 def type?(other_type) type ? type.casecmp?(other_type) : false end |
#type_application? ⇒ Boolean
is the 'type' portion of our media type 'application'
254 255 256 |
# File 'lib/ur/content_type.rb', line 254 def type_application? type ? type.casecmp?('application') : false end |
#type_audio? ⇒ Boolean
is the 'type' portion of our media type 'audio'
242 243 244 |
# File 'lib/ur/content_type.rb', line 242 def type_audio? type ? type.casecmp?('audio') : false end |
#type_image? ⇒ Boolean
is the 'type' portion of our media type 'image'
236 237 238 |
# File 'lib/ur/content_type.rb', line 236 def type_image? type ? type.casecmp?('image') : false end |
#type_message? ⇒ Boolean
is the 'type' portion of our media type 'message'
260 261 262 |
# File 'lib/ur/content_type.rb', line 260 def type ? type.casecmp?('message') : false end |
#type_multipart? ⇒ Boolean
is the 'type' portion of our media type 'multipart'
266 267 268 |
# File 'lib/ur/content_type.rb', line 266 def type_multipart? type ? type.casecmp?('multipart') : false end |
#type_text? ⇒ Boolean
is the 'type' portion of our media type 'text'
230 231 232 |
# File 'lib/ur/content_type.rb', line 230 def type_text? type ? type.casecmp?('text') : false end |
#type_video? ⇒ Boolean
is the 'type' portion of our media type 'video'
248 249 250 |
# File 'lib/ur/content_type.rb', line 248 def type_video? type ? type.casecmp?('video') : false end |
#xml? ⇒ Boolean
is this an XML content type?
218 219 220 |
# File 'lib/ur/content_type.rb', line 218 def xml? suffix ? suffix.casecmp?('xml') : subtype ? subtype.casecmp?('xml') : false end |