Class: Jekyll::Embed
- Inherits:
-
Object
- Object
- Jekyll::Embed
- Defined in:
- lib/jekyll/embed.rb,
lib/jekyll/embed/cache.rb,
lib/jekyll/embed/filter.rb,
lib/jekyll/embed/content.rb
Overview
The idea with this class is to find the best safe representation of a link. For a YouTube video it could be the sandboxed iframe. This loads the video and allows you to reproduce it while preventing YT to call home and send data about your users. But other social networks will try to take control of their containers by modifying the page. They resist sandboxing and don’t work correctly. For them, we cleanup unwanted HTML tags such as <script>, and return the HTML, which you can style using CSS. Twitter does this.
Others are only available through OGP, so we retrieve the metadata and render a template, which you can provide in your own theme too.
We also try for microformats and we would look at Schema.org too but doesn’t seem to be a gem for it yet.
If the URL doesn’t provide anything at all we get the URL, title and date of last visit.
Isn’t it nice that the corporations that requires us to use OEmbed, OGP, Twitter Cards, Schema.org and other metadata, don’t do use themselves?
Also we’re going to use heavy caching so we don’t hit rate limits or lose the representation if the service is down or the URL is removed. We may be tempted to store the resources locally (images, videos, audio) but we have to take into account that people have legitimate reasons to remove media from the Internet.
Defined Under Namespace
Modules: Filter Classes: Cache, Content
Constant Summary collapse
- IFRAME_ATTRIBUTES =
Attributes to apply by HTMLElement
%w[allow sandbox referrerpolicy loading height width].freeze
- IMAGE_ATTRIBUTES =
%w[referrerpolicy loading height width].freeze
- MEDIA_ATTRIBUTES =
%w[controls height width].freeze
- A_ATTRIBUTES =
%w[referrerpolicy rel target].freeze
- DIRECTIVES =
Directive from Feature Policy
%w[accelerometer ambient-light-sensor autoplay battery camera display-capture document-domain encrypted-media execution-while-not-rendered execution-while-out-of-viewport fullscreen gamepad geolocation gyroscope layout-animations legacy-image-formats magnetometer microphone midi navigation-override oversized-images payment picture-in-picture publickey-credentials-get speaker-selection sync-xhr usb screen-wake-lock web-share xr-spatial-tracking].freeze
- INCLUDE_OGP =
Templates
'{% include ogp.html %}'
- INCLUDE_FALLBACK =
'{% include fallback.html %}'
- INCLUDE_EMBED =
'{% include embed.html %}'
- DEFAULT_CONFIG =
The default referrer policy only sends the origin URL (not the full URL, only the protocol/scheme and domain part) if the remote URL is HTTPS.
The default sandbox restrictions only allow scripts in the context of the iframe and opening new tabs.
{ 'scrub' => %w[form input textarea button fieldset select option optgroup canvas area map], 'attributes' => { 'referrerpolicy' => 'strict-origin-when-cross-origin', 'sandbox' => %w[allow-scripts allow-popups allow-popups-to-escape-sandbox], 'allow' => %w[fullscreen; gyroscope; picture-in-picture; clipboard-write;], 'loading' => 'lazy', 'controls' => true, 'rel' => %w[noopener noreferrer], 'target' => '_blank', 'height' => nil, 'width' => nil } }.freeze
Class Method Summary collapse
- .cache ⇒ Jekyll::Embed::Cache
- .cleanup(html_fragment, url) ⇒ String
- .config ⇒ Hash
-
.embed(url) ⇒ String
Render the URL as HTML.
-
.fallback(url) ⇒ Object
Try something.
- .get(url) ⇒ Faraday::Response
- .http_client ⇒ Faraday::Connection
-
.oembed(url) ⇒ String, NilClass
Try for OEmbed.
-
.ogp(url) ⇒ String, NilClass
Try for OGP.
-
.reset ⇒ nil
Reset variables.
- .site ⇒ Object
-
.site=(site) ⇒ Jekyll::Site
This is an initializer of sorts.
- .text(node) ⇒ Object
Class Method Details
.cache ⇒ Jekyll::Embed::Cache
276 277 278 |
# File 'lib/jekyll/embed.rb', line 276 def cache @cache ||= Jekyll::Embed::Cache.new('Jekyll::Embed') end |
.cleanup(html_fragment, url) ⇒ String
295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 |
# File 'lib/jekyll/embed.rb', line 295 def cleanup(html_fragment, url) html = Loofah.fragment(html_fragment).scrub!(:prune) # Add our own attributes html.css('iframe').each do |iframe| IFRAME_ATTRIBUTES.each do |attr| set_value_for_attr(iframe, attr) end # Embedding itself require allow-same-origin iframe['sandbox'] += allow_same_origin(url) end html.css('audio, video').each do |media| MEDIA_ATTRIBUTES.each do |attr| set_value_for_attr(media, attr) end media['src'] = UrlPrivacy.clean media['src'] end html.css('img').each do |img| IMAGE_ATTRIBUTES.each do |attr| set_value_for_attr(img, attr) end end html.css('a').each do |a| A_ATTRIBUTES.each do |attr| set_value_for_attr(a, attr) end end html.css('[src]').each do |element| element['src'] = UrlPrivacy.clean(element['src']) end html.css('[href]').each do |element| element['href'] = UrlPrivacy.clean(element['href']) end # Return the cleaned up HTML as a String html.to_s end |
.config ⇒ Hash
178 179 180 181 182 183 184 |
# File 'lib/jekyll/embed.rb', line 178 def config @config ||= Jekyll::Utils.deep_merge_hashes(DEFAULT_CONFIG, (site.config['embed'] || {})).tap do |c| c['attributes']['allow'].concat (DIRECTIVES - c.dig('attributes', 'allow').join.split(';').map do |s| s.split(' ').first end).join(" 'none';|").split('|') end end |
.embed(url) ⇒ String
Render the URL as HTML
-
Try oembed for video and image
-
If rich oembed, cleanup
-
If OGP, render templates
-
Else, render fallback template
158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 |
# File 'lib/jekyll/embed.rb', line 158 def (url) raise URI::Error unless url.is_a? String url = url.strip # Quick check raise URI::Error unless url.start_with? 'http' # Just to verify the URL is valid # TODO: Use Addressable URI.parse url (url) || ogp(url) || fallback(url) || url rescue URI::Error Jekyll.logger.warn "#{url.inspect} is not a valid URL" url end |
.fallback(url) ⇒ Object
Try something
240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 |
# File 'lib/jekyll/embed.rb', line 240 def fallback(url) cache.getset("fallback+#{url}") do html = Nokogiri::HTML.fragment get(url).body element = html.css('article').first element ||= html.css('section').first element ||= html.css('main').first element ||= html.css('body').first title = html.css('title').first description = html.css('meta[name="description"]').first context = info.dup context[:registers][:page] = payload['page'] = { 'title' => text(title), 'description' => text(description), 'url' => url, 'image' => element&.css('img')&.first&.public_send(:[], 'src'), 'locale' => html.css('html')&.first&.public_send(:[], 'lang') } cleanup fallback_template.render!(payload, context), url end rescue ArgumentError Jekyll.logger.warn 'Invalid contents (fallback):', url nil rescue Faraday::Error, Nokogiri::SyntaxError nil end |
.get(url) ⇒ Faraday::Response
270 271 272 273 |
# File 'lib/jekyll/embed.rb', line 270 def get(url) @get_cache ||= {} @get_cache[url] ||= http_client.get url end |
.http_client ⇒ Faraday::Connection
281 282 283 284 285 286 287 288 289 290 |
# File 'lib/jekyll/embed.rb', line 281 def http_client @http_client ||= Faraday.new do |builder| builder..timeout = 4 builder..open_timeout = 1 builder..read_timeout = 1 builder..write_timeout = 1 builder.use FaradayMiddleware::FollowRedirects builder.use :http_cache, shared_cache: false, store: cache, serializer: Marshal end end |
.oembed(url) ⇒ String, NilClass
Try for OEmbed
190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 |
# File 'lib/jekyll/embed.rb', line 190 def (url) cache.getset("oembed+#{url}") do = OEmbed::Providers.get url # Prevent caching of nil? raise OEmbed::Error unless .respond_to? :html context = info.dup context[:registers][:page] = payload['page'] = cleanup(.html, url) .render!(payload, context) end rescue OEmbed::Error nil end |
.ogp(url) ⇒ String, NilClass
Try for OGP.
209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 |
# File 'lib/jekyll/embed.rb', line 209 def ogp(url) cache.getset("ogp+#{url}") do ogp = OGP::OpenGraph.new get(url).body page = { locale: ogp.locales.first, title: ogp.title, url: ogp.url, description: ogp.description, type: ogp.type, data: ogp.data }.transform_keys(&:to_s) %w[image video audio].each do |attr| page[attr] = ogp.public_send(:"#{attr}s").find do |a| a && a.url && http?(a.url) end&.url end context = info.dup context[:registers][:page] = payload['page'] = page cleanup ogp_template.render!(payload, context), url end rescue ArgumentError Jekyll.logger.warn 'Invalid contents (OGP):', url nil rescue LL::ParserError, OGP::MalformedSourceError, OGP::MissingAttributeError, Faraday::Error nil end |
.reset ⇒ nil
Reset variables
135 136 137 138 139 140 141 142 143 144 145 146 147 |
# File 'lib/jekyll/embed.rb', line 135 def reset @allow_same_origin = @cache = @config = @fallback_template = @get_cache = @http_client = @info = @ogp_template = @payload = @value_for_attr = nil end |
.site ⇒ Object
95 96 97 98 99 100 101 102 |
# File 'lib/jekyll/embed.rb', line 95 def site unless @site raise Jekyll::Errors::InvalidConfigurationError, 'Site is missing, configure with `Jekyll::Embed.site = site`' end @site end |
.site=(site) ⇒ Jekyll::Site
This is an initializer of sorts
108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 |
# File 'lib/jekyll/embed.rb', line 108 def site=(site) raise ArgumentError, 'Site must be a Jekyll::Site' unless site.is_a? Jekyll::Site @site = site # Add the _includes dir so we can provide default templates that # can be overriden locally or by the theme. includes_dir = File.(File.join(__dir__, '..', '..', '_includes')) site.includes_load_paths << includes_dir unless site.includes_load_paths.include? includes_dir # Since we're embedding, we're allowing iframes Loofah::HTML5::SafeList::ALLOWED_ELEMENTS_WITH_LIBXML2 << 'iframe' reset # Other elements that are disallowed config['scrub']&.each do |scrub| Loofah::HTML5::SafeList::ALLOWED_ELEMENTS_WITH_LIBXML2.delete(scrub) end payload['embed'] = config['attributes'] site end |
.text(node) ⇒ Object
340 341 342 |
# File 'lib/jekyll/embed.rb', line 340 def text(node) node&.text&.tr("\n", '')&.tr("\r", '')&.strip&.squeeze(' ') end |