Module: S33r

Defined in:
lib/s33r/bucket.rb,
lib/s33r/client.rb,
lib/s33r/s3_acl.rb,
lib/s33r/s3_obj.rb,
lib/s33r/utility.rb,
lib/s33r/networking.rb,
lib/s33r/s3_logging.rb,
lib/s33r/bucket_listing.rb,
lib/s33r/s33r_exception.rb,
lib/s33r/orderly_xml_markup.rb

Overview

Core functionality for managing HTTP requests to S3.

Defined Under Namespace

Modules: InBucket, Networking, S3ACL, S3Exception, S3Logging Classes: Bucket, BucketListing, Client, OrderlyXmlMarkup, S3Object

Constant Summary collapse

HOST =
's3.amazonaws.com'
PORT =
443
NON_SSL_PORT =
80
METADATA_PREFIX =
'x-amz-meta-'
DEFAULT_CHUNK_SIZE =

Size of each chunk (in bytes) to be sent per request when putting files (1Mb).

1048576
AWS_HEADER_PREFIX =
'x-amz-'
AWS_AUTH_HEADER_VALUE =
"AWS %s:%s"
INTERESTING_HEADERS =
['content-md5', 'content-type', 'date']
REQUIRED_HEADERS =

Headers which must be included with every request to S3.

['Content-Type', 'Date']
CANNED_ACLS =

Canned ACLs made available by S3.

['private', 'public-read', 'public-read-write', 'authenticated-read']
METHOD_VERBS =

HTTP methods which S3 will respond to.

['GET', 'PUT', 'HEAD', 'DELETE']
BUCKET_LIST_MAX_MAX_KEYS =

Maximum number which can be passed in max-keys parameter when GETting bucket list.

1000
DEFAULT_EXPIRY_SECS =

Default number of seconds an authenticated URL will last for (15 minutes).

60 * 15
FAR_FUTURE =

Number of years to use for expiry date when :expires is set to :far_flung_future.

20
RESPONSE_NAMESPACE_URI =

The namespace used for response body XML documents.

"http://s3.amazonaws.com/doc/2006-03-01/"
PERMISSIONS =

Permissions which can be set within a <Grant> (see docs.amazonwebservices.com/AmazonS3/2006-03-01/UsingPermissions.html).

NB I’ve missed out the WRITE_ACP permission as this is functionally equivalent to FULL_CONTROL.

{ 
  :read => 'READ',  # permission to read
  :write => 'WRITE',  # permission to write
  :read_acl => 'READ_ACP',  # permission to read ACL settings
  :all => 'FULL_CONTROL'  # do anything
}
NAMESPACE =

Used for generating ACL XML documents.

'xsi'
NAMESPACE_URI =
'http://www.w3.org/2001/XMLSchema-instance'
GRANTEE_TYPES =
{
  :amazon_customer => 'AmazonCustomerByEmail', 
  :canonical_user => 'CanonicalUser',
  :group => 'Group'
}
S3_GROUP_TYPES =
{
  :all_users => 'global/AllUsers',
  :authenticated_users => 'global/AuthenticatedUsers',
  :log_delivery => 's3/LogDelivery'
}
GROUP_ACL_URI_BASE =
'http://acs.amazonaws.com/groups/'

Class Method Summary collapse

Instance Method Summary collapse

Class Method Details

.keys_to_symbols(hsh) ⇒ Object

Return the hash hsh with keys converted to symbols.



407
408
409
410
411
412
413
# File 'lib/s33r/utility.rb', line 407

def self.keys_to_symbols(hsh)
  symbolised = {}
  hsh.each_pair do |key, value|
    symbolised[key.to_sym] = value
  end
  symbolised
end

.load_config(config_file) ⇒ Object

Load YAML config. file for S33r operations. The config. file looks like this:

:include: test/files/config.yaml

The options section of the YAML file is optional, and can be used to add application-specific settings for your application.

Note that the loader also runs the config. file through ERB, so you can add dynamic blocks of ERB (Ruby) code into your files.

config_file is the path to the configuration file.

Returns a [config, options], where config is a hash of standard S33r options (:access, :secret), and options is a hash of general application options.

The keys for both hashes are converted from strings into symbols.



83
84
85
86
87
88
89
90
91
92
# File 'lib/s33r/utility.rb', line 83

def self.load_config(config_file)
  config = YAML::load(ERB.new(IO.read(config_file)).result)
  
  options = config.delete('options')
  options = S33r.keys_to_symbols(options)
  
  config = S33r.keys_to_symbols(config)
  
  [config, options]
end

.parse_expiry(expires = nil) ⇒ Object

Parse an expiry date into seconds since the epoch.

expires can be set to :far_flung_future to get a time FAR_FUTURE years in the future; or to a specific date (parseable by ParseDate); or to an integer representing seconds since the epoch. If you leave it blank, you’ll get the current time + DEFAULT_EXPIRY_SECS.

Returns an integer representing seconds since the epoch.



423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
# File 'lib/s33r/utility.rb', line 423

def self.parse_expiry(expires=nil)
  unless expires.kind_of?(Integer)
    if expires.is_a?(String)
      expires = Time.parse(expires).to_i
    else
      base_expires = Time.now.to_i
      if :far_flung_future == expires
        # 50 years (same as forever in computer terms)
        expires = (base_expires + (60 * 60 * 24 * 365.25 * FAR_FUTURE)).to_i
      else
        # default to DEFAULT_EXPIRY_SECS seconds from now if expires not set
        expires = base_expires + DEFAULT_EXPIRY_SECS
      end
    end
  end
  expires
end

.remove_namespace(xml_in) ⇒ Object

Remove the namespace declaration from S3 XML response bodies (libxml isn’t fond of it).



443
444
445
446
# File 'lib/s33r/utility.rb', line 443

def self.remove_namespace(xml_in)
  namespace = S33r::RESPONSE_NAMESPACE_URI.gsub('/', '\/')
  xml_in.gsub(/ xmlns="#{namespace}"/, '')
end

Instance Method Details

#bucket_name_valid?(bucket_name) ⇒ Boolean

Ensure that a bucket_name is well-formed (no leading or trailing slash).

Returns:

  • (Boolean)


263
264
265
266
267
268
269
# File 'lib/s33r/utility.rb', line 263

def bucket_name_valid?(bucket_name)
  if !bucket_name.is_a?(String)
    raise MalformedBucketName, "Bucket name must be a string"
  elsif ('/' == bucket_name[0,1] || '/' == bucket_name[-1,1])
    raise MalformedBucketName, "Bucket name cannot have a leading or trailing slash"
  end
end

#canned_acl_header(canned_acl) ⇒ Object

Get a canned ACL setter header.



243
244
245
246
247
248
249
250
251
252
# File 'lib/s33r/utility.rb', line 243

def canned_acl_header(canned_acl)
  headers = {}
  unless canned_acl.nil?
    unless CANNED_ACLS.include?(canned_acl)
      raise UnsupportedCannedACL, "The canned ACL #{canned_acl} is not supported"
    end
    headers[AWS_HEADER_PREFIX + 'acl'] = canned_acl
  end
  headers
end

#content_headers(content_type, key = 'download', render_as_attachment = false) ⇒ Object

Content transfer headers: set Content-Type, Content-Transfer-Encoding and Content-Disposition headers.

content_type: content type string to send in the header, e.g. ‘text/html’.

key is the key for the object: used as the filename if the file is downloaded; defaults to ‘download’ if not set. If you use a path (e.g. ‘/home/you/photos/me.jpg’), just the last part (‘me.jpg’) is used as the name of the download file.

render_as_attachment: set to true if you want to add a content disposition header which enables the object to be downloaded, rather than shown inline, when fetched by a browser.



227
228
229
230
231
232
233
234
235
236
237
238
239
240
# File 'lib/s33r/utility.rb', line 227

def content_headers(content_type, key='download', render_as_attachment=false)
  headers = {}
  
  headers['Content-Type'] = content_type || 'text/plain'
  mime_type = MIME::Types[content_type][0]
  if mime_type
    headers['Content-Transfer-Encoding'] = 'binary' if mime_type.binary?
  end
  if render_as_attachment
    headers['Content-Disposition'] = "attachment; filename=#{File.basename(key)}"
  end

  headers
end

#default_headers(existing_headers, options = {}) ⇒ Object

Build the headers required with every S3 request (Date and Content-Type); options hash can contain extra header settings; :date and :content_type are required headers, and set to defaults if not supplied.



180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
# File 'lib/s33r/utility.rb', line 180

def default_headers(existing_headers, options={})
  headers = {}
  
  # which default headers required by AWS are missing?
  missing_headers = REQUIRED_HEADERS - existing_headers.keys

  if missing_headers.include?('Content-Type')
    headers['Content-Type'] = options[:content_type] || ''
  end

  if missing_headers.include?('Date')
    date = options[:date] || Time.now
    headers['Date'] = date.httpdate
  end

  headers
end

#generate_auth_header_value(method, path, headers, aws_access_key, aws_secret_access_key, subdomain) ⇒ Object

Get the value for the AWS authentication header.

Raises:

  • (MethodNotAllowed)


150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
# File 'lib/s33r/utility.rb', line 150

def generate_auth_header_value(method, path, headers, aws_access_key, aws_secret_access_key, subdomain)
  raise MethodNotAllowed, "Method %s not available" % method if !METHOD_VERBS.include?(method)

  # check the headers needed for authentication have been set
  missing_headers = REQUIRED_HEADERS - headers.keys
  if !(missing_headers.empty?)
    raise MissingRequiredHeaders,
      "Headers required for AWS auth value are missing: " + missing_headers.join(', ')
  end

  raise KeysIncomplete, "Access key or secret access key nil" \
  if aws_access_key.nil? or aws_secret_access_key.nil?

  # get the AWS header
  canonical_string = generate_canonical_string(method, path, headers, nil, subdomain)
  signature = generate_signature(aws_secret_access_key, canonical_string)
  AWS_AUTH_HEADER_VALUE % [aws_access_key, signature]
end

#generate_canonical_string(method, path, headers = {}, expires = nil, subdomain = nil) ⇒ Object

Build canonical string for signing; modified (slightly) from the Amazon sample code.

  • method is one of the available METHOD_VERBS.

  • path is the path part of the URL to generate the canonical string for.

  • headers is a hash of headers which are going to be sent with the request.

  • expires is the expiry time set in the querystring for authenticated URLs: if supplied, it is used for the date header.

  • subdomain is the bucket name if using “virtual hosts”



103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
# File 'lib/s33r/utility.rb', line 103

def generate_canonical_string(method, path, headers={}, expires=nil, subdomain=nil)
  interesting_headers = {}
  headers.each do |key, value|
    lk = key.downcase
    if (INTERESTING_HEADERS.include?(lk) or lk =~ /^#{AWS_HEADER_PREFIX}/o)
      interesting_headers[lk] = value
    end
  end

  # These fields get empty strings if they don't exist.
  interesting_headers['content-type'] ||= ''
  interesting_headers['content-md5'] ||= ''

  # If you're using expires for query string auth, then it trumps date.
  if not expires.nil?
    interesting_headers['date'] = expires
  end

  buf = ''

  buf << "#{method}\n"
  interesting_headers.sort { |a, b| a[0] <=> b[0] }.each do |key, value|
    if key =~ /^#{AWS_HEADER_PREFIX}/o
      buf << "#{key}:#{value}\n"
    else
      buf << "#{value}\n"
    end
  end

  buf << '/' + subdomain unless subdomain.nil?

  # Ignore everything after the question mark...
  buf << path.gsub(/\?.*$/, '')

  # ...unless there is an acl, logging or torrent parameter
  if path =~ /[&?]acl($|&|=)/
    buf << '?acl'
  elsif path =~ /[&?]torrent($|&|=)/
    buf << '?torrent'
  elsif path =~ /[&?]logging($|&|=)/
    buf << '?logging'
  end

  buf
end

#generate_querystring(pairs = nil) ⇒ Object

Convert a hash of name/value pairs to querystring variables. Name for a variable can be a string or symbol.



273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
# File 'lib/s33r/utility.rb', line 273

def generate_querystring(pairs=nil)
  str = ''
  pairs ||= {}
  if pairs.size > 0
    name_value_pairs = pairs.map do |key, value|
      if value.nil?
        key
      else
        "#{key}=#{CGI::escape(value.to_s)}"
      end
    end
    str += name_value_pairs.join('&')
  end
  str
end

#generate_signature(aws_secret_access_key, str) ⇒ Object

Encode the given string with the aws_secret_access_key, by taking the hmac sha1 sum, and then base64 encoding it.



171
172
173
174
# File 'lib/s33r/utility.rb', line 171

def generate_signature(aws_secret_access_key, str)
  digest = OpenSSL::HMAC::digest(OpenSSL::Digest::Digest.new("SHA1"), aws_secret_access_key, str)
  Base64.encode64(digest).strip
end

#guess_mime_type(file_name) ⇒ Object

Guess a file’s mime type. If the mime_type for a file cannot be guessed, “text/plain” is used.



256
257
258
259
260
# File 'lib/s33r/utility.rb', line 256

def guess_mime_type(file_name)
  mime_type = MIME::Types.type_for(file_name)[0]
  mime_type ||= MIME::Types['text/plain'][0]
  mime_type
end

#metadata_headers(metadata = {}) ⇒ Object

Add metadata headers, correctly prefixing them first, e.g. you might do metadata_headers(=> ‘elliot’, ‘myage’ => 36) to add two headers to the request:

x-amz-meta-myname: elliot
x-amz-meta-myage: 36

Keys shouldn’t have spaces; they can also be represented using symbols.

Returns metadata headers appended, with both keys and values as strings.



208
209
210
211
212
213
214
# File 'lib/s33r/utility.rb', line 208

def (={})
  headers = {}
  unless .empty?
    .each { |key, value| headers[METADATA_PREFIX + key.to_s] = value.to_s }
  end
  headers
end

#s3_authenticated_url(aws_access_key, aws_secret_access_key, options = {}) ⇒ Object

Generate a get-able URL for an S3 resource key which passes authentication in querystring. Note that this will correctly generate authenticated URLs for logging and ACL resources.

options are passed through to s3_path and s3_url; an :expires option is also available:

  • :expires => <date time>: when the URL expires (seconds since the epoch); S33r.parse_expiry is used to generate a suitable value from a date/time string, or you can use an int. Use :far_flung_future to get some time in the distant future. Defaults to current time + S33r::DEFAULT_EXPIRY_SECS.

Raises:

  • (KeysIncomplete)


386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
# File 'lib/s33r/utility.rb', line 386

def s3_authenticated_url(aws_access_key, aws_secret_access_key, options={})
  raise KeysIncomplete, "You must supply both an AWS access key and secret access key to create \
  an authenticated URL" if aws_access_key.nil? or aws_secret_access_key.nil?

  path = s3_path(options)
  expires = S33r.parse_expiry(options[:expires])
  
  canonical_string = generate_canonical_string('GET', path, {}, expires)
  signature = generate_signature(aws_secret_access_key, canonical_string)
  
  querystring = generate_querystring({'Signature' => signature, 'Expires' => expires,
  'AWSAccessKeyId' => aws_access_key })
  
  options[:path] = path
  base_url = s3_public_url(options)
  /\?/ =~ base_url ? base_url += '&' : base_url += '?'
  base_url += querystring
  base_url
end

#s3_path(options = {}) ⇒ Object

Returns the path for this bucket and key. By default, keys are not CGI-escaped; if you want escaping, use the :escape => true option.

options:

  • :bucket => 'my-bucket': get a path which includes the bucket (unless :subdomain => true is also passed in)

  • :key => 'my-key': get a path including a key

  • :querystring => {'acl' => nil, 'page' => 2, ...}: adds a querystring to path (when generating a signature for a URL, any ‘?acl’ or ‘?logging’ parameters must be included as part of the path before hashing)

  • :subdomain => true: don’t include the bucket name in the path.

  • :acl => true: append ?acl to the front of the querystring.

  • :logging => true: append ?logging to the start of the querystring.

  • :escape => true: CGI::escape keys when they are appended to the path.



303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
# File 'lib/s33r/utility.rb', line 303

def s3_path(options={})
  bucket = options[:bucket]
  key = options[:key]
  
  qstring_pairs = options[:querystring] || {}
  if options[:acl]
    qstring_pairs = {:acl => nil}.merge(qstring_pairs)
  elsif options[:logging]
    qstring_pairs = {:logging => nil}.merge(qstring_pairs)
  end
  
  qstring = generate_querystring(qstring_pairs)

  path = '/'
  path += (bucket + '/') if bucket and !options[:subdomain]
  if key
    key = CGI::escape(key) if options[:escape]
    path += key
  end
  path += '?' + qstring unless '' == qstring
  path
end

#s3_public_url(options) ⇒ Object

Public readable URL for a bucket and resource.

options are passed through from s3_url; only :access and :secret are irrelevant of the options available to s3_url.

Note that if a :path option is not set, a path is generated from any :bucket and/or :path parameters supplied.



370
371
372
373
374
375
376
377
# File 'lib/s33r/utility.rb', line 370

def s3_public_url(options)
  options[:use_ssl] = false if options[:subdomain]
  scheme = options[:use_ssl] ? 'https' : 'http'
  path = options[:path] || s3_path(options)
  host = HOST
  host = (options[:bucket] + "." + host ) if options[:subdomain] and options[:bucket]
  "#{scheme}://" + host + path
end

#s3_url(options = {}) ⇒ Object

Build a URL for a bucket or object on S3.

options are passed through to either s3_authenticated_url or s3_public_url (if :authenticated, :access and :secret options are passed, s3_authenticated_url is used):

  • :bucket => 'my-bucket': bucket the URL is for.

  • :key => 'my-key': the key to produce a URL for.

  • :use_ssl => true: return an https:// URL.

  • :subdomain => true: use :bucket as the subdomain to produce a bucket URL like ‘elliot.s3.amazonaws.com’ instead of ‘s3.amazonaws.com/elliot’. Note that this is NOT SUPPORTED for authenticated requests or SSL requests.

  • :path => '/bucket/key': include given path on end of URL; if not set, a path is generated from any bucket and/or key given

  • :access => 'aws access key': Generate authenticated URL.

  • :secret => 'aws secret access key': Generate authenticated URL.

  • :authenticated => true: Produce an authenticated URL.

  • :querystring => {'name' => 'value', 'test' => nil, ...}: add querystring parameters to the URL; NB any keys with a nil value are added to the querystring as keys without values. Note that querystring parameters are just appended in the order they are returned by the map iterator for a hash.

  • :acl => true: append ?acl to the front of the querystring.

  • :logging => true: append ?logging to the start of the querystring.



348
349
350
351
352
353
354
355
356
357
358
359
360
361
# File 'lib/s33r/utility.rb', line 348

def s3_url(options={})
  # Turn off the subdomain option if using SSL.
  options[:subdomain] = false if options[:use_ssl]
  
  access = options[:access]
  secret = options[:secret]
  if access and secret and options[:authenticated]
    # Turn off the subdomain option (it doesn't work with authenticated URLs).
    options[:subdomain] = false
    s3_authenticated_url(access, secret, options)
  else
    s3_public_url(options)
  end
end