Module: NexosisApi::Client::Imports

Included in:
NexosisApi::Client
Defined in:
lib/nexosis_api/client/imports.rb

Overview

Imports-based API operations

Instance Method Summary collapse

Instance Method Details

#import_from_azure(dataset_name, connection_string, container, blob_name, column_metadata = []) ⇒ NexosisApi::ImportsResponse

Note:

the connection string provided will be encrypted at the server, used once, and then removed from storage.

Import a csv, json file (gzip’d or raw) from a Microsoft Azure storage blob

If folders have been used this will contain the full path within the container.

Parameters:

  • dataset_name (String)

    the name to give to the new dataset or existing dataset to which this data will be upserted

  • connection_string (String)

    the azure blob storage connection string providing access to the file resource

  • container (String)

    the container in which the object is located.

  • blob_name (String)

    the name of the object to import, usually a file. Always csv or json content.

  • column_metadata (Array of NexosisApi::Column) (defaults to: [])

    description of each column in target dataset. Optional.

Returns:

Raises:

  • (ArgumentError)

Since:

  • 2.0.0


92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
# File 'lib/nexosis_api/client/imports.rb', line 92

def import_from_azure(dataset_name, connection_string, container, blob_name,  = [])
  raise ArgumentError, 'dataset_name was not provided and is not optional ' unless dataset_name.empty? == false
  raise ArgumentError, 'connection_string was not provided and is not optional ' unless connection_string.empty? == false
  raise ArgumentError, 'container was not provided and is not optional ' unless container.empty? == false
  raise ArgumentError, 'blob_name was not provided and is not optional ' unless blob_name.empty? == false
  azure_url = '/imports/azure'
  column_json = Column.to_json()
  body = {
    'dataSetName' => dataset_name,
    'connectionString' => connection_string,
    'container' => container,
    'blob' => blob_name,
    'columns' => column_json
  }
  response = self.class.post(azure_url, headers: @headers, body: body.to_json)
  raise HttpException.new("There was a problem importing from azure: #{response.code}.",
                          "uploading dataset from azure #{dataset_name}",
                          response) unless response.success?
  NexosisApi::ImportsResponse.new(response.parsed_response)
end

#import_from_s3(dataset_name, bucket_name, path, region = 'us-east-1', credentials = {}, column_metadata = []) ⇒ NexosisApi::ImportsResponse

Note:

If credentials are provided they will be encrypted at the server, used once, and then removed from storage.

Import a file from AWS s3 as your dataset

Parameters:

  • dataset_name (String)

    the name to give to the new dataset or existing dataset to which this data will be upserted

  • bucket_name (String)

    the AWS S3 bucket name in which the path will be found

  • path (String)

    the path within the bucket (usually file name)

  • region (String) (defaults to: 'us-east-1')

    the region in which your bucket exists. Defaults to us-east-1

  • credentials (Hash) (defaults to: {})

    :access_key_id and :secret_access_key for user with rights to read the target file.

  • column_metadata (Array of NexosisApi::Column) (defaults to: [])

    description of each column in target dataset. Optional.

Returns:

Raises:

  • (ArgumentError)

See Also:


44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
# File 'lib/nexosis_api/client/imports.rb', line 44

def import_from_s3(dataset_name, bucket_name, path, region = 'us-east-1', credentials = {},  = [])
  raise ArgumentError, 'dataset_name was not provided and is not optional ' unless dataset_name.to_s.empty? == false
  raise ArgumentError, 'bucket_name was not provided and is not optional ' unless bucket_name.to_s.empty? == false
  raise ArgumentError, 'path was not provided and is not optional ' unless path.to_s.empty? == false
  s3_import_url = '/imports/s3'
  column_json = Column.to_json()
  body = {
    'dataSetName' => dataset_name,
    'bucket' => bucket_name,
    'path' => path,
    'region' => region,
    'columns' => column_json
  }
  body['accessKeyId'] = credentials[:access_key_id] unless credentials.nil? || credentials[:access_key_id].nil?
  body['secretAccessKey'] = credentials[:secret_access_key] unless credentials.nil? || credentials[:secret_access_key].nil?
  response = self.class.post(s3_import_url, headers: @headers, body: body.to_json)
  raise HttpException.new("There was a problem importing from s3: #{response.code}.",
                          "uploading dataset from s3 #{dataset_name}",
                          response) unless response.success?
  NexosisApi::ImportsResponse.new(response.parsed_response)
end

#import_from_url(dataset_name, url, column_metadata = [], options = {}) ⇒ NexosisApi::ImportsResponse

Note:

imports depend on file extensions, so use a content type indicator if json or csv cannot be inferred.

Note:

Urls protected by basic auth can be accessed if given a userid and password in options

Import a csv or json file directly from any avaiable and reachable public endpoint.

Parameters:

  • dataset_name (String)

    the name to give to the new dataset or existing dataset to which this data will be upserted

  • url (String)

    the url indicating where to find the file resource to import

  • column_metadata (Array of NexosisApi::Column) (defaults to: [])

    description of each column in target dataset. Optional.

  • options (Hash) (defaults to: {})

    may provide basic auth credentials or a ‘content-type’ value to identify csv or json content.

Returns:

Raises:

  • (ArgumentError)

Since:

  • 2.0.0


123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
# File 'lib/nexosis_api/client/imports.rb', line 123

def import_from_url(dataset_name, url,  = [], options = {})
  raise ArgumentError, 'dataset_name was not provided and is not optional ' unless dataset_name.empty? == false
  raise ArgumentError, 'url was not provided and is not optional ' unless url.empty? == false
  endpoint_url = '/imports/url'
  column_json = Column.to_json()
  body = {
    'dataSetName' => dataset_name,
    'url' => url,
    'columns' => column_json
  }
  response = self.class.post(endpoint_url, headers: @headers, body: body.to_json)
  raise HttpException.new("There was a problem importing from url: #{response.code}.",
                          "uploading dataset from #{url}",
                          response) unless response.success?
  NexosisApi::ImportsResponse.new(response.parsed_response)
end

#list_imports(dataset_name = '', page = 0, page_size = 50) ⇒ NexosisApi::PagedArray of NexosisApi::ImportsResponse

List all existing import requests

Parameters:

  • dataset_name (String) (defaults to: '')

    optional name filter of dataset which was imported

  • page (int) (defaults to: 0)

    page number for items in list

  • page_size (int) (defaults to: 50)

    number of items in each page

Returns:

Since:

  • 1.4 added paging parameters


16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
# File 'lib/nexosis_api/client/imports.rb', line 16

def list_imports(dataset_name = '', page = 0, page_size = 50)
  imports_url = '/imports'
  query = {
    dataSetName: dataset_name,
    page: page,
    pageSize: page_size
  }
  response = self.class.get(imports_url, headers: @headers, query: query)
  if (response.success?)
    NexosisApi::PagedArray.new(response.parsed_response,
                               response.parsed_response['items']
                               .map { |i| NexosisApi::ImportsResponse.new(i) })
  else
    raise HttpException.new("There was a problem getting the imports: #{response.code}.", "uploading dataset from s3 #{dataset_name}", response)
  end
end

#retrieve_import(import_id) ⇒ NexosisApi::ImportsResponse

Get response back from import created previously. Presumably to check status.

Examples:

get import

NexosisApi.client.retrieve_import('740dca2a-b488-4322-887e-fa473b1caa54')

Parameters:

  • import_id (String)

    The id returned from a previous request to import

Returns:

Raises:

  • (ArgumentError)

72
73
74
75
76
77
78
79
# File 'lib/nexosis_api/client/imports.rb', line 72

def retrieve_import(import_id)
  raise ArgumentError, 'import_id was not provided and is not optional ' unless import_id.to_s.empty? == false
  imports_url = "/imports/#{import_id}"
  response = self.class.get(imports_url, headers: @headers)
  raise HttpException.new("There was a problem getting the import #{response.code}.",
                          "requesting an import #{import_id}", response) unless response.success?
  NexosisApi::ImportsResponse.new(response.parsed_response)
end