Module: NexosisApi::Client::Datasets

Included in:
NexosisApi::Client
Defined in:
lib/nexosis_api/client/datasets.rb

Overview

Dataset-based API operations

Instance Method Summary collapse

Instance Method Details

#create_dataset_csv(dataset_name, csv) ⇒ NexosisApi::DatasetSummary

save data in a named dataset from csv content

Parameters:

  • dataset_name (String)

    name to save the dataset

  • csv (CSV)

    csv content ready for reading

Returns:


24
25
26
27
# File 'lib/nexosis_api/client/datasets.rb', line 24

def create_dataset_csv(dataset_name, csv)
  content = process_csv_to_s csv
  create_dataset dataset_name, content, 'text/csv'
end

#create_dataset_json(dataset_name, json_data) ⇒ NexosisApi::DatasetSummary

Note:

input json is to be a hash, do not send a json string via to_json.

save data in a named dataset

Parameters:

  • dataset_name (String)

    name to save the dataset

  • json_data (Hash)

    parsed json data

Returns:


15
16
17
# File 'lib/nexosis_api/client/datasets.rb', line 15

def create_dataset_json(dataset_name, json_data)
  create_dataset dataset_name, json_data.to_json, 'application/json'
end

#get_dataset(dataset_name, page_number = 0, page_size = 50, query_options = {}) ⇒ Object

Note:

Query Options includes start_date as a DateTime or ISO 8601 compliant string, end_date, also as a DateTime or string, and :include as an Array of strings indicating the columns to return. The dates can be used independently and are inclusive. Lack of options returns all values within the given page.

Get the data in the set, with paging, and optional projection.

Parameters:

  • dataset_name (String)

    name of the dataset for which to retrieve data.

  • page_number (Integer) (defaults to: 0)

    zero-based page number of results to retrieve

  • page_size (Integer) (defaults to: 50)

    Count of results to retrieve in each page (max 1000).

  • query_options (Hash) (defaults to: {})

    options hash for limiting and projecting returned results

Raises:


62
63
64
65
66
# File 'lib/nexosis_api/client/datasets.rb', line 62

def get_dataset(dataset_name, page_number = 0, page_size = 50, query_options = {})
  response = get_dataset_internal(dataset_name, page_number, page_size, query_options)
  return NexosisApi::DatasetData.new(response.parsed_response) if response.success?
  raise HttpException.new("There was a problem getting the dataset: #{response.code}.", "getting dataset #{dataset_name}", response)
end

#get_dataset_csv(dataset_name, page_number = 0, page_size = 50, query_options = {}) ⇒ Object

Note:

Query Options includes start_date as a DateTime or ISO 8601 compliant string, end_date, also as a DateTime or string, and :include as an Array of strings indicating the columns to return. The dates can be used independently and are inclusive. Lack of options returns all values within the given page.

Get the data in the set, written to a CSV file, optionally filtering it.

Examples:

get page 1 with 20 results each page

NexosisApi.client.get_dataset_csv('MyDataset', 1, 20, {:include => 'sales'})

Parameters:

  • dataset_name (String)

    name of the dataset for which to retrieve data.

  • page_number (Integer) (defaults to: 0)

    zero-based page number of results to retrieve

  • page_size (Integer) (defaults to: 50)

    Count of results to retrieve in each page (max 1000).

  • query_options (Hash) (defaults to: {})

    options hash for limiting and projecting returned results

Raises:


79
80
81
82
83
# File 'lib/nexosis_api/client/datasets.rb', line 79

def get_dataset_csv(dataset_name, page_number = 0, page_size = 50, query_options = {})
  response = get_dataset_internal(dataset_name, page_number, page_size, query_options, 'text/csv')
  return response.body if response.success?
  raise HttpException.new("There was a problem getting the dataset: #{response.code}.", "getting dataset #{dataset_name}", response)
end

#list_datasets(partial_name = '', page = 0, page_size = 50) ⇒ NexosisApi::PagedArray of NexosisApi::DatasetSummary

Gets the list of data sets that have been saved to the system, optionally filtering by partial name match.

Parameters:

  • partial_name (String) (defaults to: '')

    if provided, all datasets returned will contain this string

  • page (int) (defaults to: 0)

    page number for items in list

  • page_size (int) (defaults to: 50)

    number of items in each page

Returns:

Since:

  • 1.4 - added paging parameters


36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
# File 'lib/nexosis_api/client/datasets.rb', line 36

def list_datasets(partial_name = '', page = 0, page_size = 50)
  list_dataset_url = '/data'
  query = {
    page: page,
    pageSize: page_size
  }
  query['partialName'] = partial_name unless partial_name.empty?
  response = self.class.get(list_dataset_url, headers: @headers, query: query)
  if response.success?
    NexosisApi::PagedArray.new(response.parsed_response,
                               response.parsed_response['items']
                               .map { |dr| NexosisApi::DatasetSummary.new(dr) })
  else
    raise HttpException.new("There was a problem listing datasets: #{response.code}.", "listing datasets with partial name #{partial_name}", response)
  end
end

#remove_dataset(dataset_name, filter_options = {}) ⇒ Object

Note:

Options: start_date, end_date, cascade_forecast, cascade_sessions, cascade

  • start_date - the first date on which to start removing data

  • end_date - the last date on which to finish removing data

  • cascade_forecast - will cascade deletes to all related forecasts

  • cascade_session - will cascade deletes to all related sessions

  • cascade_view - will cascade deletes to all related views (any part of join - think twice)

  • cascase_model - will cascade deletes to all models created from this dataset

  • cascade - will cascade deletes to all related forecasts and sessions

Remove data from a data set or the entire set.

Examples:

  • request delete with cascade forecast

NexosisApi.client.remove_dataset('mydataset', {:cascade_forecast => true})

Parameters:

  • dataset_name (String)

    the name of the dataset from which to remove data

  • filter_options (Hash) (defaults to: {})

    filtering which data to remove

Raises:

  • (ArgumentError)

99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
# File 'lib/nexosis_api/client/datasets.rb', line 99

def remove_dataset(dataset_name, filter_options = {})
  raise ArgumentError, 'dataset_name was not provided and is not optional ' if dataset_name.to_s.empty?
  dataset_remove_url = "/data/#{dataset_name}"
  query = {}
  if filter_options.empty? == false
    cascade_query = create_cascade_options(filter_options)
    query['cascade'] = cascade_query unless cascade_query.nil?
    query['startDate'] = [filter_options[:start_date].to_s] unless filter_options[:start_date].nil?
    query['endDate'] = [filter_options[:end_date].to_s] unless filter_options[:end_date].nil?
  end
  # normalizer = proc { |query_set| query_set.map { |key, value| value.map { |v| "#{key}=#{v}" } }.join('&') }
  response = self.class.delete(dataset_remove_url,
                               headers: @headers,
                               query: query,
                               query_string_normalizer: ->(query_map) {array_query_normalizer(query_map)})
  return if response.success?
  raise HttpException.new("There was a problem removing the dataset: #{response.code}.", "removing dataset #{dataset_name}", response)
end