Class: Biomart::Dataset

Inherits:
Object
  • Object
show all
Includes:
Biomart
Defined in:
lib/biomart/dataset.rb

Overview

Class represetation for a biomart dataset. Can belong to a Biomart::Database and a Biomart::Server.

Constant Summary

Constants included from Biomart

VERSION

Instance Attribute Summary collapse

Instance Method Summary collapse

Methods included from Biomart

#request

Constructor Details

#initialize(url, args) ⇒ Dataset

Creates a new Biomart::Dataset object.

arguments hash:

{
  :name         => String,     #
  "name"        => String,     #
  :display_name => {}          #

}

Parameters:

  • url (String)

    The URL location of the biomart server.

  • args (Hash)

    An arguments hash giving details of the dataset.



22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
# File 'lib/biomart/dataset.rb', line 22

def initialize( url, args )
  @url = url or raise ArgumentError, "must pass :url"
  unless @url =~ /martservice/
    @url = @url + "/martservice"
  end
  
  @name         = args["name"] || args[:name]
  @display_name = args["displayName"] || args[:display_name]
  @visible      = ( args["visible"] || args[:visible] ) ? true : false
  
  @filters      = {}
  @attributes   = {}
  @importables  = {}
  @exportables  = {}
end

Instance Attribute Details

#display_nameObject (readonly)

Returns the value of attribute display_name.



7
8
9
# File 'lib/biomart/dataset.rb', line 7

def display_name
  @display_name
end

#nameObject (readonly)

Returns the value of attribute name.



7
8
9
# File 'lib/biomart/dataset.rb', line 7

def name
  @name
end

#visibleObject (readonly)

Returns the value of attribute visible.



7
8
9
# File 'lib/biomart/dataset.rb', line 7

def visible
  @visible
end

Instance Method Details

#alive?Boolean

Simple heartbeat function to test that a Biomart server is online.

Returns:

  • (Boolean)

    true/false



207
208
209
210
# File 'lib/biomart/dataset.rb', line 207

def alive?
  server = Biomart::Server.new( @url )
  return server.alive?
end

#attributesHash

Returns a hash (keyed by the biomart ‘internal_name’ for the attribute) of all of the Biomart::Attribute objects belonging to this dataset.

Returns:

  • (Hash)

    A hash of Biomart::Attribute objects keyed by ‘internal_name’



64
65
66
67
68
69
# File 'lib/biomart/dataset.rb', line 64

def attributes
  if @attributes.empty?
    fetch_configuration()
  end
  return @attributes
end

#count(args = {}) ⇒ Object

Function to perform a Biomart count. Returns an integer value for the result of the count query.

arguments:

{
  :timeout => integer,     # set a timeout length for the request (secs) - optional
  :filters => {}           # hash of key-value pairs (filter => search term) - optional
}

Parameters:

  • args (Hash) (defaults to: {})

    The arguments hash

Raises:

  • Biomart::ArgumentError Raised when un-supported arguments are passed



94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
# File 'lib/biomart/dataset.rb', line 94

def count( args={} )
  if args[:federate]
    raise Biomart::ArgumentError, "You cannot federate a count query."
  end
  
  if args[:required_attributes]
    raise Biomart::ArgumentError, "The :required_attributes option is not allowed on count queries."
  end
  
  result = request(
    :method  => 'post',
    :url     => @url,
    :timeout => args[:timeout],
    :query   => generate_xml(
      :filters    => args[:filters], 
      :attributes => args[:attributes], 
      :count      => "1"
    )
  )
  return result.to_i
end

#filtersHash

Returns a hash (keyed by the biomart ‘internal_name’ for the filter) of all of the Biomart::Filter objects belonging to this dataset.

Returns:

  • (Hash)

    A hash of Biomart::Filter objects keyed by ‘internal_name’



42
43
44
45
46
47
# File 'lib/biomart/dataset.rb', line 42

def filters
  if @filters.empty?
    fetch_configuration()
  end
  return @filters
end

#generate_xml(args = {}) ⇒ Object

Utility function to build the Biomart query XML - used by #count and #search.

See Also:



176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
# File 'lib/biomart/dataset.rb', line 176

def generate_xml( args={} )
  biomart_xml = ""
  xml = Builder::XmlMarkup.new( :target => biomart_xml, :indent => 2 )
  
  xml.instruct!
  xml.declare!( :DOCTYPE, :Query )
  xml.Query( :virtualSchemaName => "default", :formatter => "TSV", :header => "0", :uniqueRows => "1", :count => args[:count], :datasetConfigVersion => "0.6" ) {
    dataset_xml( xml, self, { :filters => args[:filters], :attributes => args[:attributes] } )
    
    if args[:federate]
      args[:federate].each do |joined_dataset|
        unless joined_dataset[:dataset].is_a?(Biomart::Dataset)
          raise Biomart::ArgumentError, "You must pass a Biomart::Dataset object to the :federate[:dataset] option."
        end
        
        dataset_xml(
          xml,
          joined_dataset[:dataset],
          { :filters => joined_dataset[:filters], :attributes => joined_dataset[:attributes] }
        )
      end
    end
    
  }
  
  return biomart_xml
end

#list_attributesArray

Returns an array of the attribute names (biomart ‘internal_name’) for this dataset.

Returns:

  • (Array)

    An array of attributes (their ‘internal_name’s)



75
76
77
78
79
80
# File 'lib/biomart/dataset.rb', line 75

def list_attributes
  if @attributes.empty?
    fetch_configuration()
  end
  return @attributes.keys
end

#list_filtersArray

Returns an array of the filter names (biomart ‘internal_name’) for this dataset.

Returns:

  • (Array)

    An array of filters (their ‘internal_name’s)



53
54
55
56
57
58
# File 'lib/biomart/dataset.rb', line 53

def list_filters
  if @filters.empty?
    fetch_configuration()
  end
  return @filters.keys
end

#search(args = {}) ⇒ Hash/Array

Function to perform a Biomart search.

optional arguments:

{
  :process_results     => true/false,   # convert search results to object
  :timeout             => integer,      # set a timeout length for the request (secs)
  :filters             => {},           # hash of key-value pairs (filter => search term)
  :attributes          => [],           # array of attributes to retrieve
  :required_attributes => [],           # array of attributes that are required
  :federate => [
    {
      :dataset    => Biomart::Dataset, # A dataset object to federate with
      :filters    => {},               # hash of key-value pairs (filter => search term)
      :attributes => []                # array of attributes to retrieve
    }
  ]
}

Note, if you do not pass any filters or attributes arguments, the defaults for the dataset shall be used.

Also, using the :required_attributes option - this performs AND logic and will require data to be returned in all of the listed attributes in order for it to be returned.

By default will return a hash with the following:

{
  :headers => [],   # array of headers
  :data    => []    # array of arrays containing search results
}

But with the :process_results option will return an array of hashes, where each hash represents a row of results (keyed by the attribute name).

Parameters:

  • args (Hash) (defaults to: {})

    The arguments hash

Returns:

  • (Hash/Array)

    Will return a hash by default (of unprocessed data), or will return an array of hashes

Raises:

  • Biomart::ArgumentError Raised if incorrect arguments are passed



154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
# File 'lib/biomart/dataset.rb', line 154

def search( args={} )
  if args[:required_attributes] and !args[:required_attributes].is_a?(Array)
    raise Biomart::ArgumentError, "The :required_attributes option must be passed as an array."
  end
  
  response = request(
    :method  => 'post',
    :url     => @url,
    :timeout => args[:timeout],
    :query   => generate_xml( process_xml_args(args) )
  )
  
  result = process_tsv( args, response )
  result = filter_data_rows( args, result ) if args[:required_attributes]
  result = conv_results_to_a_of_h( result ) if args[:process_results]
  return result
end