Class: Veritable::Prediction

Inherits:

Hash

Object
Hash
Veritable::Prediction

show all

Defined in:: lib/veritable/api.rb

Overview

Represents the result of a Veritable prediction

A Veritable::Prediction is a Hash whose keys are the columns in the prediction request, and whose values are standard point estimates for predicted columns. For fixed (conditioning) columns, the value is the fixed value. For predicted values, the point estimate varies by datatype:

real – mean
count – mean rounded to the nearest integer
categorical – mode
boolean – mode

The object also gives access to the original predictions request, the predicted distribution on missing values, the schema of the analysis used to make predictions, and standard measures of uncertainty for the predicted values.

Attributes

request – a Hash containing the original predictions request. Keys are column names; conditioning values are present, predicted values are nil.
distribution – the underlying predicted distribution as an Array of Hashes, each of which represents a single sample from the predictive distribution.
schema – the schema for the columns in the predictions request
uncertainty – a Hash containing measures of uncertainty for each predicted value.

Methods

prob_within – calculates the probability a column’s value lies within a range
credible_values – calculates a credible range for the value of a column

Instance Attribute Summary collapse

#distribution ⇒ Object readonly

The underlying predicted distribution, as an Array of Hashes.
#request ⇒ Object readonly

The original predictions request, as a Hash.
#request_id ⇒ Object readonly

The original prediction ‘_request_id’, nil if none was specified.
#schema ⇒ Object readonly

The schema for the columns in the predictions request.
#uncertainty ⇒ Object readonly

A Hash of standard uncertainty measures.

Instance Method Summary collapse

#credible_values(column, p = nil) ⇒ Object

Based on the underlying predicted distribution, calculates a range within which the predicted value for the column lies with the specified probability.
#initialize(request, distribution, schema, request_id = nil) ⇒ Prediction constructor

Initializes a Veritable::Prediction.
#inspect ⇒ Object

Returns a string representation of the prediction results.
#prob_within(column, range) ⇒ Object

Calculates the probability a column’s value lies within a range.
#to_s ⇒ Object

Returns a string representation of the prediction results.

Constructor Details

#initialize(request, distribution, schema, request_id = nil) ⇒ `Prediction`

Initializes a Veritable::Prediction

Users should not call directly. Instead, call Veritable::Analysis#predict.

# File 'lib/veritable/api.rb', line 896

def initialize(request, distribution, schema, request_id=nil)
  @request = request
  @request.delete '_request_id'

  @schema = Schema.new(schema)
  @request_id = request_id

  fixed = {}
  @request.each { |k,v| 
    if not v.nil?
        fixed[k] = v
    end
  }
  @distribution = distribution
  @distribution.each {|d| 
    d.delete '_request_id'
    d.update(fixed)
  }

  @uncertainty = Hash.new()
  @request.each { |k,v|
    if v.nil?
      self[k] = point_estimate k
      @uncertainty[k] = calculate_uncertainty k
    else
      self[k] = v
      @uncertainty[k] = 0.0
    end
  }
end

Instance Attribute Details

#distribution ⇒ `Object` (readonly)

The underlying predicted distribution, as an Array of Hashes

Each Hash represents a single draw from the predictive distribution, and should be regarded as equiprobable with the others.



875
876
877

# File 'lib/veritable/api.rb', line 875

def distribution
  @distribution
end

#request ⇒ `Object` (readonly)

The original predictions request, as a Hash



865
866
867

# File 'lib/veritable/api.rb', line 865

def request
  @request
end

#request_id ⇒ `Object` (readonly)

The original prediction ‘_request_id’, nil if none was specified



868
869
870

# File 'lib/veritable/api.rb', line 868

def request_id
  @request_id
end

#schema ⇒ `Object` (readonly)

The schema for the columns in the predictions request



878
879
880

# File 'lib/veritable/api.rb', line 878

def schema
  @schema
end

#uncertainty ⇒ `Object` (readonly)

A Hash of standard uncertainty measures

Keys are the columns in the prediction request and values are uncertainty measures associated with each point estimate. A higher value indicates greater uncertainty. These measures vary by datatype:

real – length of 90% credible interval
count – length of 90% credible interval
categorical – total probability of all non-modal values
boolean – probability of the non-modal value



889
890
891

# File 'lib/veritable/api.rb', line 889

def uncertainty
  @uncertainty
end

Instance Method Details

#credible_values(column, p = nil) ⇒ `Object`

Based on the underlying predicted distribution, calculates a range within which the predicted value for the column lies with the specified probability.

Arguments

column – the column for which to calculate the range
p – The desired degree of probability. Default is nil, in which case will default to 0.5 for boolean and categorical columns, and to 0.9 for count and real columns.

Returns

For boolean and categorical columns, a Hash whose keys are categorical values in the calculated range and whose values are probabilities; for real and count columns, an Array of the [min, max] values for the calculated range.

# File 'lib/veritable/api.rb', line 978

def credible_values(column, p=nil)
  col_type = schema.type column
  Veritable::Util.check_datatype(col_type, "Credible values -- ")
  if col_type == 'boolean' or col_type == 'categorical'
    p = 0.5 if p.nil?
    tf = Hash.new
    ((freqs(counts(column)).sort_by {|k, v| v}).reject {|c, a| a < p}).each {|k, v| tf[k] = v}
    tf
  elsif col_type == 'count' or col_type == 'real'
    p = 0.9 if p.nil?
    n = distribution.size
    a = (n * (1.0 - p) / 2.0).round.to_i
    sv = sorted_values column
    n = sv.size
    lo = sv[a]
    hi = sv[n - 1 - a]
    [lo, hi]
  end
end

#inspect ⇒ `Object`

Returns a string representation of the prediction results

999	# File 'lib/veritable/api.rb', line 999 def inspect; to_s; end

#prob_within(column, range) ⇒ `Object`

Calculates the probability a column’s value lies within a range.

Based on the underlying predicted distribution, calculates the marginal probability that the predicted value for the given columns lies within the specified range.

Arguments

column – the column for which to calculate probabilities range – a representation of the range for which to calculate probabilities. For real and count columns, this is an Array of [start, end] representing a closed interval. For boolean and categorical columns, this is an Array of discrete values.

Returns

A probability as a Float

# File 'lib/veritable/api.rb', line 939

def prob_within(column, range)
  col_type = schema.type column
  Veritable::Util.check_datatype(col_type, "Probability within -- ")
  if col_type == 'boolean' or col_type == 'categorical'
    count = distribution.inject(0) {|memo, row|
      if range.include? row[column]
        memo + 1 
      else
        memo
      end
    }
    count.to_f / distribution.size
  elsif col_type == 'count' or col_type == 'real'
    mn = range[0]
    mx = range[1]
    count = distribution.inject(0) {|memo, row|
      v = row[column]
      if (mn.nil? or v >= mn) and (mx.nil? or v <=mx)
        memo + 1 
      else
        memo
      end
    }
    count.to_f / distribution.size
  end
end

#to_s ⇒ `Object`

Returns a string representation of the prediction results

1002	# File 'lib/veritable/api.rb', line 1002 def to_s; "<Veritable::Prediction #{super}>"; end

Class: Veritable::Prediction

Overview

Attributes

Methods

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(request, distribution, schema, request_id = nil) ⇒ Prediction

Instance Attribute Details

#distribution ⇒ Object (readonly)

#request ⇒ Object (readonly)

#request_id ⇒ Object (readonly)

#schema ⇒ Object (readonly)

#uncertainty ⇒ Object (readonly)

Instance Method Details

#credible_values(column, p = nil) ⇒ Object

Arguments

Returns

#inspect ⇒ Object

#prob_within(column, range) ⇒ Object

Arguments

Returns

#to_s ⇒ Object

#initialize(request, distribution, schema, request_id = nil) ⇒ `Prediction`

#distribution ⇒ `Object` (readonly)

#request ⇒ `Object` (readonly)

#request_id ⇒ `Object` (readonly)

#schema ⇒ `Object` (readonly)

#uncertainty ⇒ `Object` (readonly)

#credible_values(column, p = nil) ⇒ `Object`

#inspect ⇒ `Object`

#prob_within(column, range) ⇒ `Object`

#to_s ⇒ `Object`