Class: Veritable::Prediction
- Inherits:
-
Hash
- Object
- Hash
- Veritable::Prediction
- Defined in:
- lib/veritable/api.rb
Overview
Represents the result of a Veritable prediction
A Veritable::Prediction is a Hash whose keys are the columns in the prediction request, and whose values are standard point estimates for predicted columns. For fixed (conditioning) columns, the value is the fixed value. For predicted values, the point estimate varies by datatype:
-
real – mean
-
count – mean rounded to the nearest integer
-
categorical – mode
-
boolean – mode
The object also gives access to the original predictions request, the predicted distribution on missing values, the schema of the analysis used to make predictions, and standard measures of uncertainty for the predicted values.
Attributes
-
request
– a Hash containing the original predictions request. Keys are column names; conditioning values are present, predicted values arenil
. -
distribution
– the underlying predicted distribution as an Array of Hashes, each of which represents a single sample from the predictive distribution. -
schema
– the schema for the columns in the predictions request -
uncertainty
– a Hash containing measures of uncertainty for each predicted value.
Methods
-
prob_within
– calculates the probability a column’s value lies within a range -
credible_values
– calculates a credible range for the value of a column
See also: dev.priorknowledge.com/docs/client/ruby
Instance Attribute Summary collapse
-
#distribution ⇒ Object
readonly
The underlying predicted distribution, as an Array of Hashes.
-
#request ⇒ Object
readonly
The original predictions request, as a Hash.
-
#request_id ⇒ Object
readonly
The original prediction ‘_request_id’, nil if none was specified.
-
#schema ⇒ Object
readonly
The schema for the columns in the predictions request.
-
#uncertainty ⇒ Object
readonly
A Hash of standard uncertainty measures.
Instance Method Summary collapse
-
#credible_values(column, p = nil) ⇒ Object
Based on the underlying predicted distribution, calculates a range within which the predicted value for the column lies with the specified probability.
-
#initialize(request, distribution, schema, request_id = nil) ⇒ Prediction
constructor
Initializes a Veritable::Prediction.
-
#inspect ⇒ Object
Returns a string representation of the prediction results.
-
#prob_within(column, range) ⇒ Object
Calculates the probability a column’s value lies within a range.
-
#to_s ⇒ Object
Returns a string representation of the prediction results.
Constructor Details
#initialize(request, distribution, schema, request_id = nil) ⇒ Prediction
Initializes a Veritable::Prediction
Users should not call directly. Instead, call Veritable::Analysis#predict.
See also: dev.priorknowledge.com/docs/client/ruby
896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 |
# File 'lib/veritable/api.rb', line 896 def initialize(request, distribution, schema, request_id=nil) @request = request @request.delete '_request_id' @schema = Schema.new(schema) @request_id = request_id fixed = {} @request.each { |k,v| if not v.nil? fixed[k] = v end } @distribution = distribution @distribution.each {|d| d.delete '_request_id' d.update(fixed) } @uncertainty = Hash.new() @request.each { |k,v| if v.nil? self[k] = point_estimate k @uncertainty[k] = calculate_uncertainty k else self[k] = v @uncertainty[k] = 0.0 end } end |
Instance Attribute Details
#distribution ⇒ Object (readonly)
The underlying predicted distribution, as an Array of Hashes
Each Hash represents a single draw from the predictive distribution, and should be regarded as equiprobable with the others.
See also: dev.priorknowledge.com/docs/client/ruby
875 876 877 |
# File 'lib/veritable/api.rb', line 875 def distribution @distribution end |
#request ⇒ Object (readonly)
The original predictions request, as a Hash
865 866 867 |
# File 'lib/veritable/api.rb', line 865 def request @request end |
#request_id ⇒ Object (readonly)
The original prediction ‘_request_id’, nil if none was specified
868 869 870 |
# File 'lib/veritable/api.rb', line 868 def request_id @request_id end |
#schema ⇒ Object (readonly)
The schema for the columns in the predictions request
878 879 880 |
# File 'lib/veritable/api.rb', line 878 def schema @schema end |
#uncertainty ⇒ Object (readonly)
A Hash of standard uncertainty measures
Keys are the columns in the prediction request and values are uncertainty measures associated with each point estimate. A higher value indicates greater uncertainty. These measures vary by datatype:
-
real – length of 90% credible interval
-
count – length of 90% credible interval
-
categorical – total probability of all non-modal values
-
boolean – probability of the non-modal value
See also: dev.priorknowledge.com/docs/client/ruby
889 890 891 |
# File 'lib/veritable/api.rb', line 889 def uncertainty @uncertainty end |
Instance Method Details
#credible_values(column, p = nil) ⇒ Object
Based on the underlying predicted distribution, calculates a range within which the predicted value for the column lies with the specified probability.
Arguments
-
column
– the column for which to calculate the range -
p
– The desired degree of probability. Default isnil
, in which case will default to 0.5 for boolean and categorical columns, and to 0.9 for count and real columns.
Returns
For boolean and categorical columns, a Hash whose keys are categorical values in the calculated range and whose values are probabilities; for real and count columns, an Array of the [min, max]
values for the calculated range.
See also: dev.priorknowledge.com/docs/client/ruby
978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 |
# File 'lib/veritable/api.rb', line 978 def credible_values(column, p=nil) col_type = schema.type column Veritable::Util.check_datatype(col_type, "Credible values -- ") if col_type == 'boolean' or col_type == 'categorical' p = 0.5 if p.nil? tf = Hash.new ((freqs(counts(column)).sort_by {|k, v| v}).reject {|c, a| a < p}).each {|k, v| tf[k] = v} tf elsif col_type == 'count' or col_type == 'real' p = 0.9 if p.nil? n = distribution.size a = (n * (1.0 - p) / 2.0).round.to_i sv = sorted_values column n = sv.size lo = sv[a] hi = sv[n - 1 - a] [lo, hi] end end |
#inspect ⇒ Object
Returns a string representation of the prediction results
999 |
# File 'lib/veritable/api.rb', line 999 def inspect; to_s; end |
#prob_within(column, range) ⇒ Object
Calculates the probability a column’s value lies within a range.
Based on the underlying predicted distribution, calculates the marginal probability that the predicted value for the given columns lies within the specified range.
Arguments
column – the column for which to calculate probabilities range – a representation of the range for which to calculate probabilities. For real and count columns, this is an Array of [start, end]
representing a closed interval. For boolean and categorical columns, this is an Array of discrete values.
Returns
A probability as a Float
939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 |
# File 'lib/veritable/api.rb', line 939 def prob_within(column, range) col_type = schema.type column Veritable::Util.check_datatype(col_type, "Probability within -- ") if col_type == 'boolean' or col_type == 'categorical' count = distribution.inject(0) {|memo, row| if range.include? row[column] memo + 1 else memo end } count.to_f / distribution.size elsif col_type == 'count' or col_type == 'real' mn = range[0] mx = range[1] count = distribution.inject(0) {|memo, row| v = row[column] if (mn.nil? or v >= mn) and (mx.nil? or v <=mx) memo + 1 else memo end } count.to_f / distribution.size end end |
#to_s ⇒ Object
Returns a string representation of the prediction results
1002 |
# File 'lib/veritable/api.rb', line 1002 def to_s; "<Veritable::Prediction #{super}>"; end |