Class: FiniteMDP::HashModel
 Inherits:

Object
 Object
 FiniteMDP::HashModel
 Includes:
 Model
 Defined in:
 lib/finite_mdp/hash_model.rb
Overview
A finite markov decision process model for which the transition probabilities and rewards are specified using nested hash tables.
The structure of the nested hash is as follows:
hash[:s] #=> a Hash that maps actions to successor states
hash[:s][:a] #=> a Hash from successor states to pairs (see next)
hash[:s][:a][:t] #=> an Array [probability, reward] for transition (s,a,t)
The states and actions can be arbitrary objects; see notes for Model.
The TableModel is an alternative way of storing these data.
Instance Attribute Summary collapse

#hash ⇒ Hash<state, Hash<action, Hash<state, [Float, Float]>>>
See notes for HashModel for an explanation of this structure.
Class Method Summary collapse

.from_model(model, sparse = true) ⇒ HashModel
Convert a generic model into a hash model.
Instance Method Summary collapse

#actions(state) ⇒ Array<action>
Actions that are valid for the given state; see Model#actions.

#initialize(hash) ⇒ HashModel
constructor
A new instance of HashModel.

#next_states(state, action) ⇒ Array<state>
Possible successor states after taking the given action in the given state; see Model#next_states.

#reward(state, action, next_state) ⇒ Float^{?}
Reward for a given transition; see Model#reward.

#states ⇒ Array<state>
States in this model; see Model#states.

#transition_probability(state, action, next_state) ⇒ Float
Probability of the given transition; see Model#transition_probability.
Methods included from Model
#check_transition_probabilities_sum, #terminal_states, #transition_probability_sums
Constructor Details
#initialize(hash) ⇒ HashModel
Returns a new instance of HashModel.
23 24 25 
# File 'lib/finite_mdp/hash_model.rb', line 23 def initialize(hash) @hash = hash end 
Instance Attribute Details
#hash ⇒ Hash<state, Hash<action, Hash<state, [Float, Float]>>>
Returns see notes for FiniteMDP::HashModel for an explanation of this structure.
31 32 33 
# File 'lib/finite_mdp/hash_model.rb', line 31 def hash @hash end 
Class Method Details
.from_model(model, sparse = true) ⇒ HashModel
Convert a generic model into a hash model.
109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 
# File 'lib/finite_mdp/hash_model.rb', line 109 def self.from_model(model, sparse = true) hash = {} model.states.each do state hash[state] = {} model.actions(state).each do action hash[state][action] = {} model.next_states(state, action).each do next_state pr = model.transition_probability(state, action, next_state) next unless pr > 0  !sparse hash[state][action][next_state] = [pr, model.reward(state, action, next_state)] end end end FiniteMDP::HashModel.new(hash) end 
Instance Method Details
#actions(state) ⇒ Array<action>
Actions that are valid for the given state; see Model#actions.
49 50 51 
# File 'lib/finite_mdp/hash_model.rb', line 49 def actions(state) hash[state].keys end 
#next_states(state, action) ⇒ Array<state>
Possible successor states after taking the given action in the given state; see Model#next_states.
63 64 65 
# File 'lib/finite_mdp/hash_model.rb', line 63 def next_states(state, action) hash[state][action].keys end 
#reward(state, action, next_state) ⇒ Float^{?}
Reward for a given transition; see Model#reward.
94 95 96 97 
# File 'lib/finite_mdp/hash_model.rb', line 94 def reward(state, action, next_state) _probability, reward = hash[state][action][next_state] reward end 
#states ⇒ Array<state>
States in this model; see Model#states.
38 39 40 
# File 'lib/finite_mdp/hash_model.rb', line 38 def states hash.keys end 
#transition_probability(state, action, next_state) ⇒ Float
Probability of the given transition; see Model#transition_probability.
78 79 80 81 
# File 'lib/finite_mdp/hash_model.rb', line 78 def transition_probability(state, action, next_state) probability, _reward = hash[state][action][next_state] probability  0 end 