Class: FiniteMDP::ArrayModel
- Inherits:
-
Object
- Object
- FiniteMDP::ArrayModel
- Includes:
- Model
- Defined in:
- lib/finite_mdp/array_model.rb
Overview
A finite markov decision process model for which the states, transition probabilities and rewards are stored in a sparse nested array format:
model[state_num][action_num] = [[next_state_num, probability, reward], ...]
Note: The action_num is not consistent between states — each state's action array contains only the actions that apply in that state.
This class also maintains a StateActionMap to map between the state and action numbers and the original states and actions.
Defined Under Namespace
Classes: OrderedStateActionMap, StateActionMap
Instance Attribute Summary collapse
-
#array ⇒ Array<Array<Array>>
readonly
] array see notes for ArrayModel.
- #state_action_map ⇒ StateActionMap readonly
Class Method Summary collapse
-
.from_model(model, sparse = true, ordered = nil) ⇒ ArrayModel
Convert a generic model into a hash model.
Instance Method Summary collapse
-
#actions(state) ⇒ Array<state>
Actions that are valid for the given state; see Model#actions.
-
#initialize(array, state_action_map) ⇒ ArrayModel
constructor
A new instance of ArrayModel.
-
#next_states(state, action) ⇒ Array<state>
Possible successor states after taking the given action in the given state; see Model#next_states.
-
#num_states ⇒ Fixnum
Number of states in the model.
-
#reward(state, action, next_state) ⇒ Float?
Reward for a given transition; see Model#reward.
-
#states ⇒ Array<state>
States in this model; see Model#states.
-
#transition_probability(state, action, next_state) ⇒ Float
Probability of the given transition; see Model#transition_probability.
Methods included from Model
#check_transition_probabilities_sum, #terminal_states, #transition_probability_sums
Constructor Details
#initialize(array, state_action_map) ⇒ ArrayModel
Returns a new instance of ArrayModel.
94 95 96 97 |
# File 'lib/finite_mdp/array_model.rb', line 94 def initialize(array, state_action_map) @array = array @state_action_map = state_action_map end |
Instance Attribute Details
#array ⇒ Array<Array<Array>> (readonly)
Returns ] array see notes for FiniteMDP::ArrayModel.
102 103 104 |
# File 'lib/finite_mdp/array_model.rb', line 102 def array @array end |
#state_action_map ⇒ StateActionMap (readonly)
107 108 109 |
# File 'lib/finite_mdp/array_model.rb', line 107 def state_action_map @state_action_map end |
Class Method Details
.from_model(model, sparse = true, ordered = nil) ⇒ ArrayModel
Convert a generic model into a hash model.
211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 |
# File 'lib/finite_mdp/array_model.rb', line 211 def self.from_model(model, sparse = true, ordered = nil) state_action_map = StateActionMap.from_model(model, ordered) array = state_action_map.states.map do |state| state_action_map.actions(state).map do |action| model.next_states(state, action).map do |next_state| pr = model.transition_probability(state, action, next_state) next unless pr > 0 || !sparse reward = model.reward(state, action, next_state) next_index = state_action_map.state_index(next_state) raise "successor state not found: #{next_state}" unless next_index [next_index, pr, reward] end.compact end end FiniteMDP::ArrayModel.new(array, state_action_map) end |
Instance Method Details
#actions(state) ⇒ Array<state>
Actions that are valid for the given state; see Model#actions.
134 135 136 |
# File 'lib/finite_mdp/array_model.rb', line 134 def actions(state) @state_action_map.actions(state) end |
#next_states(state, action) ⇒ Array<state>
Possible successor states after taking the given action in the given state; see Model#next_states.
148 149 150 151 152 153 154 |
# File 'lib/finite_mdp/array_model.rb', line 148 def next_states(state, action) state_index, action_index = @state_action_map.state_action_index(state, action) @array[state_index][action_index].map do |next_state_index, _pr, _reward| @state_action_map.state(next_state_index) end end |
#num_states ⇒ Fixnum
Number of states in the model.
123 124 125 |
# File 'lib/finite_mdp/array_model.rb', line 123 def num_states @state_action_map.map.size end |
#reward(state, action, next_state) ⇒ Float?
Reward for a given transition; see Model#reward.
188 189 190 191 192 193 194 195 196 |
# File 'lib/finite_mdp/array_model.rb', line 188 def reward(state, action, next_state) state_index, action_index = @state_action_map.state_action_index(state, action) next_state_index = @state_action_map.state_index(next_state) @array[state_index][action_index].each do |index, _probability, reward| return reward if index == next_state_index end nil end |
#states ⇒ Array<state>
States in this model; see Model#states.
114 115 116 |
# File 'lib/finite_mdp/array_model.rb', line 114 def states @state_action_map.states end |
#transition_probability(state, action, next_state) ⇒ Float
Probability of the given transition; see Model#transition_probability.
167 168 169 170 171 172 173 174 175 |
# File 'lib/finite_mdp/array_model.rb', line 167 def transition_probability(state, action, next_state) state_index, action_index = @state_action_map.state_action_index(state, action) next_state_index = @state_action_map.state_index(next_state) @array[state_index][action_index].each do |index, probability, _reward| return probability if index == next_state_index end 0 end |