Class: LiterateRandomizer::MarkovModel

Inherits:
Object
  • Object
show all
Defined in:
lib/literate_randomizer/markov.rb

Overview

The Markov-Chain bi-gram model. Primary purpose is, given a word, return the next word that is “likely” based on the source material.

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(options = {}) ⇒ MarkovModel

Initialize a new instance.

Options:

  • :randomizer => Random.new # must respond to #rand(limit)

  • :source_parser => SourceParser.new options



91
92
93
94
95
96
# File 'lib/literate_randomizer/markov.rb', line 91

def initialize(options={})
  @randomizer = randomizer || Random.new
  @source_parser = options[:source_parser] || SourceParser.new(options)

  populate
end

Instance Attribute Details

#first_wordsObject (readonly)

An array of all words that appear at the beginning of sentences in the source-material.



20
21
22
# File 'lib/literate_randomizer/markov.rb', line 20

def first_words
  @first_words
end

#markov_chainsObject (readonly)

Data structure incoding all Markov-Chains (bi-grams) found in the source-material.

markov_chains is a hash of hashs. The top level keys are the “first words” in the chain. For each first-word, there are one or more words that followed that word in the text. Second-words are the second-level hash key. The second-level hash values are the count of the number of times that second word followed the first.

Summary: => {second_words => found-in-source-material-in-sequence-count}



29
30
31
# File 'lib/literate_randomizer/markov.rb', line 29

def markov_chains
  @markov_chains
end

#randomizerObject

The source of all random values. Must implement: #rand(limit)

Default: Random.new()



14
15
16
# File 'lib/literate_randomizer/markov.rb', line 14

def randomizer
  @randomizer
end

#source_parserObject

an instance of SourceParser attached to the source_material



32
33
34
# File 'lib/literate_randomizer/markov.rb', line 32

def source_parser
  @source_parser
end

#wordsObject (readonly)

A hash (string => true) of all unique words found in the source-material.



17
18
19
# File 'lib/literate_randomizer/markov.rb', line 17

def words
  @words
end

Instance Method Details

#next_word(word, randomizer = @randomizer) ⇒ Object

Given a word, return a weighted-randomly selected next-one.



99
100
101
102
103
104
105
106
107
108
# File 'lib/literate_randomizer/markov.rb', line 99

def next_word(word,randomizer=@randomizer)
  return if !markov_chains[word]
  sum = @markov_weighted_sum[word]
  random = randomizer.rand(sum)+1
  partial_sum = 0
  (markov_chains[word].find do |w, count|
    partial_sum += count
    w!=word && partial_sum >= random
  end||[]).first
end