Class: Bio::Pathway
Overview
Bio::Pathway is a general graph object initially constructed by the list of the ((<Bio::Relation>)) objects. The basic concept of the Bio::Pathway object is to store a graph as an adjacency list (in the instance variable @graph), and converting the list into an adjacency matrix by calling to_matrix method on demand. However, in some cases, it is convenient to have the original list of the ((<Bio::Relation>))s, Bio::Pathway object also stores the list (as the instance variable @relations) redundantly.
Note: you can clear the @relations list by calling clear_relations! method to reduce the memory usage, and the content of the @relations can be re-generated from the @graph by to_relations method.
Direct Known Subclasses
Instance Attribute Summary collapse
-
#graph ⇒ Object
readonly
Read-only accessor for the adjacency list of the graph.
-
#index ⇒ Object
readonly
Read-only accessor for the row/column index (@index) of the adjacency matrix.
-
#label ⇒ Object
Accessor for the hash of the label assigned to the each node.
-
#relations ⇒ Object
readonly
Read-only accessor for the internal list of the Bio::Relation objects.
Instance Method Summary collapse
-
#append(rel, add_rel = true) ⇒ Object
Add an Bio::Relation object ‘rel’ to the @graph and @relations.
-
#bellman_ford(root) ⇒ Object
Bellman-Ford method for solving the single-source shortest-paths problem in the graph in which edge weights can be negative.
-
#bfs_shortest_path(node1, node2) ⇒ Object
Calculates the shortest path between two nodes by using breadth_first_search method and returns steps and the path as Array.
-
#breadth_first_search(root) ⇒ Object
(also: #bfs)
Breadth first search solves steps and path to the each node and forms a tree contains all reachable vertices from the root node.
-
#clear_relations! ⇒ Object
Clear @relations array to reduce the memory usage.
-
#clique ⇒ Object
Not implemented yet.
-
#cliquishness(node) ⇒ Object
Returns completeness of the edge density among the surrounded nodes.
-
#common_subgraph(graph) ⇒ Object
Not implemented yet.
-
#delete(rel) ⇒ Object
Remove an edge indicated by the Bio::Relation object ‘rel’ from the.
-
#depth_first_search ⇒ Object
(also: #dfs)
Depth first search yields much information about the structure of the graph especially on the classification of the edges.
-
#dfs_topological_sort ⇒ Object
Topological sort of the directed acyclic graphs (“dags”) by using depth_first_search.
-
#dijkstra(root) ⇒ Object
Dijkstra method to solve the shortest path problem in the weighted graph.
-
#directed ⇒ Object
Changes the internal state of the graph from ‘undirected’ to ‘directed’ and re-generate adjacency list.
-
#directed? ⇒ Boolean
Returns true or false respond to the internal state of the graph.
-
#dump_list ⇒ Object
Pretty printer of the adjacency list.
-
#dump_matrix(*arg) ⇒ Object
Pretty printer of the adjacency matrix.
-
#edges ⇒ Object
Returns the number of the edges in the graph.
-
#floyd_warshall ⇒ Object
(also: #floyd)
Floyd-Wardshall alogrithm for solving the all-pairs shortest-paths problem on a directed graph G = (V, E).
-
#initialize(relations, undirected = false) ⇒ Pathway
constructor
Initial graph (adjacency list) generation from the list of Relation.
-
#kruskal ⇒ Object
Kruskal method for finding minimam spaninng trees.
-
#nodes ⇒ Object
Returns the number of the nodes in the graph.
-
#small_world ⇒ Object
Returns frequency of the nodes having same number of edges as hash.
-
#subgraph(list = nil) ⇒ Object
Select labeled nodes and generate subgraph.
-
#to_list ⇒ Object
Graph (adjacency list) generation from the Relations.
-
#to_matrix(default_value = nil, diagonal_value = nil) ⇒ Object
Convert adjacency list to adjacency matrix.
-
#to_relations ⇒ Object
Reconstruct @relations from the adjacency list @graph.
-
#undirected ⇒ Object
Changes the internal state of the graph from ‘directed’ to ‘undirected’ and re-generate adjacency list.
-
#undirected? ⇒ Boolean
Returns true or false respond to the internal state of the graph.
Constructor Details
#initialize(relations, undirected = false) ⇒ Pathway
Initial graph (adjacency list) generation from the list of Relation.
Generate Bio::Pathway object from the list of Bio::Relation objects. If the second argument is true, undirected graph is generated.
r1 = Bio::Relation.new('a', 'b', 1)
r2 = Bio::Relation.new('a', 'c', 5)
r3 = Bio::Relation.new('b', 'c', 3)
list = [ r1, r2, r3 ]
g = Bio::Pathway.new(list, 'undirected')
41 42 43 44 45 46 47 48 |
# File 'lib/bio/pathway.rb', line 41 def initialize(relations, undirected = false) @undirected = undirected @relations = relations @graph = {} # adjacency list expression of the graph @index = {} # numbering each node in matrix @label = {} # additional information on each node self.to_list # generate adjacency list end |
Instance Attribute Details
#graph ⇒ Object (readonly)
Read-only accessor for the adjacency list of the graph.
54 55 56 |
# File 'lib/bio/pathway.rb', line 54 def graph @graph end |
#index ⇒ Object (readonly)
Read-only accessor for the row/column index (@index) of the adjacency matrix. Contents of the hash @index is created by calling to_matrix method.
59 60 61 |
# File 'lib/bio/pathway.rb', line 59 def index @index end |
#label ⇒ Object
Accessor for the hash of the label assigned to the each node. You can label some of the nodes in the graph by passing a hash to the label and select subgraphs which contain labeled nodes only by subgraph method.
hash = { 1 => 'red', 2 => 'green', 5 => 'black' }
g.label = hash
g.label
g.subgraph # => new graph consists of the node 1, 2, 5 only
70 71 72 |
# File 'lib/bio/pathway.rb', line 70 def label @label end |
#relations ⇒ Object (readonly)
Read-only accessor for the internal list of the Bio::Relation objects
51 52 53 |
# File 'lib/bio/pathway.rb', line 51 def relations @relations end |
Instance Method Details
#append(rel, add_rel = true) ⇒ Object
Add an Bio::Relation object ‘rel’ to the @graph and @relations. If the second argument is false, @relations is not modified (only useful when genarating @graph from @relations internally).
144 145 146 147 148 149 150 151 152 153 154 |
# File 'lib/bio/pathway.rb', line 144 def append(rel, add_rel = true) @relations.push(rel) if add_rel if @graph[rel.from].nil? @graph[rel.from] = {} end if @graph[rel.to].nil? @graph[rel.to] = {} end @graph[rel.from][rel.to] = rel.relation @graph[rel.to][rel.from] = rel.relation if @undirected end |
#bellman_ford(root) ⇒ Object
Bellman-Ford method for solving the single-source shortest-paths problem in the graph in which edge weights can be negative.
592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 |
# File 'lib/bio/pathway.rb', line 592 def bellman_ford(root) distance, predecessor = initialize_single_source(root) for i in 1 ..(self.nodes - 1) do @graph.each_key do |u| @graph[u].each do |v, w| # relaxing procedure of root -> 'u' -> 'v' if distance[v] > distance[u] + w distance[v] = distance[u] + w predecessor[v] = u end end end end # negative cyclic loop check @graph.each_key do |u| @graph[u].each do |v, w| if distance[v] > distance[u] + w return false end end end return distance, predecessor end |
#bfs_shortest_path(node1, node2) ⇒ Object
Calculates the shortest path between two nodes by using breadth_first_search method and returns steps and the path as Array.
450 451 452 453 454 455 456 457 458 459 460 |
# File 'lib/bio/pathway.rb', line 450 def bfs_shortest_path(node1, node2) distance, route = breadth_first_search(node1) step = distance[node2] node = node2 path = [ node2 ] while node != node1 and route[node] node = route[node] path.unshift(node) end return step, path end |
#breadth_first_search(root) ⇒ Object Also known as: bfs
Breadth first search solves steps and path to the each node and forms a tree contains all reachable vertices from the root node. This method returns the result in 2 hashes - 1st one shows the steps from root node and 2nd hash shows the structure of the tree.
The weight of the edges are not considered in this method.
419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 |
# File 'lib/bio/pathway.rb', line 419 def breadth_first_search(root) visited = {} distance = {} predecessor = {} visited[root] = true distance[root] = 0 predecessor[root] = nil queue = [ root ] while from = queue.shift next unless @graph[from] @graph[from].each_key do |to| unless visited[to] visited[to] = true distance[to] = distance[from] + 1 predecessor[to] = from queue.push(to) end end end return distance, predecessor end |
#clear_relations! ⇒ Object
Clear @relations array to reduce the memory usage.
114 115 116 |
# File 'lib/bio/pathway.rb', line 114 def clear_relations! @relations.clear end |
#clique ⇒ Object
Not implemented yet.
370 371 372 |
# File 'lib/bio/pathway.rb', line 370 def clique raise NotImplementedError end |
#cliquishness(node) ⇒ Object
Returns completeness of the edge density among the surrounded nodes.
Calculates the value of cliquishness around the ‘node’. This value indicates completeness of the edge density among the surrounded nodes.
Note: cliquishness (clustering coefficient) for a directed graph is also calculated. Reference: en.wikipedia.org/wiki/Clustering_coefficient
Note: Cliquishness (clustering coefficient) for a node that has only one neighbor node is undefined. Currently, it returns NaN, but the behavior may be changed in the future.
388 389 390 391 392 393 394 395 396 397 398 399 |
# File 'lib/bio/pathway.rb', line 388 def cliquishness(node) neighbors = @graph[node].keys sg = subgraph(neighbors) if sg.graph.size != 0 edges = sg.edges nodes = neighbors.size complete = (nodes * (nodes - 1)) return edges.quo(complete) else return 0.0 end end |
#common_subgraph(graph) ⇒ Object
Not implemented yet.
364 365 366 |
# File 'lib/bio/pathway.rb', line 364 def common_subgraph(graph) raise NotImplementedError end |
#delete(rel) ⇒ Object
Remove an edge indicated by the Bio::Relation object ‘rel’ from the
158 159 160 161 162 163 164 |
# File 'lib/bio/pathway.rb', line 158 def delete(rel) @relations.delete_if do |x| x === rel end @graph[rel.from].delete(rel.to) @graph[rel.to].delete(rel.from) if @undirected end |
#depth_first_search ⇒ Object Also known as: dfs
Depth first search yields much information about the structure of the graph especially on the classification of the edges. This method returns 5 hashes - 1st one shows the timestamps of each node containing the first discoverd time and the search finished time in an array. The 2nd, 3rd, 4th, and 5th hashes contain ‘tree edges’, ‘back edges’, ‘cross edges’, ‘forward edges’ respectively.
If $DEBUG is true (e.g. ruby -d), this method prints the progression of the search.
The weight of the edges are not considered in this method.
Note: The result of this method depends on the order of Hash#each (and each_key, etc.), which may be variable with Ruby version and Ruby interpreter variations (JRuby, etc.). For a workaround to remove such dependency, you can use @index to set order of Hash keys. Note that this bahavior might be changed in the future.
481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 |
# File 'lib/bio/pathway.rb', line 481 def depth_first_search visited = {} = {} tree_edges = {} back_edges = {} cross_edges = {} forward_edges = {} count = 0 # begin workaround removing depencency to order of Hash#each if @index.empty? then preference_of_nodes = nil else preference_of_nodes = {}.merge(@index) i = preference_of_nodes.values.max @graph.each_key do |node0| preference_of_nodes[node0] ||= (i += 1) end end # end workaround removing depencency to order of Hash#each dfs_visit = Proc.new { |from| visited[from] = true [from] = [count += 1] ary = @graph[from].keys # begin workaround removing depencency to order of Hash#each if preference_of_nodes then ary = ary.sort_by { |node0| preference_of_nodes[node0] } end # end workaround removing depencency to order of Hash#each ary.each do |to| if visited[to] if [to].size > 1 if [from].first < [to].first # forward edge (black) p "#{from} -> #{to} : forward edge" if $DEBUG forward_edges[from] = to else # cross edge (black) p "#{from} -> #{to} : cross edge" if $DEBUG cross_edges[from] = to end else # back edge (gray) p "#{from} -> #{to} : back edge" if $DEBUG back_edges[from] = to end else # tree edge (white) p "#{from} -> #{to} : tree edge" if $DEBUG tree_edges[to] = from dfs_visit.call(to) end end [from].push(count += 1) } ary = @graph.keys # begin workaround removing depencency to order of Hash#each if preference_of_nodes then ary = ary.sort_by { |node0| preference_of_nodes[node0] } end # end workaround removing depencency to order of Hash#each ary.each do |node| unless visited[node] dfs_visit.call(node) end end return , tree_edges, back_edges, cross_edges, forward_edges end |
#dfs_topological_sort ⇒ Object
Topological sort of the directed acyclic graphs (“dags”) by using depth_first_search.
558 559 560 561 562 |
# File 'lib/bio/pathway.rb', line 558 def dfs_topological_sort # sorted by finished time reversely and collect node names only , = self.depth_first_search .sort {|a,b| b[1][1] <=> a[1][1]}.collect {|x| x.first } end |
#dijkstra(root) ⇒ Object
Dijkstra method to solve the shortest path problem in the weighted graph.
566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 |
# File 'lib/bio/pathway.rb', line 566 def dijkstra(root) distance, predecessor = initialize_single_source(root) @graph[root].each do |k, v| distance[k] = v predecessor[k] = root end queue = distance.dup queue.delete(root) while queue.size != 0 min = queue.min {|a, b| a[1] <=> b[1]} u = min[0] # extranct a node having minimal distance @graph[u].each do |k, v| # relaxing procedure of root -> 'u' -> 'k' if distance[k] > distance[u] + v distance[k] = distance[u] + v predecessor[k] = u end end queue.delete(u) end return distance, predecessor end |
#directed ⇒ Object
Changes the internal state of the graph from ‘undirected’ to ‘directed’ and re-generate adjacency list. The undirected graph can be converted to directed graph, however, the edge between two nodes will be simply doubled to both ends.
Note: this method can not be used without the list of the Bio::Relation objects (internally stored in @relations variable). Thus if you already called clear_relations! method, call to_relations first.
92 93 94 95 96 97 |
# File 'lib/bio/pathway.rb', line 92 def directed if undirected? @undirected = false self.to_list end end |
#directed? ⇒ Boolean
Returns true or false respond to the internal state of the graph.
74 75 76 |
# File 'lib/bio/pathway.rb', line 74 def directed? @undirected ? false : true end |
#dump_list ⇒ Object
Pretty printer of the adjacency list.
Useful when you want to check the internal state of the adjacency list (for debug purpose etc.) easily.
The result of this method depends on the order of Hash#each (and each_key, etc.), which may be variable with Ruby version and Ruby interpreter variations (JRuby, etc.). For a workaround to remove such dependency, you can use @index to set order of Hash keys. Note that this behavior might be changed in the future.
293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 |
# File 'lib/bio/pathway.rb', line 293 def dump_list # begin workaround removing depencency to order of Hash#each if @index.empty? then pref = nil enum = @graph else pref = {}.merge(@index) i = pref.values.max @graph.each_key do |node| pref[node] ||= (i += 1) end graph_to_a = @graph.to_a graph_to_a.sort! { |x, y| pref[x[0]] <=> pref[y[0]] } enum = graph_to_a end # end workaround removing depencency to order of Hash#each list = "" enum.each do |from, hash| list << "#{from} => " # begin workaround removing depencency to order of Hash#each if pref then ary = hash.to_a ary.sort! { |x,y| pref[x[0]] <=> pref[y[0]] } hash = ary end # end workaround removing depencency to order of Hash#each a = [] hash.each do |to, relation| a.push("#{to} (#{relation})") end list << a.join(", ") + "\n" end list end |
#dump_matrix(*arg) ⇒ Object
Pretty printer of the adjacency matrix.
The dump_matrix method accepts the same arguments as to_matrix. Useful when you want to check the internal state of the matrix (for debug purpose etc.) easily.
This method internally calls to_matrix method. Read documents of to_matrix for important informations.
274 275 276 277 278 279 |
# File 'lib/bio/pathway.rb', line 274 def dump_matrix(*arg) matrix = self.to_matrix(*arg) sorted = @index.sort {|a,b| a[1] <=> b[1]} "[# " + sorted.collect{|x| x[0]}.join(", ") + "\n" + matrix.to_a.collect{|row| ' ' + row.inspect}.join(",\n") + "\n]" end |
#edges ⇒ Object
Returns the number of the edges in the graph.
172 173 174 175 176 177 178 |
# File 'lib/bio/pathway.rb', line 172 def edges edges = 0 @graph.each_value do |v| edges += v.size end edges end |
#floyd_warshall ⇒ Object Also known as: floyd
Floyd-Wardshall alogrithm for solving the all-pairs shortest-paths problem on a directed graph G = (V, E).
619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 |
# File 'lib/bio/pathway.rb', line 619 def floyd_warshall inf = 1 / 0.0 m = self.to_matrix(inf, 0) d = m.dup n = self.nodes for k in 0 .. n - 1 do for i in 0 .. n - 1 do for j in 0 .. n - 1 do if d[i, j] > d[i, k] + d[k, j] d[i, j] = d[i, k] + d[k, j] end end end end return d end |
#kruskal ⇒ Object
Kruskal method for finding minimam spaninng trees
641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 |
# File 'lib/bio/pathway.rb', line 641 def kruskal # initialize rel = self.to_relations.sort{|a, b| a <=> b} index = [] for i in 0 .. (rel.size - 1) do for j in (i + 1) .. (rel.size - 1) do if rel[i] == rel[j] index << j end end end index.sort{|x, y| y<=>x}.each do |idx| rel[idx, 1] = [] end mst = [] seen = Hash.new() @graph.each_key do |x| seen[x] = nil end i = 1 # initialize end rel.each do |r| if seen[r.node[0]] == nil seen[r.node[0]] = 0 end if seen[r.node[1]] == nil seen[r.node[1]] = 0 end if seen[r.node[0]] == seen[r.node[1]] && seen[r.node[0]] == 0 mst << r seen[r.node[0]] = i seen[r.node[1]] = i elsif seen[r.node[0]] != seen[r.node[1]] mst << r v1 = seen[r.node[0]].dup v2 = seen[r.node[1]].dup seen.each do |k, v| if v == v1 || v == v2 seen[k] = i end end end i += 1 end return Pathway.new(mst) end |
#nodes ⇒ Object
Returns the number of the nodes in the graph.
167 168 169 |
# File 'lib/bio/pathway.rb', line 167 def nodes @graph.keys.length end |
#small_world ⇒ Object
Returns frequency of the nodes having same number of edges as hash
Calculates the frequency of the nodes having the same number of edges and returns the value as Hash.
405 406 407 408 409 410 411 |
# File 'lib/bio/pathway.rb', line 405 def small_world freq = Hash.new(0) @graph.each_value do |v| freq[v.size] += 1 end return freq end |
#subgraph(list = nil) ⇒ Object
Select labeled nodes and generate subgraph
This method select some nodes and returns new Bio::Pathway object consists of selected nodes only. If the list of the nodes (as Array) is assigned as the argument, use the list to select the nodes from the graph. If no argument is assigned, internal property of the graph @label is used to select the nodes.
hash = { 'a' => 'secret', 'b' => 'important', 'c' => 'important' }
g.label = hash
g.subgraph
list = [ 'a', 'b', 'c' ]
g.subgraph(list)
343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 |
# File 'lib/bio/pathway.rb', line 343 def subgraph(list = nil) if list @label.clear list.each do |node| @label[node] = true end end sub_graph = Pathway.new([], @undirected) @graph.each do |from, hash| next unless @label[from] sub_graph.graph[from] ||= {} hash.each do |to, relation| next unless @label[to] sub_graph.append(Relation.new(from, to, relation)) end end return sub_graph end |
#to_list ⇒ Object
Graph (adjacency list) generation from the Relations
Generate the adjcancecy list @graph from @relations (called by initialize and in some other cases when @relations has been changed).
134 135 136 137 138 139 |
# File 'lib/bio/pathway.rb', line 134 def to_list @graph.clear @relations.each do |rel| append(rel, false) # append to @graph without push to @relations end end |
#to_matrix(default_value = nil, diagonal_value = nil) ⇒ Object
Convert adjacency list to adjacency matrix
Returns the adjacency matrix expression of the graph as a Matrix object. If the first argument was assigned, the matrix will be filled with the given value. The second argument indicates the value of the diagonal constituents of the matrix besides the above.
The result of this method depends on the order of Hash#each (and each_key, etc.), which may be variable with Ruby version and Ruby interpreter variations (JRuby, etc.). For a workaround to remove such dependency, you can use @index to set order of Hash keys. Note that this behavior might be changed in the future. Be careful that @index is overwritten by this method.
196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 |
# File 'lib/bio/pathway.rb', line 196 def to_matrix(default_value = nil, diagonal_value = nil) #-- # Note: following code only fills the outer Array with the reference # to the same inner Array object. # # matrix = Array.new(nodes, Array.new(nodes)) # # so create a new Array object for each row as follows: #++ matrix = Array.new nodes.times do matrix.push(Array.new(nodes, default_value)) end if diagonal_value nodes.times do |i| matrix[i][i] = diagonal_value end end # assign index number if @index.empty? then # assign index number for each node @graph.keys.each_with_index do |k, i| @index[k] = i end else # begin workaround removing depencency to order of Hash#each # assign index number from the preset @index indices = @index.to_a indices.sort! { |i0, i1| i0[1] <=> i1[1] } indices.collect! { |i0| i0[0] } @index.clear v = 0 indices.each do |k, i| if @graph[k] and !@index[k] then @index[k] = v; v += 1 end end @graph.each_key do |k| unless @index[k] then @index[k] = v; v += 1 end end # end workaround removing depencency to order of Hash#each end if @relations.empty? # only used after clear_relations! @graph.each do |from, hash| hash.each do |to, relation| x = @index[from] y = @index[to] matrix[x][y] = relation end end else @relations.each do |rel| x = @index[rel.from] y = @index[rel.to] matrix[x][y] = rel.relation matrix[y][x] = rel.relation if @undirected end end Matrix[*matrix] end |
#to_relations ⇒ Object
Reconstruct @relations from the adjacency list @graph.
119 120 121 122 123 124 125 126 127 |
# File 'lib/bio/pathway.rb', line 119 def to_relations @relations.clear @graph.each_key do |from| @graph[from].each do |to, w| @relations << Relation.new(from, to, w) end end return @relations end |
#undirected ⇒ Object
Changes the internal state of the graph from ‘directed’ to ‘undirected’ and re-generate adjacency list.
Note: this method can not be used without the list of the Bio::Relation objects (internally stored in @relations variable). Thus if you already called clear_relations! method, call to_relations first.
106 107 108 109 110 111 |
# File 'lib/bio/pathway.rb', line 106 def undirected if directed? @undirected = true self.to_list end end |
#undirected? ⇒ Boolean
Returns true or false respond to the internal state of the graph.
79 80 81 |
# File 'lib/bio/pathway.rb', line 79 def undirected? @undirected ? true : false end |