Module: Grep

Defined in:
lib/code_zauker/grep.rb

Instance Method Summary collapse

Instance Method Details

#grep(file, pattern, pre_context = 0, post_context = 0, print_filename = true) ⇒ Object

Grep works like a shell grep. ‘file’ can be either a string, containing the name of a file to load and handle, or an IO object (such as $stdin) to deal with. ‘pattern’ can be either a string or Regexp object which contains a pattern. Patterns as strings treat no part of the string as ‘special’, such as ‘.’ or ‘?’ in a regex. ‘pre_context’ and ‘post_context’ determine the amount of lines to return that came before or after the content that was matched, respectively. If there are overlaps in the context, no duplicates will be printed.



58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
# File 'lib/code_zauker/grep.rb', line 58

def grep(file, pattern, pre_context=0, post_context=0, print_filename=true)
  currentline=0
  if file.kind_of? String
    fileName=file
    file = File.new(file, "r")
  else
    fileName=""
  end

  if ! file.kind_of? IO
    throw IOError.new("File must be the name of an existing file or IO object")
  end

  if pattern.kind_of? String
    pattern = /#{Regexp.escape(pattern)}/
  end

  if ! pattern.kind_of? Regexp
    throw StandardError.new("Pattern must be string or regexp")
  end

  cache = []
  lines = []

  util=CodeZauker::Util.new()

  loop do
    begin
      line = util.ensureUTF8(file.readline)
      
      currentline +=1
      cache.shift unless cache.length < pre_context

      cache.push("#{currentline}:#{line}")
      
      if line =~ pattern
        lines += cache
        cache = []
        if post_context > 0
          post_context.times do
            begin                
              utf8line=util.ensureUTF8(file.readline)
              lines.push("#{currentline}:#{utf8line}")                
              currentline +=1
            rescue IOError => e
              break
            end
          end
        end
      end
    rescue IOError => e
      break
    rescue ArgumentError =>e2
      # Rethrow a probably UTF-8 fatal error
      puts "Pattern Matching failed on \n\t#{fileName}\n\tLine:#{line}"
      puts "Encoding of line:#{line.encoding.name} Valid? #{line.valid_encoding?}"
      #raise e2
    end
  end
  

  file.each_line do |untrustedLine|
    cache.shift unless cache.length < pre_context
    line=util.ensureUTF8(untrustedLine)
    cache.push(line)

    if line =~ pattern
      lines += cache
      if post_context > 0
        post_context.times do
          begin
            utf8line=util.ensureUTF8(file.readline)
            lines.push("#{currentline}:#{utf8line}")  
            currentline +=1
          rescue Exception => e
            break
          end
        end
      end
    end
  end

  return lines
end