Module: Utils

Defined in:
lib/utils.rb

Overview

A collection of methods to support checking and converting different document file types.

Constant Summary collapse

CompressedFileExtensions =

Create a list of permitted compressed file extensions depending on the available commands

[]

Class Method Summary collapse

Class Method Details

.command_present?(command) ⇒ Boolean

Check if given command is present on the system

Returns:

  • (Boolean)


30
31
32
# File 'lib/utils.rb', line 30

def Utils.command_present? command
  `which #{command}` != ""
end

.convert_pdf_document(filename) ⇒ Object

Use pdf2txt to convert the pdf file to text The output is the converted filename, obtained by adding .txt to the given filename



71
72
73
74
75
76
77
78
79
# File 'lib/utils.rb', line 71

def Utils.convert_pdf_document filename
  if Utils.command_present?("pdftotext")
    output_filename = "#{filename}.txt"
    `pdftotext -layout -enc Latin1 -nopgbrk #{filename} #{output_filename}` 
    return output_filename
  else
    return filename
  end
end

.convert_wp_document(filename) ⇒ Object

Use abiword to convert the word-processed file to text The output is the converted filename, obtained by adding .txt to the given filename



84
85
86
87
88
89
90
91
92
# File 'lib/utils.rb', line 84

def Utils.convert_wp_document filename
  if Utils.command_present?("abiword")
    output_filename = "#{filename}.txt"
    `abiword --to=txt #{filename} -o #{output_filename}` 
    return output_filename
  else
    return filename 
  end
end

.is_code?(filename) ⇒ Boolean

Return true if the filename has a file ending for code

Returns:

  • (Boolean)


46
47
48
# File 'lib/utils.rb', line 46

def Utils.is_code? filename
  [".c", ".h", ".cpp", ".java"].include? File.extname(filename)
end

.is_pdf_document?(filename) ⇒ Boolean

Return true if the filename ends with .pdf and so is a pdf document.

Returns:

  • (Boolean)


59
60
61
# File 'lib/utils.rb', line 59

def Utils.is_pdf_document? filename
  ".pdf" == File.extname(filename)
end

.is_wp_document?(filename) ⇒ Boolean

Return true if the filename ends with a known word processor extension.

Returns:

  • (Boolean)


64
65
66
# File 'lib/utils.rb', line 64

def Utils.is_wp_document? filename
  [".doc", ".rtf", ".docx", ".abw"].include? File.extname(filename)
end

.valid_document?(filename) ⇒ Boolean

Return true if the filename has a valid extension

Returns:

  • (Boolean)


51
52
53
54
55
56
# File 'lib/utils.rb', line 51

def Utils.valid_document? filename
  Utils.is_code? filename or 
  (".txt" == File.extname(filename)) or
  Utils.is_pdf_document? filename or 
  Utils.is_wp_document? filename
end