Class: Pdfmdrename
Overview
Class: pdfmdrename
Class for renaming the file according to the metadata
Constant Summary collapse
- @@keymapping =
document key mappings to determine the document type based on the string in the meta field ‘title’
{ 'cno' => ['Customer','Customernumber'], 'con' => ['Contract'], 'inf' => ['Information'], 'inv' => ['Invoice', 'Invoicenumber'], 'man' => ['Manual'], 'off' => ['Offer', 'Offernumber'], 'ord' => ['Order', 'Ordernumber'], 'rec' => ['Receipt', 'Receiptnumber'], 'tic' => ['Ticket'], }
Instance Attribute Summary collapse
-
#allkeywords ⇒ Object
Returns the value of attribute allkeywords.
-
#copy ⇒ Object
Returns the value of attribute copy.
-
#dryrun ⇒ Object
Returns the value of attribute dryrun.
-
#filename ⇒ Object
Returns the value of attribute filename.
-
#nrkeywords ⇒ Object
Returns the value of attribute nrkeywords.
-
#outputdir ⇒ Object
Returns the value of attribute outputdir.
Attributes inherited from Pdfmd
Instance Method Summary collapse
-
#get_author ⇒ Object
Get the author from the metatags and normalize the string.
-
#get_doctype ⇒ Object
Get the doctype from the title.
-
#get_filename(filedata = {}) ⇒ Object
Return the filename from the available filedata.
-
#get_keywords(preface = '') ⇒ Object
Get the keywords This methods is trying in a way to intelligently handle the keywords and return them back to.
-
#get_keywordsPreface(filedata = {}) ⇒ Object
Get the preface for the keywords If the title is meaningful, then the subject will become the preface ( = first keyword) If the subject matches number/character combination and contains no spaces, the preface will be combined with the doctype.
-
#get_outputdir(outputdir = '') ⇒ Object
Validate the output directory.
-
#initialize(filename) ⇒ Pdfmdrename
constructor
A new instance of Pdfmdrename.
- #rename ⇒ Object
-
#verifyDocumentData(filedata = {}) ⇒ Object
Data verification returns false is any metatadag is missing other than keywords.
Methods inherited from Pdfmd
#check_metatags, #metadata, #readUserInput, #read_metatags
Methods included from Pdfmdmethods
#determineValidSetting, #log, #queryHiera
Constructor Details
#initialize(filename) ⇒ Pdfmdrename
Returns a new instance of Pdfmdrename.
23 24 25 26 27 28 29 30 31 32 33 34 35 |
# File 'lib/pdfmd/pdfmdrename.rb', line 23 def initialize(filename) super(filename) @nrkeywords ||= 3 # Find the valid keymapping # Use @@keymapping as default and only overwrite when provided by hiera. hierakeymapping = self.determineValidSetting(nil, 'rename:keys') hierakeymapping ? @@keymapping = hierakeymapping : '' # FIXME: this default doctype assignment might need to be rewritten as the keymapping above. @defaultDoctype = self.determineValidSetting('doc', 'rename:defaultdoctype') @fileextension = 'pdf' end |
Instance Attribute Details
#allkeywords ⇒ Object
Returns the value of attribute allkeywords.
7 8 9 |
# File 'lib/pdfmd/pdfmdrename.rb', line 7 def allkeywords @allkeywords end |
#copy ⇒ Object
Returns the value of attribute copy.
7 8 9 |
# File 'lib/pdfmd/pdfmdrename.rb', line 7 def copy @copy end |
#dryrun ⇒ Object
Returns the value of attribute dryrun.
7 8 9 |
# File 'lib/pdfmd/pdfmdrename.rb', line 7 def dryrun @dryrun end |
#filename ⇒ Object
Returns the value of attribute filename.
7 8 9 |
# File 'lib/pdfmd/pdfmdrename.rb', line 7 def filename @filename end |
#nrkeywords ⇒ Object
Returns the value of attribute nrkeywords.
7 8 9 |
# File 'lib/pdfmd/pdfmdrename.rb', line 7 def nrkeywords @nrkeywords end |
#outputdir ⇒ Object
Returns the value of attribute outputdir.
7 8 9 |
# File 'lib/pdfmd/pdfmdrename.rb', line 7 def outputdir @outputdir end |
Instance Method Details
#get_author ⇒ Object
Get the author from the metatags and normalize the string
272 273 274 275 276 |
# File 'lib/pdfmd/pdfmdrename.rb', line 272 def () = @@metadata['author'].gsub(/\./,'_').gsub(/\&/,'').gsub(/\-/,'_').gsub(/\s|\//,'_').gsub(/\,/,'_').gsub(/\_\_/,'_') I18n.enforce_available_locales = false I18n.transliterate().downcase # Normalising end |
#get_doctype ⇒ Object
Get the doctype from the title
259 260 261 262 263 264 265 266 267 268 |
# File 'lib/pdfmd/pdfmdrename.rb', line 259 def get_doctype() doctype = @defaultDoctype @@keymapping.each do |key,value| value.kind_of?(String) ? value = value.split : '' value.each do |keyword| @@metadata['title'].match(/#{keyword}/i) ? doctype = key : '' end end doctype.downcase end |
#get_filename(filedata = {}) ⇒ Object
Return the filename from the available filedata
107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 |
# File 'lib/pdfmd/pdfmdrename.rb', line 107 def get_filename(filedata = {}) if filedata.size > 0 # Create the filename out of all with some exceptions # # If the doctype is the default one, the first keywords are the # title and the subject if filedata[:doctype] == @defaultDoctype # The subject and title is part of the keywords and handled there. filedata[:outputdir] + '/' + filedata[:date] + '-' + filedata[:author] + '-' + filedata[:doctype] + '-' + filedata[:keywords] + '.' + filedata[:extension] else filedata[:outputdir] + '/' + filedata.except(:extension, :title, :subject, :outputdir).values.join('-') + '.' + filedata[:extension] end else false end end |
#get_keywords(preface = '') ⇒ Object
Get the keywords This methods is trying in a way to intelligently handle the keywords and return them back to. While doing this, the abbreviations are also being taken into account. Wordcombinations on the other hand, that contain some keywords for the abbreviation, should not be changed. That’s what makes it a bit tricky.
160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 |
# File 'lib/pdfmd/pdfmdrename.rb', line 160 def get_keywords(preface = '') if !@@metadata['keywords'].empty? keywordsarray = @@metadata['keywords'].split(',') # Replace leading spaces and strings from the keymappings # if the value is identical it will be placed at the beginning # of the array (and therefore be right after the preface in the filename) keywordsarraySorted = Array.new keywordsarray.each_with_index do |value,index| value = value.lstrip.chomp @@keymapping.each do |abbreviation,keyvaluearray| if keyvaluearray.kind_of?(String) keyvaluearray = keyvaluearray.split(',') end keyvaluearray = keyvaluearray.sort_by{|size| -size.length} keyvaluearray.each do |keystring| value = value.gsub(/^#{keystring.lstrip.chomp}\s?/i, abbreviation.to_s) end end # Remove special characters from string keywordsarray[index] = value.gsub(/\s|\/|\-|\./,'_').gsub(/_+/,'_') # If the current value matches some of the replacement abbreviations, # put the value at index 0 in the array. It will then be listed earlier in the filename. if value.match(/^#{@@keymapping.keys.join('|')} /i) keywordsarraySorted.insert(0, keywordsarray[index]) else keywordsarraySorted.push(keywordsarray[index]) end end # Insert the preface if it is not empty if !preface.to_s.empty? keywordsarraySorted.insert(0, preface) end # Convert the keywordarray to a string an limit the number # of keywords according to @nrkeywords or the parameter 'all' if @@metadata['keywords'] = !@allkeywords keywords = keywordsarraySorted.values_at(*(0..@nrkeywords-1)).join('-') else keywords = keywordsarraySorted.join('-') end # Normalize all keywords and return value I18n.enforce_available_locales = false I18n.transliterate(keywords).downcase.chomp('-') else # Keywords metafield is empty :( # So we return nothing or the preface (if available) !preface.empty? ? preface : '' end end |
#get_keywordsPreface(filedata = {}) ⇒ Object
Get the preface for the keywords If the title is meaningful, then the subject will become the preface ( = first keyword) If the subject matches number/character combination and contains no spaces, the preface will be combined with the doctype. If not: The preface will contain the whole subject with dots and spaces being replaced with underscores.
230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 |
# File 'lib/pdfmd/pdfmdrename.rb', line 230 def get_keywordsPreface(filedata = {}) I18n.enforce_available_locales = false if filedata[:doctype].nil? or filedata[:doctype].empty? filedata[:doctype] = @defaultDoctype end if !filedata[:subject].nil? and !filedata[:subject].empty? and filedata[:doctype] != @defaultDoctype I18n.transliterate(filedata[:subject]) else # Document matches standard document type. # title and subject are being returned. # Normalize special characters title = filedata[:title].downcase subject = !filedata[:subject].empty? ? '_' + filedata[:subject].downcase : '' subject = subject.gsub(/\s|\-|\&/, '_') I18n.transliterate(title + subject) end end |
#get_outputdir(outputdir = '') ⇒ Object
Validate the output directory
137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 |
# File 'lib/pdfmd/pdfmdrename.rb', line 137 def get_outputdir(outputdir = '') if !outputdir # outputdir is set to false, assume pwd self.log('debug','No outputdir specified. Taking current pwd of file.') outputdir = File.dirname(@filename) elsif outputdir and !File.exist?(outputdir) puts "Error: output directory '#{outputdir}' not found. Abort." self.log('error',"Output directory '#{outputdir}' not accessible. Abort.") exit 1 elsif outputdir and File.exist?(outputdir) outputdir else false end end |
#rename ⇒ Object
37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 |
# File 'lib/pdfmd/pdfmdrename.rb', line 37 def rename # Build new filename elements newFilename = Hash.new newFilename[:date] = @@metadata['createdate'].gsub(/\ \d{2}\:\d{2}\:\d{2}.*$/,'').gsub(/\:/,'') newFilename[:author] = ().gsub(/\'/,'').gsub(/\_$/, '') newFilename[:doctype] = get_doctype() newFilename[:title] = @@metadata['title'].downcase.gsub(/(\s|\-|\.|\&|\%|\,)/, '_').gsub(/\_+/, '_') newFilename[:subject] = @@metadata['subject'].downcase.gsub(/(\s|\-|\.|\&|\%|\,)/, '_').gsub(/\_+/, '_') newFilename[:keywords] = get_keywords(get_keywordsPreface(newFilename)) newFilename[:extension] = @fileextension newFilename[:outputdir] = get_outputdir(@outputdir) # Verify that all data is available for renaming and fail otherwise if !verifyDocumentData(newFilename) abort 'Document metadata not complete. Abort renaming.' end command = @copy ? 'cp' : 'mv' filetarget = get_filename(newFilename) puts filetarget if @dryrun # Do nothing on dryrun if @filename == filetarget self.log('info', "Dryrun: File '#{@filename}' already has the correct name. Doing nothing.") else self.log('info',"Dryrun: Renaming '#{@filename}' to '#{get_filename(newFilename)}'.") end elsif @filename == filetarget # Do nothing when name is already correct. self.log('info',"File '#{@filename}' already has the correct name. Doing nothing.") else self.log('info',"Renaming '#{@filename}' to '#{filetarget}'.") command = command + " '#{@filename}' #{filetarget} 2>/dev/null" system(command) if !$?.exitstatus log('error', "Error renaming '#{@filename}' to '#{filetarget}'.") abort else log('info', "Successfully renamed file to '#{filetarget}'.") end end end |
#verifyDocumentData(filedata = {}) ⇒ Object
Data verification returns false is any metatadag is missing other than keywords
85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 |
# File 'lib/pdfmd/pdfmdrename.rb', line 85 def verifyDocumentData(filedata = {}) @@default_tags.each do |current_tag| # Skip over keywords (optional tag) current_tag.match(/keywords/) ? next : '' if not @@metadata[current_tag].nil? and not @@metadata[current_tag] == '' else return false end end end |