fedora_2_to_3_pid_renamer
A small Ruby app used to apply alternative pid names to configuration files created during the migration of fedora 2 data to fedora 3
Migration process
Fedora is an open source repository system for the management and dissemination of digital content.
When migrating from version 2 to 3 the process involves running an Analyzer, that outputs a series of files. Among these files are a some XML files that describe the data objects that are in the existing Fedora 2 database. These files need to be manipulated so as to modify the Fedora 2 objects so that they will suitable to be inserted into a Fedora 3 database. Analyser also creates a set of cmodel-n-deployments.txt files that also need to be manipulated
This app will carry out the manipulation of the files generated by the Analyser
Installation
Within an environment with ruby installed:
gem install fedora_2_to_3_pid_renamer
Configuration
To use this app, you must first create a config.yml file. For example:
changes:
CModel1: book
CModel1-SDep1: book-SDep1
CModel2: thesis
CModel2-SDep1: thesis-SDep1
changeme: foo
folders:
input: 'path/to/analyser/files'
output: 'path/to/output/location'
locations:
- "//foxml:digitalObject/@PID"
- "//rdf:Description/@rdf:about"
- "//fedora-model:isContractorOf/@rdf:resource"
namespaces:
foxml: "info:fedora/fedora-system:def/foxml#"
rdf: "http://www.w3.org/1999/02/22-rdf-syntax-ns#"
fedora-model: "info:fedora/fedora-system:def/model#"
changes
The Analyzer will create model names from the Fedora 2 data. These often need to be changed to domain specific names.
Each change has a key and value where the key is the Analyzer generated name and the value is the domain specific name that should replace it.
changeme
The Analyser flags the elements that need to be changed by adding a 'changeme' namespace to each one. This needs to be replaced with a domain specific namespace.
folders
Defines the paths to the input and output folders.
The input folder should contain the files generated by the Analyser
An output folder is used so that Analyser files are not directly modified by this process. The files can be deleted and the process repeated any number of times until the output is satisfactory.
Only files that are changed by the process, will be copied to the output folder.
locations
The locations in the XML files, where the content should be modified. Each location is defined via XPath. If no suitable content is found (as defined in changes and changeme), that location will be ignored.
The locations given in the example above should work for most cases.
namespaces
For the XML file to be parsed correctly, the namespaces for the locations need to be specified.
The namespaces in the example above should work in most cases.
Execution
To run the process, enter the following command in the same location as the config.yml file:
fedora_2_to_3_pid_renamer
Alternatively, you can specify the location of the config.yml:
fedora_2_to_3_pid_renamer -c /path/to/config.yml