marc4j4r
A ruby wrapper around the marc4j.jar (as forked by javamarc) java library for dealing with library MARC data.
JRuby version alert
MARC4J4R::Record#each
throws an error in JRuby versions bfore 1.7.1 when in --1.9 mode Your best bet is to use JRuby 1.7.1 (or higher).
[The error in question is, I think JRUBY-6581]
Deprecation alert
I'm giving up on this standalone module and focusing my efforts into making a marc4j add-on for the standard ruby-marc distribution.
Getting a MARC reader
marc4j4r provides three readers out of the box: :strictmarc (binary), :permissivemarc (:binary), :marcxml (MARC-XML), or :alephsequential (Ex Libris's AlephSequential format).
You can pass either a filename or an open IO object (either ruby or java.io.inputstream)
require 'marc4j4r'
binreader = MARC4J4R::Reader.new('test.mrc') # defaults to :strictmarc
binreader = MARC4J4R::Reader.new('test.mrc', :strictmarc)
permissivereader = MARC4J4R::Reader.new('test.mrc', :permissivemarc)
xmlreader = MARC4J4R::Reader.new('test.xml', :marcxml)
asreader = MARC4J4R::Reader.new('test.seq', :alephsequential)
# Or use a file object
reader = MARC4J4R::Reader.new(File.open('test.mrc'))
# Or a java.io.inputstream
jurl = Java::java.net.URL.new('http://my.machine.com/test.mrc')
istream = jurl.openConnection.getInputStream
reader = MARC4J4R::Reader.new(istream)
Using the reader
A MARC4J4R::Reader is an Enumerable, so you can do:
reader.each do |record|
# do stuff with the record
end
Or, if you're using jruby_threach:
reader.threach(2) do |record|
# do stuff with records in two threads
end
Using the writer
binaryWriter = MARC4J4R::Writer.new(filename, :strictmarc)
xmlWriter = MARC4J4R::Writer.new(filename, :marcxml)
writer.write(record)
# repeat
writer.close
Working with records and fields
In addition to all the normal marc4j methods, MARC4J4R::Record exposes some additional methods and syntaxes.
See the classes themselves and/or the specs for more examples.
- MARC4J4R::Reader
- MARC4J4R::Writer
- MARC4J4R::Record
- MARC4J4R::ControlField
- MARC4J4R::DataField
-
leader = record.leader
# All fields are available via #each or #fields
fields = record.fields
record.each do |field| # do something with each controlfield/datafield; returned in the order they were added end
# Controlfields have a tag and a value
idfield = record['001'] idfield.tag # => '001' id = idfield.value # or idfield.data, same thing
# Get the first datafield with a given tag first700 = record['700'] # Note: need to use strings, not integers
# Stringify a field to get all the subfields joined with spaces
fullTitle = record['245'].to_s
all700s = record.find_by_tag '700' all700and856s = record.find_by_tag ['700', '856']
# Construct and add a controlfield record << MARC4J4R::ControlField.new('001', '0000333234')
# Construct and add a datafield df = MARC4J4R::DataField.new(tag, ind1, ind2)
ind1 = df.ind1 ind2 = df.ind2
df << MARC4J4R::Subfield.new('a', 'the $a value') df << MARC4J4R::Subfield.new('b', 'the $b value')
# Add it to a record
record << df
# Get subfields or their values
firstSubfieldAValue = df['a']
allSubfields = df.subs allSubfieldAs = df.subs('a') allSubfieldAorBs = df.subs(['a', 'b'])
allSubfieldAorBValues = df.sub_values(['a', 'b'])
Install
$ gem install marc4j4r
Note on Patches/Pull Requests
- Fork the project.
- Make your feature addition or bug fix.
- Add tests for it. This is important so I don't break it in a future version unintentionally.
- Commit, do not mess with rakefile, version, or history. (if you want to have your own version, that is fine but bump version in a commit by itself I can ignore when I pull)
- Send me a pull request. Bonus points for topic branches.
Copyright
Copyright (c) 2012 Bill Dueber
See LICENSE for details.