Class: RDF::CLI

Inherits:
Object
  • Object
show all
Defined in:
lib/rdf/cli.rb

Overview

Individual formats can modify options by updating Reader.options or Writer.options. Format-specific commands are taken from Format.cli_commands for each loaded format, which returns an array of lambdas taking arguments and options.

Status updates should be logged to ‘opts.info`. More complicated information can be added to `:messages` key within `opts`, if present.

Other than ‘help`, all commands parse an input file.

Multiple commands may be added in sequence to execute a pipeline.

Format-specific commands should verify that the reader and/or output format are appropriate for the command.

Examples:

Creating Reader-specific options:

class Reader
  def self.options
    [
      RDF::CLI::Option.new(
        symbol: :canonicalize,
        on: ["--canonicalize"],
        description: "Canonicalize input/output.") {true},
      RDF::CLI::Option.new(
        symbol: :uri,
        on: ["--uri STRING"],
        description: "URI.") {|v| RDF::URI(v)},
    ]
  end

Creating Format-specific commands:

class Format
  def self.cli_commands
    {
      count: {
        description: "",
        parse: true,
        lambda: ->(argv, opts) {}
      },
    }
  end

Adding a command manually

class MyCommand
  RDF::CLI.add_command(:count, description: "Count statements") do |argv, opts|
    count = 0
    RDF::CLI.parse(argv, opts) do |reader|
      reader.each_statement do |statement|
        count += 1
      end
    end
    options[:logger].info "Parsed #{count} statements"
  end
end

Defined Under Namespace

Classes: Option

Constant Summary collapse

COMMANDS =

Built-in commands. Other commands are imported from the Format class of different readers/writers using Format#cli_commands. ‘COMMANDS` is a Hash who’s keys are commands that may be executed by exec. The value is a hash containing the following keys:

  • ‘description` used for providing information about the command.

  • ‘parse` Boolean value to determine if input files should automatically be parsed into `repository`.

  • ‘help` used for the CLI help output.

  • ‘lambda` code run to execute command.

  • ‘filter` Option values that must match for command to be used

  • ‘control` Used to indicate how (if) command is displayed

  • ‘repository` Use this repository, if set

  • ‘options` an optional array of `RDF::CLI::Option` describing command-specific options.

  • ‘option_use`: A hash of option symbol to option usage, used for overriding the default status of an option for this command.

Returns:

  • (Hash{Symbol => Hash{Symbol => Object}})
{
  count: {
    description: "Count statements in parsed input",
    parse: false,
    control: :none,
    help: "count [options] [args...]\nreturns number of parsed statements",
    lambda: ->(argv, opts) do
      unless repository.count > 0
        start = Time.new
        count = 0
        self.parse(argv, **opts) do |reader|
          reader.each_statement do |statement|
            count += 1
          end
        end
        secs = Time.new - start
        opts[:output].puts "Parsed #{count} statements with #{@readers.join(', ')} in #{secs} seconds @ #{count/secs} statements/second."
      end
    end,
    option_use: {output_format: :disabled}
  },
  help: {
    description: "This message",
    parse: false,
    control: :none,
    lambda: ->(argv, opts) {self.usage(self.options)}
  },
  lengths: {
    description: "Lengths of each parsed statement",
    parse: true,
    control: :none,
    help: "lengths [options] [args...]\nreturns lengths of each parsed statement",
    lambda: ->(argv, opts) do
      opts[:output].puts "Lengths"
      repository.each_statement do |statement|
        opts[:output].puts statement.to_s.size
      end
    end,
    option_use: {output_format: :disabled}
  },
  objects: {
    description: "Serialize each parsed object to N-Triples",
    parse: true,
    control: :none,
    help: "objects [options] [args...]\nreturns unique objects serialized in N-Triples format",
    lambda: ->(argv, opts) do
      opts[:output].puts "Objects"
      repository.each_object do |object|
        opts[:output].puts object.to_ntriples
      end
    end,
    option_use: {output_format: :disabled}
  },
  predicates: {
    parse: true,
    description: "Serialize each parsed predicate to N-Triples",
    control: :none,
    help: "predicates [options] [args...]\nreturns unique predicates serialized in N-Triples format",
    lambda: ->(argv, opts) do
      opts[:output].puts "Predicates"
      repository.each_predicate do |predicate|
        opts[:output].puts predicate.to_ntriples
      end
    end,
    option_use: {output_format: :disabled}
  },
  serialize: {
    description: "Serialize using output-format (or N-Triples)",
    parse: true,
    help: "serialize [options] [args...]\nserialize output using specified format (or N-Triples if not specified)",
    lambda: ->(argv, opts) do
      writer_class = RDF::Writer.for(opts[:output_format]) || RDF::NTriples::Writer
      out = opts[:output]
      writer_opts = {prefixes: {}, standard_prefixes: true}.merge(opts)
      writer_class.new(out, **writer_opts) do |writer|
        writer << repository
      end
    end
  },
  subjects: {
    parse: true,
    control: :none,
    description: "Serialize each parsed subject to N-Triples",
    help: "subjects [options] [args...]\nreturns unique subjects serialized in N-Triples format",
    lambda: ->(argv, opts) do
      opts[:output].puts "Subjects"
      repository.each_subject do |subject|
        opts[:output].puts subject.to_ntriples
      end
    end,
    option_use: {output_format: :disabled}
  },
  validate: {
    description: "Validate parsed input",
    control: :none,
    parse: true,
    help: "validate [options] [args...]\nvalidates resulting repository (may also be used with --validate to check for parse-time errors)",
    lambda: ->(argv, opts) do
      opts[:output].puts "Input is " + (repository.valid? ? "" : "in") + "valid"
    end,
    option_use: {output_format: :disabled}
  }
}
OPTIONS =

Options to setup, may be modified by selected command. Options are also read from Reader#options and Writer#options. When a specific input- or ouput-format is selected, options are also discovered from the associated subclass reader or writer.

Returns:

([
  RDF::CLI::Option.new(
    symbol: :debug,
    control: :checkbox,
    datatype: TrueClass,
    on: ["-d", "--debug"],
    description: 'Enable debug output for troubleshooting.'),
  RDF::CLI::Option.new(
    symbol: :verbose,
    control: :checkbox,
    datatype: TrueClass,
    on: ['-v', '--verbose'],
    description: 'Enable verbose output. May be given more than once.'),
  RDF::CLI::Option.new(
    symbol: :evaluate,
    control: :none,
    datatype: TrueClass,
    on: ["-e", "--evaluate STRING"],
    description: "Evaluate argument as RDF input, if no files are specified"),
  RDF::CLI::Option.new(
    symbol: :output,
    control: :none,
    on: ["-o", "--output FILE"],
    description: "File to write output, defaults to STDOUT") {|arg| File.open(arg, "w")},
  RDF::CLI::Option.new(
    symbol: :ordered,
    control: :checkbox,
    datatype: TrueClass,
    on: ["--ordered"],
    description: "Use order preserving repository"),
  RDF::CLI::Option.new(
    symbol: :format,
    control: :select,
    datatype: RDF::Format.select {|ft| ft.reader}.map(&:to_sym).sort,
    on: ["--input-format FORMAT", "--format FORMAT"],
    description: "Format of input file, uses heuristic if not specified"
  ) do |arg, options|
      unless reader = RDF::Reader.for(arg.downcase.to_sym)
        RDF::CLI.abort "No reader found for #{arg.downcase.to_sym}. Available readers:\n  #{RDF::CLI.formats(reader: true).join("\n  ")}"
      end

      # Add format-specific reader options
      reader.options.each do |cli_opt|
        next if options.options.key?(cli_opt.symbol)
        on_args = cli_opt.on || []
        on_args << cli_opt.description if cli_opt.description
        options.on(*on_args) do |opt_arg|
          options.options[cli_opt.symbol] = cli_opt.call(opt_arg, options)
        end
      end if reader
      arg.downcase.to_sym
    end,
  RDF::CLI::Option.new(
    symbol: :output_format,
    control: :select,
    datatype: RDF::Format.select {|ft| ft.writer}.map(&:to_sym).sort,
    on: ["--output-format FORMAT"],
    description: "Format of output file, defaults to NTriples"
  ) do |arg, options|
      unless writer = RDF::Writer.for(arg.downcase.to_sym)
        RDF::CLI.abort "No writer found for #{arg.downcase.to_sym}. Available writers:\n  #{self.formats(writer: true).join("\n  ")}"
      end

      # Add format-specific writer options
      writer.options.each do |cli_opt|
        next if options.options.key?(cli_opt.symbol)
        on_args = cli_opt.on || []
        on_args << cli_opt.description if cli_opt.description
        options.on(*on_args) do |opt_arg|
          options.options[cli_opt.symbol] = cli_opt.call(opt_arg, options)
        end
      end if writer
      arg.downcase.to_sym
    end,
] + RDF::Reader.options + RDF::Writer.options).uniq(&:symbol)

Class Attribute Summary collapse

Class Method Summary collapse

Class Attribute Details

.repositoryRDF::Repository

Repository containing parsed statements

Returns:



371
372
373
# File 'lib/rdf/cli.rb', line 371

def repository
  @repository
end

Class Method Details

.abort(msg) ⇒ void

This method returns an undefined value.

Parameters:



694
695
696
# File 'lib/rdf/cli.rb', line 694

def self.abort(msg)
  Kernel.abort "#{basename}: #{msg}"
end

.add_command(command, **options) {|argv, opts| ... } ⇒ Object

Add a command.

Parameters:

  • command (#to_sym)
  • options (Hash{Symbol => String})

Options Hash (**options):

  • description (String)
  • help (String)

    string to display for help

  • parse (Boolean)

    parse input files in to Repository, or not.

  • options (Array<RDF::CLI::Option>)

    specific to this command

Yields:

  • argv, opts

Yield Parameters:

  • argv (Array<String>)
  • opts (Hash)

Yield Returns:

  • (void)


636
637
638
639
# File 'lib/rdf/cli.rb', line 636

def self.add_command(command, **options, &block)
  options[:lambda] = block if block_given?
  COMMANDS[command.to_sym] ||= options
end

.basenameString

Returns:



376
# File 'lib/rdf/cli.rb', line 376

def self.basename() File.basename($0) end

.commands(**options) ⇒ Array<String> .commands(format: :json, **options) ⇒ Array{Object}

Overloads:

  • .commands(**options) ⇒ Array<String>

    Returns list of executable commands.

    Parameters:

    • options (Hash{Symbol => Object})

      already set

    Returns:

    • (Array<String>)

      list of executable commands

  • .commands(format: :json, **options) ⇒ Array{Object}

    Returns an array of commands including the command symbol

    Parameters:

    • format (:json) (defaults to: :json)

      (:json)

    • options (Hash{Symbol => Object})

      already set

    Returns:

    • (Array{Object})

      Returns an array of commands including the command symbol



579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
# File 'lib/rdf/cli.rb', line 579

def self.commands(format: nil, **options)
  # First, load commands from other formats
  load_commands

  case format
  when :json
    COMMANDS.map do |k, v|
      v = v.merge(symbol: k, options: v.fetch(:options, []).map(&:to_hash))
      v.delete(:lambda)
      v.delete(:help)
      v.delete(:options) if v[:options].empty?
      v[:control] == :none ? nil : v
    end.compact
  else
    # Subset commands based on filter options
    cmds = COMMANDS.reject do |k, c|
      c.fetch(:filter, {}).any? do |opt, val|
        options[opt].to_s != val.to_s
      end
    end

    sym_len = cmds.keys.map {|k| k.to_s.length}.max
    cmds.keys.sort.map do |k|
      "%*s: %s" % [sym_len, k, cmds[k][:description]]
    end
  end
end

.exec(args, output: $stdout, option_parser: nil, messages: {}, **options) ⇒ Boolean

Execute one or more commands, parsing input as necessary

Parameters:

  • args (Array<String>)
  • output (IO) (defaults to: $stdout)
  • option_parser (OptionParser) (defaults to: nil)
  • messages (Hash{Symbol => Hash{Symbol => Array[String]}}) (defaults to: {})

    used for conveying non primary-output which is structured.

  • options (Hash{Symbol => Object})

Returns:

  • (Boolean)


479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
# File 'lib/rdf/cli.rb', line 479

def self.exec(args, output: $stdout, option_parser: nil, messages: {}, **options)
  option_parser ||= self.options(args)
  options[:logger] ||= option_parser.options[:logger]
  output.set_encoding(Encoding::UTF_8) if output.respond_to?(:set_encoding) && RUBY_PLATFORM == "java"

  # Separate commands from file options; arguments already extracted
  cmds, args = args.partition {|e| COMMANDS.include?(e.to_sym)}

  if cmds.empty?
    usage(option_parser)
    raise ArgumentError, "No command given"
  end

  if cmds.first == 'help'
    on_cmd = cmds[1]
    cmd_opts = COMMANDS.fetch(on_cmd.to_s.to_sym, {})
    if on_cmd && cmd_opts[:help]
      usage(option_parser, cmd_opts: cmd_opts, banner: "Usage: #{self.basename.split('/').last} #{COMMANDS[on_cmd.to_sym][:help]}")
    elsif on_cmd
      usage(option_parser, cmd_opts: cmd_opts)
    else
      usage(option_parser)
    end
    return
  end

  # Make sure any selected command isn't filtered out
  cmds.each do |c|
    COMMANDS[c.to_sym].fetch(:filter, {}).each do |opt, val|
      if options[opt].to_s != val.to_s
        usage(option_parser, banner: "Command #{c.inspect} requires #{opt}: #{val}, not #{options.fetch(opt, 'null')}")
        raise ArgumentError, "Incompatible command #{c} used with option #{opt}=#{options[opt]}"
      end
    end

    # The command may specify a repository instance to use
    options[:repository] ||= COMMANDS[c.to_sym][:repository]
  end

  # Hacks for specific options
  options[:logger].level = Logger::INFO if options[:verbose]
  options[:logger].level = Logger::DEBUG if options[:debug]
  options[:format] = options[:format].to_sym if options[:format]
  options[:output_format] = options[:output_format].to_sym if options[:output_format]

  # Allow repository to be set via option.
  # If RDF::OrderedRepo is present, use it if the `ordered` option is specified, otherwise extend an Array.
  @repository = options[:repository] || case
    when RDF.const_defined?(:OrderedRepo) then RDF::OrderedRepo.new
    when options[:ordered] then [].extend(RDF::Enumerable, RDF::Queryable)
    else RDF::Repository.new
  end

  # Parse input files if any command requires it
  if cmds.any? {|c| COMMANDS[c.to_sym][:parse]}
    start = Time.new
    count = 0
    self.parse(args, **options) do |reader|
      reader.each_statement {|st| @repository << st}
      # Remember prefixes from reading
      options[:prefixes] ||= reader.prefixes
    end
    secs = Time.new - start
    options[:logger].info "Parsed #{repository.count} statements with #{@readers.join(', ')} in #{secs} seconds @ #{count/secs} statements/second."
  end

  # Run each command in sequence
  cmds.each do |command|
    COMMANDS[command.to_sym][:lambda].call(args,
      output: output,
      messages: messages,
      **options.merge(repository: repository))
  end

  # Normalize messages
  messages.each do |kind, term_messages|
    case term_messages
    when Hash
    when Array
      messages[kind] = {result: term_messages}
    else
      messages[kind] = {result: [term_messages]}
    end
  end

  if options[:statistics]
    options[:statistics][:reader] = @readers.first unless (@readers || []).empty?
    options[:statistics][:count] = @repository.count
  end
end

.formats(reader: false, writer: false) ⇒ Array<String>

Returns list of available formats.

Returns:

  • (Array<String>)

    list of available formats



643
644
645
646
647
648
649
650
651
# File 'lib/rdf/cli.rb', line 643

def self.formats(reader: false, writer: false)
  f = RDF::Format.sort_by(&:to_sym).
    select {|ft| (reader ? ft.reader : (writer ? ft.writer : (ft.reader || ft.writer)))}.
    inject({}) do |memo, r|
      memo.merge(r.to_sym => r.name)
  end
  sym_len = f.keys.map {|k| k.to_s.length}.max
  f.map {|s, t| "%*s: %s" % [sym_len, s, t]}
end

.load_commandsHash{Symbol => Hash{Symbol => Object}}

Load commands from formats

Returns:

  • (Hash{Symbol => Hash{Symbol => Object}})


610
611
612
613
614
615
616
617
618
619
620
621
# File 'lib/rdf/cli.rb', line 610

def self.load_commands
  unless @commands_loaded
    RDF::Format.each do |format|
      format.cli_commands.each do |command, options|
        options = {lambda: options} unless options.is_a?(Hash)
        add_command(command, **options)
      end
    end
    @commands_loaded = true
  end
  COMMANDS
end

.options(argv) ⇒ OptionParser .options(argv, format: :json) ⇒ Array<RDF::CLI::Option>

Return OptionParser set with appropriate options

The yield return should provide one or more commands from which additional options will be extracted.

Overloads:



390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
# File 'lib/rdf/cli.rb', line 390

def self.options(argv, format: nil)
  options = OptionParser.new
  cli_opts = OPTIONS.map(&:dup)
  logger = Logger.new($stderr)
  logger.level = Logger::WARN
  logger.formatter = lambda {|severity, datetime, progname, msg| "#{severity} #{msg}\n"}
  opts = options.options = {logger: logger}

  # Pre-load commands
  load_commands

  # Add options for the specified command(s)
  cmds, args = argv.partition {|e| COMMANDS.include?(e.to_sym)}
  cmds.each do |cmd|
    Array(RDF::CLI::COMMANDS[cmd.to_sym][:options]).each do |option|
      # Replace any existing option with the same symbol
      cli_opts.delete_if {|cli_opt| cli_opt.symbol == option.symbol}

      # Add the option, unless disabled or removed
      cli_opts.unshift(option)
    end

    # Update usage of options for this command
    RDF::CLI::COMMANDS[cmd.to_sym].fetch(:option_use, {}).each do |sym, use|
      if opt = cli_opts.find {|cli_opt| cli_opt.symbol == sym}
        opt.use = use
      end
    end
  end

  cli_opts.each do |cli_opt|
    next if opts.key?(cli_opt.symbol)
    on_args = cli_opt.on || []
    on_args << cli_opt.description if cli_opt.description
    options.on(*on_args) do |arg|
      opts[cli_opt.symbol] = cli_opt.call(arg, options)
    end
  end

  if format == :json
    # Return options
    cli_opts.map(&:to_hash)
  else
    options.banner = "Usage: #{self.basename} command+ [options] [args...]"

    options.on_tail('-V', '--version', 'Display the RDF.rb version and exit.') do
      puts RDF::VERSION; exit(0)
    end

    show_help = false
    options.on_tail("-h", "--help", "Show this message") do
      show_help = true
    end

    begin
      args = options.parse!(args)
    rescue OptionParser::InvalidOption, OptionParser::InvalidArgument, ArgumentError => e
      abort e
    end

    # Make sure options are processed first
    if show_help
      self.usage(options); exit(0)
    end

    options.args = cmds + args
    options
  end
end

.parse(files, evaluate: nil, format: nil, encoding: Encoding::UTF_8, **options) {|reader| ... } ⇒ nil

Parse each file, $stdin or specified string in ‘options` yielding a reader

Parameters:

  • files (Array<String>)
  • evaluate (String) (defaults to: nil)

    from command-line, rather than referenced file

  • format (Symbol) (defaults to: nil)

    (:ntriples) Reader symbol for finding reader

  • encoding (Encoding) (defaults to: Encoding::UTF_8)

    set on the input

  • options (Hash{Symbol => Object})

    sent to reader

Yields:

  • (reader)

Yield Parameters:

Returns:

  • (nil)


665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
# File 'lib/rdf/cli.rb', line 665

def self.parse(files, evaluate: nil, format: nil, encoding: Encoding::UTF_8, **options, &block)
  if files.empty?
    # If files are empty, either use options[:execute]
    input = evaluate ? StringIO.new(evaluate) : $stdin
    input.set_encoding(encoding )
    if !format
      sample = input.read
      input.rewind
    end
    r = RDF::Reader.for(format|| {sample: sample})
    raise ArgumentError, "Unknown format for evaluated input" unless r
    (@readers ||= []) << r
    r.new(input, **options) do |reader|
      yield(reader)
    end
  else
    options[:format] = format if format
    files.each do |file|
      RDF::Reader.open(file, **options) do |reader|
        (@readers ||= []) << reader.class.to_s
        yield(reader)
      end
    end
  end
end

.usage(options, cmd_opts: {}, banner: nil) ⇒ Object

Output usage message



462
463
464
465
466
467
468
# File 'lib/rdf/cli.rb', line 462

def self.usage(options, cmd_opts: {}, banner: nil)
  options.banner = banner if banner
  $stdout.puts options
  $stdout.puts "Note: available commands and options may be different depending on selected --input-format and/or --output-format."
  $stdout.puts "Available commands:\n\t#{self.commands(**options.options).join("\n\t")}"
  $stdout.puts "Available formats:\n\t#{(self.formats).join("\n\t")}"
end