Class: RDF::CLI

Inherits:
Object
  • Object
show all
Defined in:
lib/rdf/cli.rb

Overview

Individual formats can modify options by updating Reader.options or Writer.options. Format-specific commands are taken from Format.cli_commands for each loaded format, which returns an array of lambdas taking arguments and options.

Status updates should be logged to ‘opts.info`. More complicated information can be added to `:messages` key within `opts`, if present.

Other than ‘help`, all commands parse an input file.

Multiple commands may be added in sequence to execute a pipeline.

Format-specific commands should verify that the reader and/or output format are appropriate for the command.

Examples:

Creating Reader-specific options:

class Reader
  def self.options
    [
      RDF::CLI::Option.new(
        symbol: :canonicalize,
        on: ["--canonicalize"],
        description: "Canonicalize URI/literal forms.") {true},
      RDF::CLI::Option.new(
        symbol: :uri,
        on: ["--uri STRING"],
        description: "URI.") {|v| RDF::URI(v)},
    ]
  end

Creating Format-specific commands:

class Format
  def self.cli_commands
    {
      count: {
        description: "",
        parse: true,
        lambda: ->(argv, opts) {}
      },
    }
  end

Adding a command manually

class MyCommand
  RDF::CLI.add_command(:count, description: "Count statements") do |argv, opts|
    count = 0
    RDF::CLI.parse(argv, opts) do |reader|
      reader.each_statement do |statement|
        count += 1
      end
    end
    options[:logger].info "Parsed #{count} statements"
  end
end

Defined Under Namespace

Classes: Option

Constant Summary collapse

COMMANDS =

Built-in commands. Other commands are imported from the Format class of different readers/writers using Format#cli_commands. ‘COMMANDS` is a Hash who’s keys are commands that may be executed by exec. The value is a hash containing the following keys:

  • ‘description` used for providing information about the command.

  • ‘parse` Boolean value to determine if input files should automatically be parsed into `repository`.

  • ‘help` used for the CLI help output.

  • ‘lambda` code run to execute command.

  • ‘filter` value is a Hash whose keys are matched against selected command options. All specified `key/value` pairs are compared against the equivalent key in the current invocation.

    If an Array, option value (as a string) must match any value of the array (as a string)
    If a Proc, it is passed the option value and must return `true`.
    Otherwise, the option value (as a string) must equal the  `value` (as a string).
    
  • ‘control` Used to indicate how (if) command is displayed

  • ‘repository` Use this repository, if set

  • ‘options` an optional array of `RDF::CLI::Option` describing command-specific options.

  • ‘option_use`: A hash of option symbol to option usage, used for overriding the default status of an option for this command.

Returns:

  • (Hash{Symbol => Hash{Symbol => Object}})
{
  count: {
    description: "Count statements in parsed input",
    parse: false,
    control: :none,
    help: "count [options] [args...]\nreturns number of parsed statements",
    lambda: ->(argv, opts) do
      unless repository.count > 0
        start = Time.new
        count = 0
        self.parse(argv, **opts) do |reader|
          reader.each_statement do |statement|
            count += 1
          end
        end
        secs = Time.new - start
        opts[:output].puts "Parsed #{count} statements with #{@readers.join(', ')} in #{secs} seconds @ #{count/secs} statements/second."
      end
    end,
    option_use: {output_format: :disabled}
  },
  help: {
    description: "This message",
    parse: false,
    control: :none,
    lambda: ->(argv, opts) {self.usage(self.options)}
  },
  lengths: {
    description: "Lengths of each parsed statement",
    parse: true,
    control: :none,
    help: "lengths [options] [args...]\nreturns lengths of each parsed statement",
    lambda: ->(argv, opts) do
      opts[:output].puts "Lengths"
      repository.each_statement do |statement|
        opts[:output].puts statement.to_s.size
      end
    end,
    option_use: {output_format: :disabled}
  },
  objects: {
    description: "Serialize each parsed object to N-Triples",
    parse: true,
    control: :none,
    help: "objects [options] [args...]\nreturns unique objects serialized in N-Triples format",
    lambda: ->(argv, opts) do
      opts[:output].puts "Objects"
      repository.each_object do |object|
        opts[:output].puts object.to_ntriples
      end
    end,
    option_use: {output_format: :disabled}
  },
  predicates: {
    parse: true,
    description: "Serialize each parsed predicate to N-Triples",
    control: :none,
    help: "predicates [options] [args...]\nreturns unique predicates serialized in N-Triples format",
    lambda: ->(argv, opts) do
      opts[:output].puts "Predicates"
      repository.each_predicate do |predicate|
        opts[:output].puts predicate.to_ntriples
      end
    end,
    option_use: {output_format: :disabled}
  },
  serialize: {
    description: "Serialize using output-format (or N-Triples)",
    parse: true,
    help: "serialize [options] [args...]\nserialize output using specified format (or N-Triples if not specified)",
    lambda: ->(argv, opts) do
      writer_class = RDF::Writer.for(opts[:output_format]) || RDF::NTriples::Writer
      out = opts[:output]
      writer_opts = {prefixes: {}, standard_prefixes: true}.merge(opts)
      writer_class.new(out, **writer_opts) do |writer|
        writer << repository
      end
    end
  },
  subjects: {
    parse: true,
    control: :none,
    description: "Serialize each parsed subject to N-Triples",
    help: "subjects [options] [args...]\nreturns unique subjects serialized in N-Triples format",
    lambda: ->(argv, opts) do
      opts[:output].puts "Subjects"
      repository.each_subject do |subject|
        opts[:output].puts subject.to_ntriples
      end
    end,
    option_use: {output_format: :disabled}
  },
  validate: {
    description: "Validate parsed input",
    control: :none,
    parse: true,
    help: "validate [options] [args...]\nvalidates resulting repository (may also be used with --validate to check for parse-time errors)",
    lambda: ->(argv, opts) do
      opts[:output].puts "Input is " + (repository.valid? ? "" : "in") + "valid"
    end,
    option_use: {output_format: :disabled}
  }
}
OPTIONS =

Options to setup, may be modified by selected command. Options are also read from Reader#options and Writer#options. When a specific input- or ouput-format is selected, options are also discovered from the associated subclass reader or writer.

Returns:

([
  RDF::CLI::Option.new(
    symbol: :debug,
    control: :checkbox,
    datatype: TrueClass,
    on: ["-d", "--debug"],
    description: 'Enable debug output for troubleshooting.'),
  RDF::CLI::Option.new(
    symbol: :verbose,
    control: :checkbox,
    datatype: TrueClass,
    on: ['-v', '--verbose'],
    description: 'Enable verbose output. May be given more than once.'),
  RDF::CLI::Option.new(
    symbol: :evaluate,
    control: :none,
    datatype: TrueClass,
    on: ["-e", "--evaluate STRING"],
    description: "Evaluate argument as RDF input, if no files are specified"),
  RDF::CLI::Option.new(
    symbol: :output,
    control: :none,
    on: ["-o", "--output FILE"],
    description: "File to write output, defaults to STDOUT") {|arg| File.open(arg, "w")},
  RDF::CLI::Option.new(
    symbol: :ordered,
    control: :checkbox,
    datatype: TrueClass,
    on: ["--ordered"],
    description: "Use order preserving repository"),
  RDF::CLI::Option.new(
    symbol: :format,
    control: :select,
    datatype: RDF::Format.select {|ft| ft.reader}.map(&:to_sym).sort,
    on: ["--input-format FORMAT", "--format FORMAT"],
    description: "Format of input file, uses heuristic if not specified"
  ) do |arg, options|
      unless reader = RDF::Reader.for(arg.downcase.to_sym)
        RDF::CLI.abort "No reader found for #{arg.downcase.to_sym}. Available readers:\n  #{RDF::CLI.formats(reader: true).join("\n  ")}"
      end

      # Add format-specific reader options
      reader.options.each do |cli_opt|
        next if options.options.key?(cli_opt.symbol)
        on_args = cli_opt.on || []
        on_args << cli_opt.description if cli_opt.description
        options.on(*on_args) do |opt_arg|
          options.options[cli_opt.symbol] = cli_opt.call(opt_arg, options)
        end
      end if reader
      arg.downcase.to_sym
    end,
  RDF::CLI::Option.new(
    symbol: :output_format,
    control: :select,
    datatype: RDF::Format.select {|ft| ft.writer}.map(&:to_sym).sort,
    on: ["--output-format FORMAT"],
    description: "Format of output file, defaults to NTriples"
  ) do |arg, options|
      unless writer = RDF::Writer.for(arg.downcase.to_sym)
        RDF::CLI.abort "No writer found for #{arg.downcase.to_sym}. Available writers:\n  #{self.formats(writer: true).join("\n  ")}"
      end

      # Add format-specific writer options
      writer.options.each do |cli_opt|
        next if options.options.key?(cli_opt.symbol)
        on_args = cli_opt.on || []
        on_args << cli_opt.description if cli_opt.description
        options.on(*on_args) do |opt_arg|
          options.options[cli_opt.symbol] = cli_opt.call(opt_arg, options)
        end
      end if writer
      arg.downcase.to_sym
    end,
] + RDF::Reader.options + RDF::Writer.options).uniq(&:symbol)

Class Attribute Summary collapse

Class Method Summary collapse

Class Attribute Details

.repositoryRDF::Repository

Repository containing parsed statements

Returns:



375
376
377
# File 'lib/rdf/cli.rb', line 375

def repository
  @repository
end

Class Method Details

.abort(msg) ⇒ void

This method returns an undefined value.

Parameters:

  • msg (String)


719
720
721
# File 'lib/rdf/cli.rb', line 719

def self.abort(msg)
  Kernel.abort "#{basename}: #{msg}"
end

.add_command(command, **options) {|argv, opts| ... } ⇒ Object

Add a command.

Parameters:

  • command (#to_sym)
  • options (Hash{Symbol => String})

Options Hash (**options):

  • description (String)
  • help (String)

    string to display for help

  • parse (Boolean)

    parse input files in to Repository, or not.

  • options (Array<RDF::CLI::Option>)

    specific to this command

Yields:

  • argv, opts

Yield Parameters:

  • argv (Array<String>)
  • opts (Hash)

Yield Returns:

  • (void)


661
662
663
664
# File 'lib/rdf/cli.rb', line 661

def self.add_command(command, **options, &block)
  options[:lambda] = block if block_given?
  COMMANDS[command.to_sym] ||= options
end

.basenameString

Returns:

  • (String)


380
# File 'lib/rdf/cli.rb', line 380

def self.basename() File.basename($0) end

.commands(**options) ⇒ Array<String> .commands(format: :json, **options) ⇒ Array{Object}

Overloads:

  • .commands(**options) ⇒ Array<String>

    Returns list of executable commands.

    Parameters:

    • options (Hash{Symbol => Object})

      already set

    Returns:

    • (Array<String>)

      list of executable commands

  • .commands(format: :json, **options) ⇒ Array{Object}

    Returns commands as JSON, for API usage.

    Parameters:

    • format (:json) (defaults to: :json)
    • options (Hash{Symbol => Object})

      already set

    Returns:

    • (Array{Object})

      Returns an array of commands including the command symbol



597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
# File 'lib/rdf/cli.rb', line 597

def self.commands(format: nil, **options)
  # First, load commands from other formats
  load_commands

  case format
  when :json
    COMMANDS.map do |k, v|
      v = v.merge(symbol: k, options: v.fetch(:options, []).map(&:to_hash))
      v.delete(:lambda)
      v.delete(:help)
      v.delete(:options) if v[:options].empty?
      v[:control] == :none ? nil : v
    end.compact
  else
    # Subset commands based on filter options
    cmds = COMMANDS.reject do |k, c|
      c.fetch(:filter, {}).any? do |opt, val|
        case val
        when Array
          !val.map(&:to_s).include?(options[opt].to_s)
        when Proc
          !val.call(options[opt])
        else
          val.to_s != options[opt].to_s
        end
      end
    end

    sym_len = cmds.keys.map {|k| k.to_s.length}.max
    cmds.keys.sort.map do |k|
      "%*s: %s" % [sym_len, k, cmds[k][:description]]
    end
  end
end

.exec(args, output: $stdout, option_parser: nil, messages: {}, **options) ⇒ Boolean

Execute one or more commands, parsing input as necessary

Parameters:

  • args (Array<String>)
  • output (IO) (defaults to: $stdout)
  • option_parser (OptionParser) (defaults to: nil)
  • messages (Hash{Symbol => Hash{Symbol => Array[String]}}) (defaults to: {})

    used for conveying non primary-output which is structured.

  • options (Hash{Symbol => Object})

Returns:

  • (Boolean)


483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
# File 'lib/rdf/cli.rb', line 483

def self.exec(args, output: $stdout, option_parser: nil, messages: {}, **options)
  option_parser ||= self.options(args)
  options[:logger] ||= option_parser.options[:logger]
  output.set_encoding(Encoding::UTF_8) if output.respond_to?(:set_encoding) && RUBY_PLATFORM == "java"

  # Separate commands from file options; arguments already extracted
  cmds, args = args.partition {|e| COMMANDS.include?(e.to_sym)}

  if cmds.empty?
    usage(option_parser)
    raise ArgumentError, "No command given"
  end

  if cmds.first == 'help'
    on_cmd = cmds[1]
    cmd_opts = COMMANDS.fetch(on_cmd.to_s.to_sym, {})
    if on_cmd && cmd_opts[:help]
      usage(option_parser, cmd_opts: cmd_opts, banner: "Usage: #{self.basename.split('/').last} #{COMMANDS[on_cmd.to_sym][:help]}")
    elsif on_cmd
      usage(option_parser, cmd_opts: cmd_opts)
    else
      usage(option_parser)
    end
    return
  end

  # Make sure any selected command isn't filtered out
  cmds.each do |c|
    COMMANDS[c.to_sym].fetch(:filter, {}).each do |opt, val|
      case val
      when Array
        unless val.map(&:to_s).include?(options[opt].to_s)
          usage(option_parser, banner: "Command #{c.inspect} requires #{opt} in #{val.map(&:to_s).inspect}, not #{options.fetch(opt, 'null')}")
          raise ArgumentError, "Incompatible command #{c} used with option #{opt}=#{options[opt]}"
        end
      when Proc
        unless val.call(options[opt])
          usage(option_parser, banner: "Command #{c.inspect} #{opt} inconsistent with #{options.fetch(opt, 'null')}")
          raise ArgumentError, "Incompatible command #{c} used with option #{opt}=#{options[opt]}"
        end
      else
        unless val.to_s == options[opt].to_s
          usage(option_parser, banner: "Command #{c.inspect} requires compatible value for #{opt}, not #{options.fetch(opt, 'null')}")
          raise ArgumentError, "Incompatible command #{c} used with option #{opt}=#{options[opt]}"
        end
      end
    end

    # The command may specify a repository instance to use
    options[:repository] ||= COMMANDS[c.to_sym][:repository]
  end

  # Hacks for specific options
  options[:logger].level = Logger::INFO if options[:verbose]
  options[:logger].level = Logger::DEBUG if options[:debug]
  options[:format] = options[:format].to_sym if options[:format]
  options[:output_format] = options[:output_format].to_sym if options[:output_format]

  # Allow repository to be set via option.
  # If RDF::OrderedRepo is present, use it if the `ordered` option is specified, otherwise extend an Array.
  @repository = options[:repository] || case
    when RDF.const_defined?(:OrderedRepo) then RDF::OrderedRepo.new
    when options[:ordered] then [].extend(RDF::Enumerable, RDF::Queryable)
    else RDF::Repository.new
  end

  # Parse input files if any command requires it
  if cmds.any? {|c| COMMANDS[c.to_sym][:parse]}
    start = Time.new
    count = 0
    self.parse(args, **options) do |reader|
      reader.each_statement {|st| @repository << st}
      # Remember prefixes from reading
      options[:prefixes] ||= reader.prefixes
    end
    secs = Time.new - start
    options[:logger].info "Parsed #{repository.count} statements with #{@readers.join(', ')} in #{secs} seconds @ #{count/secs} statements/second."
  end

  # Run each command in sequence
  cmds.each do |command|
    COMMANDS[command.to_sym][:lambda].call(args,
      output: output,
      messages: messages,
      **options.merge(repository: repository))
  end

  # Normalize messages
  messages.each do |kind, term_messages|
    case term_messages
    when Hash
    when Array
      messages[kind] = {result: term_messages}
    else
      messages[kind] = {result: [term_messages]}
    end
  end

  if options[:statistics]
    options[:statistics][:reader] = @readers.first unless (@readers || []).empty?
    options[:statistics][:count] = @repository.count
  end
end

.formats(reader: false, writer: false) ⇒ Array<String>

Returns list of available formats.

Returns:

  • (Array<String>)

    list of available formats



668
669
670
671
672
673
674
675
676
# File 'lib/rdf/cli.rb', line 668

def self.formats(reader: false, writer: false)
  f = RDF::Format.sort_by(&:to_sym).
    select {|ft| (reader ? ft.reader : (writer ? ft.writer : (ft.reader || ft.writer)))}.
    inject({}) do |memo, r|
      memo.merge(r.to_sym => r.name)
  end
  sym_len = f.keys.map {|k| k.to_s.length}.max
  f.map {|s, t| "%*s: %s" % [sym_len, s, t]}
end

.load_commandsHash{Symbol => Hash{Symbol => Object}}

Load commands from formats

Returns:

  • (Hash{Symbol => Hash{Symbol => Object}})


635
636
637
638
639
640
641
642
643
644
645
646
# File 'lib/rdf/cli.rb', line 635

def self.load_commands
  unless @commands_loaded
    RDF::Format.each do |format|
      format.cli_commands.each do |command, options|
        options = {lambda: options} unless options.is_a?(Hash)
        add_command(command, **options)
      end
    end
    @commands_loaded = true
  end
  COMMANDS
end

.options(argv) ⇒ OptionParser .options(argv, format: :json) ⇒ Array<RDF::CLI::Option>

Return OptionParser set with appropriate options

The yield return should provide one or more commands from which additional options will be extracted.

Overloads:

  • .options(argv) ⇒ OptionParser

    Parameters:

    • argv (Array<String>)

    Returns:

  • .options(argv, format: :json) ⇒ Array<RDF::CLI::Option>

    Returns discovered options

    Parameters:

    • argv (Array<String>)
    • format (:json) (defaults to: :json)

      (:json)

    Returns:



394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
# File 'lib/rdf/cli.rb', line 394

def self.options(argv, format: nil)
  options = OptionParser.new
  cli_opts = OPTIONS.map(&:dup)
  logger = Logger.new($stderr)
  logger.level = Logger::WARN
  logger.formatter = lambda {|severity, datetime, progname, msg| "#{severity} #{msg}\n"}
  opts = options.options = {logger: logger}

  # Pre-load commands
  load_commands

  # Add options for the specified command(s)
  cmds, args = argv.partition {|e| COMMANDS.include?(e.to_sym)}
  cmds.each do |cmd|
    Array(RDF::CLI::COMMANDS[cmd.to_sym][:options]).each do |option|
      # Replace any existing option with the same symbol
      cli_opts.delete_if {|cli_opt| cli_opt.symbol == option.symbol}

      # Add the option, unless disabled or removed
      cli_opts.unshift(option)
    end

    # Update usage of options for this command
    RDF::CLI::COMMANDS[cmd.to_sym].fetch(:option_use, {}).each do |sym, use|
      if opt = cli_opts.find {|cli_opt| cli_opt.symbol == sym}
        opt.use = use
      end
    end
  end

  cli_opts.each do |cli_opt|
    next if opts.key?(cli_opt.symbol)
    on_args = cli_opt.on || []
    on_args << cli_opt.description if cli_opt.description
    options.on(*on_args) do |arg|
      opts[cli_opt.symbol] = cli_opt.call(arg, options)
    end
  end

  if format == :json
    # Return options
    cli_opts.map(&:to_hash)
  else
    options.banner = "Usage: #{self.basename} command+ [options] [args...]"

    options.on_tail('-V', '--version', 'Display the RDF.rb version and exit.') do
      puts RDF::VERSION; exit(0)
    end

    show_help = false
    options.on_tail("-h", "--help", "Show this message") do
      show_help = true
    end

    begin
      args = options.parse!(args)
    rescue OptionParser::InvalidOption, OptionParser::InvalidArgument, ArgumentError => e
      abort e
    end

    # Make sure options are processed first
    if show_help
      self.usage(options); exit(0)
    end

    options.args = cmds + args
    options
  end
end

.parse(files, evaluate: nil, format: nil, encoding: Encoding::UTF_8, **options) {|reader| ... } ⇒ nil

Parse each file, $stdin or specified string in ‘options` yielding a reader

Parameters:

  • files (Array<String>)
  • evaluate (String) (defaults to: nil)

    from command-line, rather than referenced file

  • format (Symbol) (defaults to: nil)

    (:ntriples) Reader symbol for finding reader

  • encoding (Encoding) (defaults to: Encoding::UTF_8)

    set on the input

  • options (Hash{Symbol => Object})

    sent to reader

Yields:

  • (reader)

Yield Parameters:

Returns:

  • (nil)


690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
# File 'lib/rdf/cli.rb', line 690

def self.parse(files, evaluate: nil, format: nil, encoding: Encoding::UTF_8, **options, &block)
  if files.empty?
    # If files are empty, either use options[:execute]
    input = evaluate ? StringIO.new(evaluate) : $stdin
    input.set_encoding(encoding )
    if !format
      sample = input.read
      input.rewind
    end
    r = RDF::Reader.for(format|| {sample: sample})
    raise ArgumentError, "Unknown format for evaluated input" unless r
    (@readers ||= []) << r
    r.new(input, **options) do |reader|
      yield(reader)
    end
  else
    options[:format] = format if format
    files.each do |file|
      RDF::Reader.open(file, **options) do |reader|
        (@readers ||= []) << reader.class.to_s
        yield(reader)
      end
    end
  end
end

.usage(options, cmd_opts: {}, banner: nil) ⇒ Object

Output usage message



466
467
468
469
470
471
472
# File 'lib/rdf/cli.rb', line 466

def self.usage(options, cmd_opts: {}, banner: nil)
  options.banner = banner if banner
  $stdout.puts options
  $stdout.puts "Note: available commands and options may be different depending on selected --input-format and/or --output-format."
  $stdout.puts "Available commands:\n\t#{self.commands(**options.options).join("\n\t")}"
  $stdout.puts "Available formats:\n\t#{(self.formats).join("\n\t")}"
end