Class: RDF::SAK::Context
- Inherits:
-
Object
- Object
- RDF::SAK::Context
- Includes:
- Util, XML::Mixup
- Defined in:
- lib/rdf/sak.rb
Defined Under Namespace
Classes: Document
Constant Summary collapse
- SKOS_HIER =
[ { element: :subject, pattern: -> c, p { [nil, p, c] }, preds: [RDF::Vocab::SKOS.broader, RDF::Vocab::SKOS.broaderTransitive], }, { element: :object, pattern: -> c, p { [c, p, nil] }, preds: [RDF::Vocab::SKOS.narrower, RDF::Vocab::SKOS.narrowerTransitive], } ]
- AUTHOR_SPEC =
[ ['By:', [RDF::Vocab::BIBO., RDF::Vocab::DC.creator]], ['With:', [RDF::Vocab::BIBO.contributorList, RDF::Vocab::DC.contributor]], ['Edited by:', [RDF::Vocab::BIBO.editorList, RDF::Vocab::BIBO.editor]], ['Translated by:', [RDF::Vocab::BIBO.translator]], ].freeze
- CONCEPTS =
generate skos concept schemes
Util.(RDF::Vocab::SKOS.Concept).to_set
- CSV_PRED =
{ audience: RDF::Vocab::DC.audience, nonaudience: CI['non-audience'], subject: RDF::Vocab::DC.subject, assumes: CI.assumes, introduces: CI.introduces, mentions: CI.mentions, }
- DSD_SEQ =
%i[characters words blocks sections min low-quartile median high-quartile max mean sd].freeze
- TH_SEQ =
%w[Document Abstract Created Modified Characters Words Blocks Sections Min Q1 Median Q3 Max Mean SD].map { |t| { [t] => :th } }
Constants included from Util
Util::SCHEME_RANK, Util::XHTMLNS, Util::XHV, Util::XPATHNS
Instance Attribute Summary collapse
-
#base ⇒ Object
readonly
Returns the value of attribute base.
-
#config ⇒ Object
readonly
Returns the value of attribute config.
-
#graph ⇒ Object
readonly
Returns the value of attribute graph.
Instance Method Summary collapse
- #abbreviate(term, prefixes: , vocab: nil, noop: true, sort: true) ⇒ String
- #all_internal_docs(published: true, exclude: []) ⇒ Object
-
#all_of_type(rdftype, exclude: []) ⇒ Array
Obtain every subject that is rdf:type the given type or its subtypes.
-
#all_types ⇒ Array
Obtain everything in the graph that is an ‘rdf:type` of something.
-
#asserted_types(subject, type = nil) ⇒ Array
Obtain all and only the rdf:types directly asserted on the subject.
- #audiences_for(uuid, proximate: false, invert: false) ⇒ Object
-
#authors_for(subject, unique: false, contrib: false) ⇒ RDF::Value, Array
Assuming the subject is a thing that has authors, return the list of authors.
-
#canonical_uri(subject, unique: true, rdf: true, slugs: false, fragment: false) ⇒ RDF::URI, ...
Obtain the “best” dereferenceable URI for the subject.
-
#canonical_uuid(uri, unique: true, published: false) ⇒ RDF::URI, Array
Obtain the canonical UUID for the given URI.
-
#dates_for(subject, predicate: RDF::Vocab::DC.date, datatype: [RDF::XSD.date, RDF::XSD.dateTime]) ⇒ Array
Obtain dates for the subject as instances of Date(Time).
-
#formats_for(subject, predicate: RDF::Vocab::DC.format, datatype: [RDF::XSD.token]) ⇒ Array
Obtain any specified MIME types for the subject.
- #generate_atom_feed(id, published: true, related: []) ⇒ Object
- #generate_audience_csv(file = nil, published: true) ⇒ Object
- #generate_backlinks(subject, published: true, ignore: nil) ⇒ Object
- #generate_bibliography(id, published: true) ⇒ Object
- #generate_gone_map(published: false, docs: nil) ⇒ Object
- #generate_reading_list(subject, published: true) ⇒ Object
-
#generate_redirect_map(published: false, docs: nil) ⇒ Object
you know what, it’s entirely possible that these ought never be called individually and the work to get one would duplicate the work of getting the other, so maybe just do ‘em both at once.
-
#generate_rewrite_map(published: false, docs: nil) ⇒ Object
generate rewrite map(s).
- #generate_sitemap(published: true) ⇒ Object
-
#generate_slug_redirect_map(published: false, docs: nil) ⇒ Object
find all URIs/slugs that are not canonical, map them to slugs that are canonical.
- #generate_stats(published: true) ⇒ Object
- #generate_twitter_meta(subject) ⇒ Object
-
#generate_uuid_redirect_map(published: false, docs: nil) ⇒ Object
give me all UUIDs of all documents, filter for published if applicable.
-
#head_links(subject, struct: nil, nodes: nil, prefixes: {}, ignore: [], uris: {}, labels: {}, vocab: nil) ⇒ Object
generate indexes of books, not-books, and other external links.
- #head_meta(subject, struct: nil, nodes: nil, prefixes: {}, ignore: [], meta_names: {}, vocab: nil, lang: nil, xhtml: true) ⇒ Object
- #ingest_csv(file) ⇒ Object
-
#initialize(graph: nil, base: nil, config: nil, type: nil) ⇒ RDF::SAK::Context
constructor
Initialize a context.
-
#label_for(subject, candidates: nil, unique: true, type: nil, lang: nil, desc: false, alt: false) ⇒ Array
Obtain the most appropriate label(s) for the subject’s type(s).
-
#locate(uri) ⇒ Pathname
Locate the file in the source directory associated with the given URI.
-
#map_location(type) ⇒ Object
private?.
-
#objects_for(subject, predicate, entail: true, only: [], datatype: nil) ⇒ RDF::Term
Returns objects from the graph with entailment.
-
#prefixes ⇒ Hash
Get the prefix mappings from the configuration.
-
#published?(uri, circulated: false) ⇒ true, false
Determine whether the URI represents a published document.
-
#reachable(published: false) ⇒ Object
Get all “reachable” UUID-identified entities (subjects which are also objects).
-
#reading_lists(published: true) ⇒ Object
whoops lol we forgot the book list.
-
#replacements_for(subject, published: true) ⇒ Set
Find the terminal replacements for the given subject, if any exist.
-
#resolve_documents ⇒ Object
resolve documents from source.
- #resolve_file(path) ⇒ Object
-
#struct_for(subject, rev: false, only: [], uuids: false, canon: false) ⇒ Hash
Obtain a key-value structure for the given subject, optionally constraining the result by node type (:resource, :uri/:iri, :blank/:bnode, :literal).
- #sub_concepts(concept, extra: []) ⇒ Object
-
#subjects_for(predicate, object, entail: true, only: []) ⇒ RDF::Resource
Returns subjects from the graph with entailment.
-
#target_for(uri, published: false) ⇒ Pathname
Find a destination pathname for the document.
-
#visit(uri) ⇒ RDF::SAK::Context::Document
Visit (open) the document at the given URI.
- #write_feeds(type: RDF::Vocab::DCAT.Distribution, published: true) ⇒ Object
- #write_gone_map(published: false, docs: nil) ⇒ Object
-
#write_map_file(location, data) ⇒ Object
private?.
- #write_maps(published: true, docs: nil) ⇒ Object
- #write_reading_lists(published: true) ⇒ Object
- #write_redirect_map(published: false, docs: nil) ⇒ Object
-
#write_rewrite_map(published: false, docs: nil) ⇒ Object
public again.
- #write_sitemap(published: true) ⇒ Object
- #write_stats(published: true) ⇒ Object
-
#write_xhtml(published: true) ⇒ Object
write public and private variants to target.
Methods included from Util
#all_related, asserted_types, base_for, canonical_uri, canonical_uuid, cmp_label, #cmp_resource, #coerce_node_spec, dates_for, #dehydrate, #get_base, #get_prefixes, #invert_struct, label_for, #modernize, #node_matches?, objects_for, #predicate_set, #prefix_subset, #prepare_collation, published?, #rehydrate, #reindent, replacements_for, #resolve_curie, #smush_struct, #split_pp, #split_pp2, #split_qp, struct_for, #subject_for, subjects_for, #subtree, #terminal_slug, #title_tag, traverse_links, #type_strata, #uri_pp
Constructor Details
#initialize(graph: nil, base: nil, config: nil, type: nil) ⇒ RDF::SAK::Context
Initialize a context.
229 230 231 232 233 234 235 236 237 238 239 240 241 |
# File 'lib/rdf/sak.rb', line 229 def initialize graph: nil, base: nil, config: nil, type: nil # RDF::Reasoner.apply(:rdfs, :owl) @config = coerce_config config graph ||= @config[:graph] if @config[:graph] base ||= @config[:base] if @config[:base] @graph = coerce_graph graph, type: type @base = RDF::URI.new base.to_s if base @ucache = RDF::Util::Cache.new(-1) @scache = {} # wtf rdf util cache doesn't like booleans end |
Instance Attribute Details
#base ⇒ Object (readonly)
Returns the value of attribute base.
218 219 220 |
# File 'lib/rdf/sak.rb', line 218 def base @base end |
#config ⇒ Object (readonly)
Returns the value of attribute config.
218 219 220 |
# File 'lib/rdf/sak.rb', line 218 def config @config end |
#graph ⇒ Object (readonly)
Returns the value of attribute graph.
218 219 220 |
# File 'lib/rdf/sak.rb', line 218 def graph @graph end |
Instance Method Details
#abbreviate(term, prefixes: , vocab: nil, noop: true, sort: true) ⇒ String
260 261 262 263 |
# File 'lib/rdf/sak.rb', line 260 def abbreviate term, prefixes: @config[:prefixes], vocab: nil, noop: true, sort: true super term, prefixes: prefixes || {}, vocab: vocab, noop: noop, sort: sort end |
#all_internal_docs(published: true, exclude: []) ⇒ Object
1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 |
# File 'lib/rdf/sak.rb', line 1173 def all_internal_docs published: true, exclude: [] # find all UUIDs that are documents docs = all_of_type(RDF::Vocab::FOAF.Document, exclude: exclude).select { |x| x =~ /^urn:uuid:/ } # prune out all but the published documents if specified if published p = RDF::Vocab::BIBO.status o = RDF::Vocabulary.find_term( 'http://purl.org/ontology/bibo/status/published') docs = docs.select do |s| @graph.has_statement? RDF::Statement(s, p, o) end end docs end |
#all_of_type(rdftype, exclude: []) ⇒ Array
Obtain every subject that is rdf:type the given type or its subtypes.
295 296 297 298 299 300 301 302 303 304 |
# File 'lib/rdf/sak.rb', line 295 def all_of_type rdftype, exclude: [] exclude = term_list exclude t = RDF::Vocabulary.find_term(rdftype) or raise "No type #{rdftype.to_s}" out = [] (all_types & (t) - exclude).each do |type| out += @graph.query([nil, RDF.type, type]).subjects end out.uniq end |
#all_types ⇒ Array
Obtain everything in the graph that is an ‘rdf:type` of something.
285 286 287 |
# File 'lib/rdf/sak.rb', line 285 def all_types @graph.query([nil, RDF.type, nil]).objects.uniq end |
#asserted_types(subject, type = nil) ⇒ Array
Obtain all and only the rdf:types directly asserted on the subject.
313 314 315 |
# File 'lib/rdf/sak.rb', line 313 def asserted_types subject, type = nil Util.asserted_types @graph, subject, type end |
#audiences_for(uuid, proximate: false, invert: false) ⇒ Object
511 512 513 514 515 516 517 518 519 520 521 |
# File 'lib/rdf/sak.rb', line 511 def audiences_for uuid, proximate: false, invert: false p = invert ? CI['non-audience'] : RDF::Vocab::DC.audience return @graph.query([uuid, p, nil]).objects if proximate out = [] @graph.query([uuid, p, nil]).objects.each do |o| out += sub_concepts o end out end |
#authors_for(subject, unique: false, contrib: false) ⇒ RDF::Value, Array
Assuming the subject is a thing that has authors, return the list of authors. Try bibo:authorList first for an explicit ordering, then continue to the various other predicates.
423 424 425 |
# File 'lib/rdf/sak.rb', line 423 def subject, unique: false, contrib: false Util. @graph, subject, unique: unique, contrib: contrib end |
#canonical_uri(subject, unique: true, rdf: true, slugs: false, fragment: false) ⇒ RDF::URI, ...
Obtain the “best” dereferenceable URI for the subject. Optionally returns all candidates.
341 342 343 344 345 |
# File 'lib/rdf/sak.rb', line 341 def canonical_uri subject, unique: true, rdf: true, slugs: false, fragment: false Util.canonical_uri @graph, subject, base: @base, unique: unique, rdf: rdf, slugs: slugs, fragment: fragment end |
#canonical_uuid(uri, unique: true, published: false) ⇒ RDF::URI, Array
Obtain the canonical UUID for the given URI
325 326 327 328 |
# File 'lib/rdf/sak.rb', line 325 def canonical_uuid uri, unique: true, published: false Util.canonical_uuid @graph, uri, unique: unique, published: published, scache: @scache, ucache: @ucache, base: @base end |
#dates_for(subject, predicate: RDF::Vocab::DC.date, datatype: [RDF::XSD.date, RDF::XSD.dateTime]) ⇒ Array
Obtain dates for the subject as instances of Date(Time). This is just shorthand for a common application of ‘objects_for`.
394 395 396 397 |
# File 'lib/rdf/sak.rb', line 394 def dates_for subject, predicate: RDF::Vocab::DC.date, datatype: [RDF::XSD.date, RDF::XSD.dateTime] Util.dates_for @graph, subject, predicate: predicate, datatype: datatype end |
#formats_for(subject, predicate: RDF::Vocab::DC.format, datatype: [RDF::XSD.token]) ⇒ Array
Obtain any specified MIME types for the subject. Just shorthand for a common application of ‘objects_for`.
408 409 410 411 |
# File 'lib/rdf/sak.rb', line 408 def formats_for subject, predicate: RDF::Vocab::DC.format, datatype: [RDF::XSD.token] Util.objects_for @graph, subject, predicate: predicate, datatype: datatype end |
#generate_atom_feed(id, published: true, related: []) ⇒ Object
1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 |
# File 'lib/rdf/sak.rb', line 1191 def generate_atom_feed id, published: true, related: [] raise 'ID must be a resource' unless id.is_a? RDF::Resource # prepare relateds raise 'related must be an array' unless .is_a? Array -= [id] # feed = struct_for id faudy = audiences_for id faudn = audiences_for id, invert: true faudy -= faudn docs = all_internal_docs published: published # now we create a hash keyed by uuid containing the metadata = {} titles = {} dates = {} entries = {} latest = nil docs.each do |uu| # basically make a jsonld-like structure #rsrc = struct_for uu indexed = objects_for uu, RDF::SAK::CI.indexed, only: :literal next if !indexed.empty? and indexed.any? { |f| f == false } # get id (got it already duh) # get audiences audy = audiences_for uu, proximate: true audn = audiences_for uu, proximate: true, invert: true #warn "#{faudy.to_s} & #{faud" skip = false if audy.empty? # an unspecified audience implies "everybody", but if the # feed's audience *is* specified, then it's not for everybody skip = true unless faudy.empty? else # if document audience matches feed non-audience, disqualify skip = true unless (faudn & audy).empty? # absence of an explicit feed audience implies "everybody" if faudy.empty? # if document audience minus feed non-audience has # members, re-qualify skip = false unless (audy - faudn).empty? else # if document audience matches feed audience, re-qualify skip = false unless (faudy & audy).empty? end end # if document non-audience matches feed audience, re-disqualify skip = true if !(audn.empty? || faudy.empty?) && !(faudy & audn).empty? next if skip canon = URI.parse(canonical_uri(uu).to_s) xml = { '#entry' => [ { '#link' => nil, rel: :alternate, href: canon, type: 'text/html' }, { '#id' => uu.to_s } ] } # get published date first published = (objects_for uu, [RDF::Vocab::DC.issued, RDF::Vocab::DC.created], datatype: RDF::XSD.dateTime)[0] # get latest updated date updated = (objects_for uu, RDF::Vocab::DC.modified, datatype: RDF::XSD.dateTime).sort[-1] updated ||= published || RDF::Literal::DateTime.new(DateTime.now) updated = Time.parse(updated.to_s).utc latest = updated if !latest or latest < updated xml['#entry'].push({ '#updated' => updated.iso8601 }) if published published = Time.parse(published.to_s).utc xml['#entry'].push({ '#published' => published.iso8601 }) dates[uu] = [published, updated] else dates[uu] = [updated, updated] end # get author(s) al = [] (uu).each do |a| unless [a] n = label_for a x = [a] = { '#author' => [{ '#name' => n[1].to_s }] } hp = @graph.first_object [a, RDF::Vocab::FOAF.homepage, nil] hp ||= canonical_uri a x['#author'].push({ '#uri' => hp.to_s }) if hp end al.push [a] end xml['#entry'] += al unless al.empty? # get title (note unshift) if (t = label_for uu) titles[uu] = t[1].to_s xml['#entry'].unshift({ '#title' => t[1].to_s }) else titles[uu] = uu.to_s end # get abstract if (d = label_for uu, desc: true) xml['#entry'].push({ '#summary' => d[1].to_s }) end entries[uu] = xml end # note we overwrite the entries hash here with a sorted array entrycmp = -> a, b { # first we sort by published date p = dates[a][0] <=> dates[b][0] # if the published dates are the same, sort by updated date u = dates[a][1] <=> dates[b][1] # to break any ties, finally sort by title p == 0 ? u == 0 ? titles[a] <=> titles[b] : u : p } entries = entries.values_at( *entries.keys.sort { |a, b| entrycmp.call(a, b) }) # ugggh god forgot the asterisk and lost an hour # now we punt out the doc preamble = [ { '#id' => id.to_s }, { '#updated' => latest.iso8601 }, { '#generator' => 'RDF::SAK', version: RDF::SAK::VERSION, uri: "https://github.com/doriantaylor/rb-rdf-sak" }, { nil => :link, rel: :self, type: 'application/atom+xml', href: canonical_uri(id) }, { nil => :link, rel: :alternate, type: 'text/html', href: @base }, ] + .map do |r| { nil => :link, rel: :related, type: 'application/atom+xml', href: canonical_uri(r) } end if (t = label_for id) preamble.unshift({ '#title' => t[1].to_s }) end if (r = @graph.first_literal [id, RDF::Vocab::DC.rights, nil]) rh = { '#rights' => r.to_s, type: :text } rh['xml:lang'] = r.language if r.has_language? preamble.push rh end markup(spec: { '#feed' => preamble + entries, xmlns: 'http://www.w3.org/2005/Atom' }).document end |
#generate_audience_csv(file = nil, published: true) ⇒ Object
959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 |
# File 'lib/rdf/sak.rb', line 959 def generate_audience_csv file = nil, published: true require 'csv' file = coerce_to_path_or_io file if file lab = {} out = all_internal_docs(published: published, exclude: RDF::Vocab::FOAF.Image).map do |s| u = canonical_uri s x = struct_for s c = x[RDF::Vocab::DC.created] ? x[RDF::Vocab::DC.created][0] : nil _, t = label_for s, candidates: x _, d = label_for s, candidates: x, desc: true # # audience(s) # a = objects_for(s, RDF::Vocab::DC.audience).map do |au| # next lab[au] if lab[au] # _, al = label_for au # lab[au] = al # end.map(&:to_s).sort.join '; ' # # explicit non-audience(s) # n = objects_for(s, RDF::SAK::CI['non-audience']).map do |au| # next lab[au] if lab[au] # _, al = label_for au # lab[au] = al # end.map(&:to_s).sort.join '; ' # audience and non-audience a, n = [RDF::Vocab::DC.audience, CI['non-audience']].map do |ap| objects_for(s, ap).map do |au| next lab[au] if lab[au] _, al = label_for au lab[au] = al end.map(&:to_s).sort.join '; ' end # concepts??? concepts = [RDF::Vocab::DC.subject, CI.introduces, CI.assumes, CI.mentions].map do |pred| objects_for(s, pred, only: :resource).map do |o| con = self.objects_for(o, RDF.type).to_set & CONCEPTS next if con.empty? next lab[o] if lab[o] _, ol = label_for o lab[o] = ol end.compact.map(&:to_s).sort.join '; ' end [s, u, c, t, d, a, n].map(&:to_s) + concepts end.sort { |a, b| a[2] <=> b[2] } out.unshift ['ID', 'URL', 'Created', 'Title', 'Description', 'Audience', 'Non-Audience', 'Subject', 'Introduces', 'Assumes', 'Mentions'] if file # don't open until now file = file..open('wb') unless file.is_a? IO csv = CSV.new file out.each { |x| csv << x } file.flush end out end |
#generate_backlinks(subject, published: true, ignore: nil) ⇒ Object
678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 |
# File 'lib/rdf/sak.rb', line 678 def generate_backlinks subject, published: true, ignore: nil uri = canonical_uri(subject, rdf: false) || URI(uri_pp subject) ignore ||= Set.new raise 'ignore must be amenable to a set' unless ignore.respond_to? :to_set ignore = ignore.to_set nodes = {} labels = {} types = {} @graph.query([nil, nil, subject]).each do |stmt| next if ignore.include?(sj = stmt.subject) preds = nodes[sj] ||= Set.new preds << (pr = stmt.predicate) types[sj] ||= asserted_types sj labels[sj] ||= label_for sj labels[pr] ||= label_for pr end # prune out nodes.select! { |k, _| published? k } if published return if nodes.empty? li = nodes.sort do |a, b| cmp_label a[0], b[0], labels: labels end.map do |rsrc, preds| cu = canonical_uri(rsrc, rdf: false) or next lab = labels[rsrc] || [nil, rsrc] lp = abbreviate(lab[0]) if lab[0] ty = abbreviate(types[rsrc]) if types[rsrc] { [{ [{ [lab[1].to_s] => :span, property: lp }] => :a, href: uri.route_to(cu), typeof: ty, rev: abbreviate(preds) }] => :li } end.compact { [{ li => :ul }] => :nav } end |
#generate_bibliography(id, published: true) ⇒ Object
760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 |
# File 'lib/rdf/sak.rb', line 760 def generate_bibliography id, published: true id = canonical_uuid id uri = canonical_uri id struct = struct_for id nodes = Set[id] + smush_struct(struct) bodynodes = Set.new parts = {} referents = {} labels = { id => label_for(id, candidates: struct) } canon = {} # uggh put these somewhere preds = { hp: predicate_set(RDF::Vocab::DC.hasPart), sa: predicate_set(RDF::RDFS.seeAlso), canon: predicate_set([RDF::OWL.sameAs, CI.canonical]), ref: predicate_set(RDF::Vocab::DC.references), al: predicate_set(RDF::Vocab::BIBO.contributorList), cont: predicate_set(RDF::Vocab::DC.contributor), } # collect up all the parts (as in dct:hasPart) objects_for(id, preds[:hp], entail: false, only: :resource).each do |part| bodynodes << part # gather up all the possible alias urls this thing can have sa = ([part] + objects_for(part, preds[:sa], only: :uri, entail: false)).map do |x| [x] + subjects_for(preds[:canon], x, only: :uri, entail: false) end.flatten.uniq # collect all the referents reftmp = {} sa.each do |u| subjects_for preds[:ref], u, only: :uri, entail: false do |s, *p| reftmp[s] ||= Set.new reftmp[s] += p[0].to_set end end # if we are producing a list of references identified by only # published resources, prune out all the unpublished referents reftmp.select! { |x, _| published? x } if published # unconditionally skip this item if nothing references it next if reftmp.empty? referents[part] = reftmp reftmp.each do |r, _| labels[r] ||= label_for r canon[r] ||= canonical_uri r end # collect all the authors and author lists objects_for(part, preds[:al], only: :resource, entail: false) do |o| RDF::List.new(subject: o, graph: @graph).each do |a| labels[a] ||= label_for a end end objects_for(part, preds[:cont], only: :uri, entail: false) do |a| labels[a] ||= label_for a end ps = struct_for part labels[part] = label_for part, candidates: ps nodes |= smush_struct ps parts[part] = ps end bmap = prepare_collation struct pf = -> x { abbreviate bmap[x.literal? ? :literals : :resources][x] } body = [] parts.sort { |a, b| cmp_label a[0], b[0], labels: labels }.each do |k, v| mapping = prepare_collation v p = -> x { abbreviate mapping[x.literal? ? :literals : :resources][x] } t = abbreviate mapping[:types] lp = label_for k, candidates: v h2c = [lp[1].to_s] h2 = { h2c => :h2 } cu = canonical_uri k rel = nil unless cu.scheme.downcase.start_with? 'http' if sa = v[RDF::RDFS.seeAlso] rel = p.call sa[0] cu = canonical_uri sa[0] else cu = nil end end if cu h2c[0] = { [lp[1].to_s] => :a, rel: rel, property: p.call(lp[1]), href: cu.to_s } else h2[:property] = p.call(lp[1]) end # authors &c # authors contributors editors translators al = [] AUTHOR_SPEC.each do |label, pl| dd = [] seen = Set.new pl.each do |pred| # first check if the struct has the predicate next unless v[pred] li = [] ul = { li => :ul, rel: abbreviate(pred) } v[pred].sort { |a, b| cmp_label a, b, labels: labels }.each do |o| # check if this is a list tl = RDF::List.new subject: o, graph: @graph if tl.empty? and !seen.include? o seen << o lab = labels[o] ? { [labels[o][1]] => :span, property: abbreviate(labels[o][0]) } : o li << { [lab] => :li, resource: o } else # XXX this will actually not be right if there are # multiple lists but FINE FOR NOW ul[:inlist] ||= '' tl.each do |a| seen << a lab = labels[a] ? { [labels[a][1]] => :span, property: abbreviate(labels[a][0]) } : a li << { [lab] => :li, resource: a } end end end dd << ul unless li.empty? end al += [{ [label] => :dt }, { dd => :dd }] unless dd.empty? end # ref list rl = referents[k].sort do |a, b| cmp_label a[0], b[0], labels: labels end.map do |ref, pset| lab = labels[ref] ? { [labels[ref][1]] => :span, property: abbreviate(labels[ref][0]) } : ref { [{ [lab] => :a, rev: abbreviate(pset), href: canon[ref] }] => :li } end contents = [h2, { al + [{ ['Referenced in:'] => :dt }, { [{ rl => :ul }] => :dd }] => :dl }] body << { contents => :section, rel: pf.call(k), resource: k.to_s, typeof: t } end # prepend abstract to body if it exists abs = label_for id, candidates: struct, desc: true if abs tag = { '#p' => abs[1], property: abbreviate(abs[0]) } body.unshift tag end # add labels to nodes nodes += smush_struct labels # get prefixes pfx = prefix_subset prefixes, nodes # get title tag title = title_tag labels[id][0], labels[id][1], prefixes: prefixes, lang: 'en' # get links link = head_links id, struct: struct, ignore: bodynodes, labels: labels, vocab: XHV # get metas mn = {} mn[abs[1]] = :description if abs mi = Set.new mi << labels[id][1] if labels[id] = id, struct: struct, lang: 'en', ignore: mi, meta_names: mn, vocab: XHV += (id) || [] xhtml_stub(base: uri, prefix: pfx, lang: 'en', title: title, vocab: XHV, link: link, meta: , transform: @config[:transform], body: { body => :body, about: '', typeof: abbreviate(struct[RDF::RDFV.type] || []) }).document end |
#generate_gone_map(published: false, docs: nil) ⇒ Object
1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 |
# File 'lib/rdf/sak.rb', line 1483 def generate_gone_map published: false, docs: nil # published is a no-op for this one because these docs are by # definition not published docs ||= reachable published: false p = RDF::Vocab::BIBO.status base = URI(@base.to_s) out = {} docs.select { |s| @graph.has_statement? RDF::Statement(s, p, CI.retired) }.each do |doc| canon = canonical_uri doc, rdf: false next unless base.route_to(canon).relative? canon = canon.request_uri.delete_prefix '/' # value of the gone map doesn't matter out[canon] = canon end out end |
#generate_reading_list(subject, published: true) ⇒ Object
1560 1561 1562 1563 1564 1565 1566 1567 1568 |
# File 'lib/rdf/sak.rb', line 1560 def generate_reading_list subject, published: true # struct = struct_for subject # find all the books, sort them by title # for each book, give title, authors, inbound references # punt out xhtml end |
#generate_redirect_map(published: false, docs: nil) ⇒ Object
you know what, it’s entirely possible that these ought never be called individually and the work to get one would duplicate the work of getting the other, so maybe just do ‘em both at once
1478 1479 1480 1481 |
# File 'lib/rdf/sak.rb', line 1478 def generate_redirect_map published: false, docs: nil generate_uuid_redirect_map(published: published, docs: docs).merge( generate_slug_redirect_map(published: published, docs: docs)) end |
#generate_rewrite_map(published: false, docs: nil) ⇒ Object
generate rewrite map(s)
1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 |
# File 'lib/rdf/sak.rb', line 1372 def generate_rewrite_map published: false, docs: nil docs ||= reachable published: published base = URI(@base.to_s) rwm = {} docs.each do |doc| tu = URI(doc.to_s) cu = canonical_uri doc, rdf: false next unless tu.respond_to?(:uuid) and cu.respond_to?(:request_uri) # skip external links obvs next unless base.route_to(cu).relative? # skip /uuid form cp = cu.request_uri.delete_prefix '/' next if cu.host == base.host and tu.uuid == cp rwm[cp] = tu.uuid end rwm end |
#generate_sitemap(published: true) ⇒ Object
1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 |
# File 'lib/rdf/sak.rb', line 1126 def generate_sitemap published: true urls = {} # do feeds separately feeds = all_of_type RDF::Vocab::DCAT.Distribution #feeds.select! { |f| published? f } if published feeds.each do |f| uri = canonical_uri(f) f = generate_atom_feed f, published: published, related: feeds mt = f.at_xpath('/atom:feed/atom:updated[1]/text()', { atom: 'http://www.w3.org/2005/Atom' }) urls[uri] = { [{ [uri.to_s] => :loc }, { [mt] => :lastmod }] => :url } end # build up hash of urls all_internal_docs(published: published).each do |doc| next if asserted_types(doc).include? RDF::Vocab::FOAF.Image uri = canonical_uri(doc) next unless uri. && @base && uri. == base. mods = objects_for(doc, [RDF::Vocab::DC.created, RDF::Vocab::DC.modified, RDF::Vocab::DC.issued], datatype: RDF::XSD.dateTime).sort nodes = [{ [uri.to_s] => :loc }] nodes << { [mods[-1].to_s] => :lastmod } unless mods.empty? urls[uri] = { nodes => :url } end urls = urls.sort.map { |_, v| v } markup(spec: { urls => :urlset, xmlns: 'http://www.sitemaps.org/schemas/sitemap/0.9' }).document end |
#generate_slug_redirect_map(published: false, docs: nil) ⇒ Object
find all URIs/slugs that are not canonical, map them to slugs that are canonical
1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 |
# File 'lib/rdf/sak.rb', line 1423 def generate_slug_redirect_map published: false, docs: nil docs ||= reachable published: published base = URI(@base.to_s) # for redirects we collect all the docs, plus all their URIs, # separate canonical from the rest # actually an easy way to do this is just harvest all the # multi-addressed docs, remove the first one, then ask for the # canonical uuid back, fwd = {} rev = {} out = {} docs.each do |doc| uris = canonical_uri doc, unique: false, rdf: false canon = uris.shift next unless canon.respond_to? :request_uri # cache the forward direction fwd[doc] = canon unless uris.empty? uris.each do |uri| next unless uri.respond_to? :request_uri next if canon == uri next unless base.route_to(uri).relative? # warn "#{canon} <=> #{uri}" requri = uri.request_uri.delete_prefix '/' next if requri == '' || requri =~ /^[0-9a-f]{8}(?:-[0-9a-f]{4}){4}[0-9a-f]{8}$/ # cache the reverse direction rev[uri] = requri end end end rev.each do |uri, requri| if (doc = canonical_uuid(uri, published: published)) and fwd[doc] and fwd[doc] != uri out[requri] = fwd[doc].to_s end end out end |
#generate_stats(published: true) ⇒ Object
1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 |
# File 'lib/rdf/sak.rb', line 1585 def generate_stats published: true out = {} all_of_type(QB.DataSet).map do |s| base = canonical_uri s, rdf: false types = abbreviate asserted_types(s) title = if t = label_for(s) [t[1].to_s, abbreviate(t[0])] end cache = {} subjects_for(QB.dataSet, s, only: :resource).each do |o| if d = objects_for(o, CI.document, only: :resource).first if !published or published?(d) # include a "sort" time that defaults to epoch zero c = cache[o] ||= { doc: d, stime: Time.at(0).getgm, struct: struct_for(o) } if t = label_for(d) c[:title] = t end if a = label_for(d, desc: true) c[:abstract] = a end if ct = objects_for(d, RDF::Vocab::DC.created, datatype: RDF::XSD.dateTime).first c[:stime] = c[:ctime] = ct.object.to_time.getgm end if mt = objects_for(d, RDF::Vocab::DC.modified, datatype:RDF::XSD.dateTime) c[:mtime] = mt.map { |m| m.object.to_time.getgm }.sort c[:stime] = c[:mtime].last unless mt.empty? end end end end # sort lambda closure sl = -> a, b do x = cache[b][:stime] <=> cache[a][:stime] return x unless x == 0 x = cache[b][:ctime] <=> cache[a][:ctime] return x unless x == 0 ta = cache[a][:title] || Array.new(2, cache[a][:uri]) tb = cache[b][:title] || Array.new(2, cache[b][:uri]) ta[1].to_s <=> tb[1].to_s end rows = [] cache.keys.sort(&sl).each do |k| c = cache[k] href = base.route_to canonical_uri(c[:doc], rdf: false) dt = abbreviate asserted_types(c[:doc]) uu = URI(k.to_s).uuid nc = UUID::NCName.to_ncname uu, version: 1 tp, tt = c[:title] || [] ab = if c[:abstract] { [c[:abstract][1].to_s] => :th, about: href, property: abbreviate(c[:abstract].first) } else { [] => :th } end td = [{ { { [tt.to_s] => :span, property: abbreviate(tp) } => :a, rel: 'ci:document', href: href } => :th }, ab, { [c[:ctime].iso8601] => :th, property: 'dct:created', datatype: 'xsd:dateTime', about: href, typeof: dt }, { c[:mtime].reverse.map { |m| { [m.iso8601] => :span, property: 'dct:modified', datatype: 'xsd:dateTime' } } => :th, about: href }, ] + DSD_SEQ.map do |f| h = [] x = { h => :td } p = CI[f] if y = c[:struct][p] and !y.empty? h << y = y.first x[:property] = abbreviate p x[:datatype] = abbreviate y.datatype if y.datatype? end x end rows << { td => :tr, id: nc, about: "##{nc}", typeof: 'qb:Observation' } end out[s] = xhtml_stub(base: base, title: title, transform: config[:transform], attr: { about: '', typeof: types }, prefix: prefixes, content: { [{ [{ [{ ['About'] => :th, colspan: 4 }, { ['Counts'] => :th, colspan: 4 }, { ['Words per Block'] => :th, colspan: 7 }] => :tr }, { TH_SEQ => :tr } ] => :thead }, { rows => :tbody, rev: 'qb:dataSet' }] => :table }).document end out end |
#generate_twitter_meta(subject) ⇒ Object
715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 |
# File 'lib/rdf/sak.rb', line 715 def subject # get author = (subject, unique: true) or return # get author's twitter account twitter = objects_for(, RDF::Vocab::FOAF.account, only: :resource).select { |t| t.to_s =~ /twitter\.com/ }.sort.first or return twitter = URI(twitter.to_s).path.split(/\/+/)[1] twitter = ?@ + twitter unless twitter.start_with? ?@ # get title title = label_for(subject) or return out = [ { nil => :meta, name: 'twitter:card', content: :summary }, { nil => :meta, name: 'twitter:site', content: twitter }, { nil => :meta, name: 'twitter:title', content: title[1].to_s } ] # get abstract if desc = label_for(subject, desc: true) out.push({ nil => :meta, name: 'twitter:description', content: desc[1].to_s }) end # get image (foaf:depiction) img = objects_for(subject, RDF::Vocab::FOAF.depiction, only: :resource) unless img.empty? img = img[0].to_s out.push({ nil => :meta, name: 'twitter:image', content: img }) out[0][:content] = :summary_large_image end # return the appropriate xml-mixup structure out end |
#generate_uuid_redirect_map(published: false, docs: nil) ⇒ Object
give me all UUIDs of all documents, filter for published if applicable
find the “best” (relative) URL for the UUID and map the pair together
1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 |
# File 'lib/rdf/sak.rb', line 1399 def generate_uuid_redirect_map published: false, docs: nil docs ||= reachable published: published base = URI(@base.to_s) # keys are /uuid, values are out = {} docs.each do |doc| tu = URI(doc.to_s) cu = canonical_uri doc, rdf: false next unless tu.respond_to?(:uuid) and cu.respond_to?(:request_uri) # skip /uuid form cp = cu.request_uri.delete_prefix '/' next if cu.host == base.host && tu.uuid == cp # all redirect links are absolute out[tu.uuid] = cu.to_s end out end |
#head_links(subject, struct: nil, nodes: nil, prefixes: {}, ignore: [], uris: {}, labels: {}, vocab: nil) ⇒ Object
generate indexes of books, not-books, and other external links
567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 |
# File 'lib/rdf/sak.rb', line 567 def head_links subject, struct: nil, nodes: nil, prefixes: {}, ignore: [], uris: {}, labels: {}, vocab: nil raise 'ignore must be Array or Set' unless [Array, Set].any? { |c| ignore.is_a? c } struct ||= struct_for subject nodes ||= invert_struct struct # make sure these are actually URI objects not RDF::URI uris = uris.transform_values { |v| URI(uri_pp v.to_s) } uri = uris[subject] || canonical_uri(subject, rdf: false) ignore = ignore.to_set # output links = [] nodes.reject { |n, _| ignore.include?(n) || !n.uri? }.each do |k, v| # first nuke rdf:type, that's never in there v = v.dup.delete RDF::RDFV.type next if v.empty? unless uris[k] cu = canonical_uri k uris[k] = cu || uri_pp(k.to_s) end # munge the url and make the tag rel = abbreviate v.to_a, vocab: vocab ru = uri.route_to(uris[k]) ln = { nil => :link, rel: rel, href: ru.to_s } # add the title if lab = labels[k] ln[:title] = lab[1].to_s end # add type attribute unless (mts = formats_for k).empty? ln[:type] = mts.first.to_s if ln[:type] =~ /(java|ecma)script/i || !(v.to_set & Set[RDF::Vocab::DC.requires]).empty? ln[:src] = ln.delete :href # make sure we pass in an empty string so there is a closing tag ln.delete nil ln[['']] = :script end end # finally add the link links.push ln end links.sort! do |a, b| # sort by rel, then by href # warn a.inspect, b.inspect s = 0 [nil, :rel, :rev, :href, :title].each do |k| s = a.fetch(k, '').to_s <=> b.fetch(k, '').to_s break if s != 0 end s end links end |
#head_meta(subject, struct: nil, nodes: nil, prefixes: {}, ignore: [], meta_names: {}, vocab: nil, lang: nil, xhtml: true) ⇒ Object
636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 |
# File 'lib/rdf/sak.rb', line 636 def subject, struct: nil, nodes: nil, prefixes: {}, ignore: [], meta_names: {}, vocab: nil, lang: nil, xhtml: true raise 'ignore must be Array or Set' unless [Array, Set].any? { |c| ignore.is_a? c } struct ||= struct_for subject nodes ||= invert_struct struct ignore = ignore.to_set = [] nodes.select { |n| n.literal? && !ignore.include?(n) }.each do |k, v| rel = abbreviate v.to_a, vocab: vocab tag = { nil => :meta, property: rel, content: k.to_s } lang = (k.language? && k.language != lang ? k.language : nil) || (k.datatype == RDF::XSD.string && lang ? '' : nil) if lang tag['xml:lang'] = lang if xhtml tag[:lang] = lang end tag[:datatype] = abbreviate k.datatype, vocab: XHV if k.datatype? tag[:name] = [k] if [k] << tag end .sort! do |a, b| s = 0 [:about, :property, :datatype, :content, :name].each do |k| # warn a.inspect, b.inspect s = a.fetch(k, '').to_s <=> b.fetch(k, '').to_s break if s != 0 end s end end |
#ingest_csv(file) ⇒ Object
1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 |
# File 'lib/rdf/sak.rb', line 1034 def ingest_csv file file = coerce_to_path_or_io file require 'csv' # key mapper km = { uuid: :id, url: :uri } kt = -> (k) { km[k] || k } # grab all the concepts and audiences audiences = {} all_of_type(CI.Audience).map do |c| s = struct_for c # homogenize the labels lab = [false, true].map do |b| label_for(c, candidates: s, unique: false, alt: b).map { |x| x[1] } end.flatten.map { |x| x.to_s.strip.downcase } # we want all the keys to share the same set set = nil lab.each { |t| set = audiences[t] ||= set || Set.new } set << c end concepts = {} all_of_type(RDF::Vocab::SKOS.Concept).map do |c| s = struct_for c # homogenize the labels lab = [false, true].map do |b| label_for(c, candidates: s, unique: false, alt: b).map { |x| x[1] } end.flatten.map { |x| x.to_s.strip.downcase } # we want all the keys to share the same set set = nil lab.each { |t| set = concepts[t] ||= set || Set.new } set << c end data = CSV.read(file, headers: true, header_converters: :symbol).map do |o| o = o.to_h.transform_keys(&kt) s = canonical_uuid(o.delete :id) or next # LOLOL wtf # handle audience [:audience, :nonaudience].each do |a| if o[a] o[a] = o[a].strip.split(/\s*[;,]+\s*/, -1).map do |t| if t =~ /^[a-z+-]+:[^[:space:]]+$/ u = RDF::URI(t) canonical_uuid(u) || u elsif audiences[t.downcase] audiences[t.downcase].to_a end end.flatten.compact.uniq else o[a] = [] end end # handle concepts [:subject, :introduces, :assumes, :mentions].each do |a| if o[a] o[a] = o[a].strip.split(/\s*[;,]+\s*/, -1).map do |t| if t =~ /^[a-z+-]+:[^[:space:]]+$/ u = RDF::URI(t) canonical_uuid(u) || u elsif concepts[t.downcase] concepts[t.downcase].to_a end end.flatten.compact.uniq else o[a] = [] end end CSV_PRED.each do |sym, pred| o[sym].each do |obj| @graph << [s, pred, obj] end end [s, o] end.compact.to_h data end |
#label_for(subject, candidates: nil, unique: true, type: nil, lang: nil, desc: false, alt: false) ⇒ Array
Obtain the most appropriate label(s) for the subject’s type(s). Returns one or more (depending on the ‘unique` flag) predicate-object pairs in order of preference.
440 441 442 443 444 |
# File 'lib/rdf/sak.rb', line 440 def label_for subject, candidates: nil, unique: true, type: nil, lang: nil, desc: false, alt: false Util.label_for @graph, subject, candidates: candidates, unique: unique, type: type, lang: lang, desc: desc, alt: alt end |
#locate(uri) ⇒ Pathname
Locate the file in the source directory associated with the given URI.
1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 |
# File 'lib/rdf/sak.rb', line 1703 def locate uri uri = coerce_resource uri base = URI(@base.to_s) tu = URI(uri) # copy of uri for testing content unless tu.scheme == 'urn' and tu.nid == 'uuid' raise "could not find UUID for #{uri}" unless uuid = canonical_uuid(uri) tu = URI(uri = uuid) end # xxx bail if the uri isn't a subject in the graph candidates = [@config[:source] + tu.uuid] # try all canonical URIs (canonical_uri uri, unique: false, slugs: true).each do |u| u = URI(u.to_s) next unless u.hostname == base.hostname p = URI.unescape u.path[/^\/*(.*?)$/, 1] candidates.push(@config[:source] + p) end # warn candidates files = candidates.uniq.map do |c| Pathname.glob(c.to_s + '{,.*,/index{,.*}}') end.reduce(:+).reject do |x| x.directory? or RDF::SAK::MimeMagic.by_path(x).to_s !~ /.*(?:markdown|(?:x?ht|x)ml).*/i end.uniq #warn files # XXX implement negotiation algorithm return files[0] # return the filename from the source # nil end |
#map_location(type) ⇒ Object
private?
1504 1505 1506 1507 1508 1509 1510 |
# File 'lib/rdf/sak.rb', line 1504 def map_location type # find file name in config fn = @config[:maps][type] or return # concatenate to target directory @config[:target] + fn end |
#objects_for(subject, predicate, entail: true, only: [], datatype: nil) ⇒ RDF::Term
Returns objects from the graph with entailment.
370 371 372 373 |
# File 'lib/rdf/sak.rb', line 370 def objects_for subject, predicate, entail: true, only: [], datatype: nil Util.objects_for @graph, subject, predicate, entail: entail, only: only, datatype: datatype end |
#prefixes ⇒ Hash
Get the prefix mappings from the configuration.
247 248 249 |
# File 'lib/rdf/sak.rb', line 247 def prefixes @config[:prefixes] || {} end |
#published?(uri, circulated: false) ⇒ true, false
Determine whether the URI represents a published document.
1808 1809 1810 1811 |
# File 'lib/rdf/sak.rb', line 1808 def published? uri, circulated: false RDF::SAK::Util.published? @graph, uri, circulated: circulated, base: @base end |
#reachable(published: false) ⇒ Object
Get all “reachable” UUID-identified entities (subjects which are also objects)
525 526 527 528 529 530 531 |
# File 'lib/rdf/sak.rb', line 525 def reachable published: false p = published ? -> x { published?(x) } : -> x { true } # now get the subjects which are also objects @graph.subjects.select do |s| s.uri? && s =~ /^urn:uuid:/ && @graph.has_object?(s) && p.call(s) end end |
#reading_lists(published: true) ⇒ Object
whoops lol we forgot the book list
1554 1555 1556 1557 1558 |
# File 'lib/rdf/sak.rb', line 1554 def reading_lists published: true out = all_of_type RDF::Vocab::SiocTypes.ReadingList return out unless published out.select { |r| published? r } end |
#replacements_for(subject, published: true) ⇒ Set
Find the terminal replacements for the given subject, if any exist.
382 383 384 |
# File 'lib/rdf/sak.rb', line 382 def replacements_for subject, published: true Util.replacements_for @graph, subject, published: published end |
#resolve_documents ⇒ Object
resolve documents from source
1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 |
# File 'lib/rdf/sak.rb', line 1758 def resolve_documents src = @config[:source] out = [] src.find do |f| Find.prune if f.basename.to_s[0] == ?. next if f.directory? out << f end out end |
#resolve_file(path) ⇒ Object
1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 |
# File 'lib/rdf/sak.rb', line 1770 def resolve_file path return unless path.file? path = Pathname('/') + path.relative_path_from(@config[:source]) base = URI(@base.to_s) uri = base + path.to_s #warn "trying #{uri}" until (out = canonical_uuid uri) # iteratively strip off break if uri.path.end_with? '/' dn = path.dirname bn = path.basename '.*' # try index first if bn.to_s == 'index' p = dn.to_s p << '/' unless p.end_with? '/' uri = base + p elsif bn == path.basename break else path = dn + bn uri = base + path.to_s end # warn "trying #{uri}" end out end |
#struct_for(subject, rev: false, only: [], uuids: false, canon: false) ⇒ Hash
Obtain a key-value structure for the given subject, optionally constraining the result by node type (:resource, :uri/:iri, :blank/:bnode, :literal)
276 277 278 279 |
# File 'lib/rdf/sak.rb', line 276 def struct_for subject, rev: false, only: [], uuids: false, canon: false Util.struct_for @graph, subject, rev: rev, only: only, uuids: uuids, canon: canon end |
#sub_concepts(concept, extra: []) ⇒ Object
470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 |
# File 'lib/rdf/sak.rb', line 470 def sub_concepts concept, extra: [] raise 'Concept must be exactly one concept' unless concept.is_a? RDF::Resource extra = term_list extra # we need an array for a queue, and a set to accumulate the # output as well as a separate 'seen' set queue = [concept] seen = Set.new queue.dup out = seen.dup # it turns out that the main SKOS hierarchy terms, while not # being transitive themselves, are subproperties of transitive # relations which means they are as good as being transitive. while c = queue.shift SKOS_HIER.each do |struct| elem, pat, preds = struct.values_at(:element, :pattern, :preds) preds.each do |p| @graph.query(pat.call c, p).each do |stmt| # obtain hierarchical element hierc = stmt.send elem # skip any further processing if we have seen this concept next if seen.include? hierc seen << hierc next if !extra.empty? and !extra.any? do |t| @graph.has_statement? RDF::Statement.new(hierc, RDF.type, t) end queue << hierc out << hierc end end end end out.to_a.sort end |
#subjects_for(predicate, object, entail: true, only: []) ⇒ RDF::Resource
Returns subjects from the graph with entailment.
356 357 358 |
# File 'lib/rdf/sak.rb', line 356 def subjects_for predicate, object, entail: true, only: [] Util.subjects_for @graph, predicate, object, entail: entail, only: only end |
#target_for(uri, published: false) ⇒ Pathname
Find a destination pathname for the document
1819 1820 1821 1822 1823 1824 1825 1826 |
# File 'lib/rdf/sak.rb', line 1819 def target_for uri, published: false uri = coerce_resource uri uri = canonical_uuid uri target = @config[published?(uri) && published ? :target : :private] # target is a pathname so this makes a pathname target + "#{URI(uri.to_s).uuid}.xml" end |
#visit(uri) ⇒ RDF::SAK::Context::Document
Visit (open) the document at the given URI.
1750 1751 1752 1753 1754 1755 |
# File 'lib/rdf/sak.rb', line 1750 def visit uri uri = canonical_uuid uri path = locate uri return unless path Document.new self, uri, uri: canonical_uri(uri), doc: path end |
#write_feeds(type: RDF::Vocab::DCAT.Distribution, published: true) ⇒ Object
1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 |
# File 'lib/rdf/sak.rb', line 1357 def write_feeds type: RDF::Vocab::DCAT.Distribution, published: true feeds = all_of_type type target = @config[published ? :target : :private] feeds.each do |feed| tu = URI(feed.to_s) doc = generate_atom_feed feed, published: published, related: feeds fh = (target + "#{tu.uuid}.xml").open('w') doc.write_to fh fh.close end end |
#write_gone_map(published: false, docs: nil) ⇒ Object
1535 1536 1537 1538 1539 |
# File 'lib/rdf/sak.rb', line 1535 def write_gone_map published: false, docs: nil data = generate_gone_map published: published, docs: docs loc = map_location :gone write_map_file loc, data end |
#write_map_file(location, data) ⇒ Object
private?
1514 1515 1516 1517 1518 1519 |
# File 'lib/rdf/sak.rb', line 1514 def write_map_file location, data # open file fh = File.new location, 'w' data.sort.each { |k, v| fh.write "#{k}\t#{v}\n" } fh.close # return value is return value from close end |
#write_maps(published: true, docs: nil) ⇒ Object
1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 |
# File 'lib/rdf/sak.rb', line 1541 def write_maps published: true, docs: nil docs ||= reachable published: false # slug to uuid (internal) write_rewrite_map docs: docs # uuid/slug to canonical slug (308) write_redirect_map docs: docs # retired slugs/uuids (410) write_gone_map docs: docs true end |
#write_reading_lists(published: true) ⇒ Object
1570 1571 1572 1573 1574 1575 1576 1577 1578 |
# File 'lib/rdf/sak.rb', line 1570 def write_reading_lists published: true reading_lists(published: published).each do |rl| tu = URI(rl.to_s) doc = generate_reading_list rl, published: published fh = (target + "#{tu.uuid}.xml").open('w') doc.write_to fh fh.close end end |
#write_redirect_map(published: false, docs: nil) ⇒ Object
1529 1530 1531 1532 1533 |
# File 'lib/rdf/sak.rb', line 1529 def write_redirect_map published: false, docs: nil data = generate_redirect_map published: published, docs: docs loc = map_location :redirect write_map_file loc, data end |
#write_rewrite_map(published: false, docs: nil) ⇒ Object
public again
1523 1524 1525 1526 1527 |
# File 'lib/rdf/sak.rb', line 1523 def write_rewrite_map published: false, docs: nil data = generate_rewrite_map published: published, docs: docs loc = map_location :rewrite write_map_file loc, data end |
#write_sitemap(published: true) ⇒ Object
1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 |
# File 'lib/rdf/sak.rb', line 1159 def write_sitemap published: true sitemap = generate_sitemap published: published file = @config[:sitemap] || '.well-known/sitemap.xml' target = @config[published ? :target : :private] target.mkpath unless target.directory? fh = (target + file).open(?w) sitemap.write_to fh fh.close end |
#write_stats(published: true) ⇒ Object
1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 |
# File 'lib/rdf/sak.rb', line 1683 def write_stats published: true target = @config[published ? :target : :private] target.mkpath unless target.directory? generate_stats(published: published).each do |uu, doc| bn = URI(uu.to_s).uuid + '.xml' fh = (target + bn).open ?w doc.write_to fh fh.flush fh.close end end |
#write_xhtml(published: true) ⇒ Object
write public and private variants to target
1834 1835 |
# File 'lib/rdf/sak.rb', line 1834 def write_xhtml published: true end |