Class: CBETA

Inherits:
Object
  • Object
show all
Defined in:
lib/cbeta.rb

Defined Under Namespace

Classes: BMToText, CharCount, CharFrequency, Gaiji, HTMLToPDF, HTMLToText, P5aParser, P5aToHTML, P5aToHTMLForEveryEdition, P5aToHTMLForPDF, P5aToSimpleHTML, P5aToText, P5aValidator

Constant Summary collapse

DATA =
File.join(File.dirname(__FILE__), 'data')
PUNCS =
'.[]。,、?「」『』《》<>〈〉〔〕[]【】〖〗'

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initializeCBETA

載入藏經資料



52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
# File 'lib/cbeta.rb', line 52

def initialize()
  fn = File.join(File.dirname(__FILE__), 'data/canons.csv')
  text = File.read(fn)
  @canon_abbr = {}
  @canon_nickname = {}
  CSV.parse(text, :headers => true) do |row|
    id = row['id']
    unless row['nickname'].nil?
      @canon_nickname[id] = row['nickname']
    end
    next if row['abbreviation'].nil?
  	next if row['abbreviation'].empty?
    @canon_abbr[id] = row['abbreviation']
  end
  
  fn = File.join(File.dirname(__FILE__), 'data/categories.json')
  s = File.read(fn)
  @categories = JSON.parse(s)
end

Class Method Details

.linehead_to_s(linehead) ⇒ String

將行首資訊轉為引用格式

Examples:

CBETA.linehead_to_s('T85n2838_p1291a03')
# return "T85, no. 2838, p. 1291, a03"

Parameters:

  • linehead (String)

    行首資訊, 例如:T85n2838_p1291a03

Returns:

  • (String)

    引用格式的出處資訊,例如:T85, no. 2838, p. 1291, a03



20
21
22
23
24
25
# File 'lib/cbeta.rb', line 20

def self.linehead_to_s(linehead)
  linehead.match(/^([A-Z]\d+)n(.*)_p(\d+)([a-z]\d+)$/) {
    return "#{$1}, no. #{$2}, p. #{$3}, #{$4}"
  }
  nil
end

.open_xml(fn) ⇒ Object



27
28
29
30
31
32
# File 'lib/cbeta.rb', line 27

def self.open_xml(fn)
  s = File.read(fn)
  doc = Nokogiri::XML(s)
  doc.remove_namespaces!()
  doc
end

.pua(gid) ⇒ Object

傳入 缺字碼,傳回 Unicode PUA 字元



35
36
37
# File 'lib/cbeta.rb', line 35

def self.pua(gid)
  [0xf0000 + gid[2..-1].to_i].pack 'U'
end

.ranjana_pua(gid) ⇒ Object

傳入 蘭札體 缺字碼,傳回 Unicode PUA 字元



40
41
42
43
# File 'lib/cbeta.rb', line 40

def self.ranjana_pua(gid)
  i = 0x10000 + gid[-4..-1].hex
  [i].pack("U")
end

.siddham_pua(gid) ⇒ Object

傳入 悉曇字 缺字碼,傳回 Unicode PUA 字元



46
47
48
49
# File 'lib/cbeta.rb', line 46

def self.siddham_pua(gid)
  i = 0xFA000 + gid[-4..-1].hex
  [i].pack("U")
end

Instance Method Details

#get_canon_abbr(id) ⇒ String

取得藏經略名

Examples:

cbeta = CBETA.new
cbeta.get_canon_abbr('T') # return "大"

Parameters:

  • id (String)

    藏經 ID, 例如大正藏的 ID 是 “T”

Returns:

  • (String)

    藏經短名,例如 “大”



100
101
102
103
104
# File 'lib/cbeta.rb', line 100

def get_canon_abbr(id)
   r = get_canon_symbol(id)
   return nil if r.nil?
   r.sub(/^【(.*?)】$/, '\1')
end

#get_canon_nickname(id) ⇒ String

Returns 藏經短名,例如 “大正藏”.

Parameters:

  • id (String)

    藏經 ID, 例如大正藏的 ID 是 “T”

Returns:

  • (String)

    藏經短名,例如 “大正藏”



74
75
76
77
# File 'lib/cbeta.rb', line 74

def get_canon_nickname(id)
return nil unless @canon_nickname.key? id
@canon_nickname[id]
end

#get_canon_symbol(id) ⇒ String

取得藏經略符

Examples:

cbeta = CBETA.new
cbeta.get_canon_symbol('T') # return "【大】"

Parameters:

  • id (String)

    藏經 ID, 例如大正藏的 ID 是 “T”

Returns:

  • (String)

    藏經略符,例如 “【大】”



87
88
89
90
# File 'lib/cbeta.rb', line 87

def get_canon_symbol(id)
	return nil unless @canon_abbr.key? id
	@canon_abbr[id]
end

#get_category(book_id) ⇒ String

傳入經號,取得部類

Examples:

cbeta = CBETA.new
cbeta.get_category('T0220') # return '般若部類'

Parameters:

  • book_id (String)

    Book ID (經號), ex. “T0220”

Returns:

  • (String)

    部類名稱,例如 “阿含部類”



113
114
115
# File 'lib/cbeta.rb', line 113

def get_category(book_id)
  @categories[book_id]
end