Class: CBETA::Gaiji

Inherits:
Object
  • Object
show all
Defined in:
lib/cbeta/gaiji.rb

Overview

存取 CBETA 缺字資料庫

Instance Method Summary collapse

Constructor Details

#initializeGaiji

載入 CBETA 缺字資料庫



8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
# File 'lib/cbeta/gaiji.rb', line 8

def initialize
  @us = CBETA::UnicodeService.new
  folder = File.join(File.dirname(__FILE__), '../data')
  fn = File.join(folder, 'cbeta_gaiji.json')
  @gaijis = JSON.parse(File.read(fn))
  
  fn = File.join(folder, 'cbeta_sanskrit.json')
  h = JSON.parse(File.read(fn))
  @gaijis.merge!(h)
  
  @zzs = {}
  @uni2cb = {}
  @gaijis.each do |k,v|
    if v.key? 'composition'
      zzs = v['composition']
      @zzs[zzs] = k
    end
    
    if v.key? 'uni_char'
      c = v['uni_char']
      @uni2cb[c] = k
    end
  end
end

Instance Method Details

#[](cb) ⇒ Hash{String => Strin, Array<String>}?

取得缺字資訊

Return:

{
  "composition": "[得-彳]",
  "unicode": "3775",
  "uni_char": "",
  "zhuyin": [ "ㄉㄜˊ", "ㄞˋ" ]
}

Examples:

g = CBETA::Gaiji.new
g["CB01002"]

Parameters:

  • cb (String)

    缺字 CB 碼

Returns:

  • (Hash{String => Strin, Array<String>})

    缺字資訊

  • (nil)

    如果該 CB 碼在 CBETA 缺字庫中不存在



50
51
52
# File 'lib/cbeta/gaiji.rb', line 50

def [](cb)
	@gaijis[cb]
end

#key?(cb) ⇒ Boolean

檢查某個缺字碼是否存在

Returns:

  • (Boolean)


55
56
57
# File 'lib/cbeta/gaiji.rb', line 55

def key?(cb)
  @gaijis.key? cb
end

#to_s(gid, cb_priority: nil, skt_priority: nil) ⇒ String

依優先序呈現缺字

Parameters:

  • cb_priority (Array<String>) (defaults to: nil)

    優先序

  • skt_priority (Array<String>) (defaults to: nil)

    優先序

Returns:

  • (String)

    可能是 nil



64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
# File 'lib/cbeta/gaiji.rb', line 64

def to_s(gid, cb_priority: nil, skt_priority: nil)
  if cb_priority.nil?
    cb_priority = %w(uni_char norm_uni_char norm_big5_char composition)
  end
  
  if skt_priority.nil?
    skt_priority = %w(symbol romanized PUA)
  end
  
  g = @gaijis[gid]
  return nil if g.nil?
  
  if gid.start_with? 'CB'
    cb_priority.each do |k|
      case k
      when 'PUA'
        return CBETA.pua(gid)
      when 'uni_char', 'norm_uni_char'
        return g[k] if @us.level2?(g[k])
      else
        return g[k] if g.key?(k) and not g[k].empty?
      end
    end
  else
    skt_priority.each do |k|
      if k == 'PUA'
        s = g['pua'].sub(/^U\+(.*)$/, '\1')
        i = s.to_i(16)
        return [i].pack("U")
      else
        if g.key? k
          return g[k] unless g[k].empty?
        end
      end
    end
  end
  nil
end

#unicode_to_cb(unicode_char) ⇒ Object



103
104
105
# File 'lib/cbeta/gaiji.rb', line 103

def unicode_to_cb(unicode_char)
  @uni2cb[unicode_char]
end

#zhuyin(cb) ⇒ Array<String>

傳入缺字 CB 碼,傳回注音 array

資料來源:CBETA 於 2015.5.15 提供的 MS Access 缺字資料庫

Examples:

g = CBETA::Gaiji.new
g.zhuyin("CB00023") # return [ "ㄍㄢˇ", "ㄍㄢ", "ㄧㄤˊ", "ㄇㄧˇ", "ㄇㄧㄝ", "ㄒㄧㄤˊ" ]

Parameters:

  • cb (String)

    缺字 CB 碼

Returns:

  • (Array<String>)


117
118
119
120
# File 'lib/cbeta/gaiji.rb', line 117

def zhuyin(cb)
	return nil unless @gaijis.key? cb
  @gaijis[cb]['zhuyin']
end

#zzs2pua(zzs) ⇒ Object

傳入 組字式,取得 PUA



123
124
125
126
127
# File 'lib/cbeta/gaiji.rb', line 123

def zzs2pua(zzs)
  return nil unless @zzs.key? zzs
  gid = @zzs[zzs]
  CBETA.pua(gid)
end