Class: Swot
- Inherits:
-
Object
- Object
- Swot
- Extended by:
- SwotCollectionMethods
- Includes:
- NaughtyOrNice
- Defined in:
- lib/swot.rb,
lib/swot/academic_tlds.rb
Constant Summary collapse
- VERSION =
"0.4.2"
- BLACKLIST =
These are domains that snuck into the edu registry, but don’t pass the education sniff test Note: validated domain must not end with the blacklisted string
%w( si.edu america.edu californiacolleges.edu australia.edu cet.edu folger.edu ).freeze
- ACADEMIC_TLDS =
These top-level domains are guaranteed to be academic institutions.
%w( ac.ae ac.at ac.bd ac.be ac.cn ac.cr ac.cy ac.fj ac.gg ac.gn ac.id ac.il ac.in ac.ir ac.jp ac.ke ac.kr ac.ma ac.me ac.mu ac.mw ac.mz ac.ni ac.nz ac.om ac.pa ac.pg ac.pr ac.rs ac.ru ac.rw ac.sz ac.th ac.tz ac.ug ac.uk ac.yu ac.za ac.zm ac.zw cc.al.us cc.ar.us cc.az.us cc.ca.us cc.co.us cc.fl.us cc.ga.us cc.hi.us cc.ia.us cc.id.us cc.il.us cc.in.us cc.ks.us cc.ky.us cc.la.us cc.md.us cc.me.us cc.mi.us cc.mn.us cc.mo.us cc.ms.us cc.mt.us cc.nc.us cc.nd.us cc.ne.us cc.nj.us cc.nm.us cc.nv.us cc.ny.us cc.oh.us cc.ok.us cc.or.us cc.pa.us cc.ri.us cc.sc.us cc.sd.us cc.tx.us cc.va.us cc.vi.us cc.wa.us cc.wi.us cc.wv.us cc.wy.us ed.ao ed.cr ed.jp edu edu.af edu.al edu.ar edu.au edu.az edu.ba edu.bb edu.bd edu.bh edu.bi edu.bn edu.bo edu.br edu.bs edu.bt edu.bz edu.ck edu.cn edu.co edu.cu edu.do edu.dz edu.ec edu.ee edu.eg edu.er edu.es edu.et edu.ge edu.gh edu.gr edu.gt edu.hk edu.hn edu.ht edu.in edu.iq edu.jm edu.jo edu.kg edu.kh edu.kn edu.kw edu.ky edu.kz edu.la edu.lb edu.lr edu.lv edu.ly edu.me edu.mg edu.mk edu.ml edu.mm edu.mn edu.mo edu.mt edu.mv edu.mw edu.mx edu.my edu.ni edu.np edu.om edu.pa edu.pe edu.ph edu.pk edu.pl edu.pr edu.ps edu.pt edu.pw edu.py edu.qa edu.rs edu.ru edu.sa edu.sc edu.sd edu.sg edu.sh edu.sl edu.sv edu.sy edu.tr edu.tt edu.tw edu.ua edu.uy edu.ve edu.vn edu.ws edu.ye edu.zm es.kr g12.br hs.kr ms.kr sc.kr sc.ug sch.ae sch.gg sch.id sch.ir sch.je sch.jo sch.lk sch.ly sch.my sch.om sch.ps sch.sa sch.uk school.nz school.za tec.ar.us tec.az.us tec.co.us tec.fl.us tec.ga.us tec.ia.us tec.id.us tec.il.us tec.in.us tec.ks.us tec.ky.us tec.la.us tec.ma.us tec.md.us tec.me.us tec.mi.us tec.mn.us tec.mo.us tec.ms.us tec.mt.us tec.nc.us tec.nd.us tec.nh.us tec.nm.us tec.nv.us tec.ny.us tec.oh.us tec.ok.us tec.pa.us tec.sc.us tec.sd.us tec.tx.us tec.ut.us tec.vi.us tec.wa.us tec.wi.us tec.wv.us vic.edu.au ).to_set.freeze
Class Method Summary collapse
- .academic? ⇒ Object
- .domains_path ⇒ Object
-
.from_path(path_string_or_path) ⇒ Object
Returns a new Swot instance for the domain file at the given path.
- .get_institution_name(text) ⇒ Object (also: school_name)
- .is_academic? ⇒ Object
Instance Method Summary collapse
-
#academic_domain? ⇒ Boolean
Figure out if a domain name is a know academic institution.
-
#institution_name ⇒ Object
(also: #school_name, #name)
Figure out the institution name based on the email address/domain.
-
#valid? ⇒ Boolean
Figure out if an email or domain belongs to academic institution.
Methods included from SwotCollectionMethods
Class Method Details
.academic? ⇒ Object
26 |
# File 'lib/swot.rb', line 26 alias_method :academic?, :valid? |
.domains_path ⇒ Object
33 34 35 |
# File 'lib/swot.rb', line 33 def domains_path @domains_path ||= File. "domains", File.dirname(__FILE__) end |
.from_path(path_string_or_path) ⇒ Object
Returns a new Swot instance for the domain file at the given path.
Note that the path must be absolute.
Returns a Swot instance or false is no domain is found at the given path.
41 42 43 44 45 46 47 48 |
# File 'lib/swot.rb', line 41 def from_path(path_string_or_path) path = Pathname.new(path_string_or_path) return false unless path.exist? path_dir, file = path.relative_path_from(Pathname.new(domains_path)).split backwards_path = path_dir.to_s.split('/').push(file.basename('.txt').to_s) domain = backwards_path.reverse.join('.') Swot.new(domain) end |
.get_institution_name(text) ⇒ Object Also known as: school_name
28 29 30 |
# File 'lib/swot.rb', line 28 def get_institution_name(text) Swot.new(text).institution_name end |
.is_academic? ⇒ Object
25 |
# File 'lib/swot.rb', line 25 alias_method :is_academic?, :valid? |
Instance Method Details
#academic_domain? ⇒ Boolean
Figure out if a domain name is a know academic institution.
Returns true if the domain name belongs to a known academic institution;
false otherwise.
84 85 86 |
# File 'lib/swot.rb', line 84 def academic_domain? @academic_domain ||= File.exist?(file_path) end |
#institution_name ⇒ Object Also known as: school_name, name
Figure out the institution name based on the email address/domain.
Returns a string with the institution name; nil if nothing is found.
72 73 74 75 76 |
# File 'lib/swot.rb', line 72 def institution_name @institution_name ||= File.read(file_path, :mode => "rb", :external_encoding => "UTF-8").strip rescue nil end |
#valid? ⇒ Boolean
Figure out if an email or domain belongs to academic institution.
Returns true if the domain name belongs to an academic institution;
false otherwise.
55 56 57 58 59 60 61 62 63 64 65 66 67 |
# File 'lib/swot.rb', line 55 def valid? if domain.nil? false elsif BLACKLIST.any? { |d| to_s =~ /(\A|\.)#{Regexp.escape(d)}\z/ } false elsif ACADEMIC_TLDS.include?(domain.tld) true elsif academic_domain? true else false end end |