Module: Linguistics::EN::TitleCase

Defined in:
lib/linguistics/en/titlecase.rb

Overview

Methods for capitalizing a sentence as a title, nouns as proper nouns, and for turning a sentence into its equivalent CamelCaseSentence and vice-versa. It’s part of the English-language Linguistics module.

Constant Summary collapse

ARTICLES =

Exceptions: Indefinite articles

%w[a and the]
SHORT_PREPOSITIONS =

Exceptions: Prepositions shorter than five letters

["amid", "at", "but", "by", "down", "for", "from", "in",
"into", "like", "near", "of", "off", "on", "onto", "out", "over",
"past", "save", "with", "till", "to", "unto", "up", "upon", "with"]
COORD_CONJUNCTIONS =

Exceptions: Coordinating conjunctions

%w[and but as]
TITLE_CASE_EXCEPTIONS =

Titlecase exceptions: “In titles, capitalize the first word, the last word, and all words in between except articles (a, an, and the), prepositions under five letters (in, of, to), and coordinating conjunctions (and, but). These rules apply to titles of long, short, and partial works as well as your own papers” (Anson, Schwegler, and Muth. The Longman Writer’s Companion 240).

ARTICLES | SHORT_PREPOSITIONS | COORD_CONJUNCTIONS
PROPER_NOUN_EXCEPTIONS =

The words which don’t get capitalized in a compound proper noun

%w{and the of}

Instance Method Summary collapse

Instance Method Details

#proper_nounObject

Returns the proper noun form of the inflected object by capitalizing most of the words.

Some examples:

"bosnia and herzegovina".en.proper_noun
# => "Bosnia and Herzegovina"
"macedonia, the former yugoslav republic of".en.proper_noun
# => "Macedonia, the Former Yugoslav Republic of"
"virgin islands, u.s.".en.proper_noun
# => "Virgin Islands, U.S."


110
111
112
113
114
115
116
117
# File 'lib/linguistics/en/titlecase.rb', line 110

def proper_noun
	return self.to_s.split(/([ .]+)/).collect do |word|
		next word unless
			/^[a-z]/.match( word ) &&
			! (PROPER_NOUN_EXCEPTIONS.include?( word ))
		word.capitalize
	end.join
end

#titlecaseObject

Returns the inflected object as a title-cased String.

Some examples:

"a portrait of the artist as a young man".en.titlecase
# => "A Portrait of the Artist as a Young Man"

"a seven-sided romance".en.titlecase
# => "A Seven-Sided Romance"

"the curious incident of the dog in the night-time".en.titlecase
# => "The Curious Incident of the Dog in the Night-Time"

"the rats of n.i.m.h.".en.titlecase
# => "The Rats of N.I.M.H."


68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
# File 'lib/linguistics/en/titlecase.rb', line 68

def titlecase

	# Split on word-boundaries
	words = self.to_s.split( /\b/ )

	# Always capitalize the first and last words
	words.first.capitalize!
	words.last.capitalize!

	# Now scan the rest of the tokens, skipping non-words and capitalization
	# exceptions.
	words.each_with_index do |word, i|

		# Non-words
		next unless /^\w+$/.match( word )

		# Skip exception-words
		next if TITLE_CASE_EXCEPTIONS.include?( word )

		# Skip second parts of contractions
		next if words[i - 1] == "'" && /\w/.match( words[i - 2] )

		# Have to do it this way instead of capitalize! because that method
		# also downcases all other letters.
		word.gsub!( /^(\w)(.*)/ ) { $1.upcase + $2 }
	end

	return words.join
end

#to_camel_caseObject

Turns an English language string into a CamelCase word.



48
49
50
# File 'lib/linguistics/en/titlecase.rb', line 48

def to_camel_case
	self.to_s.gsub( /\s+([a-z])/i ) { $1.upcase }
end

#un_camel_caseObject

Turns a camel-case string (“camelCaseToEnglish”) to plain English (“camel case to english”). Each word is decapitalized.



40
41
42
43
44
# File 'lib/linguistics/en/titlecase.rb', line 40

def un_camel_case
	self.to_s.
		gsub( /([A-Z])([A-Z])/ ) { "#$1 #$2" }.
		gsub( /([a-z])([A-Z])/ ) { "#$1 #$2" }.downcase
end