Wordlist

CI Code Climate Gem Version

Description

Wordlist is a Ruby library and CLI for reading, combining, mutating, and building wordlists, efficiently.

Features

  • Supports reading .txt wordlists, and .gz, .bz2, .xz, .zip, and .7z compressed wordlists.
  • Supports building wordlists from arbitrary text. Also supports .gz, .bz2, .xz, .zip, and .7z compression.
  • Provides an advanced lexer for parsing text into words.
    • Can parse/skip digits, special characters, whole numbers, acronyms.
    • Can normalize case, apostrophes, and acronyms.
  • Supports wordlist operations for combining multiple wordlists together.
  • Supports wordlist modify or mutating the words in the wordlist on the fly.
  • Also provides a wordlist command.
  • Fast-ish

Examples

Reading

Open a wordlist for reading:

wordlist = Wordlist.open("passwords.txt")

Open a compressed wordlist for reading:

wordlist = Wordlist.open("rockyou.txt.gz")

Enumerate through a wordlist:

wordlist.each do |word|
  puts word
end

Create an in-memory list of literal words:

words = Wordlist::Words["foo", "bar", "baz"]

List Operations

Concat two wordlists together:

(wordlist1 + wordlist2).each do |word|
  puts word
end

Union two wordlists together:

(wordlist1 | wordlist2).each do |word|
  puts word
end

Subtract one wordlist from the other:

(wordlist1 - wordlist2).each do |word|
  puts word
end

Combine every word from wordlist1 with the words from wordlist2:

(wordlist1 * wordlist2).each do |word|
  puts word
end

Combine the wordlist with itself multiple times:

(wordslist ** 3).each do |word|
  puts word
end

Filter out duplicates from multiple wordlists:

(wordlist1 + wordlist2 + wordlist3).uniq.each do |word|
  puts word
end

String Manipulation

Convert every word in a wordlist to lowercase:

wordlist.downcase.each do |word|
  puts word
end

Convert every word in a wordlist to UPPERCASE:

wordlist.upcase.each do |word|
  puts word
end

Capitalize every word in a wordlist:

wordlist.capitalize.each do |word|
  puts word
end

Run String#tr on every word in a wordlist:

wordlist.tr('_','-').each do |word|
  puts word
end

Run String#sub on every word in a wordlist:

wordlist.sub("fish","phish").each do |word|
  puts word
end

Run String#gsub on every word in a wordlist:

wordlist.gsub(/\d+/,"").each do |word|
  puts word
end

Performs every possible mutation of each word in a wordlist:

wordlist.mutate(/[oae]/, {'o' => '0', 'a' => '@', 'e' => '3'}).each do |word|
  puts word
end
# dog
# d0g
# firefox
# fir3fox
# firef0x
# fir3f0x
# ...

Enumerates over every possible case variation of every word in a wordlist:

wordlist.mutate_case.each do |word|
  puts word
end
# cat
# Cat
# cAt
# caT
# CAt
# CaT
# cAT
# CAT
# ...

Building a Wordlist

Wordlist::Builder.open('path/to/file.txt.gz') do |builder|
  # ...
end

Add individual words:

builder.add(word)

Adding an Array of words:

builder.append(words)

Parsing text:

builder.parse(text)

Parsing a file's content:

builder.parse_file(path)

Requirements

  • ruby >= 3.0.0
  • zcat/gzip (for reading/writing .gz wordlists)
  • bzcat/bzip2 (for reading/writing .bz2 wordlists)
  • xzcat/xz (for reading/writing .xz wordlists)
  • unzip/zip (for reading/writing .zip wordlists)
  • 7za (for reading/writing .7z wordlists)

Install

$ gem install wordlist

gemspec

gem.add_dependency 'wordlist', '~> 1.0'

Gemfile

gem 'wordlist', '~> 1.0'

Synopsis

usage: wordlist { [options] WORDLIST ... | --build WORDLIST [FILE ...] }

Wordlist Reading Options:
    -f {txt|gzip|bz2|xz|zip|7zip},   Sets the desired wordlist format
        --format
        --exec COMMAND               Runs the command with each word from the wordlist.
                                     The string "{}" will be replaced with each word.

Wordlist Operations:
    -U, --union WORDLIST             Unions the wordlist with the other WORDLIST
    -I, --intersect WORDLIST         Intersects the wordlist with the other WORDLIST
    -S, --subtract WORDLIST          Subtracts the words from the WORDLIST
    -p, --product WORDLIST           Combines every word with the other words from WORDLIST
    -P, --power NUM                  Combines every word with the other words from WORDLIST
    -u, --unique                     Filters out duplicate words

Wordlist Modifiers:
    -C, --capitalize                 Capitalize each word
        --uppercase, --upcase        Converts each word to UPPERCASE
        --lowercase, --downcase      Converts each word to lowercase
    -t, --tr CHARS:REPLACE           Translates the characters of each word
    -s, --sub PATTERN:SUB            Replaces PATTERN with SUB in each word
    -g, --gsub PATTERN:SUB           Replaces all PATTERNs with SUB in each word
    -m, --mutate PATTERN:SUB         Performs every possible substitution on each word
    -M, --mutate-case                Switches the case of each letter in each word

Wordlist Building Options:
    -b, --build WORDLIST             Builds a wordlist
    -a, --[no-]append                Appends to the new wordlist instead of overwriting it
    -L, --lang LANG                  The language to expect
        --stop-words WORDS...        Ignores the stop words
        --ignore-words WORDS...      Ignore the words
        --[no-]digits                Allow digits in the middle of words
        --special-chars CHARS        Allows the given special characters inside of words
        --[no-]numbers               Parses whole numbers in addition to words
        --[no-]acronyms              Parses acronyms in addition to words
        --[no-]normalize-case        Converts all words to lowercase
        --[no-]normalize-apostrophes Removes "'s" from words
        --[no-]normalize-acronyms    Removes the dots from acronyms

General Options:
    -V, --version                    Print the version
    -h, --help                       Print the help output

Examples:
    wordlist rockyou.txt.gz
    wordlist passwords_short.txt passwords_long.txt
    wordlist sport_teams.txt -p beers.txt -p digits.txt
    cat *.txt | wordlist --build custom.txt

Reading a wordlist:

$ wordlist rockyou.txt.gz

Reading multiple wordlists:

$ wordlist sport_teams.txt beers.txt

Combining every word from one wordlist with another:

$ wordlist sport_teams.txt -p beers.txt -p all_four_digits.txt
coors0000
coors0001
coors0002
coors0003
...

Combining every word from one wordlist with itself, N times:

$ wordlist words.txt -P 3

Mutating every word in a wordlist:

$ wordlist passwords.txt -m o:0 -m e:3 -m a:@
dog
d0g
firefox
fir3fox
firef0x
fir3f0x
...

Executing a command on each word in the wordlist:

$ wordlist directories.txt --exec "curl -X POST -F 'user=joe&password={}' -o /dev/null -w '%{http_code} {}' https://$TARGET/login"

Building a wordlist from a directory of .txt files:

$ wordlist --build wordlist.txt dir/*.txt

Building a wordlist from STDIN:

$ cat *.txt | wordlist --build wordlist.txt

Benchmarks

                                               user     system      total        real
Wordlist::Builder#parse_text (size=5.4M)   1.943605   0.003809   1.947414 (  1.955960)
Wordlist::File#each (N=1000)               0.000544   0.000000   0.000544 (  0.000559)
Wordlist::File#concat (N=1000)             0.001143   0.000000   0.001143 (  0.001153)
Wordlist::File#subtract (N=1000)           0.001360   0.000000   0.001360 (  0.001375)
Wordlist::File#product (N=1000)            0.536518   0.005959   0.542477 (  0.545536)
Wordlist::File#power (N=1000)              0.000015   0.000001   0.000016 (  0.000014)
Wordlist::File#intersect (N=1000)          0.001389   0.000000   0.001389 (  0.001407)
Wordlist::File#union (N=1000)              0.001310   0.000000   0.001310 (  0.001317)
Wordlist::File#uniq (N=1000)               0.000941   0.000000   0.000941 (  0.000948)
Wordlist::File#tr (N=1000)                 0.000725   0.000000   0.000725 (  0.000736)
Wordlist::File#sub (N=1000)                0.000863   0.000000   0.000863 (  0.000870)
Wordlist::File#gsub (N=1000)               0.001240   0.000000   0.001240 (  0.001249)
Wordlist::File#capittalize (N=1000)        0.000821   0.000000   0.000821 (  0.000828)
Wordlist::File#upcase (N=1000)             0.000760   0.000000   0.000760 (  0.000769)
Wordlist::File#downcase (N=1000)           0.000544   0.000001   0.000545 (  0.000545)
Wordlist::File#mutate (N=1000)             0.004656   0.000000   0.004656 (  0.004692)
Wordlist::File#mutate_case (N=1000)       24.178521   0.000000  24.178521 ( 24.294962)

License

Copyright (c) 2009-2023 Hal Brodigan

See LICENSE for details.