2012-01-19 00:01:12 by chort
A coworker once told me he imagined immigration officials handing Chinese immigrants two bags with slips of paper, asking them to pick a paper from each bag and put them together to form the name of their restaurant. This is how he imagined names like "Green Dragon," or "Golden Lotus," or "China Garden" got created. While it might not be a very accurate way to describe culinary establishment marketing, it is similar to how many users choose passwords. I'm calling this method the "Chinese Take-out Attack."
Due to password complexity policies that require at least one of (fill in the blank), many users choose passwords by adding a pattern to the end (or less frequently, the beginning) of a word. A typical thought process seems to be:
I think about Carlos a lot, so I'll start my password with 'carlos.'
Now I need a number, so I'll choose the year we met (2006), but that's long to type so I'll shorten it to '06.'
I like him a lot, so I'll add an exclamation!
My password will be 'carlos06!'
You might call this 3 sections, but the entropy for such a suffix is sufficiently low that there are a high number of collisions in 2-5 character password suffixes, so I like to think of this as two parts: The base word, and the padding. The same works for prefixes, either people reverse the order (padding, then base word) or just choose from a predictably small pool of base words (names of people, places, teams, schools, years, zipcodes, keyboard-walking patterns, etc). This is essentially the same process as picking two halves of a name from different bags and connecting them.
You might think that a normal list of cracked or leaked passwords would find a lot of these "Chinese Take-out" passwords, especially after rules have been applied. Remember that there are a lot of ways to connect even a small number of patterns, so it's unlikely even the majority of common variations will be found in password lists. Rules are much better at finding character substitutions and other minor variations than they are at finding words swapped to different positions in a string.
I'd been analyzing a list of hashes recently and had reached the point where every attack feasible with my hardware was returning meager results. I had run through all the high-yield attacks, such as fully brute-forcing short passwords, using wordlists + rules, using wordlists + masks, etc. I was down to trying different full-mask attacks (a sub-set of brute-forcing) and using wordlists with thousands of randomly generated rules. While I was finding passwords, it was at the rate of several per hour. Clearly I was not going to find the last 100,000 passwords at this rate. Enter cutb.
The 0.6 release of hashcat-utils will include a new program called cutb. It's essentially a binary executable to do the common string library functions of returning the first n characters, last n characters, or some other sub-section of a string. When you recall that password-cracking wordlists are giant files will millions and millions of strings, you may get an idea of how this would be useful. To spell it out: You can lop off the prefix and/or suffix from every entry in a password list, creating new files with specific length prefixes and suffixes. You can then splice those sub-strings together (pick one from each bag) to create new wordlists, or simply use two lists simultaneously with something like the oclHashcat-plus combinator attack. The combinator attack fuses two strings together on the fly, repeating the process until all combinations of the two wordlists are exhausted.
Now you may understand this at a conceptual level, but if you're anything like me you'll need to visualize it to really understand the power. I'll show some real examples of plaintext passwords I recovered using Chinese Take-out Attacks on hashlists that had resisted all other previous attacks.
a26armorer = a26a + rmorer 1VANhalen0 = 1VAN + halen0 107Travel. = 107T + ravel. Dubai007. = Dubai + 007. g0tsm0k3d = g0tsm + 0k3d
Some of these were probably obvious, but some are not. Who would have thought a password starting with '107T' would contain the word 'Travel' as the base word? More over, this is not a password likely to be found even by tens of thousands of custom and randomly generated rules (it survived just such attacks), due to having both prefix and suffix padding.
In case it's not obvious, here's how I setup the attack:
chort@hydra:~/fun/tmp$ ./cutb.bin 0 3 < combine.txt | sort -u > 3-first.txt chort@hydra:~/fun/tmp$ ./cutb.bin 0 4 < combine.txt | sort -u > 4-first.txt chort@hydra:~/fun/tmp$ ./cutb.bin 0 5 < combine.txt | sort -u > 5-first.txt chort@hydra:~/fun/tmp$ ./cutb.bin 0 6 < combine.txt | sort -u > 6-first.txt chort@hydra:~/fun/tmp$ ./cutb.bin -3 < combine.txt | sort -u > 3-last.txt chort@hydra:~/fun/tmp$ ./cutb.bin -4 < combine.txt | sort -u > 4-last.txt chort@hydra:~/fun/tmp$ ./cutb.bin -5 < combine.txt | sort -u > 5-last.txt chort@hydra:~/fun/tmp$ ./cutb.bin -6 < combine.txt | sort -u > 6-last.txt
This gave me all the unique 3-6 character prefix & suffix strings from combine.txt (my main wordlist). To see how bad human entropy really is, compare the number of lines in combine.txt to the number of unique prefixes and suffixes.
chort@hydra:~/fun/tmp$ wc -l combine.txt 76485167 combine.txt # That's 76.5 MILLION entries, mostly recovered passwords chort@hydra:~/fun/tmp$ for i in ?-first.txt ; do echo $i ; wc -l $i ; done 3-first.txt 297231 3-first.txt # That's 300,000 unique, out of 857,375 possible in ASCII 4-first.txt 3188252 4-first.txt # 3.2 million out of 81.5 million possible in ASCII 5-first.txt 10947267 5-first.txt # 11 million out of 7.74 BILLION possible in ASCII 6-first.txt 24121023 6-first.txt # 24 million out of 735.09 BILLION possible in ASCII chort@hydra:~/fun/tmp$ for i in ?-last.txt ; do echo $i ; wc -l $i ; done 3-last.txt 314039 3-last.txt # 314,000 4-last.txt 2989287 4-last.txt # 3 million 5-last.txt 9395507 5-last.txt # 9.4 million 6-last.txt 20610518 6-last.txt # 21 million
Putting it into action:
chort@hydra:~$ ./oclhp64 -m 0 -n 160 --gpu-loops=1024 -d 2 -c 128 \ -o /fun/out/cracked.out2 -a 1 /fun/hash/hashlist2.md5 \ /fun/tmp/6-first.txt /fun/tmp/4-last.txt
This is just barely scratching the surface of what's possible using cutb output. You can also use the resulting wordlists, one at a time, combine with a mask attack to form a hybrid attack. In essence, it combines a limited characterset brute-force as either a prefix or a suffix for existing words, read from a file.
chort@hydra:~$ ./oclhp64 -m 0 -n 160 --gpu-loops=1024 -d 1 -c 128 \ -o /fun/out/cracked.out1 -a 7 /fun/hash/hashlist1.md5 \ -1 ?l?u?d -2 ?l?d?s ?1?2?2?2 /fun/tmp/4-last.txt
Another method I intend to explore is piping the raw output (i.e. without sort -u) through sort | uniq -c to get a frequency count of each prefix or suffix. Extremely common prefixes and suffixes could be incorporated into rules, suitable for use in multi-rule attacks. The idea behind multi-rules is very similar to combinator, but rather than combining two wordlists in every possible way, it combines two sets of rules in every possible way, and applies them to a wordlist.
So is it really worth going to the effort of processing your wordlists with cutb, taking the time to sort -u the results, and go through all the permutations? Absolutely! In less than a day I was able to recover an additional 4,000 passwords from a hashlist I'd only be getting a few hundred a day at best from via mask attacks and random rules. On another hashlist that I hadn't cracked any hashes in roughly 8 days, I got two within the first 12 hours using the Chinese Take-out Attack. I intend to exploit it to the fullest extent.
This article was only possible thanks to mountains of prior research. Here are some of my sources of inspiration:
On the evolving security of password schemes (Rootlabs)
More on the evolution of password security (Rootlabs)
Salt The Fries: Some Notes On Password Complexity (Kaminsky)
Crack Me If You Can 2010 - Team hashcat (atom)
The Password Project (arex1337)
Question Defense Tutorials (purehate_)