Posted on

Cewl wordlist generator: Complete guide

Hello, aspiring ethical hackers. This blogpost is a complete guide to Cewl wordlist generator. Cewl stands for Custom Word List generation. It is written in ruby language. Unlike crunch that takes input from us to generate a wordlist, this tool spiders a given website to the depth specified by us and then creates a dictionary or wordlist from it. This wordlist can then be used for password cracking with tools the Brutus, Hydra, Medusa and John The Ripper. It was designed by Robin Wood based on a discussion on PaulDotCom about creating wordlists by spidering a target website.

Cewl can be downloaded from here. Kali Linux has cewl installed by default and for this tutorial we are using the same. Metasploitable 2will be used as target since it has lot of applications installed that can be used for testing it.

Cewl 1

To use cewl, all we have to do is simply supply the URL you want to spider for as shown below.

Cewl 2

Then it will create the wordlist for you as shown below.

Cewl 3
Cewl 4

Keeping the downloaded file (-k)

Specifying the ‘-k’ option the downloaded file can be saved.

Cewl 5

Depth of spidering (-d)

When you specify a target to Cewl, it spiders to the dept of 2 by default. You can change it using this option. For example, let’s set the depth to 1.

Cewl 6

Or 3 may be.

Cewl 7

Minimum word length (-m)

This option can be used to set the minimum length of the passwords cewl creates. For example, nowadays, most services set the minimum length of the password to at least 5. So there’s no use creating passwords with length less than the minimum length your target has set, If you have noticed the above results, the default minimum length for cewl is 3. Let’s set it to 5 for example.

Cewl 8
Cewl 9

As you can see in the above image, there are no longer passwords of length less than 5.

Spidering other sites (-o)

Setting this option allows cewl to spider another external sites during spidering of the target.

Cewl 10

Don’t scan specified paths (–exclude)

If the target website is too large, you may want to skip some paths or maybe you just want to skip some paths as they may not have any passwords. You can use this option what cewl to exclude the paths you don’t want to scan. Then paths should be supplied in a file.

Cewl 11
Cewl 12

–allowed

This option is used to specify regex patterns of the path to be followed. If the pattern doesn’t match, it drops the scan.

Cewl 13

No wordlist please (-n)

If you don’t want the wordlist to be printed, you can use the “-n” option.

Cewl 17

Saving the output (-w)

Till now, you have seen that Cewl has been printing the wordlists it creates to the terminal. The output can be copied and pasted in another file to be saved from using as a wordlist. But why take all this effort when we can just save the output to file, using the ‘-w’ option.

Cewl 14
Cewl 15

Use User agent (-u))

A user agent is a characteristic string that allows web server to identify the browser and operating system and we are using. You can see your user agent here. Using the (-u) option, cewl allows you to be get a specific user agent while spidering the website.

Cewl 16

Groups of words (-g)

Setting this option allow cewl to return group of words along with single words.

Cewl 18

Turn all generated passwords to lowercase (–lowercase)

Setting this option allows us to 16 turn all the passwords cewl generates into lower case.

Cewl 19

Here is the result.

Cewl 20

Include words with numbers too (–with-numbers)

While spidering, if cewl finds any words with numbers in them, it just ignores them. But many passwords contain numbers too. By setting this option, cewl collects numbers as well.

Cewl 21
Cewl 22

–convert-umlauts

Setting this option to while using cewl converts common ISO-8859-1 umlauts.

Cewl 23

Include meta data too (-a)

When we set this option, Cewl includes metadata found while spidering the website apart from the words.

Cewl 24

This metadata found is included at the end of the output files as shown below.

Cewl 25

Save metadata found to a different file (–meta-file)

You can even save all the metadata found on the target site to a different file as shown below using the “–meta-file” option.

Cewl 26

In the above image, you can see that cewl, while processing metadata creates a temporary file. This file known as cewl-temp.pdf is created in the /tmp directory by default. We can change the location of this file by using the “—meta-temp-dir” option as shown below.

Cewl 27

Gather emails too (-e)

Not just metadata, you can even process emails using cewl with the ‘-e’ option.

Cewl 28
Cewl 29

Save collected emails in another file (–email-file)

These collected emails can also be saved in another file using the “–email-file” option.

Cewl 30
Cewl 31

Show the count for each word (-c)

Setting this option shows the count for each word.

Cewl 32
Cewl 33

Verbose mode (-v)

Just like every other tool, this tool to has verbose mode that gives detailed information.

Cewl 34
Cewl 35

Debug mode (–debug)

Similarly, the “–debug” option gives detailed debugging information too.

Cewl 36
Cewl 37

Authentication

Some sites need authentication to spider it. Cewl allows authentication of websites too.

Cewl 38

It supports both basic authentication and digest authentication. The “auth_type” option is used to set the type of authentication. The “–auth _user” and “–auth_pass” option can be used to set username and passwords respectively.

Cewl 39

Here’s the output.

Cewl 40

Spidering through proxy

Spidering, no matter what the purpose is very noisy. Cewl has an option to spider through a proxy using the “–proxy_host” option.

Cewl 41

You can even specify the port of the proxy using the “–proxy-port” option.

Cewl 42

Cewl even has feature of authenticating to the proxy server using the “–proxy_username” and “–proxy_password” options.

Cewl 43
Follow Us