CeWL Wordlists

Custom wordlist generator that crawls websites and extracts meaningful words for targeted brute-force attacks.

7 Categories25+ CommandsCopy Ready
Phase 1

Installation

1Check if CeWL is installed (Kali Linux usually has it pre-installed)
cewl
2Install CeWL on Debian/Ubuntu/Kali
sudo apt update && sudo apt install cewl
3Install from source (latest version)
git clone https://github.com/digininja/CeWL.git && cd CeWL && sudo ./install.sh
Phase 2

Basic Wordlist Generation

1Basic wordlist generation from target website
cewl https://example.com
2-w = write output to file
cewl https://example.com -w wordlist.txt
3-d = set crawl depth (1=main page, 2+=follow internal links)
cewl https://example.com -d 2
4-m = minimum word length (ignores words shorter than 6 chars)
cewl https://example.com -m 6
Phase 3

Advanced Options

1Deep crawl (depth 3) with min word length 5
cewl https://example.com -d 3 -m 5 -w final.txt
2--email = extract email addresses from the site
cewl https://example.com --email
3--meta = extract meta tags/keywords from pages
cewl https://example.com --meta
4Ignore SSL certificate errors
cewl https://example.com --no-check-certificate
5Limit links per page to 10 (reduce noise)
cewl https://example.com --link-count 10
Phase 4

CTF & Bug Bounty Workflow

1Step 1: Generate targeted wordlist from target
cewl http://target.com -d 2 -m 5 -w cewl.txt
2Step 2: Review extracted words
cat cewl.txt
3Step 3: Combine with generic wordlists (rockyou)
cat cewl.txt rockyou.txt > final_wordlist.txt
4Step 4: Use with Hydra for brute-force
hydra -l admin -P final_wordlist.txt http-post-form "/login:username=^USER^&password=^PASS^:F=invalid"
Phase 5

Example Output

1Sample words extracted from a corporate website
admin dashboard password secure user account login settings profile admin-panel
2Generate and view the wordlist immediately
cewl https://example.com -d 3 -m 5 -w words.txt && cat words.txt
Phase 6

When CeWL is Powerful

1Extracts company-specific terms, product names, department names
Corporate websites with structured content
2Finds recurring terms used in the application
Web applications with repeated keywords
3Extracts navigation terms, menu items, common paths
Admin dashboards and panels
4Product names, categories, brand-specific terms
E-commerce sites
5CMS, forums, documentation sites with rich text
Platforms with structured content
Phase 7

Why CeWL Over Generic Lists

1Problem: Too many unrelated words for specific target
Generic: rockyou.txt = 14M+ passwords, most irrelevant
2Advantage: High relevance, focused on actual content
CeWL: 50-500 targeted words from ACTUAL target site
3Strategy: Combine targeted + generic for maximum coverage
cewl + rockyou = best of both worlds
Tools

Tools & Resources