Skip to content

A Scrapy script to spider a website and scrape all emails using a regex.

License

Notifications You must be signed in to change notification settings

TheKevinWang/EmailScraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 

Repository files navigation

EmailScraper

A scrapy script to spider a website and scrape all emails using a regex. EmailScraper outputs the email and the url it was found in JSON format. The output is generated as the website is spidered and does not contain duplicates.

Requirements

Scrapy

pip install scrapy

Usage

Scrape all emails from example.com and save the output to emails.json, and only print status of spider (not every GET request).

scrapy runspider EmailScraper.py -a url=http://example.com/ -o emails.json -L INFO

License

MIT License

About

A Scrapy script to spider a website and scrape all emails using a regex.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages