Skip to content

Commit

Permalink
Merge pull request #5 from darkrho/readme-update
Browse files Browse the repository at this point in the history
Removed reference to ``crawl URL`` command as it's no longer supported.
  • Loading branch information
pablohoffman committed Oct 21, 2013
2 parents 0edb745 + 4bbc16b commit 0d875f1
Showing 1 changed file with 1 addition and 9 deletions.
10 changes: 1 addition & 9 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -36,15 +36,7 @@ default (defined in the ``start_pages`` attribute). These pages are:
* http://www.dmoz.org/Computers/Programming/Languages/Python/Resources/

So, if you run the spider regularly (with ``scrapy crawl dmoz``) it will scrape
only those two pages. However, you can scrape any dmoz.org page by passing the
url instead of the spider name. Scrapy internally resolves the spider to use by
looking at the allowed domains of each spider.

For example, to scrape a different URL use::

scrapy crawl http://www.dmoz.org/Computers/Programming/Languages/Erlang/

You can scrape any URL from dmoz.org using this spider
only those two pages.

.. _Scrapy tutorial: http://doc.scrapy.org/intro/tutorial.html

Expand Down

0 comments on commit 0d875f1

Please sign in to comment.