This project is a fully automated crawler for the Golestan website (https://golestan.iust.ac.ir/). It uses Selenium for automation, including bypassing the Captcha using a GitHub Repository (https://github.com/AmirH-Moosavi/CaptchaCrack). The crawler can access the 102 report on Golestan and is configured to retrieve undergraduate course information without requiring a username, password, or human configuration. The only requirement is to set the semesters you want to retrieve.
- Python 3.6 or higher
- A browser (Preferably Chrome)
- Clone the repository to your local machine.
git clone https://github.com/KamyarMoradian/Golestan-Crawler.git
- Navigate to the project directory.
cd Golestan-Crawler
- Install the required packages.
pip install -r requirements.txt
-
Set the driver version in the
config.py
file.DRIVER_VERSION = "YOUR_CHROME_DRIVER_VERSION"
-
Set the semesters you want to retrieve in the
main.py
file. -
Uncomment commented lines in
main.py
based on the guide given above each one of them. -
Run the
main.py
file.python main.py
-
The retrieved data will be saved in the
data
directory.
This project is licensed under the MIT License - see the LICENSE file for details.