Skip to content

mShubham18/GitHub-Top30-Repositories-Data-Scrapping

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GitHub Topics Scraper 🧑‍💻

A Python Web Scraping Project to scrape topic titles, descriptions, and URLs from GitHub's topics page and organize them into a Pandas DataFrame or .csv file

Features

  • Scrapes TOP 30 topics information from GitHub.
  • Cleans and structures data for easy analysis.
  • Outputs data as a Pandas DataFrame and .csv file

Technologies Used

  • Python
  • Requests
  • os
  • BeautifulSoup
  • Pandas

Installation

Clone the repo:

git clone https://github.com/mShubham18/DataScience-Projects-GitHub-Top30-Repositories-Data-Scrapping.git
cd DataScience-Projects-GitHub-Top30-Repositories-Data-Scrapping

NOTES

  • Data folder contains all the output csv's
  • Notebook.ipynb is the jupyter notebook file containing the entire project source code.

Keep Coding ;)

About

This repository contains a GitHub Web scraping project using Python

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published