Skip to content

DeduDLB - Deduplication of Names and Co-Authoring Networks in the DBLP

Notifications You must be signed in to change notification settings

marianaossilva/DSW2017

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 

Repository files navigation

DSW 2017 - DATASET SHOWCASE WORKSHOP

DeduDLB - Deduplication of Names and Co-Authoring Networks in the DBLP

Dataset Information

This repository stores a dataset collected from the DBLP computer science bibliography, an on-line reference for bibliographic information on major computer science publications. This dataset includes approximately 15 million records collected in September/2016.

From this dataset, two sub-datasets were created:

  • The first has the original database collected from the DBLP with name deduplication treatment.
  • The second presents three co-authorship social networks built using the snowball sampling technique.

Dataset Statistics


Data # Records
Publications in articles 1,505,020
Authors 1,779,971
Publications in proceedings 31,549
Publications in inproceedings 1,861,226
Relation between authors and publications 9,707,161
Total 14,884,927

Files

Source (citation)

[1] Mariana O. Silva, Michele A. Brandão. “Deduplicação de Nomes e Redes de Co-autoria na DBLP”. Em: SBBD Dataset Showcase Workshop, pp. 203-211. Uberlândia, MG.

Usage

If you would like to use the datasets, please cite our paper [1].

About

DeduDLB - Deduplication of Names and Co-Authoring Networks in the DBLP

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published