Datasets related to Complex Networks

A complex network is a graph or a network with features that do not occur in simple networks such as lattices or random graphs but often occur in graphs modelling real systems. The study of complex networks is a young and active area of scientific research inspired largely by the empirical study of real-world networks such as computer networks and social networks


Given below is a list of datasets related to Complex networks we came across in course of time.



KONECT (the Koblenz Network Collection) is a project to collect large network datasets of all types in order to perform research in network science and related fields, collected by the Institute of Web Science and Technologies at the University of Koblenz–Landau. KONECT contains over a hundred network datasets of various types, including directed, undirected, bipartite, weighted, unweighted, signed and rating networks. The networks of KONECT are collected from many diverse areas such as social networks, hyperlink networks, authorship networks, physical networks, interaction networks and communication networks.


Pajek datasets

Protein-protein interaction network in budding yeast. Interaction detection methods have led to the discovery of thousands of interactions between proteins, and discerning relevance within large-scale data sets is important to present-day biology.



The datasets that were used in the Web Science 2014 Data Challenge. There are 4 datasets in this collection namely

  • A collection of Web (HTTP) requests from Indiana university for the month of November 2009

  • A collection of records extracted from tweets for the month of November 2012

  • A collection of bookmarks from for the month of November 2009

  • Metadata for the complete set of all PubMed records through 2012


Each is available as a .tar.gz file containing either .json or .csv files. When the JSON format is used, each .json file contains a single JSON object. The format of that object is dependent on the dataset. The datasets have been prepared by Dimitar Nikolov.


UMassAmherst DBLP Citation dataset

The Proximity DBLP database presents information on computer science publications listed in the DBLP Computer Science Bibliography. The data in this dataset were derived from a snapshot of the bibliography as of April 12, 2006. The Proximity DBLP dataset maps each entry in the original DBLP data to one of six types of objects representing different types of publications. It includes links from publications to their authors and editors and from papers to the journal, proceedings, or book in which they appear, as well as citation links from one publication to another.


WSU Graph Datasets

The provides information and pointers to datasets that are either already represented as a graph, or are relational in nature and lend themselves to a graph representation.


Small Network Data

This page contains links to some network data sets I've compiled over the years.


UCI Network Data Repository

The UCI Network Data Repository is an effort to facilitate the scientific study of networks.


The Nexus Network Repository

Nexus is a repository of freely available network data sets, to promote network science research and education. Real-world data is essential for the testing of hypotheses and the development of algorithms.