Compact Data Structures and Query Processing for Temporal Graphs

PhD Thesis
by Diego Caro A. <diegocaro at udec cl>

Advisor: Dr. M. Andrea Rodríguez
Co-Advisor: Dr. Nieves R. Brisaboa
Department of Computer Science
University of Concepción, Chile

The thesis + slides are freely available.

Abstract

Temporal graphs represent vertices and binary relations that change along time. A temporal graph could be represented as several static graphs (or snapshots), one per each time point in the lifetime of the graph. The main problem of this representation is the space usage when edges remain unchanged for long periods of time, as consecutive snapshots tend to be very similar between them. Consequently, most of the space correspond to a replication of previous snapshots.

In this thesis, we explore alternative representations of temporal graphs that are efficient in space and in time to answer adjacency operations. On one hand, a representation based on a log of changes stores the time instant when edges appear or disappear. A multidimensional binary matrix, on the other hand, stores data of edges and their valid time intervals as cells in a 4D matrix, which reduces space while maintaining a good retrieval time.

Publications

Journal

International Conferences

International Workshops

Code and datasets

Visit http://github.com/diegocaro/temporalgraphs.

Acknowledgements

Through the journey of pursuing the PhD I received support from people all over the world. With these words I would like to say thank you to all of them. In particular, I would like to say thanks to Diego Seco for his help in the preliminary discussions of the structures, to Antonio Fariña for his infinite support in the development and implementation of many data structures, to Guillermo de Bernardo and Susana Ladra for introduce me to the \(k^2\) world, to Gonzalo Navarro and Simon Gog for their suggestions on the improvement of the data structures and the experimental evaluation, and to Claudio Sanhueza from Yahoo! Labs Chile who helped me to obtain datasets for the experimental evaluation.

And of course, I would like to say thank to my co-phd-workers José Fuentes, Erick Elejalde, Natalia Jaña, Sandra Álvarez, Guillermo de Bernardo and Leticia González.

This work was partially supported by a CONICYT doctoral fellowship 21120569, by Fondecyt 1140428, and by Fondef D09I1185.

The idea of creating this web site was borrowed from my friend and colleage José Fuentes.