Penn DB group at SIGMOD/PODS 2017

It’s a pleasure to share the accomplishment of two members of Penn database group at PODS 2017 conference here in Chicago.  Prof. Susan Davidson gave a wonderful keynote talk about our work on data citation and provenance. Prof. Val Tennan has received the ToT (test of time) award for his work on semirings provenance “The Semiring Framework for Database Provenance“. Prof. Val gave a great talk about the influence of semiring provenance work on various modern data problems.

Automating data citation: the eagle-i experience

Great news! Our paper “Automating data citation: the eagle-i experience” has been accepted for inclusion in the JCDL 2017 conference.


Data citation is of growing concern for owners of curated databases, who wish to give credit to the contributors and curators responsible for portions of the dataset and enable the data retrieved by a query to be later examined. While several databases specify how data should be cited, they leave it to users to manually construct the citations and do not generate them automatically.

We report our experiences in automating data citation for an RDF dataset called eagle-i, and discuss how to generalize this to a citation framework that can work across a variety of different types of databases (e.g. relational, XML, and RDF). We also describe how a database administrator would use this framework to automate citation for a particular dataset.

Automating Data Citation in CiteDB

Our paper “Automating Data Citation in CiteDB” will appear in PVLDB 2017 (Munich, Germany).


An increasing amount of information is being collected in structured, evolving, curated databases, driving the ques- tion of how information extracted from such datasets via queries should be cited. While several databases say how data should be cited for web-page views of the database, they leave it to users to manually construct the citations. Furthermore, they do not say how data extracted by queries other than web-page views general queries should be cited. This demo shows how citations can be specified for a small set of views of the database, and used to automatically gen- erate citations for general queries against the database.