Brave new data, revisited
With the brave new world of big data, of which Linked Data is a part of, one of the top issues mentioned by any analyst, researcher, and data consumer is a concern about the quality of data. In part, this concern arises because the new sources and types of data might indeed have poorer quality, but another cause of this perception is that it has become increasingly difficult to understand, measure, and control this data and thus its quality: Data comes no longer exclusively from inside an organization, where its creation, management and analysis is well understood, but from outside sources, be it sensor data, web data, scientific data, and Linked Data from public sources. In this talk I want to highlight the achievements (and failures) of data quality research in the past, examine why data quality considerations are more important than ever when creating, managing, and using Linked Data, and show how the computer science community can contribute to the goal of creating a useful and relevant Web of Linked Open Data.
About the presenter
Felix Naumann studied mathematics, economy, and computer sciences at the University of Technology in Berlin. After receiving his diploma (MA) in 1997 he joined the graduate school "Distributed Information Systems" at Humboldt University of Berlin. He completed his PhD thesis on "Quality-driven Query Answering" in 2000. In 2001 and 2002 he worked at the IBM Almaden Research Center on topics around data integration. From 2003 - 2006 he was assistant professor for information integration at the Humboldt-University of Berlin. Since then he holds the chair for information systems at the Hasso Plattner Institute at the University of Potsdam in Germany where he is focusing on the topics of data profiling and web sciences.