ACM DL

ACM Journal of

Data and Information Quality (JDIQ)

Menu
Latest Articles

An Introduction to Dynamic Data Quality Challenges

The Challenge of Test Data Quality in Data Processing

From Content to Context

Research in data and information quality has made significant strides over the last 20 years. It has become a unified body of knowledge incorporating techniques, methods, and applications from a variety of disciplines including information systems, computer science, operations management, organizational behavior, psychology, and statistics. With... (more)

A Probabilistically Integrated System for Crowd-Assisted Text Labeling and Extraction

The amount of text data has been growing exponentially in recent years, giving rise to automatic information extraction methods that store text... (more)

NEWS

Feb. 2017 -- Call for Nominations Editor-In-Chief, ACM Journal on Data and Information Quality

Feb. 2017 -- Call for Papers: 
Special Issue on Improving the Veracity and Value of Big Data 
Extended Submission deadline: April  1st, 2017

Jan. 2016 -- New Book Announcement
Carlo Batini and Monica Scannapieco have a new book:

Data and Information Quality: Dimensions, Principles and Techniques 

Springer Series: Data-Centric Systems and Applications, soon available from the Springer shop

The Springer flyer is available here


Experience and Challenge papers:  JDIQ now accepts two new types of papers. Experience papers describe real-world applications, datasets and other experiences in handling poor quality data. Challenges papers briefly describe a novel problem or challenge for the IQ community. See Author Guidelines for details.

Forthcoming Articles
An Exploratory Case Study to Understand Primary Care Users and Their Data Quality Tradeoffs

Primary care data is an important piece of the evolving healthcare ecosystem. In addition to supporting the provision of patient care, primary care data can be used for a number of important secondary purposes. Understanding the tradeoffs between timeliness, accuracy, completeness and usefulness of primary care data is important to design systems that generate high quality data. As a case study, data quality measures and metrics are developed with a focus group of managers from a primary care organization. After calculating and extracting measurements of data quality, each measure was modeled with logit binomial regression to characterize tradeoffs and data quality interactions. Measures for accuracy, completeness and timeliness were calculated for 196,967 patient encounters. Report generation was measured as a proxy for the usefulness dimension. Based on the analysis, there was a positive relationship between accuracy and completeness, and a negative relationship between timeliness and usefulness. Importantly, the use of data was associated with an increase in completeness and accuracy. There were limitations to the measures and metrics developed with the focus group, however it was agreed that the measures were reasonable proxies for the data quality dimensions under study. The results provide meaningful insight on user tradeoffs and can be used in the design of systems in primary care.

Dependable Data Repairing with Fixing Rules

One of the main challenges that data cleaning systems face is to automatically identify and repair data errors in a dependable manner. Though data dependencies (a.k.a. integrity constraints) have been widely studied to capture errors in data, automated and dependable data repairing on these errors has remained a notoriously hard problem. In this work, we introduce an automated approach for dependably repairing data errors, based on a novel class of fixing rules. A fixing rule contains an evidence pattern, a set of negative patterns, and a fact value. The heart of fixing rules is deterministic: given a tuple, the evidence pattern and the negative patterns of a fixing rule are combined to precisely capture which attribute is wrong, and the fact indicates how to correct this error. We study several fundamental problems associated with fixing rules, and establish their complexity. We develop efficient algorithms to check whether a set of fixing rules are consistent, and discuss approaches to resolve inconsistent fixing rules. We also devise efficient algorithms for repairing data errors using fixing rules. Moreover, we discuss approaches on how to generate a large number of fixing rules, from examples or available knowledge bases. We experimentally demonstrate that our techniques outperform other automated algorithms in terms of the accuracy of repairing data errors, using both real-life and synthetic data.

The Challenge of Quality in Social Computation

In applications where machine intelligence falls short (e.g. alignment of taxonomies on the Semantic Web, image annotation, label sorting), so-called social computation approaches that utilise crowds of interconnected human workers offer a viable solution. Computations such as these can be modelled as a collection of structured activities (i.e. workflows) that represent a blend of human and machine tasks. From a data quality perspective, social computations cannot be treated as traditional computational systems and existing quality models will need to be adapted or redesigned to accommodate the unique characteristics of such systems. We argue that only by enhancing the transparency of social computation systems will we be able to realize such novel quality assessment processes.

Bibliometrics

Publication Years 2009-2017
Publication Count 117
Citation Count 196
Available for Download 117
Downloads (6 weeks) 1561
Downloads (12 Months) 12043
Downloads (cumulative) 114845
Average downloads per article 982
Average citations per article 2
First Name Last Name Award
Peter Aiken ACM Senior Member (2011)
Ahmed Elmagarmid ACM Distinguished Member (2009)
Daniel S Katz ACM Senior Member (2011)
Beth A. Plale ACM Senior Member (2006)

First Name Last Name Paper Counts
Yang Lee 4
John Talburt 3
Stuart Madnick 3
G Shankaranarayanan 3
Peter Christen 3
Eitel LauríA 2
Sherali Zeadally 2
Arnon Rosenthal 2
Christan Grant 2
Daisyzhe Wang 2
Ross Gayler 2
Ali Sunyaev 2
Nan Tang 2
Vassilios Verykios 2
Carolyn Matheus 2
Wolfgang Lehner 2
Dinusha Vatsalan 2
Roger Blake 2
Xiaobai Li 2
Roman Lukyanenko 2
Pierpaolo Vittorini 1
Karthikeyan Ramamurthy 1
Ralf Tönjes 1
Laurent Lecornu 1
Dov Biran 1
Edward Anderson 1
Shelly Sachdeva 1
Stuart Madnick 1
Monica Tremblay 1
Debra Vandermeer 1
Foster Provost 1
Nicola Ferro 1
Christian Becker 1
Sandra Sampaio 1
Jianyong Wang 1
Wenfei Fan 1
Dustin Lange 1
Therese Williams 1
Chintan Amrit 1
John Krogstie 1
Banda Ramadan 1
John O’Donoghue 1
Axel Polleres 1
Venkata Meduri 1
Wenjun Li 1
Khoi Tran 1
Lan Cao 1
Jeffrey Vaughan 1
Melanie Herschel 1
Payam Barnaghi 1
Jean Caillec 1
Rashid Ansari 1
Davide Ceolin 1
Arputharaj Kannan 1
Anupkumar Sen 1
Hubert Österle 1
Suzanne Embury 1
Lizhu Zhou 1
Erhard Rahm 1
Shuai Ma 1
Nigel Martin 1
Huizhi Liang 1
Paolo Coletti 1
Mirko Cesarini 1
Hongjiang Xu 1
Vincenzo Maltese 1
Yuheng Hu 1
Robert Meusel 1
Xiaoping Liu 1
Valentina Maccatrozzo 1
Maurice Van Keulen 1
Stephen Chong 1
Edoardo Pignotti 1
A Borthick 1
Sara Tonelli 1
Kush Varshney 1
Rahul Basole 1
Jimeng Sun 1
Ashfaq Khokhar 1
Dmitry Chornyi 1
Fred Morstatter 1
Paul Groth 1
Mohamed Yakout 1
Danilo Montesi 1
Omar Alonso 1
Alan Labouseur 1
Irit Askira Gelman 1
Alexandra Poulovassilis 1
Eric Medvet 1
Fabiano Tarlao 1
John Herbert 1
Juan Augusto 1
Maurice Mulvenna 1
Paul Mccullagh 1
Fabio Mercorio 1
Fei Chiang 1
Siddharth Sitaramachandran 1
J Jha 1
Laure Berti-Équille 1
Jürgen Umbrich 1
Sven Weber 1
Fabian Panse 1
Fumiko Kobayashi 1
Johann Freytag 1
María Bermúdez-Edo 1
Maria Alvarez 1
Richard Briotta 1
Kristin Weber 1
Panagiotis Ipeirotis 1
Paolo Missier 1
Xu Pu 1
Yi Chen 1
Benjamin Ngugi 1
Beverly Kahn 1
Paul Glowalla 1
Wenyuan Yu 1
Felix Naumann 1
Wenyuan Yu 1
Fausto Giunchiglia 1
Jeremy Debattista 1
Sushovan De 1
Dominique Ritze 1
Heiko Paulheim 1
Christoph Quix 1
Matthias Jarke 1
Wan Fokkink 1
Jeffrey Fisher 1
Adriane Chapman 1
Jeremy Millar 1
Hilko Donker 1
Dezhao Song 1
Rabia Nuray-Turan 1
Dmitri Kalashnikov 1
Yinle Zhou 1
Heiko Müller 1
Youwei Cheah 1
Steven Brown 1
Terry Clark 1
Adir Even 1
H Nehemiah 1
Matthew Jensen 1
Jay Nunamaker, 1
Daniel Dalip 1
Tobias Vogel 1
Arvid Heise 1
Uwe Draisbach 1
Fons Wijnhoven 1
Pável Calado 1
Olivier Curé 1
Claire Collins 1
Ioannis Anagnostopoulos 1
Patricia Franklin 1
Willem Van Hage 1
Len Seligman 1
Gilbert Peterson 1
Robert Ulbricht 1
Martin Hahmann 1
Eric Nelson 1
Hongwei Zhu 1
Ulf Leser 1
Irit Gelman 1
Yanjuan Yang 1
Paul Bowen 1
Dennis Wei 1
Aleksandra Mojsilović 1
Ion Todoran 1
Ali Khenchaf 1
Valerie Sessions 1
Trent Rosenbloom 1
Shawn Hardenbrook 1
Huan Liu 1
Peter Aiken 1
Michael Zack 1
Nitin Joglekar 1
Mikhail Atallah 1
Subhash Bhalla 1
D Elizabeth 1
Kaushik Dutta 1
M Kaiser 1
Kresimir Duretec 1
Manoranjan Dash 1
Xiaoming Fan 1
Floris Geerts 1
Thomas Redman 1
David Becker 1
Wenfei Fan 1
Pim Dietz 1
Jeffrey Parsons 1
Giannis Haralabopoulos 1
Sebastian Neumaier 1
Kyle Niemeyer 1
Arfon Smith 1
Archana Nottamkandath 1
Darryl Ahner 1
Claudio Hartmann 1
Hongwei Zhu 1
Norbert Ritter 1
Cihan Varol 1
Coşkun Bayrak 1
David Robb 1
Rosella Gennari 1
Mark Braunstein 1
Marta Zárraga-Rodríguez 1
Craig Fisher 1
Sufyan Ababneh 1
Peter Elkin 1
C Raj 1
Matteo Magnani 1
Hema Meda 1
Amitava Bagchi 1
Bernd Heinrich 1
Mathias Klier 1
R Greenwood 1
Ayush Singhania 1
George Moustakides 1
Bing Lv 1
Paul Mangiameli 1
Jianing Wang 1
Dirk Ahlers 1
Marcos Gonçalves 1
Alberto Bartoli 1
Hongwei Zhu 1
James McNaull 1
Kelly Janssens 1
Judith Gelernter 1
Mouhamadoulamine Ba 1
Ciro D'Urso 1
Subbarao Kambhampati 1
Hua Zheng 1
Jeff Heflin 1
Christian Skalka 1
Michael Mannino 1
Fiona Rohde 1
Kewei Sha 1
Marco Valtorta 1
Elliot Fielstein 1
Theodore Speroff 1
Ahmed Elmagarmid 1
Yang Lee 1
Judee Burgoon 1
Boris Otto 1
Josh Attenberg 1
Sean Goldberg 1
Andreas Rauber 1
Alun Preece 1
Anja Klein 1
Marilyn Tremaine 1
Alan March 1
Felix Naumann 1
Marco Cristo 1
Andrea Lorenzo 1
Richard Wang 1
Maurizio Murgia 1
Mario Mezzanzanica 1
Roberto Boselli 1
Christoph Lange 1
Sören Auer 1
Luvai Motiwalla 1
Sandra Geisler 1
Daniel Katz 1
Douglas Hodson 1
Sharad Mehrotra 1
Chris Baillie 1
Peter Edwards 1
Beth Plale 1

Affiliation Paper Counts
University of Padua 1
Universite Paris-Est 1
Federal University of Amazonas 1
Florida State University 1
Virginia Commonwealth University 1
Vanderbilt University 1
Instituto Superior Tecnico 1
Google Inc. 1
University of Leipzig 1
Hospital Universitario Austral 1
Harvard University 1
University of Colorado at Denver 1
Oklahoma City University 1
University of Rhode Island 1
State University of New York at Albany 1
Georgia State University 1
University of Antwerp 1
University of Texas at Austin 1
Oregon State University 1
University of Massachusetts System 1
Indian Institute of Science 1
Elsevier 1
University of Augsburg 1
Vienna University of Technology 1
University of South Carolina 1
Memorial University of Newfoundland 1
Boston University 1
Technical University of Munich 1
Butler University 1
New Jersey Institute of Technology 1
National Institute of Standards and Technology 1
Cardiff University 1
Sam Houston State University 1
University College Cork 1
Microsoft Corporation 1
Ben-Gurion University of the Negev 1
University of Edinburgh 1
Charleston Southern University 1
Commonwealth Scientific and Industrial Research Organization 1
Rutgers, The State University of New Jersey 1
University of Patras 1
Hellenic Open University 1
University of Illinois at Urbana-Champaign 1
Lehigh University 2
Humboldt University of Berlin 2
Fraunhofer Institute for Applied Information Technology 2
Nanyang Technological University 2
Old Dominion University 2
Suffolk University 2
Free University of Bozen-Bolzano 2
University of Innsbruck 2
University of Arizona 2
Norwegian University of Science and Technology 2
University of Kentucky 2
University of Trento 2
RWTH Aachen University 2
University of Toronto 2
University of Surrey 2
Indiana University 2
New York University 2
Massachusetts Institute of Technology 2
University of Massachusetts Boston 2
University of Bologna 2
University of Hamburg 2
Federal University of Minas Gerais 2
University of Oklahoma 2
University of Queensland 2
University of Aizu 2
McMaster University 2
Universidad de Navarra 2
Indian Institute of Management Calcutta 2
Vienna University of Economics and Business Administration 3
University of Massachusetts Medical School 3
Qatar Computing Research institute 3
Northeastern University 3
University of St. Gallen 3
University of Thessaly 3
Babson College 3
University of Cologne 3
Georgia Institute of Technology 3
University of Aberdeen 3
Beihang University 3
Telecom Bretagne 3
Purdue University 3
Birkbeck University of London 3
University of Bonn 3
University of California, Irvine 3
University of Mannheim 3
University of Twente 4
University of Ulster 4
Vrije Universiteit Amsterdam 4
United States Air Force Institute of Technology 4
University of Trieste 4
University of Manchester 4
Anna University 4
University of Illinois at Chicago 4
IBM Thomas J. Watson Research Center 4
Technical University of Dresden 4
University of Milan - Bicocca 4
United States Department of Veterans Affairs 4
University of Florida 4
MITRE Corporation 5
University of Massachusetts Lowell 5
Marist College 5
Florida International University 5
Tsinghua University 5
Arizona State University 5
Hasso-Plattner-Institut fur Softwaresystemtechnik GmbH 6
University of Arkansas at Little Rock 8
Australian National University 9
 
All ACM Journals | See Full Journal Index

Search JDIQ
enter search term and/or author name