ACM JDIQ

Latest Articles

An Introduction to Dynamic Data Quality Challenges

Alan G. Labouseur, Carolyn C. Matheus

The Challenge of Test Data Quality in Data Processing

Christoph Becker, Kresimir Duretec, Andreas Rauber

Reproducibility Challenges in Information Retrieval Evaluation

Nicola Ferro

From Content to Context

G. Shankaranarayanan, Roger Blake

Research in data and information quality has made significant strides over the last 20 years. It has become a unified body of knowledge incorporating techniques, methods, and applications from a variety of disciplines including information systems, computer science, operations management, organizational behavior, psychology, and statistics. With... (more)

A Probabilistically Integrated System for Crowd-Assisted Text Labeling and Extraction

Sean Goldberg, Daisy Zhe Wang, Christan Grant

The amount of text data has been growing exponentially in recent years, giving rise to automatic information extraction methods that store text... (more)

NEWS

Feb. 2017 -- Call for Nominations Editor-In-Chief, ACM Journal on Data and Information Quality

Feb. 2017 -- Call for Papers:
Special Issue on Improving the Veracity and Value of Big Data
Extended Submission deadline: April 1st, 2017

Jan. 2016 -- New Book Announcement
Carlo Batini and Monica Scannapieco have a new book:

Data and Information Quality: Dimensions, Principles and Techniques

Springer Series: Data-Centric Systems and Applications, soon available from the Springer shop

The Springer flyer is available here

Experience and Challenge papers: JDIQ now accepts two new types of papers. Experience papers describe real-world applications, datasets and other experiences in handling poor quality data. Challenges papers briefly describe a novel problem or challenge for the IQ community. See Author Guidelines for details.

Forthcoming Articles

An Exploratory Case Study to Understand Primary Care Users and Their Data Quality Tradeoffs Justin St-Maurice (University of Waterloo and Conestoga College ITAL); Catherine Burns (University of Waterloo)

Primary care data is an important piece of the evolving healthcare ecosystem. In addition to supporting the provision of patient care, primary care data can be used for a number of important secondary purposes. Understanding the tradeoffs between timeliness, accuracy, completeness and usefulness of primary care data is important to design systems that generate high quality data. As a case study, data quality measures and metrics are developed with a focus group of managers from a primary care organization. After calculating and extracting measurements of data quality, each measure was modeled with logit binomial regression to characterize tradeoffs and data quality interactions. Measures for accuracy, completeness and timeliness were calculated for 196,967 patient encounters. Report generation was measured as a proxy for the usefulness dimension. Based on the analysis, there was a positive relationship between accuracy and completeness, and a negative relationship between timeliness and usefulness. Importantly, the use of data was associated with an increase in completeness and accuracy. There were limitations to the measures and metrics developed with the focus group, however it was agreed that the measures were reasonable proxies for the data quality dimensions under study. The results provide meaningful insight on user tradeoffs and can be used in the design of systems in primary care.

The Data Repurposing Challenge: New Pressures from Data Analytics Dr. Philip Woodall(University of Cambridge)

N/A

Dependable Data Repairing with Fixing Rules Dr. Jiannan Wang(Simon Fraser University); Dr. Nan Tang(QCRI)

One of the main challenges that data cleaning systems face is to automatically identify and repair data errors in a dependable manner. Though data dependencies (a.k.a. integrity constraints) have been widely studied to capture errors in data, automated and dependable data repairing on these errors has remained a notoriously hard problem. In this work, we introduce an automated approach for dependably repairing data errors, based on a novel class of fixing rules. A fixing rule contains an evidence pattern, a set of negative patterns, and a fact value. The heart of fixing rules is deterministic: given a tuple, the evidence pattern and the negative patterns of a fixing rule are combined to precisely capture which attribute is wrong, and the fact indicates how to correct this error. We study several fundamental problems associated with fixing rules, and establish their complexity. We develop efficient algorithms to check whether a set of fixing rules are consistent, and discuss approaches to resolve inconsistent fixing rules. We also devise efficient algorithms for repairing data errors using fixing rules. Moreover, we discuss approaches on how to generate a large number of fixing rules, from examples or available knowledge bases. We experimentally demonstrate that our techniques outperform other automated algorithms in terms of the accuracy of repairing data errors, using both real-life and synthetic data.

The Challenge of Quality in Social Computation Dr. Milan Markovic(University of Aberdeen); Prof. Peter Edwards(University of Aberdeen)

In applications where machine intelligence falls short (e.g. alignment of taxonomies on the Semantic Web, image annotation, label sorting), so-called social computation approaches that utilise crowds of interconnected human workers offer a viable solution. Computations such as these can be modelled as a collection of structured activities (i.e. workflows) that represent a blend of human and machine tasks. From a data quality perspective, social computations cannot be treated as traditional computational systems and existing quality models will need to be adapted or redesigned to accommodate the unique characteristics of such systems. We argue that only by enhancing the transparency of social computation systems will we be able to realize such novel quality assessment processes.

A Bayesian Approach for Estimating and Replacing Missing Categorical Data

June, 2009

Xiao-Bai Li

One Size Does Not Fit All---A Contingency Approach to Data Governance

June, 2009

Kristin Weber, Boris Otto, Hubert Österle

A Procedure to Develop Metrics for Currency and its Application in CRM

June, 2009

B. Heinrich, M. Klier, M. Kaiser

Overview and Framework for Data and Information Quality Research

June, 2009

Stuart E. Madnick, Richard Y. Wang, Yang W. Lee, Hongwei Zhu

Representing Data Quality in Sensor Data Streaming Environments

September, 2009

A. Klein, W. Lehner

An Accuracy Metric: Percentages, Randomness, and Probabilities

December, 2009

Craig W. Fisher, Eitel J. M. Lauria, Carolyn C. Matheus

Automatic Assessment of Document Quality in Web Collaborative Digital Libraries

December, 2011

Daniel Hasan Dalip, Marcos André Gonçalves, Marco Cristo, Pável Calado

A Survey on Uncertainty Management in Data Integration

July, 2010

Matteo Magnani, Danilo Montesi

The Effects and Interactions of Data Quality and Problem Complexity on Classification

February, 2011

Roger Blake, Paul Mangiameli

On Graph-Based Name Disambiguation

February, 2011

Xiaoming Fan, Jianyong Wang, Xu Pu, Lizhu Zhou, Bing Lv

Bibliometrics

Publication Years	2009-2017
Publication Count	117
Citation Count	196
Available for Download	117
Downloads (6 weeks)	1561
Downloads (12 Months)	12043
Downloads (cumulative)	114845
Average downloads per article	982
Average citations per article	2

First Name	Last Name	Award
Peter	Aiken	ACM Senior Member (2011)
Ahmed	Elmagarmid	ACM Distinguished Member (2009)
Daniel S	Katz	ACM Senior Member (2011)
Beth A.	Plale	ACM Senior Member (2006)

First Name	Last Name	Paper Counts
Yang	Lee	4
John	Talburt	3
Stuart	Madnick	3
G	Shankaranarayanan	3
Peter	Christen	3
Eitel	LauríA	2
Sherali	Zeadally	2
Arnon	Rosenthal	2
Christan	Grant	2
Daisyzhe	Wang	2
Ross	Gayler	2
Ali	Sunyaev	2
Nan	Tang	2
Vassilios	Verykios	2
Carolyn	Matheus	2
Wolfgang	Lehner	2
Dinusha	Vatsalan	2
Roger	Blake	2
Xiaobai	Li	2
Roman	Lukyanenko	2
Pierpaolo	Vittorini	1
Karthikeyan	Ramamurthy	1
Ralf	Tönjes	1
Laurent	Lecornu	1
Dov	Biran	1
Edward	Anderson	1
Shelly	Sachdeva	1
Stuart	Madnick	1
Monica	Tremblay	1
Debra	Vandermeer	1
Foster	Provost	1
Nicola	Ferro	1
Christian	Becker	1
Sandra	Sampaio	1
Jianyong	Wang	1
Wenfei	Fan	1
Dustin	Lange	1
Therese	Williams	1
Chintan	Amrit	1
John	Krogstie	1
Banda	Ramadan	1
John	O’Donoghue	1
Axel	Polleres	1
Venkata	Meduri	1
Wenjun	Li	1
Khoi	Tran	1
Lan	Cao	1
Jeffrey	Vaughan	1
Melanie	Herschel	1
Payam	Barnaghi	1
Jean	Caillec	1
Rashid	Ansari	1
Davide	Ceolin	1
Arputharaj	Kannan	1
Anupkumar	Sen	1
Hubert	Österle	1
Suzanne	Embury	1
Lizhu	Zhou	1
Erhard	Rahm	1
Shuai	Ma	1
Nigel	Martin	1
Huizhi	Liang	1
Paolo	Coletti	1
Mirko	Cesarini	1
Hongjiang	Xu	1
Vincenzo	Maltese	1
Yuheng	Hu	1
Robert	Meusel	1
Xiaoping	Liu	1
Valentina	Maccatrozzo	1
Maurice	Van Keulen	1
Stephen	Chong	1
Edoardo	Pignotti	1
A	Borthick	1
Sara	Tonelli	1
Kush	Varshney	1
Rahul	Basole	1
Jimeng	Sun	1
Ashfaq	Khokhar	1
Dmitry	Chornyi	1
Fred	Morstatter	1
Paul	Groth	1
Mohamed	Yakout	1
Danilo	Montesi	1
Omar	Alonso	1
Alan	Labouseur	1
Irit	Askira Gelman	1
Alexandra	Poulovassilis	1
Eric	Medvet	1
Fabiano	Tarlao	1
John	Herbert	1
Juan	Augusto	1
Maurice	Mulvenna	1
Paul	Mccullagh	1
Fabio	Mercorio	1
Fei	Chiang	1
Siddharth	Sitaramachandran	1
J	Jha	1
Laure	Berti-Équille	1
Jürgen	Umbrich	1
Sven	Weber	1
Fabian	Panse	1
Fumiko	Kobayashi	1
Johann	Freytag	1
María	Bermúdez-Edo	1
Maria	Alvarez	1
Richard	Briotta	1
Kristin	Weber	1
Panagiotis	Ipeirotis	1
Paolo	Missier	1
Xu	Pu	1
Yi	Chen	1
Benjamin	Ngugi	1
Beverly	Kahn	1
Paul	Glowalla	1
Wenyuan	Yu	1
Felix	Naumann	1
Wenyuan	Yu	1
Fausto	Giunchiglia	1
Jeremy	Debattista	1
Sushovan	De	1
Dominique	Ritze	1
Heiko	Paulheim	1
Christoph	Quix	1
Matthias	Jarke	1
Wan	Fokkink	1
Jeffrey	Fisher	1
Adriane	Chapman	1
Jeremy	Millar	1
Hilko	Donker	1
Dezhao	Song	1
Rabia	Nuray-Turan	1
Dmitri	Kalashnikov	1
Yinle	Zhou	1
Heiko	Müller	1
Youwei	Cheah	1
Steven	Brown	1
Terry	Clark	1
Adir	Even	1
H	Nehemiah	1
Matthew	Jensen	1
Jay	Nunamaker,	1
Daniel	Dalip	1
Tobias	Vogel	1
Arvid	Heise	1
Uwe	Draisbach	1
Fons	Wijnhoven	1
Pável	Calado	1
Olivier	Curé	1
Claire	Collins	1
Ioannis	Anagnostopoulos	1
Patricia	Franklin	1
Willem	Van Hage	1
Len	Seligman	1
Gilbert	Peterson	1
Robert	Ulbricht	1
Martin	Hahmann	1
Eric	Nelson	1
Hongwei	Zhu	1
Ulf	Leser	1
Irit	Gelman	1
Yanjuan	Yang	1
Paul	Bowen	1
Dennis	Wei	1
Aleksandra	Mojsilović	1
Ion	Todoran	1
Ali	Khenchaf	1
Valerie	Sessions	1
Trent	Rosenbloom	1
Shawn	Hardenbrook	1
Huan	Liu	1
Peter	Aiken	1
Michael	Zack	1
Nitin	Joglekar	1
Mikhail	Atallah	1
Subhash	Bhalla	1
D	Elizabeth	1
Kaushik	Dutta	1
M	Kaiser	1
Kresimir	Duretec	1
Manoranjan	Dash	1
Xiaoming	Fan	1
Floris	Geerts	1
Thomas	Redman	1
David	Becker	1
Wenfei	Fan	1
Pim	Dietz	1
Jeffrey	Parsons	1
Giannis	Haralabopoulos	1
Sebastian	Neumaier	1
Kyle	Niemeyer	1
Arfon	Smith	1
Archana	Nottamkandath	1
Darryl	Ahner	1
Claudio	Hartmann	1
Hongwei	Zhu	1
Norbert	Ritter	1
Cihan	Varol	1
Coşkun	Bayrak	1
David	Robb	1
Rosella	Gennari	1
Mark	Braunstein	1
Marta	Zárraga-Rodríguez	1
Craig	Fisher	1
Sufyan	Ababneh	1
Peter	Elkin	1
C	Raj	1
Matteo	Magnani	1
Hema	Meda	1
Amitava	Bagchi	1
Bernd	Heinrich	1
Mathias	Klier	1
R	Greenwood	1
Ayush	Singhania	1
George	Moustakides	1
Bing	Lv	1
Paul	Mangiameli	1
Jianing	Wang	1
Dirk	Ahlers	1
Marcos	Gonçalves	1
Alberto	Bartoli	1
Hongwei	Zhu	1
James	McNaull	1
Kelly	Janssens	1
Judith	Gelernter	1
Mouhamadoulamine	Ba	1
Ciro	D'Urso	1
Subbarao	Kambhampati	1
Hua	Zheng	1
Jeff	Heflin	1
Christian	Skalka	1
Michael	Mannino	1
Fiona	Rohde	1
Kewei	Sha	1
Marco	Valtorta	1
Elliot	Fielstein	1
Theodore	Speroff	1
Ahmed	Elmagarmid	1
Yang	Lee	1
Judee	Burgoon	1
Boris	Otto	1
Josh	Attenberg	1
Sean	Goldberg	1
Andreas	Rauber	1
Alun	Preece	1
Anja	Klein	1
Marilyn	Tremaine	1
Alan	March	1
Felix	Naumann	1
Marco	Cristo	1
Andrea	Lorenzo	1
Richard	Wang	1
Maurizio	Murgia	1
Mario	Mezzanzanica	1
Roberto	Boselli	1
Christoph	Lange	1
Sören	Auer	1
Luvai	Motiwalla	1
Sandra	Geisler	1
Daniel	Katz	1
Douglas	Hodson	1
Sharad	Mehrotra	1
Chris	Baillie	1
Peter	Edwards	1
Beth	Plale	1

Affiliation	Paper Counts
University of Padua	1
Universite Paris-Est	1
Federal University of Amazonas	1
Florida State University	1
Virginia Commonwealth University	1
Vanderbilt University	1
Instituto Superior Tecnico	1
Google Inc.	1
University of Leipzig	1
Hospital Universitario Austral	1
Harvard University	1
University of Colorado at Denver	1
Oklahoma City University	1
University of Rhode Island	1
State University of New York at Albany	1
Georgia State University	1
University of Antwerp	1
University of Texas at Austin	1
Oregon State University	1
University of Massachusetts System	1
Indian Institute of Science	1
Elsevier	1
University of Augsburg	1
Vienna University of Technology	1
University of South Carolina	1
Memorial University of Newfoundland	1
Boston University	1
Technical University of Munich	1
Butler University	1
New Jersey Institute of Technology	1
National Institute of Standards and Technology	1
Cardiff University	1
Sam Houston State University	1
University College Cork	1
Microsoft Corporation	1
Ben-Gurion University of the Negev	1
University of Edinburgh	1
Charleston Southern University	1
Commonwealth Scientific and Industrial Research Organization	1
Rutgers, The State University of New Jersey	1
University of Patras	1
Hellenic Open University	1
University of Illinois at Urbana-Champaign	1
Lehigh University	2
Humboldt University of Berlin	2
Fraunhofer Institute for Applied Information Technology	2
Nanyang Technological University	2
Old Dominion University	2
Suffolk University	2
Free University of Bozen-Bolzano	2
University of Innsbruck	2
University of Arizona	2
Norwegian University of Science and Technology	2
University of Kentucky	2
University of Trento	2
RWTH Aachen University	2
University of Toronto	2
University of Surrey	2
Indiana University	2
New York University	2
Massachusetts Institute of Technology	2
University of Massachusetts Boston	2
University of Bologna	2
University of Hamburg	2
Federal University of Minas Gerais	2
University of Oklahoma	2
University of Queensland	2
University of Aizu	2
McMaster University	2
Universidad de Navarra	2
Indian Institute of Management Calcutta	2
Vienna University of Economics and Business Administration	3
University of Massachusetts Medical School	3
Qatar Computing Research institute	3
Northeastern University	3
University of St. Gallen	3
University of Thessaly	3
Babson College	3
University of Cologne	3
Georgia Institute of Technology	3
University of Aberdeen	3
Beihang University	3
Telecom Bretagne	3
Purdue University	3
Birkbeck University of London	3
University of Bonn	3
University of California, Irvine	3
University of Mannheim	3
University of Twente	4
University of Ulster	4
Vrije Universiteit Amsterdam	4
United States Air Force Institute of Technology	4
University of Trieste	4
University of Manchester	4
Anna University	4
University of Illinois at Chicago	4
IBM Thomas J. Watson Research Center	4
Technical University of Dresden	4
University of Milan - Bicocca	4
United States Department of Veterans Affairs	4
University of Florida	4
MITRE Corporation	5
University of Massachusetts Lowell	5
Marist College	5
Florida International University	5
Tsinghua University	5
Arizona State University	5
Hasso-Plattner-Institut fur Softwaresystemtechnik GmbH	6
University of Arkansas at Little Rock	8
Australian National University	9

Journal of Data and Information Quality (JDIQ) - Challenge Papers and Research Papers
Archive

2017
Volume 8 Issue 2, February 2017 Challenge Papers and Research Papers

2016
Volume 8 Issue 1, November 2016 Special Issue on Web Data Quality
Volume 7 Issue 4, October 2016 Challenge Papers and Regular Papers
Volume 7 Issue 3, September 2016 Research Paper, Challenge Papers and Experience Paper
Volume 7 Issue 1-2, June 2016 Challenge Papers, Regular Papers and Experience Paper

2015
Volume 6 Issue 4, October 2015 Challenge Papers and Regular Papers
Volume 6 Issue 2-3, July 2015
Volume 6 Issue 1, March 2015
Volume 5 Issue 4, February 2015
Volume 5 Issue 3, February 2015 Special Issue on Provenance, Data and Information Quality

2014
Volume 5 Issue 1-2, August 2014
Volume 4 Issue 4, May 2014

2013
Volume 4 Issue 3, May 2013
Volume 4 Issue 2, March 2013 Special Issue on Entity Resolution

2012
Volume 4 Issue 1, October 2012
Volume 3 Issue 4, September 2012
Volume 3 Issue 3, August 2012
Volume 3 Issue 2, May 2012
Volume 3 Issue 1, April 2012
Volume 2 Issue 4, February 2012

2011
Volume 2 Issue 3, December 2011
Volume 2 Issue 2, February 2011

2010
Volume 2 Issue 1, July 2010

2009
Volume 1 Issue 3, December 2009
Volume 1 Issue 2, September 2009
Volume 1 Issue 1, June 2009