Abstract
The effectiveness of information retrieval technology in electronic discovery (E-discovery) has become the subject of judicial
rulings and practitioner controversy. The scale and nature of E-discovery tasks, however, has pushed traditional information
retrieval evaluation approaches to their limits. This paper reviews the legal and operational context of E-discovery and the
approaches to evaluating search technology that have evolved in the research community. It then describes a multi-year effort
carried out as part of the Text Retrieval Conference to develop evaluation methods for responsive review tasks in E-discovery.
This work has led to new approaches to measuring effectiveness in both batch and interactive frameworks, large data sets,
and some surprising results for the recall and precision of Boolean and statistical information retrieval methods. The paper
concludes by offering some thoughts about future research in both the legal and technical communities toward the goal of reliable,
effective use of information retrieval in E-discovery.
rulings and practitioner controversy. The scale and nature of E-discovery tasks, however, has pushed traditional information
retrieval evaluation approaches to their limits. This paper reviews the legal and operational context of E-discovery and the
approaches to evaluating search technology that have evolved in the research community. It then describes a multi-year effort
carried out as part of the Text Retrieval Conference to develop evaluation methods for responsive review tasks in E-discovery.
This work has led to new approaches to measuring effectiveness in both batch and interactive frameworks, large data sets,
and some surprising results for the recall and precision of Boolean and statistical information retrieval methods. The paper
concludes by offering some thoughts about future research in both the legal and technical communities toward the goal of reliable,
effective use of information retrieval in E-discovery.
- Content Type Journal Article
- Pages 347-386
- DOI 10.1007/s10506-010-9093-9
- Authors
- Douglas W. Oard, College of Information Studies and Institute for Advanced Computer Studies, University of Maryland, College Park, MD 20742, USA
- Jason R. Baron, Office of the General Counsel, National Archives and Records Administration, College Park, MD 20740, USA
- Bruce Hedin, H5, 71 Stevenson St., San Francisco, CA 94105, USA
- David D. Lewis, David D. Lewis Consulting, 1341 W. Fullerton Ave., #251, Chicago, IL 60614, USA
- Stephen Tomlinson, Open Text Corporation, Ottawa, ON Canada
- Journal Artificial Intelligence and Law
- Online ISSN 1572-8382
- Print ISSN 0924-8463
- Journal Volume Volume 18
- Journal Issue Volume 18, Number 4
