Event Details

The Federal Committee on Statistical Methodology (FCSM) and the Washington Statistical Society (WSS) are sponsoring a series of workshops about transparent reporting on the quality of data produced through the integration of multiple data sources. As government agencies, academia and the private sector develop new approaches to maximizing the information available from the ever growing body of existing data, integrated data are becoming more common. Many of the prospective sources include data developed originally for administrative or commercial work, rather than for statistical work as such.

Integrated data provide tremendous opportunities to learn more from data than evaluation of single data sets in isolation. They also pose a large number of challenges. Several of these challenges revolve around transparently documenting: 1) the quality of the input data sources; 2) how the input data were processed into an integrated data set and how the integrated data are structured; and 3) how to convey information about the quality of the resulting output data and information drawn from those data. These challenges are compounded to the extent that the input data may be in formats that federal agencies and the statistical community are less familiar with in terms of quality metrics, and analyses of the integrated data may require analytic decisions guided by fewer well accepted standards than agencies have typically followed.

This third workshop will focus on issues related to item (3) above. A draft agenda is given below.

We are asking all attendees to register, so we can send information on logistics and workshop materials. Here is the link to attend remotely.


WebEx audio is via phone: Call-in toll-free number: 1-866-865-9536, Access code: 744 124 3


Opening Remarks 9:00-9:40

  • John Eltinge, Chris Chapman, Joe Schafer: Introduction and Summary of earlier workshops
  • Linda Young: Overview of Third Workshop

Session 1: Break in Series: 9:40-10:45

  • Main Speaker: Lynn Langton, Bureau of Justice Statistics
  • Discussant: TBD

Break 10:45-11:00

Session 2:Combining Data From Disparate Sources 11:00-12:05

  • Main Speaker: : Trivellore Raghunathan, University of Michigan
  • Discussant: TBD

Lunch 12:05-1:15

Session 3: Framework for Assessing Data Quality 1:15-2:35

  • Speaker 1: Paul Biemer, RTI
  • Speaker 2: John Czajka, Mathematica

Break 2:35-2:55

Session 4: Summary 2:55-4:15

  • Main speaker: Frauke Kreuter