Assessing the Reliability of Facebook’s Advertising Data for Use in Demographic Research

André Grow , University of Leuven (KU Leuven)
René Flores, University of Chicago
Ilana Ventura, University of Chicago
Ingmar Weber, Qatar Computing Research Institute
Kiran Garimella, MIT Institute for Data, Systems, and Society
Emilio Zagheni, Max Planck Institute for Demographic Research (MPIDR)

An increasing number of scholars advocate the use of Facebook’s advertising platform in demographic research, either for conducting ‘digital censuses’ that provide information about the composition of the population at large, or for recruiting participants for survey research. The efficacy of both approaches depends on the accuracy of the data that Fakebook provides about its users, but so far little is known about the reliability of this data. We address this lacuna by comparing Facebook’s user classification with users’ self-reported information in a demographic survey. Our focus is on two types of classification criteria that Facebook employs: mandatory, user-provided information (e.g., their gender) and non-mandatory information that is partially inferred (e.g., place of birth). To the extent that there are errors in Facebook’s user classification, we expect that this error is larger for classifications based on non-mandatory, partially inferred information, than for classification based on mandatory information.

See extended abstract

 Presented in Session 4. Innovations in Demographic Data and Methods