Introduction

This report provides results from the first and second waves of the Distancia-Covid Survey launched on 14 May 2020 under the CSIC-funded project “Impacto de las medidas de distanciamiento social sobre la expansión de la epidemia de Covid-19 en España.” It relies on the survey responses received from the launch date through 31 August 2020. This period encompasses two “waves” during which the survey was disseminated through social media and other channels. As described further below, Wave 1 ran from 14 May 2020 through 10 June 2020, and Wave 2 ran from 24 July 2020 through 31 August 2020.

The vast majority of the responses received during Wave 1 came during the time in which Spain still had social distancing measures in effect but was transitioning away from the extensive restrictions on mobility and social contacts that had been put into place with the state of alarm decreed on 14 March 2020. The state of alarm lasted until 21 June 2020 and Spanish territories were moving, at varying rates, through the three phases of the de-escalation process during the Wave 1 period analyzed here. All of the responses in Wave 2 were received when most restrictions had been lifted and there was no longer a state of alarm in effect. In addition, Wave 2 ends on 31 August in order to coincide with the end of the traditional summer vacation period and avoid overlapping with the September transition back to work and school.

Survey Design

The survey was designed by the Distancia-Covid team in order to better understand changing patterns of human mobility and social contacts in Spain in the context of the Covid-19 pandemic. Many of the questions draw on the approach taken by the POLYMOD study (Mossong et al. 2008; Prem, Cook, and Jit 2017), and were developed in coordination with researchers in other countries working on similar surveys related to social mixing (Del Fava et al. 2020; Feehan and Mahmud 2020; Perrotta et al. 2020).

The survey was distributed in Spanish, Catalan, Galician, Basque, and English using Kobo Toolbox1. Respondents accessed the survey at https://distancia-covid.csic.es/encuesta and it remains available at present at that URL. Respondents are able to access the survey questions only if they first provide informed consent.

The sampling design was non-random, based entirely on people self-selecting into the respondent pool by connecting to the survey URL online. The survey URL was distributed through press releases, Twitter, Whatsapp, and other channels by members of the project team and institutional press offices, and it appears to have propagated through digital networks reasonably well, reaching all provinces in Spain and a relatively wide segment of the population (see further below).

As of 12 November 2020 there were 6952 valid submissions, 4392 in Wave 1 and 2560 in Wave 2. Initial data cleaning was done to improve the interpretability of variable names and generate additional variables calculated from the original ones. Among other things, an imputed usual postal code variable was created based on the two postal code questions in the survey, which asked respondents to list their current and usual postal codes. The imputed variable takes the value of the usual postal code when this has been provided. When it has not been provided it takes the value of the current postal code on the assumption that these are the same in these cases. In addition, province variables were created based on the first two digits of the postal code responses.

Descriptive Statistics

This section provides descriptive statistics of the survey submissions received to date, with distinctions made between the two waves as appropriate. Throughout the text and plots, “NA” is used to denote missing data due to respondents declining to answer certain questions on the survey. It should be noted that these statistics are not necessarily representative of the population given the non-random sampling design. Population estimates are now being made using multilevel regression with poststratification, as described in the next section.

Temporal Distribution

Most survey submissions were made soon after the survey was released and promoted in each wave. Figure 1 shows the submission time pattern on a histogram with the data aggregated in 1-hour bins. As can be seen, there were several sub-waves of submissions within each of the two main Waves. There is also a clear daily cycle of submissions, which drop off at night (as one would expect), which can be seen if one zooms in on the plot by clicking on it.

Figure 1: Histogram of survey start times binned by the hour. Wave 1 is shown in red; Wave 2 is shown in blue.

Geographic Distribution

Based on the imputed usual postal code variable, survey respondents appear to have had their usual places of residence distributed across Spain, with at least one respondent in each province. (This mostly also corresponded to their current places of residence, although 792 respondents listed different current and usual postal codes, and of these, 479 are in different provinces.)

In absolute terms, most respondents reported their usual places of residence in Madrid or Barcelona, as shown in Figure 2. Relative to the province residential populations (taken from the padron), the greatest sampling fraction is from Girona, followed by Toledo, Bizkaia, Barcelona, and Castellon, as shown in Figure 3.

Figure 2: Number of people sampled in province during each wave.

Figure 3: Province sampling fractions. Percentage of each province’s residential population sampled in each wave. Province populations are based on padron.

Age and Gender

The survey respondents also represented a broad cross-section of ages, ranging from 18 (the requirement for participation) up to 92. The median age of respondents was 46 in Wave 1 and 47 in Wave 2. The middle 50% of ages of respondents was 36 to 57 in Wave 1 and 38 to 56 in Wave 2. The survey’s gender question provided binary response options of male or female in order to match the phrasing of Spain’s labor force survey (Encuesta de población activa),2 which is being used for poststratification. There were both male and female respondents in nearly every age group in both waves. In Wave 1 62% of respondents identified themselves as female, 35% as male, and 2% declined to respond to the gender question. In Wave 2 65% of respondents identified themselves as female, 34% as male, and 1% declined to respond to the gender question. Figure 4 provides a population pyramid of male and female respondents. Note that Wave 2 had relatively fewer respondents below 38 years old.

Figure 4: Age and gender distribution of survey respondents.

Education

The survey asked respondents to report their highest level of education, divided into four levels. Submissions were received from people reporting all four levels, with most reporting undergraduate or graduate level. Figure 5 shows reported education levels by gender. A relatively large proportion of the respondents had high education levels.

Figure 5: Distribution of respondent education levels by gender.

Work

The survey also asked respondents to report their “occupation or type of work” as well as “the activity of the establishment in which [they] work.” The distribution of responses to the occupation question is shown in Figure 6, with the labels on the x-axis corresponding to the following response options (abbreviated version for chart in italics, followed by full response option shown in English version to respondents):

  • military: Military occupations; Armed forces
  • directors: Directors and managers; Business and Public Administration Management
  • scientists: Scientific and intellectual technicians and professionals
  • support techs: Support Technicians and Professionals
  • admin: Accounting, Administrative, and Other Office Employees; Administrative type employees
  • caterers: Catering, personal, protection and trade vendor workers
  • skilled agricultural: Skilled workers in the agricultural, livestock, forestry and fishing sectors; Skilled workers in agriculture and fishing
  • skilled manufacturers: Artisans and skilled workers in manufacturing and construction industries (except facilities and machinery operators; Artisans and skilled workers in manufacturing, construction, and mining industries, except facilities and machinery operators
  • machinery operators: Plant and machinery operators and assemblers
  • unskilled: Elementary occupations; Unskilled workers
  • other: Other

The categories in Figure 6 are ordered according to their relative prevalence during Wave 1. We observe a relatively large proportion of scientists in both waves, likely related to the dissemination strategy of the survey, which mainly relied on academic social networks. We can also see similar distributions of the other occupational categories, with the exception of the “other” category, which dropped from Wave 1 to Wave 2 (in proportion to the others), and the non-response “NA” category, which rose. Presumably this reflects some combination of (1) variation by respondents in the decision of whether to choose “other” or simply not to respond when they did not see a category fitting their occupation, (2) an increase in unemployment and job instability leading respondents to see themselves less attached to a particular occupational category, (3) survey fatigue or loss of motivation to respond to all questions due to the length and changing nature of the pandemic, and (4) changes in the networks through which the survey propagated.

Figure 6: Distribution of respondent occupations by gender.

The distribution of responses to the work activity question is shown in Figure 7, with the labels on the x-axis corresponding to the following response options (abbreviated version for chart in italics, followed by full response option shown in English version to respondents):

  • Agriculture: Agriculture, forestry and fishing
  • Food: Food, textile, leather, wood and paper industry
  • Extractive: Extractive industries, oil refining, chemical, pharmaceutical, rubber and plastics industries, electricity, gas, steam and air conditioning supply, water supply, waste management. Metallurgy
  • Construction: Construction of machinery, electrical equipment and transport material. Industrial installation and repair
  • Building: Building
  • Wholesale: Wholesale and retail trade and its facilities and repairs. Auto repair, hospitality
  • Transportation: Transportation and storage. Information and communications
  • Financial: Financial intermediation, insurance, real estate activities, professional, scientific, administrative and other services
  • Public: Public administration and education
  • Health: Health activities
  • Other services: Other services
  • Other: Other

As with the occupation plot, Figure 7 shows the activity categories on the x-axis in the order of their prevalence in the Wave 1 responses. In this case, we see the highest proportion of responses coming in the Public category, again very likely reflecting the networks through which the survey was distributed. We see somewhat less stability in the relative proportion of other categories across the two waves and we again see a large increase in the non-response “NA” category, which may be explained in the same way as in the occupation case above.

Figure 7: Distribution of respondent work activities by gender.

Country of Birth

Most respondents reported that they were born in Spain (93%). Of those who reported being born outside Spain, the top 5 countries of birth were Argentina (11% of non-natives), Italy (8% of non-natives), Germany (7% of non-natives), the UK (5% of non-natives), and France (5% of non-natives).

Continuing to work

For Wave 1, the survey asked respondents, “Are you continuing to work during the lockdown?” The distribution of responses is summarized in Figure 8 with the labels on the x-axis corresponding to the following response options (abbreviated version for chart in italics, followed by full response option shown in English version to respondents):

  • no: No
  • all remote: Yes, I work remotely from home 100%
  • some remote: Yes, I work remotely part time
  • all in-person: Yes, I work outside my home

For Wave 2, the question was modified to reflect the ending of the “lockdown” and also to better account for the variety of working/non-working situations. The question in this wave was, “What is your current employment status?” The distribution of response is summarized in Figure 9 with the labels on the x-axis corresponding to the following response options (abbreviated version for chart in italics, followed by full response option shown in English version to respondents):

  • all remote: I am employed (full time or part time) and do all my work from home
  • all in-person: I am employed (full time or part time) and do all of my work outside my home
  • some remote: I am employed (full time or part time) and do some of my work from home and some outside my home
  • unemployed: I am unemployed
  • retired: I am retired or unable to work
  • student: I am a student and not working

Figure 8: Responses in Wave 1 to question: Are you continuing to work during the lockdown?

Figure 9: Responses in Wave 2 to question: What is your current employment status?

ICT Resources

Nearly all respondents reported owning or living with someone who owns an information and communication technology (ICT) device, with personal computers being most prevalent, followed by smart phones and then tablets (Figure 10). Respondents mostly reported multiple devices. Most (>60%) of respondents also reported being constantly connected to the internet, and most of the rest reported being connected several times per day (Figure 11).

Figure 10: Proportion of respondents reporting that they or someone they live with owns particular ICT devices. The x-axis shows options from which respondents could select one or more; bars show proportion of respondents who included each option in their response. (In many cases respondents included more than one option.)

Figure 11: Distributiuon of responses to the question about how many times per day respondents connect to the internet.

Trips out of home

As one way of assessing levels of mobility, respondents were asked about the trips they had taken out of their dwellings during the past week. Figure 12 shows the distribution of number of trips reported. The maximum value listed in the responses was 10,000, but this was omitted from the analysis as obviously erroneous. Several people reported 50 or more trips (including two reporting 100) and these were retained, as they reflect plausible behavioral patterns (e.g., delivery work). In Wave 1, the mean and median number of reported trips were both 5. For Wave 2, the mean was 9 and the median was 7. Overall, 80% reported having gone out between 1 and 7 times during Wave 1 and 61% reported this during Wave 2. The mode of the distribution (most frequent value) in both waves was 7 trips, presumably because many people actually tend to go out once per day (even during the confinement period) or because 7 is simply the rough estimate many people use to answer the question. Reports of more than 7 trips accounted for 15% of responses in Wave 1 and 37% in Wave 2.

Figure 12: Number of trips out of dwelling during past week.

Respondents were also asked about the farthest distance they had traveled on any of these trips as well as all of their destinations and safety precautions. The distributions of responses are shown in Figures 13, 14, and 15.

In Wave 1, nearly 80% of respondents reported having traveled less than 10 km from their home and nearly 40% reported having traveled less than 1 km. The most frequent destination was stores, followed by public spaces and workplaces. Nearly all respondents reported taking some sort of safety precaution, with masks, social distancing, and handwashing being the most frequent. In Wave 2, there were proportionally fewer displacements below 1 and 10 km, and proportionally more displacements above 10 km among the respondents. Final destinations in Wave 2 were more diverse compared to Wave 1, but stores remained the most frequent destination. In terms of safety precautions while traveling during Wave 2, again masks, social distancing, and handwashing were the most frequent. There was also a decrease in the proportion of respondents reporting use of gloves in Wave 2 compared to Wave 1.

Figure 13: Farthest distance traveled out of home during past week.

Figure 14: Destinations of trips during past week. The x-axis shows options from which respondents could select one or more; bars show proportion of respondents who included each option in their response. (In many cases respondents included more than one option.)

Figure 15: Precautions employed on trips taken during past week. The x-axis shows options from which respondents could select one or more; bars show proportion of respondents who included each option in their response. (In many cases respondents included more than one option.)

Households

An important source of information about social mixing comes from the sizes and age structures of people’s households (defined here as the group of people with whom they were residing at the time of the survey submission). Figure 16 shows the number of co-residents reported by each respondent by autonomous community and city. This raw data is very noisy due in part to the non-random sampling design and the small number of respondents from some autonomous communities/cities (particularly, for example, Ceuta and Melilla). (Modeled population estimates are provided in Figure 20.)

Figure 17 shows the age-structure of respondents and co-residents. Note that the x-axis here indicates the respondents’ age groups (in 5-year ranges, based on the number of years respondents reported on the survey), while the y-axis indicates the age groups of the reported co-residents (in the 10-year ranges that respondents were given on the survey as choice options). Cell colors indicate the mean number of co-residents reported by respondents in each of the respective age groups. Hovering the cursor over the cells will also show medians and the central 90% of the distributions. The matrices are not symmetrical because population sizes vary by age group.

Figure 16: Distribution of co-residents by autonomous community/city.

Figure 17: Mean number of co-residents by respondent and co-resident age group.

Out-of-home contacts

Relevant social mixing also occurs outside the home. Respondents were asked to report the number and ages of the people with whom they had contact on the previous day. Following the POLYMOD approach, contacts were defined for respondents as: “EITHER a two-way conversation with three or more words in the physical presence of another person, OR physical skin-to-skin contact (for example a handshake, hug, kiss or contact sports).” The distribution of the reported numbers of contacts is shown in Figure 18. Note the relatively large proportion of respondents reporting 0 contacts in Wave 1 compared to Wave 2. Although all of the responses in Wave 1 were received at the time of the de-escalation process, this appears to reflect the effect of the extensive restrictions on mobility and social contacts of the previous months. As with all of these descriptive statistics, however, we need to be extremely cautious in making any population inferences directly from the raw data as we know the samples are not representative. (Modeled population estimates are provded below in Figures 23, 24, and 25.)

Figure 19 shows the matrix of reported contacts by age group. In this figure (similar to the household matrix above) the x-axis indicates the respondents’ age groups (in 5-year ranges, based on the number of years respondents reported on the survey), while the y-axis indicates the age groups of the reported contacts (in the 10-year ranges that respondents were given on the survey as choice options). Cell colors indicate the mean number of contacts respondents in each of the respective age groups reported having had in the day prior to their response. Hovering the cursor over the cells will also show medians and the central 90% of the contact distributions. As with the household co-resident matrices, these matrices are not symmetrical because population sizes vary by age group.

It should also be noted that these surveys are cross-sectional, and thus we are not able to observe variation in the number of contacts each person has over time. Thus, for example, a mean of 0.5 could reflect variation within individual contact patterns over time, with a respondent of age X having contact with a respondent of age Y on average every 2 days. Alternatively, it could reflect variation across respondents, with some having one or more contacts on a daily basis and others having no contacts on a daily basis. In fact, the data surely reflect variation at both levels, but it is not possible to differentiate between them from the available data.

Figure 18: Distribution of daily out-of-home contacts by autonomous community/city.

Figure 19: Mean number of daily out-of-home contacts by respondent and contact age group.

Population Estimates

The project team is now using multilevel regression with poststratification (MRP) (Zhang et al. 2014; Downes et al. 2018; Park, Gelman, and Bafumi 2004) to make population-level estimates from the survey data. Some preliminary results are offered here but the models on which they rely are still being tuned and improved. These figures should, therefore, be treated with caution. We focus here on social mixing patterns, both in and out of home, because of the obvious relevance to understanding Covid-19 dynamics.

MRP is a statistical method that has the potential to produce reliable population-level estimates from non-representative samples (Downes et al. 2018; Wang et al. 2015; Del Fava et al. 2020). The approach relies on multilevel modeling to first estimate an outcome of interest for different combinations or cells of respondent characteristics. MRP then uses model predictions and poststratification to generate population-level estimates based on knowledge of the relative proportion of each cell in the total population (Downes et al. 2018).

In our case, key outcomes of interest are (1) the number of co-residents in each household, (2) the probability of having had an out-of-home contact during a given 24-hour period, and (3) the number of such contacts in that period. The respondent characteristics used to create the cells are taken from survey questions that provide information also obtained from Spain’s large, representative labor force survey (Encuesta de población activa),3 from which the population proportions needed for poststratification are taken.

We use multilevel negative binomial regression models for the mean of the count response variables — both in-home co-residents and out-of-home contacts — conditional on poststratification cells. We use a multilevel logistic regression model to estimate the probability of having any out-of-home contact (again conditional on these cells).

We assume the random variable representing the number of co-residents or out-of-home contacts for each individual \(i\) follows a negative binomial distribution. We further transform the scale of the expectation into non-negative values with a log link and define the multilevel model for the expected number of co-residents or out-of-home contacts using random intercepts for occupation, province of residence, and response date. (The date intercept is included to account for potential temporal autocorrelation arising from the network structure along which the survey was distributed; model predictions are then made for a hypothetical unobserved date within each wave.)

In the co-resident model, fixed effects were included for gender and five-year age group, whereas in the out-of-home contacts model fixed effects were included for education level and five-year age group.

In order to model the probability of any out-of-home contact, we first defined a random variable representing the occurrence of any contact for any individual \(i\), following a Bernoulli distribution with probability \(\pi_i\). We then fit a multilevel logistic regression with random intercepts for occupation, province of residence, and date (as in the count models) and fixed effects of education and five-year age group.

We fit all models in R (R Core Team 2020) using Stan and the rstanarm package (Stan Development Team 2015, 2016), with the default priors described in the rstanarm 2.21.1 documentation.4

After fitting these models, we made population level estimates by sampling from the posterior predictive distributions according to the corresponding cell size in the labor force survey data. As a comparison, we also modeled the co-resident outcome directly from the labor force survey, using the same count model described above.

Households

Starting with the number of co-residents each respondent reported, we estimate a population-level distribution of co-resident counts for people aged 20 and over. This is shown in Figure 20. As a comparison, Figure 21 shows that same estimates based directly on the Spain’s labor force survey (Encuesta de población activa) for each quarter during 2019 and 2020. Comparing Figures 20 and 21, we see that the Distancia-Covid survey estimates (using MRP) match very closely with the estimates obtained from the much larger more representative labor force survey. Looking at the two waves of the Distancia-Covid survey in Figure 20, we see very little difference in the distribution of co-residents. Looking at the labor force survey estimates in Figure 21, we see that this patterns appears to have been quite stable over the past two years.

Adding age groups, we estimate the matrix shown in Figure 22. As with the descriptive matrix drawn from the raw household data above, the x-axis here indicates the respondents’ age groups (in 5-year ranges, based on the number of years respondents reported on the survey), while the y-axis indicates the age groups of the reported co-residents (in the 10-year ranges that respondents were given on the survey as choice options). Cell colors indicate the mean number of co-residents reported by respondents in each of the respective age groups. Hovering the cursor over the cells will also show medians and the central 90% of the distributions. The matrices are not symmetrical because population sizes vary by age group. One can observe here that people from a given age tend to reside with people of the same age (mid diagonal), as well as clear generational groupings (grandfathers, parents, kids).

Figure 20: Distancia-Covid Survey Waves 1 and 2: Estimated distribution of co-residents by autonomous community/city. Estimates are limited to population aged 20 and over. X-axis shows number of co-residents and bars indicate estimated proportion of of each population residing with this number of co-residents.

Figure 21: EPA 2019-20: Estimated distribution of co-residents by autonomous community/city. Estimates are limited to population aged 20 and over in order to match the sample used in the Distancia-Covid survey.

Figure 22: Estimated mean number of co-residents by age groups. Estimates are limited to population aged 20 and over.

Out-of-Home Contacts

For out-of-home contacts we use the survey responses to estimate the distribution and age-structured contact matrix for the population aged 20 and over.

Probability of Any Contact

Since a large number of respondents in Wave 1 reported no out-of-home contacts at all on the previous day, we start by simply estimating the probability of any out-of-home contact. Figure 23 shows the estimated probabilities and the 90% credible intervals for these estimates for each province in each wave. We see a clear increase in all provinces in the probability of having had any out-of-home contact.

Figure 23: Estimated probability of having any out-of-home contact on the previous day for each province. Estimates are limited to population aged 20 and over. They are indicated by the points with 90% credible intervals shown by the lines.

Number of Contacts

Figure 24 shows the estimated distribution of the number of out-of-home contacts for the total population aged 20 and over. Preliminary fits suggest that in Wave 2 this closely follows a zero-inflated discrete exponential distribution with a relatively broad tail, compared to Wave 2 adjusting more to a simple discrete exponential with a relatively thinner tail. This means that more people had zero daily contacts during the period of mobility restrictions (Wave 1), but there were also larger proportions of the population having more contacts during that period compared to the less restrictive period (Wave 2). This evidence of relatively greater numbers of high-contact people in Wave 1 may have to do with the dynamics of specific occupations and population sectors (e.g. delivery, health care) during that time. We will investigate this further. Figure 25 provides distributions for each age group in each wave. The main qualitative features described above also appear to hold per age groups.

Figure 26 shows the estimated age-structured contact matrix for the population in each wave (again with ego-side limited to people 20 and over). The x-axis here indicates the 5-year age group of reference for the estimate (“self age group”), while the y-axis indicates the 10-year age groups of the estimated contacts for the reference groups. Cell colors indicate the mean number of contacts each of the respective reference groups are estimated to have had with each of the respective contact age groups in some hypothetical day during the wave period. Hovering the cursor over the cells will also show medians and the central 90% of the contact distributions. As with the other contact and household matrices, these matrices are not symmetrical because population sizes vary by age group.

We can observe here high mean number of daily contacts in the diagonal (as is the case also of the estimated household contact matrix). That is, people from one age group tend to have contact with people from the same age group. We also observe some apparent differences between reported contacts in Waves 1 and 2. Most notably, in Wave 2 (with mobility restrictions lifted) there appears to have been an increase in contacts among young people, seen in Figure 26 as the larger number of estimated contacts that people aged 20 through 24 had with people aged 20 through 29. In addition, Wave 2 appears to show an increase in the contacts people aged 65 and older had with those aged 30 through 49.

In general, the estimated mean number of daily contacts are rather small. It should also be noted (as with the descriptive statistics) that these estimates are based on cross-sectional data that does not incorporate information about variation in the number of contacts each person has over time. Thus, an estimate of 0.5 could reflect variation within individual contact patterns over time, with a person of age X having contact with a person of age Y on average every 2 days. Alternatively, it could reflect variation at the population level, with some people having one or more contacts on a daily basis and others having no contacts on a daily basis. In fact, these estimates surely reflect variation at both levels, but it is not possible to differentiate between them from the available data.

Figure 24: Distribution of estimated number of daily contacts per person. Estimates are limited to population aged 20 and over.

Figure 25: Distribution of estimated number of daily contacts per person by 5-year age group of the reference person. Estimates are limited to population aged 20 and over. Panels are labelled by the lower age of each group.

Figure 26: Estimated mean number of daily out-of-home contacts by age group. Estimates are limited to population aged 20 and over.

Next Steps

The Distancia-Covid Group will be further analyzing this data to better understand social mixing patterns across ages, occupations and other collected variables, to build-up a network-focused analysis, and to prepare data to feed into a variety of epidemiological models, ranging from agent-based models to classical SEIR compartmental models. The data will help make the models more realistic, provide insight into how they may be affected by changing contact networks, and improve predictions about future scenarios. Keeping in mind the words of George E.P. Box that “all models are wrong but some of them are useful,” we hope that these data and estimates will increase the usefulness of the models our public health systems are relying on to make decisions during the present pandemic.

Acknowledgements

Special thanks to Ane Calvo, Jose A. Costoya, and Manuel Pereira, for translating the survey into Basque and Galician, and to Wiebke Weber, Dennis Feehan, Ayesha Mahmud, Emilio Zagheni, and Jorge Cimentada for suggestions and feedback on the survey design.

References

Del Fava, Emanuele, Jorge Cimentada, Daniela Perrotta, André Grow, Francesco Rampazzo, Sofia Gil-Clavel, and Emilio Zagheni. 2020. “The Differential Impact of Physical Distancing Strategies on Social Contacts Relevant for the Spread of Covid-19.” medRxiv. https://doi.org/10.1101/2020.05.15.20102657.

Downes, Marnie, Lyle C Gurrin, Dallas R English, Jane Pirkis, Dianne Currier, Matthew J Spittal, and John B Carlin. 2018. “Multilevel Regression and Poststratification: A Modeling Approach to Estimating Population Quantities From Highly Selected Survey Samples.” American Journal of Epidemiology 187 (8): 1780–90. https://doi.org/10.1093/aje/kwy070.

Feehan, Dennis, and Ayesha Mahmud. 2020. “Quantifying Interpersonal Contact in the United States During the Spread of Covid-19: First Results from the Berkeley Interpersonal Contact Study.” medRxiv. https://doi.org/10.1101/2020.04.13.20064014.

Mossong, Joël, Niel Hens, Mark Jit, Philippe Beutels, Kari Auranen, Rafael Mikolajczyk, Marco Massari, et al. 2008. “Social contacts and mixing patterns relevant to the spread of infectious diseases.” Edited by Steven Riley. PLoS Medicine 5 (3): 0381–91. https://doi.org/10.1371/journal.pmed.0050074.

Park, David K., Andrew Gelman, and Joseph Bafumi. 2004. “Bayesian Multilevel Estimation with Poststratification: State-Level Estimates from National Polls.” Political Analysis 12 (4): 375–85. https://doi.org/10.1093/pan/mph024.

Perrotta, Daniela, André Grow, Francesco Rampazzo, Jorge Cimentada, Emanuele Del Fava, Sofia Gil-Clavel, and Emilio Zagheni. 2020. “Behaviors and Attitudes in Response to the Covid-19 Pandemic: Insights from a Cross-National Facebook Survey.” medRxiv. https://doi.org/10.1101/2020.05.09.20096388.

Prem, Kiesha, Alex R. Cook, and Mark Jit. 2017. “Projecting social contact matrices in 152 countries using contact surveys and demographic data.” Edited by Betz Halloran. PLoS Computational Biology 13 (9): e1005697. https://doi.org/10.1371/journal.pcbi.1005697.

R Core Team. 2020. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.

Stan Development Team. 2015. Stan Modeling Language User’s Guide and Reference Manual, Version 2.10.0. http://mc-stan.org/.

———. 2016. “rstanarm: Bayesian applied regression modeling via Stan.” http://mc-stan.org/.

Wang, Wei, David Rothschild, Sharad Goel, and Andrew Gelman. 2015. “Forecasting Elections with Non-Representative Polls.” International Journal of Forecasting 31 (3): 980–91.

Zhang, X., J. B. Holt, H. Lu, A. G. Wheaton, E. S. Ford, K. J. Greenlund, and J. B. Croft. 2014. “Multilevel Regression and Poststratification for Small-Area Estimation of Population Health Outcomes: A Case Study of Chronic Obstructive Pulmonary Disease Prevalence Using the Behavioral Risk Factor Surveillance System.” American Journal of Epidemiology 179 (8): 1025–33. https://doi.org/10.1093/aje/kwu018.