When interpreting the results of any sample survey it makes sense to be circumspect about the data we are working with. In the paragraphs below we will look at everything that it is necessary to be careful about, watch out for, and avoid. Using practical examples we will outline some of the problems that can arise in the process of collecting and interpreting data, and we will offer some practical recommendations.
Whenever we are trying to interpret some kind of data, and this does not just apply to public opinion surveys, we should first of all have a very good idea of what kind of data we are dealing with. We should know who, how, when, and for what purpose the data were acquired, we should be interested in the kind of method used to sample respondents and know whether the sample was representative and what target population it represents. Another important piece of information is how many people were interviewed in the frame of the given survey. Every serious piece of outcome that interprets data from a sample survey must include this information. An example of a summary of this information is the technical specifications of the research.
Selecting respondents
In the case of a survey we should be interested in what method was used to select respondents, whether the sample is representative, and what target population it represents. For example, the public opinion surveys that are conducted as part of the Public Opinion Research Centre’s continuous research project ‘Czech Society’ are always representative for the total population of the Czech Republic over the age of 15, but it is common also to come across other samples. It is always necessary to know this and every serious outcome that interprets data from a sample survey must include this information.
The timing and duration of data collection
The timing and duration of the actual data collection in the field play an important role in public opinion research, as this type of research for the most part is concerned with attitudes and opinions, which can change very quickly under the influence of current events.
The standardisation of questioning
When interpreting data from public opinion research it is also necessary to take into account the method of questioning that was used in the research. In quantitative surveys an important role is played not just by sample representativeness but also by the standardisation of questioning, which basically means that all the respondents should be asked and answer questions in the same conditions in terms of the wording and order of the questions, etc. These conditions should be defined by the researcher in advance and carefully monitored and they should be neutral so that they do not influence the responses. This can sometimes be difficult. The main instrument for standardising the questioning is the questionnaire itself, which predetermines which questions respondents will be asked and in what order.
Self-administered surveys vs guided questioning through an interviewer
The results of a survey can be significantly influenced by the way the survey is administered, that is, whether the respondent filled in the questionnaire him or herself or whether the questions were posed in some form of guided interview with the respondent by an interviewer using the prepared questionnaire. In some cases there is an advantage to the respondents filling in the questionnaire themselves, especially if the questions are simple and short or if the survey includes questions that it might be unpleasant for respondents to answer in front of an interviewer. In general, though, this mode of questioning has many disadvantages that negatively affect the quality of the data. As well as what is usually a low response rate, other disadvantages are that it is impossible to ask more complicated questions, or they are not suited to this mode of questioning, and it is impossible to use filters or cards, and the researcher has no control over the situation in which the questions are administered, which in the case of an interview is controlled by the interviewer. In a self-administered questionnaire we do not know if the respondents answered the questions on their own or if perhaps other family members assisted them, and in the case of electronic questionnaires, which can record information automatically, we usually do not even know how long it took the respondent to complete the questionnaire, whether the process of answering questions was interrupted, or whether the respondent even answered the questions in the order in which they were listed in the questionnaire. Most surveys are for this reason administered as a guided interview with a respondent conducted by a trained interviewer using the questionnaire.
Constructing the questions
A vital role in the interpretation of public opinion research is played by how the survey questions are expressed. It is not just the wording of the questions that matters, the scale of response items used is also of key importance. In the case of symmetric ordinal scales that ask respondents to rate something (good/bad) or asking about their level of satisfaction with something (happy/unhappy), and so forth, another important factor is the number of response categories/items and whether it is an even or odd number of items. When an odd number of categories is used, there will be some middle, neutral response item indicating something like ‘neither good or bad’, ‘neither happy or unhappy’, or perhaps even ‘somewhat happy, somewhat unhappy’, and this middle category has a direct impact on the distribution of responses on the positive and negative sides of the scale, as it has the potential to draw in a larger share of respondents from one side of the scale than from the other. Also, unlike scales with an even number of response items, scales with an odd number can act differently over time in relation to a question that otherwise does not change. For example, shifts between the two categories/items adjacent to each other at the dividing line between the positive and negative sides of the scales may go entirely unnoticed in scales with an odd number of categories/items, because they will be hidden inside the middle category/item. Conversely, shifts from the middle category into one of the adjacent categories on a scale with an odd number of categories might not be analogically expressed on a scale with an even number of categories as a shift between the two adjacent categories at the dividing line between the positive and negative sides of the scale.
Substantial differences can also appear when an ostensibly similar question is used to measure the share of people who rate something positively or negatively on a ‘good/bad’ scale and when it is used to measure the share of people who are ‘happy/unhappy’ with the same thing. For example, in January 2017, 51% of households rated their standard of living on a five-point scale as ‘very’ or ‘somewhat good’, 37% as ‘neither good or bad’, and 12% as ‘somewhat’ or ‘very bad’. That same month the very same respondents were asked to indicate on a five-point scale whether they were happy or unhappy with the current situation in the country in terms of their standard of living and the results showed that 36% of them were very or somewhat happy, 40% were ‘somewhat happy and somewhat unhappy’, and 23% were somewhat or very unhappy, which is a noticeably less optimistic outcome. What’s more, in October 2016, households rated their standard of living almost identically to the results in 2017, but one-half of respondents nonetheless indicated their household had difficulty getting by on its income. If the results in each of these cases were interpreted in isolation from the others, we would clearly have three very different pictures of the socio-economic situation of households.
Another thing that has a big influence on the distribution of responses on a scale is whether all the items at every point on a scale are verbally described or whether only the items at the extreme ends of the scale are verbally defined. For example, a question that measures how satisfied people are with their job using a seven-point scale of items ranging from very happy to very unhappy will almost always have a very different and much more polarised distribution of responses if the categories/items at the extreme ends are described than it will if all the items on the scale are fully described, with three categories expressing greater or lesser degrees of satisfaction on one side, a middle category expressing a neutral position like ‘neither happy or unhappy’, and three categories expressing greater or lesser degrees of dissatisfaction on the other side. On this kind of scale the responses will be much more concentrated around the first three categoriesand in the middle of the scale, while only a small share of responses will fall into the remaining three categories. This is consistent with the empirically demonstrated fact that most people are more of less satisfied with their job. Although both scales are symmetrical in the same way, many people do not see the scale this way when the verbal descriptions are not included, and do not automatically interpret the undescribed items next to, for example, the extreme response of ‘very unhappy’ as also expressing dissatisfaction and the middle item as expressing a neutral viewpoint.
Recommendations for interpreting the results of public opinion research
If we are to correctly interpret the results of any public opinion research we need to know the basic information about the process/method of data collection, and we should not try to make any interpretations from any isolated, individually measured data. Whenever possible we should base our interpretations on multiple, various indicators drawn from variously formulated questions using different types of response scales, and rather than individual pieces of data from an isolated research, we should focus on trends observed in time series of these indicators. It is also necessary to bear in mind that the results of a survey are not ‘true values’ and are always just estimates, while the true values are very likely somewhere within close range of these estimates. When the size of a sample is around a thousand respondents small differences of up to three percentages points are not statistically significant and may also be the result of the errors that necessarily arise from having to select just a sample of respondents from the total population.