Ministry of Economic Development Home| Contact MED|


 
 
 

Links to this page were:

Section Subnavigation Links:

2. Background


08/04: A Comparison of Qualitative and Quantitative Firm Performance Measures

Richard Fabling (Reserve Bank of New Zealand), Arthur Grimes (Motu Economic & Public Policy Research), Philip Stevens (Ministry of Economic Development)
[ Last Updated 31 March 2008 ]


Business surveys form an important basis for academic and policy analysis. They are designed to provide information on important theoretical or policy questions. As such, they have the advantage of being directly targeted at the issue of interest. Many analyses of firm performance are based upon self-reported measures from such surveys (e.g. Machin and Stewart, 1990; Fabling and Grimes, 2007). However, there are reasons to suspect self-reported measures of firm performance to be subject to reporting error and/or perception biases.

From a statistical perspective, data issues may arise from sampling and measurement errors. Sampling error is the statistical imprecision due to using a random sample instead of the entire population. This is of course dependent on the nature of both the overall population (e.g. how it is distributed) and the sample taken (e.g. its size, whether it is stratified). Measurement error, on the other hand, results from the failure of the recorded responses to reflect the true characteristics of the respondents. Whereas statisticians have been considering issues of sampling errors for many years – the classic texts of sampling theory being over half a century old (e.g. Cochran, 1953; Deming, 1950; Hansen, Hurwitz and Madow, 1953) – the systematic consideration of the influence of the design of the survey instrument itself is considerably younger (e.g. Tanur 1992; Sudman, Bradburn and Schwarz,1996; Tourangeau, Rips, and Rasinski, 2000).1

2.1 Ask Me No Questions…

There are a number of reasons why the results from surveys and/or administrative data may not represent a true picture of the quantities which they purport to capture or in which the researcher is interested. The first of these is that respondents to surveys do not have the same incentives as those providing data for official purposes. For many official purposes, such as tax reporting, respondents are under legal obligations to provide correct data. Even when the return of survey questionnaires duly filled-in is a compulsory requirement, the incentives and thus the time taken to fill-in surveys are generally lower.

Second, in a survey, the respondent may not hold all of the necessary information. Thus, she is required to either discuss with (or pass the survey to) staff that do, or estimate it herself. If the survey is filled out by more than one respondent, this may create other difficulties. For example, the possibility arises that the reference points for each respondent may differ. They may, for example, be referring to different time periods. Each link makes the chain weaker. On the other hand, if the survey is completed by the same person, it may be subject to the "common-rater" problem discussed in section 2.2 below.

Another problem with survey information is that much of it is subjective, rather than objective. Examples of subjective information include questions on job and life satisfaction, or assessments of the business environment. These questions may be subject to cognitive problems (e.g. related to the ordering or framing of questions), social desirability issues ("what do you want me to say?", "what do I want you to hear me say?") and/or situations in which objective answers simply do not exist or for which people cannot make the relevant choices (Bertrand and Mullainathan, 2001). Certainly the processes respondents go through in order to provide survey information may be more complex in a cognitive sense than those required to provide information in administrative forms (Tourangeau, et al., 2000). This is true despite the fact that the latter may involve considerably more complex external data retrieval and processing. A reason for this is administrative forms tend to be more tightly defined (for legal reasons, among others) and so create less potential for error in response.

Note that whilst administrative data such as tax data may be considered superior because, for example, firms could be made subject to audits with penalties for inaccurate filing, survey data may be in turn considered better than tax data because questions are designed to collect the right conceptual variable. The data collected for administrative purposes may not correspond to the theoretical construct, for example tax accountants and economists may have different definitions of the term "profit".

2.2 Measurement Error and Microeconometrics

Although much empirical work in microeconomics is dependent on survey data, the quality of the data is not always explicitly considered. When it is, it is in the context of the impact of measurement error on estimated models. In these cases it is often assumed that the error is "classical" – i.e. it is assumed to be uncorrelated with the true values of itself and other variables and any errors in measuring these.2

Whilst recently there has been an increasing consideration of survey response errors, these have tended to be in areas relating to individual responses to questions regarding personal issues. These have been as diverse as labour market transitions (Poterba and Summers, 1986), earnings (Bound and Krueger, 1991), consumption (Battistin, 2003), and nursing home expenses (McFadden, Schwarz and Winter, 2004). The literature on survey responses for firm-level information, however, is much sparser,3 although Brown and Medoff (1996) does consider the reporting of firm age and size by its workers. Work that exists investigating how managers respond to surveys finds that respondents reply differently to subjective and objective questions (Hillage et al., 2002; Mason, 2005; Forth and McNabb, 2008a,b). This has important ramifications regarding how such measures are interpreted, in particular whether they are equivalent. Another cause for concern that they have raised is the "common-rater" problem (Forth and McNabb, 2008b). This arises from the fact that respondents who provide information on what the researcher may feel are key determinants of performance (e.g. management practices) are also the same person who provides the firm performance data. This may generate a spurious correlation between the two.

It is not entirely clear how the literature relating to measurement error, which mainly focuses on individuals reporting personal information relating to themselves, relates to survey data collected on firms, which relates to individuals reporting on firms. This difference between individuals reporting on items about themselves, and individual's reporting on their firms (describing the qualities of firms), based on imperfect information may be rather more like Hyslop and Imben's (2001) "optimal prediction error" rather than the more common "classical measurement error". The former means that the measurement error is independent of the reported value rather than of the true value, as in the classical measurement error case. Each have different implications for estimation.

In order to use the data appropriately, the measurement error literature raises the following questions. First, is there measurement or reporting error? Second, is it systematically biased? If it is, with which other variables is it correlated? Note also that non-response is also a particular source of potentially systematic error. There is the potential for overcoming this by re-weighting, but this is not always as clear as it might seem (Horowitz and Manski, 1998).

So what have we learned? Clearly, we want the error in measurement to be as small as possible. Even if the error is not correlated with anything else, it will still bias estimates towards zero ("attenuation bias") and reduce the statistical precision and, hence, significance of any tests we conduct (e.g. t-tests). It will also bias the coefficients on accurately-measured variables (Bound et al., 2001). Bias one way or the other will cause us to incorrectly reject or accept hypotheses. Once errors are correlated with other variables of interest, the difficulties multiply. In particular, the effects of such multi-correlated measurement error can quickly become complex and unpredictable. Relationships can appear in the data where they should not, or disappear when they should.

If we have more than one measure of a quantity, we have the ability to understand the problem a little more. What effects will there be? Evidence of what in the literature is called "pure classical measurement error" might include higher variance in one of the estimates or a reduction in the correlation between the two. If there is a bias in the data that is uncorrelated with other variables, we would expect to see a difference in mean values, but a high degree of correlation. The indicators for the multi-correlated error types are much more complex; even if one version of a variable is measured with error and the other is not, they depend on the relationships between the errors and the variables, and also those between the respective variables.

This tells us something about the effects of measurement error, but not much about its causes. For this we need to think about how people interpret and respond to surveys and other methods of data collection.

2.3 Cognitive Psychology

Modern cognitive psychology approaches to understanding respondents' responses to surveys break the process down into four or five components (which roughly correspond to sequential stages).4 For example, Tourangeau et al. (2000) delineate between:

  • Comprehension,
  • retrieval,
  • judgement, and
  • response.5

Comprehension involves processes such as understanding the language of the question itself and attendant instructions ("syntax"), identifying the question's focus or the information that is sought ("semantics") and linking the terms used to actual concepts ("pragmatics"). Next the individual retrieves information (internally or externally) and may fill in any missing details. These are then assessed and a judgement is made as to how the information retrieved corresponds with the respondent's comprehension of what is required. In doing so, they may make an estimate based on partial retrieval. Finally, there is the response, which may involve translating the retrieved or generated information into response categories provided.

Survey design focuses on making questions as comprehensible as possible. Cognitive testing can relatively easily uncover misunderstandings of syntax. With skilled testers, difficulties with the semantic aspects of the question can be uncovered. The BOS sought to minimise problems with comprehension of the pragmatics and reduce obstacles to effective data retrieval by using classifications in the financial questions that accorded with the much longer running Annual Enterprise Survey. Nevertheless, it is clear that methods of data retrieval will vary across firm types. In small firms, the general manager may also be the accountant, or they may contract such work out to another firm. In larger firms, it may be done in separate departments.6 The "quality" of a respondent's judgement will in part depend on the job at hand. If the terms and definitions included in the question differ from those by which the respondent or their colleague knows them, or how they are recorded in the books, they must exercise greater judgement. This introduces a potential for error in response.

The consideration of incentives is economists' bread and butter. So it comes as no surprise to learn that it is important to consider respondents' incentives to provide information and to spend the time required to provide information of high quality. This is also considered in the cognitive psychology survey literature. Krosnick's (1991) theory of survey satisficing, relates the decisions of respondents to Tourangeau et al.'s (2000) four components of processing. Not all respondents are sufficiently motivated or able to carefully execute each of the four components of processing as well as would be hoped. The three important factors in the decision to satisfice rather than optimise (provide the best response they possibly can) are – perhaps rather unsurprisingly – the complexity of the task, the respondent's ability and their motivation.

2.4 Are we Better than Average, on Average?

One potential bias that is likely to affect self-reported measures of relative performance is the tendency for people to believe (or at least report) that they are above average.7 According to a survey of the psychology literature by Taylor and Brown (1988), people have unrealistically positive views of the self. For example, evidence suggests that managers are inclined to believe they are superior to the average manager (Larwood and Whittaker, 1977), and entrepreneurs perceive their own chance for success as being higher than that of their peers (Cooper, Woo and Dunkelberg, 1988). It might be expected, therefore, that there is an upward bias to estimates of firm performance.8 Alternatively, one might expect the ability to correctly perceive one's business environment as part of the set of skills required by management. Thus one would expect this misperception bias to be correlated with management quality and hence firm performance. For more on this subject see Fabling, Grimes and Stevens (2007).

2.5 Counting What Counts

In summary, there is no "golden bullet" for extracting the information required by researchers and policy-makers from firms. Administrative data such as tax data may be considered desirable because, for example, firms could be made subject to audits with penalties for inaccurate filing. However, in some cases survey data may be considered superior. This is because questionnaires can be designed to collect the right conceptual variable. The measurement error literature considers this mainly in the light of impact on estimated models – i.e. the size and direction of the bias and whether this is correlated with other variables of interest. The cognitive psychology literature explicitly considers the processes respondents undertake and their influence on response. This will depend on the complexity of the task being asked, as well as the ability and motivation of respondent to undertake them over a number of dimensions.

Much of our discussion thus far has focused on the problems of subjectivity for obtaining unbiased estimates of objective quantities. Of course the subjectivity of such data is not always a weakness. Indeed, it can be very informative about the perceptions of the respondent. Measures of what firms – or, rather, their employees – observe tell us something about their behaviour that objective measures may not. They allow us to understand firm behaviour in terms of active responses to the environment they observe, rather than merely considering the firm as a passive part of a system being acted upon by abstract forces. Nevertheless, subjective data should not be confused with objective data by the user. For this purpose, it is at best an estimate that is likely to be measured with error.


1 According to Bradburn (2004), the meeting of the cognitive psychology and survey literatures only came about around 30 years ago. He suggests that the earliest such meeting was a seminar held in 1978 by the British Social Science Research Council and the Royal Statistical Society on problems in the collection and interpretation of recall data in social surveys (p.5).

2 For more on the causes and effects of, and solutions to, measurement error in econometric studies, see Bound, Brown and Mathiowetz (2001).

3 Forth and McNabb (2008a,b). Forth and McNabb (2008a) discuss some previous tentative investigations in this area using the UK Workplace Employment Relations Survey, or WERS.

4 Although the overall process may include feedback loops.

5 Cannell, Miller and Oksenberg (1981), on the other hand consider five components: comprehension of the question; cognitive processing; evaluation of the accuracy of the response; evaluation based on other criteria; and accurate responding.

6 We consider instructions as to whom should fill out which sections of the form in Module A of the BOS in section 3 below.

7 Note that this may be an Anglo-Saxon trait. Comparative work between Britain and France suggests that there is no tendency for the majority of French firms to say they are above average, unlike their counterparts across the Channel (source: personal correspondence with John Forth).

8 Note, however, that the issue of overconfidence is often modelled as underestimation of the variance of signals, rather than an overestimation of mean values (De Bondt and Thaler, 1995).



Back to Top