3. Data
The survey data considered in this paper relate to the Business Operations Survey (BOS) 2005 and 2006. The BOS data are matched to data obtained from Statistics New Zealand's prototype Longitudinal Business Database (LBD). The LBD is built around the Longitudinal Business Frame (LBF). To this is attached, among other things, Goods and Services Tax (GST) returns, financial accounts (IR10) and aggregated Pay-As-You-Earn (PAYE) returns all provided by the Inland Revenue Department (IRD). The full prototype LBD is described in more detail in Fabling, Grimes, Sanderson and Stevens (2008).
The BOS is an annual three part modular survey, which began in 2005. The first module is focussed on firm characteristics and performance. The second module alternates between biennial innovation and business use of ICT collections. The third module is a contestable module that enables specific policy-relevant data to be collected on an ad hoc basis.9 The BOS is conducted using two-way stratified sampling, with stratification on rolling-mean-employment (RME) and two-digit industry according to the ANZSIC system.10 The survey excludes firms with fewer than six RME and firms in the following industries: M81 Government Administration, M82 Defence, P92 Libraries, Museums and the Arts, Q95 Personal Services, Q96 Other Services, and Q97 Private Households Employing Staff. The 2005 survey was sent to 6,979 enterprises with a total of 5,595 usable responses returned (a response rate of 80.2% after adjusting for ceases). The 2006 survey achieved an 81.7% response rate, a total of 6,066 responses.
It is important to note that in common with many surveys conducted by Statistics New Zealand (SNZ) the survey is statutory and the front page of the BOS bears the imprimatur: "The taking of this survey has been approved by the Minister of Statistics and the return of this questionnaire, duly filled in and signed, is a compulsory requirement under the Statistics Act 1975". Because of this, the BOS has a considerably higher response rate than comparable surveys internationally (the 2004 Workplace Employment Relations Survey achieved a response rate of 64%, for example). The implications for data quality are uncertain. Whether this requirement increases the quality of responses or simply brings into the sample a number of firms who will spend less time and effort on the survey remains to be seen.11 However, according to the cognitive psychology literature discussed in the previous section, the nature of respondent motivation is an important input into the ability of surveys to generate good quality data.
The quantitative financial information is reported in the first part of the BOS, "Part i: Financial information". The qualitative performance is contained in the third part of the BOS, "Part iii: Business performance". According to the instructions contained in the survey: Part i should be completed by the finance department or the accountant. If the firm does not have an accountant on-site, then firms are instructed that Part i should be completed by the General Manager. We do not have information on who completed each section of the survey. This creates the possibility that there may be some kind of reporting bias introduced. However, with this caveat in mind, the instructions clearly state that the quantitative financial information should be completed either by someone who has responsibility for finance or by a general manager with reference to an accountant and so we feel reasonably confident that such information is as objective as is possible in such a survey.
The BOS is something approaching best practice in such surveys internationally. It has removed replication of surveys12 – and thus reduces respondent load and makes sampling simpler. It is explicitly designed with a panel element, enabling more sophisticated analysis to be undertaken allowing us to better understand issues of causality and – as the panel element increases – dynamic issues.13
The administrative data to which we shall be comparing the BOS have three sources: counts of employees from PAYE returns,14 the Business Activity Indicator (BAI) dataset and IR10 forms. The BAI is derived from GST data, with the main manipulations applied being temporal and group return apportionment and limited imputation for single missing returns. In this paper, the BAI is used for data on sales of goods and services, and purchases. Financial accounts returns (IR10) are the source for information on purchases, profits, opening and closing stock. We use IR10 sales for comparative purposes, with the difference between the two alternative administrative sources providing some context for our comparison with the BOS. We will also be using them to examine income and expenses in greater detail than is available in the financial module of the BOS, in order to aid our understanding of any differences. The variables used in this paper will be discussed in more detail when we discuss the comparisons themselves in the following section and in the data appendix.
In order to make appropriate comparisons, it is important to ensure that the information relates to the same financial year. Respondents to the BOS are asked to state "the balance data of the financial accounts which you will use for this questionnaire" (Section A, part I, Question 6 in the 2005 survey). This data is used to match PAYE, IR10 and BAI data to the appropriate financial year. Note that some firms report information relating to the same financial year in both surveys. Because of this we remove some observations to enable matching to take place. If firms report that their information relates to the same financial year when completing both the 2005 and 2006 surveys, we use the response to the 2006 survey.15 We do not discard these observations altogether; in the appendix to this paper we compare the sales reported in the financial section of each of the BOS surveys with that of official sources for firms who supplied the "same" data for both years (Table 19 and Table 20).16
Given the difficulties with applying the appropriate input and output price deflators and the fact that we only consider two, consecutive years of survey data, in what follows we consider nominal figures only. In order to make our comparisons as transparent as possible, we also only consider firms that are in existence for the whole of the financial year. The nature of data collected for business start ups and failures in the first and final years of existence respectively is an important one to consider. These issues are important both from a data quality perspective and because what happens to firms when they are born and die is of particular interest to researchers and policy-makers alike. However, it is beyond the scope of this piece of work.
Back to Top