2. The Data
Two main data sources are used in this paper, both based on data provided to Statistics New Zealand by the New Zealand Customs Service. The first is detailed data on New Zealand's aggregate merchandise exports at the product level. The second is unit-record firm data on merchandise trade, linked to simple firm characteristics from the newly compiled prototype Longitudinal Business Database (LBD).
2.1 Aggregate Data
The aggregate data was provided by Statistics New Zealand and forms the basis of their official Merchandise Trade statistics. It provides annual data on the value of New Zealand's merchandise trade, disaggregated by partner country and product type. Each line entry shows the value of exports of a specific product, to a given destination. Values are given in current New Zealand dollars. Exports are valued free on board and are classified to the country of final destination, as shown on the Customs declaration. Exports are defined as "goods which add to or subtract from the material resources in New Zealand as a result of their movement in or out of the country", with a small number of exclusions, such as currency transactions, goods consigned for modification or repair, and temporary trade goods such as the effects of New Zealand tourists going overseas.2
Two versions of the aggregate data are used. The first uses the SITC 5-digit product classifications, and covers the years from 1988 to 2005. The second uses the HS 10-digit product classification and covers the period 1996-2005. Although HS10 data is available at the aggregate level back to 1988, there have been numerous changes in product classifications over time, and it has not yet been possible to create a full concordance beyond 1996. Section 2.3 discusses the classification changes and their impact on the analysis in more detail. In addition, restricting the period to 1996 onwards allows for better comparison to the firm level data. The focus will be on the 10 digit product data from 1996-2005 as this gives a more detailed picture of the product and firm level changes in export behaviour. The SITC 5-digit aggregate data is provided mainly to situate the more detailed data within a longer time period and for comparison with existing international research (see Section 5.2.1).3
2.2 Firm-Level Data
The firm-level data was provided through Statistics New Zealand's LBD. The complete dataset, described in more detail in Fabling et al (2007), combines administrative and survey data over various time frames and frequencies.
For this paper, we make use of four data sources – the Longitudinal Business Frame (LBF), the Business Activity Indicator (BAI), and Customs data for the calendar years 1996 to 2005 and IR4 Company Tax data from 1999 to 2004. This time frame is chosen to maximize the available time period while maintaining a high level of comparability between the three sources and mitigating the effect of changes in product classifications and political changes in destination countries. The approach taken to political and classification changes is discussed further in Section 2.3, below.
The main focus of the paper is the Customs data on merchandise exports. Customs data is available on shipment-level basis, and has been rolled up to calendar years for comparability with the aggregate data. Customs data is available over the period from 1988 to 2006, but the quality of matches between Customs clients and firms falls prior to 1998.4
It is uncertain how many unmatched firms there are, as it is not possible to distinguish between an unmatched exporting firm and an individual client. However, it is likely that the data used here underestimate the true value of enterprise exports for the first two years of the sample, as there is a significant proportion of total exports which are attributed to "non-enterprises" in those years. As such, there is some underestimation of export value by firms in the first two years of the sample. Still, it is notable that the 13,092 Customs clients which are not matched to business data account for only 1.4% of merchandise export value over the ten year period, with the remainder spread among 23,778 identifiable firms. Table 1 shows the percentage of total export value attributed to New Zealand located firms, overseas located firms, and individuals and unmatched firms respectively. From 1998 onwards, over 99% of aggregate export value can be matched to firms, with at least 96.6% of value attributable to New Zealand based firms.
The LBF, BAI and IR4 data are used to provide information about firm characteristics, and about the population of non-exporting firms. The population used in the analysis covers all 'active' firms, where activity is determined broadly to include all firms in which we observe output, purchases of inputs or factors of production. Specifically, a firm is classed as active if it has at least one of: a positive export value; positive BAI sales or purchases value; positive employee count or PAYE salaries and wages; and/or positive total income, total expenditure or total fixed assets
From the LBF, we make use of firms' ANZSIC industry classification. The LBF covers the financial years 1999/00 to 2005/06. ANZSIC codes are attributed to the calendar year which has the greatest overlap with the financial year period recorded in the LBF. For the years outside the LBF, we use the ANZSIC from the closest available year for that firm. As such, for most firms the ANZSICs reported in 1996-1998 will be assumed to be the same as their 1999 industry.
The BAI includes data on firms' total sales, which are sourced from GST returns. This is used as a measure of firm size. Total sales are rolled up to calendar years.5
We also consider differences in export behaviour between domestic and foreign owned and controlled companies. Our indicator of foreign control is defined as those firms which answer "yes" to the question "is this company controlled or owned by non-residents?" on the IR4 company tax return. Any analysis based on ownership is therefore limited to companies, with other business types (sole traders, partnerships etc) excluded as they do not complete IR4 returns.
Differences in business types across industries may affect the observed exporting rate among foreign owned firms. For example, the Agriculture, Forestry and Fishing industries have both a relatively low proportion of companies and a low proportion of (direct) exporters.6 By concentrating on companies we therefore exclude most Agricultural, Forestry and Fishing firms and thereby increase the proportion of the sample with observed exports.
Table 2 indicates the level of data coverage for each year, as the proportion of active firms which have data available in each of three key areas – ANZSIC industry codes, GST sales and purchases data, and an indicator for foreign ownership and control. Many of the observed missing variables are due to the differing time periods over which each data source is available. In the years prior to 1999, only GST and exports data is available, so both the ANZSIC and foreign control indicators are missing for all firms. This also impacts on the total observed population, as firms which do not have either GST data or exports data for these years will not be in the sample of active firms. This situation is mirrored for 2005, as very few firms have LBF or IR4 data available for the 2005 calendar year.
Although the aggregate and firm level data are based on the same source, export value totals in each of the two datasets do not match exactly. There are three sources of this variance. The first two involve restrictions which we place on the firm-level data, to include only enterprises located in New Zealand. That is, exports by individuals and by overseas located firms are excluded. The third discrepancy arises due to imperfect matching of firms to Customs clients, as discussed above.
An additional concern which should be noted with the longitudinal data in the LBD is that it is subject to discontinuities where there are changes in the structure or legal status of firms over time.7 These problems arise because the LBD tracks firms according to their enterprise number, which is based on legal units. This has implications in the later parts of the analysis presented in this paper, as some firm entry and exit may in fact be spurious, due to changes in legal status rather than changes in the operational aspects of the firm. Similarly, restructuring within multi-enterprise groups may result in spurious entry and exit of export markets, as exports may appear to shift between members of the group. Future development work on the LBD may reduce the severity of this issue, through implementing repairs to the longitudinal firm links and developing methods to deal with restructuring about groups.
2.3 Classification Changes
Attempts to distinguish between new and continuing products and markets are complicated by changes in political and classification changes over time. For example, the fall of the Berlin Wall would be reflected in export data by an apparent cessation of trade with two countries (East Germany and West Germany) and the commencement of trade with another country (Germany). In order to prevent spurious entries and exits from being observed in the analysis, a number of amendments must be made to the export data.
2.3.1 Classification of Destinations
The destinations shown in the dataset reflect a combination of geographical boundaries and political control. Two distinct destinations will be listed if they are significantly geographically separate (eg. the UK is separated from British Indian Ocean Territories, despite being part of the same political entity) and/or politically separate (eg. Vatican City State is listed separately from Italy). This gives a dataset with 224 discrete destinations for New Zealand exports over the period from 1996-2005. Non-country destinations, such as 'ships stores', 'ships bunkering' or 'passenger effects' are dropped from the analysis.
Where there have been changes in political boundaries (eg. the creation of the countries which were formerly part of Yugoslavia or the re-unification of Germany) destinations have been amalgamated to the smallest possible consistent entity, regardless of whether the political change involved a split or a unification. That is, the Former Yugoslav republics are classified as one destination (Former Yugoslavia), as are East and West Germany (Germany). This is salient only for the SITC 5-digit data, which covers a period of significant political changes in the late 1980s and early 1990s, but does not affect the 1996-2005 sample period. After these reclassifications, the SITC5 dataset shows a total of 214 distinct destinations, but there are no changes to the HS10 dataset.
The decision to treat countries which split identically with those which join together is based on the assumption that the political changes which take place are exogenous to the changing composition of trade.8 In contrast, product classification concordances are restricted to their 1996 classifications, as changes in classifications may be directly related to trade performance, as discussed below.
2.3.2 Classification of Products
The SITC data is available to the 5 digit level. This implies a reasonable, though not high, level of differentiation. For example, 'meat of sheep' can be distinguished at the level of 'fresh or chilled' vs 'frozen', but not by cut.
The HS10 data provides a much greater level of detail on products traded. At the 10-digit level, we can differentiate between, for example, baseball caps, bowlers' hats and other cloth or stitched hats; and between motorcycle helmets, bicycle helmets and fireman's helmets.
One drawback of the HS10 data is that it has gone through a number of revisions over time. Over the period from 1 January 1996 to 31 December 2005, there have been 30 separate revisions to the data, including one full revision in 2001.
These revisions often include the introduction of classifications for new technologies or varieties. However, the introduction of a new category to the HS classification system does not necessarily correspond with the introduction of a new product to New Zealand's trade flows. For example, until October 2005, fresh pears were distinguished only with respect to whether they were European or nashi pears. Since that date, the 'Pears' category is distinguished by variety – Belle de Jumet, Beurre Bosc, Doyenne du Comice, Taylor's Gold, plus separate categories for 'green pears', 'red pears' and 'pears, not elsewhere specified'. Other products have been reclassified into wider groupings, or simply into different combinations. Until April 1998, dairy spreads were distinguished according to fat content, but are now listed as a single category. Wools have always been distinguished according to thickness (microns), but the relevant gradations have changed over time.
Classification changes may also reflect changes in recognition of salient points in terms of product characteristics - for example from July 1997, the classification distinguishes between digital audio tape and audio tape other than digital, whereas previously these were both listed as 'Media, unrecorded; magnetic tapes, prepared, (of a width not exceeding 4mm), for sound or similar recording of other phenomena (excluding products of chapter 37), audio tape'.
Changes to the classification system also reflect changes in New Zealand's tariff system. For example in May 2003, a distinction between synthetic and hog bristle paint brushes was introduced, while in July 2005 a distinction based purely on the tariff schedule was dropped – between 'Pens; felt tipped and other porous-tipped pens and markers' and 'Pens; felt tipped and other porous-tipped pens and markers; alternative rate'.
In summary, there are three main reasons for classification revisions:
- changes in the perceived nature or importance of the goods eg. consolidation of several varieties into a single good, due to falling relevance of the differences
- the development/recognition of new goods
- changes in the tariff rates and conditions for a given product
This creates substantial difficulties for comparing product composition over time. In the basic trade data, many apparently new products are actually products which have been traded previously but have been reclassified. We therefore need to create a consistent concordance of all products over time. As changes in classifications are often endogenous (eg they occur because a new good has been introduced) we do this by attributing all product codes back to their HS1996 code.
Although this approach is suitable for product groups which have split over time, it cannot be used on products which have merged. For example, a 1996 category "Apples" which later splits into "Red Delicious" and "Granny Smith" could be rolled back to its original 1996 group, but if a pair of categories "electric toothbrushes with revolving heads" and "electric toothbrushes without revolving heads" had merged to "electric toothbrushes", they could not be tracked to a unique 1996 code. Products whose classifications have merged over time are therefore excluded from the analysis of product dynamics, with a small number of exceptions, described below.
In examining the materiality of merged classifications, we find that around 6.3% of total export value between 1996 and 2005 is in products which are subject to mergers. This value ranges from 4.5 to 9.3% in any given year. Much of the value in merged products is associated with a small number of products. Over half the value of merged products is accounted for by just 10 HS10 codes. A further 15% is accounted for by the next 10 codes.
In order to capture this value, we therefore apply a manual adjustment for the top 20 HS10 products. This involves creating a small number of codes which merge several groupings into a single HS1996 code. These adjustments affect 3 groups of products – simple leather products, wool, and newsprint/paperboard. In the case of wool, for example, the HS10 classification changes over time related to the fineness of the wool. In 1996, classes of wool were grouped according to three fineness categories – less than 28 microns in diameter, 29 to 34 microns, and 35 microns and over. On the first of July 1999, these classifications changed to four classes, less than 24.5 microns, 24.5 to 31.4 microns, 31.4 to 35.4 microns, and greater than 35.4 microns. As it is impossible to distinguish what value of each grade of wool is attributable to the other categories we therefore group all grades together, while maintaining the distinction between, for example, whether or not wool has been degreased or carbonized. Given these adjustments, our revised dataset allows us to use 98.6% of total export value over the 10 years of the sample in the aggregate data, with a maximum of 1.8% of value excluded due to mergers in any one year. In the micro data, around 7.5% of total export value is affected by merged codes. After the manual corrections are applied, this falls to 1.3 percent. Products are differentiable into 10,090 distinct products, plus the aggregated code for those products which have been merged. In the analysis below, merged codes are included in aggregate export statistics but are excluded from analysis at the product level.
In contrast, the SITC 5 data was last updated in 1987/88, and shows a total of 3,030 products exported or re-exported from New Zealand over the 18 years to 2005. While this allows for a consistent set of definitions, any products which did not exist in 1988 will be subsumed in categories "not elsewhere specified". As a rough guide to the extent to which new products are encompassed in these 'other' codes, the value in product codes labeled 'not elsewhere specified' trebled over the 18 years in question, while the 'specified' product codes doubled. As such, the problem of identifying truly new goods in New Zealand's trade flows is hidden, rather than actually solved.
Back to Top