
If the integrity of economic data rests on a single pillar, it is probably the accuracy of industry classification. For decades, national statistical agencies have relied on the International Standard Industrial Classification (ISIC) system to bring order to the messy diversity of economic activity. ISIC is now the bedrock of headline statistics—GDP, productivity, employment, wage growth. Yet, beneath the surface, the process of assigning ISIC codes is not as automatic or as foolproof as the casual reader of statistical yearbooks might assume.
The most obvious challenge arises from survey design itself. When firms or individuals are asked to self-report their ISIC classification, the room for error is significant. Many businesses, particularly smaller ones, do not see themselves as neatly fitting into a single code. Activities often straddle multiple categories—manufacturing and retail, for example, or transportation and warehousing. In practice, a business’s “primary” activity may shift over time or be interpreted differently depending on the respondent. The result: misclassification, sometimes subtle, sometimes glaring.
This matters more than one might think. At scale, misclassification can distort productivity and wage statistics. Imagine a survey where a group of highly productive tech consultancies (properly ISIC 6202) are mistakenly reported as “general business services” (ISIC 8299). Productivity for the former appears artificially low, and for the latter, perhaps inexplicably high. Such shifts, multiplied across an economy, can alter sectoral growth rates, skew wage dispersion measures, and even affect policy decisions.
What, then, can statistical agencies do to mitigate these risks? There are, broadly speaking, three strategies: smarter survey design, better coder training, and the adoption of automated tools.
Survey design is, in many ways, the first line of defense. The wording of survey questions can influence the accuracy of self-reported ISIC codes. Vague or overly broad descriptions invite confusion. Agencies with experience in this area often use detailed prompts, sector-specific examples, and follow-up questions to clarify ambiguous cases. Instead of simply asking, “What is your business’s main activity?” the survey might offer a list of concrete activities, mapped to ISIC codes, and probe further when the response is unclear. While this lengthens the questionnaire, the payoff in data quality is substantial.
Coder training is another critical component, and one that is sometimes underappreciated. In many countries, survey responses are reviewed and coded by staff who assign the final ISIC classification. The consistency and accuracy of these coders is essential. Regular training, standardized guidelines, and inter-coder reliability checks are best practices. Some agencies run periodic audits, comparing coder decisions to a “gold standard” sample, flagging anomalies for review. While no training program can eliminate all subjectivity, it narrows the range of possible errors and helps maintain standards across different survey rounds.
A third, increasingly important, strategy is the use of automated coding tools. Advances in machine learning and natural language processing have made it possible to scan narrative descriptions of business activities and suggest likely ISIC codes. Such systems can process thousands of responses quickly and flag cases where confidence is low or human intervention is needed. These tools are not without their own biases—algorithms are only as good as the training data they receive—but when used alongside human coders, they often improve both speed and accuracy. The best implementations are hybrid: machines handle the routine cases, while humans focus on the ambiguous outliers.
Still, even with all these safeguards, classification errors persist. The impact can be surprising. For instance, when misclassification systematically affects high-growth sectors, headline indicators like GDP composition or employment shares can be misleading. Policymakers, seeing growth in the “wrong” sectors, may adjust incentives, tax regimes, or workforce training programs in suboptimal directions. It’s a classic case of “garbage in, garbage out”—only in this context, the “garbage” is hard to detect until after the fact.
A practical example: in the early 2010s, several European countries noted unexplained discrepancies between survey-based wage data and administrative payroll records. Further investigation revealed that rapid growth in IT and creative industries (where activities evolve quickly and defy old labels) led to persistent misclassification. Productivity and wage growth appeared to stagnate—until reclassification revealed a hidden surge in the real underlying sectors.
What lessons can be drawn? First, agencies must treat ISIC implementation as an ongoing process, not a one-time technical fix. Regular review of classification protocols, updating examples and guidelines to reflect new industries, and engaging with respondents when codes seem ambiguous are essential. Second, transparency matters. When publishing data, agencies should document the extent of coding uncertainty and, when feasible, publish sensitivity analyses showing how reclassifications would affect headline indicators.
Finally, collaboration with sector experts can help. Industry associations, chambers of commerce, and business consultants often have a more granular understanding of evolving business models than statistical agencies can hope to maintain alone. Periodic consultation can surface “edge cases” that standard coding manuals miss.
Robust ISIC implementation is vital for trustworthy statistics. Without it, the edifice of economic policymaking stands on uncertain ground. Getting the coding right—through smarter survey design, better training, and judicious use of automation—is not glamorous work. But it is, in its quiet way, one of the most consequential tasks that statistical agencies perform.