Have a personal or library account? Click to login
Tax Compliance among Managers: Evidence from Randomized Audits1 Cover

Tax Compliance among Managers: Evidence from Randomized Audits1

By: Thomas Lange and  Anne May Melsom  
Open Access
|May 2025

Full Article

Nontechnical summary: Our study examines tax compliance among managers using data from random audits. We found associations between tax compliance and the use of an external accountant, age, manager’s place of origin and employees’ conflict exposure. However, the manager’s own conflict exposure and corruption levels in their home country did not affect compliance. External accountants seem to encourage managers to comply with tax rules. Machine-learning models confirmed these findings and suggested other relevant factors.

1
Introduction

Norway has one of the world’s most generous welfare states, financed primarily by tax revenues. Since the expansion of the European Union (EU) in 2004, Norway has seen a significant increase in the number and mobility of foreigners in the work force. With a relatively stable work force with respect to the diversity of countries represented post-World War II, Norway has experienced a large shift in this composition in the past twenty years. In 1990, approximately 3.5 percent of the Norwegian population was foreign born, compared to 14.8 in 2021 (Statistics Norway, 2022). Since the year 2000, a large number of immigrants have come from countries with a different taxpayer culture.

Obtaining a high level of tax compliance is essential for tax administrations all over the world, and managers play an important role in setting the standards for adequate compliance levels.

Managers are liable for several additional reporting requirements, and so their compliance behavior may be affected by other factors than their individual compliance preferences. What drives managers’ tax compliance may be determined by both intrinsic factors, such as cultural and social norms, and factors related to their tax reporting duties. Immigrants are of particular interest to study in this context because they typically bring cultural and social norms from their country of origin into a new social and cultural environment in their destination country (Foner, 2014; Potocky and Naseh, 2020). Such effects are of rising importance to modern tax administrations in the era of globalization and international exchange in the work force.

Whereas the literature by and large focuses on individual taxpayer compliance, the drivers behind tax compliance among managers are less studied. In a third-party reporting regime of modern tax jurisdictions, the need for better understanding manager compliance is important because pre-filled individual tax declarations have enabled the tax administration to lift the burden of upholding its trust from individual taxpayers to managers.

In this paper, we use data from random audits carried out by the Norwegian Tax Administration (NTA) to ask which factors drive compliance among firm managers in labor-intensive sectors. The purpose is twofold. First, we use an Ordinary Least Squares (OLS) fixed effects model to estimate marginal effects of how characteristics of managers, known from the previous literature, affect the likelihood of compliance. Second, we introduce two machine-learning (ML) models, namely, one the genetic algorithm (GA) and the other, the least absolute shrinkage and selection operator (LASSO) to perform variable selection of the full data set, which allow us to both to confirm the findings from the OLS model and study a much wider array of factors associated with manager compliance.

While we specify the OLS on individual characteristics of the manager, the variable selection from the ML models points toward characteristics of the firm. The way our ML models are specified, they do a good job in selecting control variables we estimate with OLS as well, and compare the results with our “standard” OLS model. This twofold strategy allows us both to test “common” independent variables from other compliance areas, and to systematically explore a magnitude of reporting variables not previously tested scientifically in the literature. In this context, we do not use the ML models for prediction, but to perform variable selection, even though the results could be interpreted as predictions (Battaglini et al., 2022).

The randomized audits of Norwegian companies concern wage and labor regulation compliance. The on-site audits were executed on a stratified random sample of 1,974 firms in labor-intensive businesses in Norway during the filing and auditing season of 2018. The audits included many questions, checkpoints, and control actions covering several reporting liabilities. Audited firms were notified shortly prior to the audit about which information the NTA would collect. The empirical analysis is divided into two main parts.

The first part studies how managers’ cultural background and residence time is associated with compliance using audit data combined with the Corruption Perception Index (CPI) (Transparency International, 2019) of birth country and the UCDP/PRIO Armed Conflict Dataset (Gleditsch et al., 2002) as institutional proxies for taxpayer culture.

The second part of our analysis is a suggested solution to the problem of variable selection, where we allow for nonparametric inference. We investigate 100 prospective independent variables suggested by NTA auditors contained in our data set. Unlike previous work based on random audit samples, such as Kleven et al. (2011), we extend our analysis by using two different supervised algorithms to perform variable selection from our sample.

Our approach contributes to the current research field by using a wide array of randomized audit data, which enhances the generalizability of findings. Our study is more representative of real-world tax compliance scenarios among managers, making the results applicable to a broader range of contexts. The data from managers’ “natural” tax environment allows for the identifying patterns and trends that may not be apparent in smaller or more narrowly focused studies. By utilizing ML models from other research fields, we unravel a more complete understanding of tax compliance among managers. Both LASSO and GA are known for their variable selection capability from other fields of research, which helps in identifying the most relevant features influencing tax compliance, given both the complexity and quality of our data set. As such, the application of advanced ML models introduces methodological diversity to tax compliance research. By using well-defined and tested algorithms, it broadens the methodological tool kit available for studying tax compliance, enabling researchers to develop new approaches to the research field.

In our OLS model framework, we find that employees’ previous exposure to armed conflict is associated with less likelihood of compliance, but no such associations between managers’ own conflict exposure and compliance. We find positive associations between compliance and age, and the use of an external accountant robust across all model specifications. However, we find no significant effects of home country corruption level. These results are consistent with Kleven et al. (2011), who found that the impact of social and cultural variables is small compared to variables that capture information and incentives. The result also indicates that there are other factors driving tax compliance among managers compared to individual taxpayers. The larger number of reporting liabilities, and thus contact points with the tax administration, may increase both the learning effect from frequent reporting, and perception of tax authority presence and enforcement.

We find that managers of private limited companies (hereafter Private Ltd.) are significantly more likely to be compliant than managers of other organizational forms. GA and LASSO reproduce the positive effects from the OLS models of holding an external accountant. Furthermore, holding a salary system and the number of terms the firm reports salaries or social benefits for employees both have a positive impact on managers’ compliance. We find a negative effect of foreign employees’ exposure to armed conflict. One explanation could be that the potential stress, trauma, or altered perspectives resulting from conflict experiences among employees may indirectly affect the decision-making within the managerial hierarchy. These findings are robust across all three models.

By providing insights into the key determinants of tax compliance among managers, our findings can inform policymaking and strategies for tax administration. Understanding which factors significantly influence compliance allows policymakers to develop more targeted and effective interventions.

The paper is organized into five sections. Section 2 gives a short review of the relevant literature, and how this paper fills the gaps in the existing body of empirical work studying tax compliance. Section 3 presents the data. Section IV describes the three models, namely, the OLS, the GA and, the LASSO. Section 5 estimates the effects of cultural background and residence time (OLS), defines two models from variable selection performed by the GA and LASSO, respectively, and then estimates the effects from these. Section 6 concludes the paper.

2
Related Literature

Since Torgler (2007) and Torgler et al. (2008) studied cross-cultural differences in tax compliance behavior, research on tax morale is typically concerned with both the underlying morale as a cause of tax evasion (Cummings et al. 2009), and, slightly contrary, why tax morale in Western countries is so high, given the low probability of audits and sanctions (Frey and Torgler 2007). While there is some evidence that increasing ethnic fractionalization decreases voluntary tax compliance (Lassen 2007), the historically homogeneous population in, for example, Norway, has traditionally left a whole generation of taxpayers with trusted responsibility to declare their income truthfully.

Previous literature suggests several “cultural” mechanisms that may come into play when studying the associations between culture and tax compliance. Individualistic cultures tend to emphasize personal freedom, autonomy, and self-interest, while collectivist cultures prioritize social cohesion, harmony, and group interests. The review by Marandu, Mbekomize, and Ifezue (2015) suggests that individualistic cultures may exhibit lower tax compliance due to a greater focus on self-interest and a weaker sense of duty toward the broader society. Trust and social capital may also influence tax compliance. Societies with higher levels of trust and strong social networks may exhibit higher tax compliance, as individuals are more likely to cooperate and comply with tax obligations due to social norms and expectations. Gangl, Torgler, and Kirchler (2016) give some support for this claim, in that social capital fosters important prosocial behavior and increase citizens’ cooperation with the state. Bornman (2015) finds that positive attitudes toward government and perceptions of government legitimacy are likely to increase tax compliance. Conversely, if individuals perceive the government as corrupt, ineffective, or illegitimate, they may be less inclined to voluntarily comply with tax laws (Torgler 2004).

Societies with a stronger emphasis on fairness and support for wealth redistribution and social justice may exhibit higher tax compliance rates (Hofmann, Hoelzl, and Kirchler 2008). Bobek, Hageman, and Kelliher (2013) find that social norms have direct as well as indirect influences on tax compliance behavior. Societies with strong norms of honesty, fairness, and cooperation may thus exhibit higher levels of tax compliance due to the influence of social pressure and reputational concerns.

To our knowledge, there are few previous studies of drivers behind poor tax compliance among managers except for Joulfaian (2000), which attributes manager tax noncompliance to understatements in their own personal income tax, and some developing country perspectives on small and medium enterprises (SMEs) under a completely different tax liability regime, with little or no third-party reporting (Atawodi and Ojeka, 2012; Musimenta et al. 2017). While Joulfaian (2000) finds that noncompliant firms are more likely to be managed by executives understating their personal taxes, the paper suggests few other controls that may form managers’ preference for noncompliance.

Alm, McClelland, and Schulze (1992), Andreoni, Erard, and Feinstein (1998), Feld and Frey (2002), and Feld and Frey (2007) argue that factors such as social norms, tax morale, patriotism, guilt, and shame explain variances in tax compliance. Tsakumis, Curatola, and Porcano (2007) find that national culture, as proposed by Hofstede (1984), is a significant factor in explaining tax evasion levels across fifty countries. Thus, depending on the options for noncompliance, as e.g., the level of third-party reporting, taxpayers may be affected by noneconomic drivers in their compliance behavior.

While studies have found tax morale in the country of origin to be a significant determinant of tax morale among immigrants in the destination country (Kountouris and Remoundou, 2013), few attempts have been made to test the common assumption in various tax administrations that immigrants become more compliant with time spent in the destination country.

The study of corruption norms and tax morale have gained increased attention, but the field is still small. Alm, Martinez-Vazquez, and McClellan (2016) and Jahnke and Weisser (2019) find that corruption drives higher levels of evasion. But their focus is the evasive effects of bribery of tax officials in the home country context, rather than the corruption level of the environment in which these tax officials operate. There are some examples of a focus similar to ours in the literature, however. Cummings et al. (2009) find a significant correlation between tax compliance behavior and tax morale in South Africa and Botswana, using CPI scores as one proxy for tax morale. Fisman and Miguel (2007) use another index for corruption, namely, that of Kaufmann, Kraay, and Mastruzzi (2005) and find a strong effect of corruption norms on diplomat parking violations, and a significant effect on enforcement. However, their environment deviates from ours in several respects. They study less complex parking regulations, whereas our focus is complex tax reporting liabilities. In their context, diplomats have few contact points with legal authorities, whereas managers in our context have more frequent contact points with the NTA. Parking regulations are not part of diplomats’ day-to-day business, whereas reporting obligations are integrated in the manager’s responsibilities running a legal business.

DeBacker, Heim, and Tran (2015) find that corporations with owners from more corrupt countries evade more US tax than owners from less corrupt ones. Common for earlier approaches is either the use of survey data, or use of risk-based tax audit data, an exception being Bastani, Giebe, and Miao (2020), who utilize population-wide register data to investigate differences in the use of commuter deduction in tax filing between immigrants and Swedish natives. They find less filing among recently arrived immigrants than their native fellows. But their study neither concerns tax noncompliance nor manager reporting liabilities, but rather filing of legitimate claims or lack thereof. However, the time effects they describe may be relevant for our study as well, since residence time in Norway also may affect managers’ compliance.

Alm, Deskins, and McKee (2006); Kleven et al. (2011); Bjørneby, Alstadsæter, and Telle (2018); and Alm (2019b) find that third-party reporting has increased compliance among individual taxpayers, simply because the opportunities for evasion have been reduced. Whenever third-party reporting is low, Kleven et al. (2011) show Allingham and Sandmo (1972) still has merit; when the options for evasion are present, evaded tax is a function of probability of detection and a penalty for withholding. Some have shown that various tax administration measures, such as shifting tax remittance (Kopczuk et al. 2016) and public disclosure (Bø, Slemrod, and Thoresen 2015) lead to improvements in tax compliance. While the opportunities for evasion diminish with increasing levels of third-party reporting, this may not be the case for managers.

Shifts in tax compliance due to temporary or permanently demographic changes in the taxpayer population has, to our knowledge, not been under scrutiny. There may be several reasons for this. Rapid demographic changes in a population are often caused by immigration. Some studies exist on the effects of tax revenue from migration, but these few focus either on macro revenue effects of immigration (e.g., Harding and Mutascu (2016)), differences in the use of tax deduction among various immigrant groups in Sweden (Bastani, Giebe, and Miao, 2020), or the effects of tax rates on the migration of top income earners (e.g., Young and Varner, 2011; Kleven et al., 2014). Neither discuss compliance-related effects from migration or demographic changes. In other words, in the economic literature there are few contributions on the connection between migration, demographic changes, and tax compliance, especially at the management level.

ML and GA in economics have previously been applied in stock market price predictions (Sable, Porwal, and Singh 2017), financial asset portfolio selection (Sefiane and Benbouziane 2012), or determination of real estate prices (Del Giudice, De Paola, and Forte 2017). Such models have various applications in variable selection problems (Broadhurst et al., 1997; Tolvi, 2004; Liu and Ong, 2008; Cateni, Colla, and Vannucci 2010), but to our knowledge, GA models are not used in the field of taxation. A more commonly used ML algorithm in econometrics is the LASSO (Belloni et al. (2012); Chalfin et al. (2016); Pereira, Basto, and da Silva (2016); Fonti and Belitser (2017); Mullainathan and Spiess (2017); Hansen and Liao (2019)).

3
Data

We use data from randomized audits of Norwegian companies on wage and labor regulation compliance. Auditors collected company information and documentation on 1,974 random firms in labor-intensive businesses in Norway. The audits covered many reporting liabilities, primarily within the area of wage reporting, as, for example, salary accounts, documented time use, tax deductions, staff registers, employment contracts, time sheets, overtime payments, etc. Interviews and meetings with NTA auditors in the early stage of the random audit program resulted in a comprehensive list of 303 variables, of which 100 could be used for regression analysis.2 The audits in our study focus on business reporting routines and standards rather than undeclared income. Thus, we use correct tax deduction, the existence of payroll accounts, correct monthly reporting, and the existence of general accounts to define compliance. The employer/manager bears the full responsibility to ensure compliance on all areas covered by our dependent variable.

Combining the data with information on residence time in Norway for the manager (owner or CEO) of the firm, and the CPI of birth country, the UCDP/PRIO Armed Conflict Dataset (Gleditsch et al. 2002) as institutional proxies for taxpayer culture, we can study the influence of residence time on tax compliance among managers. The audited companies have been randomly selected, but they were not randomly assigned to auditors. Thus, we use auditor fixed effects to control for systematic differences between auditors, and fixed effects on two-digit NACE-code (Statistical classification of economic activities in the European Community) to control for systematic differences between sectors.

3.1
Population and Sample

The 1,974 audited companies were randomly drawn from a target population of about 31,000 companies, representing 13 percent of all Norwegian businesses. When defining this target population, the NTA started off with all companies defined as legal employers who reported working conditions and other tax-relevant information in the monthly reporting (A-melding) to the NTA during the year 2017 (Norwegian Tax Administration 2019). The NTA then restricted this main population in order to target the audits to industries and businesses more at risk of noncompliance. Hence, the randomization is not valid for the whole taxpayer population, but rather a subpopulation where the risk perception is higher than average. This limitation is based on a total of fourteen criteria that are explained in more detail in appendix, table A1. It implies that the number of industries was reduced from 38 to 22, and that the number of businesses was reduced from 231,753 to 30,961.

Country of birth and citizenship are taken from the National Population Register. Information about the company’s revenues and expenses in 2016 is taken from business reports to the Tax Administration (Income Statement I and II), while turnover is taken from the VAT Register. Information on NACE codes; establishment dates and termination dates are obtained from the Register of Legal Entities.

The target population is stratified according to the industry branch specified in the appendix, i.e., each industry branch represents a stratum, or industry sector. The purpose of the stratification is primarily to enable a more effective selection for the audits, to draw more businesses from industries with a large proportion of working conditions associated with increased risk of noncompliance. The sample selection method is proportional to the number of reported, foreign employees arrived in the last three years, combined with a minimum and maximum number of businesses in the lower and upper part of the distribution.

3.2
Variables

The main dependent variable is firm compliance, which is an individual score of the most central, responsible person in the firm, i.e., typically a CEO or an owner. The dependent variable is not a continuous variable in reported income or tax liability as in, for example, Kleven et al. (2011), but rather a binary variable intended to measure compliance by any individual manager responsible for firm reporting liabilities, equivalent to that of Fellner, Sausgruber, and Traxler (2013). It takes the value 1 if there is no error in the four variables for the year 2017: tax deduction, existence of payroll accounts, correct monthly reporting, and existence of general accounts.

Managers are liable for more reporting and contact points with the tax administration, and thus more subject to regulatory oversight. Even if the perceived audit probability is low, the frequency of contact points (e.g., monthly reporting) may create a stronger incentive for managers to comply, although the evidence on this effect seems to go in both directions (Snow and Warren 2005). Furthermore, managers are often responsible for establishing and implementing internal governance mechanisms within the firm. They have the authority to develop and enforce policies and procedures to ensure compliance with various regulations, including tax liabilities. Failure to enforce tax compliance within the organization can be seen as a failure of managerial responsibility, leading to potential internal control and governance issues (Alm 2019a). Managers, therefore, have a vested interest in promoting tax compliance as part of their overall responsibilities. Finally, managers with ownership or financial holdings in the firm have a direct financial interest in maintaining compliance. Tax compliance ensures the stability and sustainability of the firm’s operations (Bird and Davis-Nozemack 2018), but there is mixed evidence for a positive association between tax compliance and a firm’s financial performance and value (Watson 2015).

In our context, manager compliance is closely related to reporting obligations where only the manager is liable, not the employee. Thus, there are different evasion schemes available to the manager than to an employee, and thus more opportunities to evade. However, to which extent noncompliance can be due to honest mistakes or deliberate withholding of information, is still difficult to assess. The random audits focus on business reporting routines and standards rather than undeclared income. Tax deduction is one direct route in which the manager may underreport. Not having payroll accounts and general accounts of the firm, represent two other opportunities for noncompliance. The firm may intentionally underreport the number of employees or misclassify them as independent contractors or consultants. By doing so, they can avoid the obligation of deducting and remitting payroll taxes. The firm also may engage in cash transactions or pay employees “off the books” without proper documentation or recording. This allows them to avoid reporting the income and associated payroll taxes.

An error in the firm’s monthly reporting can involve an employer deliberately reporting lower income for employees than what is actually paid. This can lead to employees evading taxes on the unreported income, but it presumes some sort of collusion between the employer and employee, especially in a tax regime with high levels of third-party reporting (cf. Bjørneby, Alstadsæter, and Telle (2021)). Other items in the monthly reporting may also provide opportunities for evasion, such as providing incorrect information about deductible expenses or attempting to claim deductions for expenses that are not eligible. Failure to report the correct number of employees can be an attempt to evade tax liabilities and other tax-related obligations related to the firms’ workforce.

Despite other opportunities to evade, managers may also have stronger incentives or reasons for compliance than employees. For instance, managers may not themselves gain much from noncompliance, as they mainly report for others, i.e., their employees and owners, which could partly explain a high compliance rate among managers. Also, firms whose owners care about compliance may hire managers who are likely to be compliant or monitor their compliance more. Managers typically possess a higher level of responsibility and accountability within organizations. Noncompliance with tax obligations may expose the firm to legal risks, including penalties, fines, and potential legal actions. Moreover, a manager’s tax noncompliance can tarnish the firm’s reputation, leading to negative publicity, loss of customer trust, and damage to long-term business relationships. Managers, being responsible for the overall functioning and success of the organization, have a greater stake in safeguarding the firm’s legal and reputational standing.

As control variables, we add the managers’ gender (men as 0 and women as 1), age as a continuous variable, and whether they have a foreign background. Managers with a different birth country than Norway are defined as “Foreign.” If information on birth country is missing, we use citizenship. For foreign managers, residence time is calculated based on arrival date to Norway. For Norwegian managers without any registered long-term stays abroad, residence time coincides with age. In the regression models, we do not assume a linear relationship between residence time and compliance, but divide the sample into three different residence time groups, in line with Bastani, Giebe, and Miao (2020) (i.e., 0–5 years, 5–10 years, and over 10 years). There are no Norwegian managers in the first two residence groups.

A foreign manager’s CPI score gives the score for this person’s birth country at the year of arrival to Norway. Norwegian managers are given the CPI score for Norway in 2018. We want to capture the manager’s “last impression” of his or her country of origin at the time of arrival in Norway. To do so, we use the CPI score of the year of arrival in Norway as a representation of a time-precise image of the tax morale in the country of origin. We assume that any development in this index, following arrival in Norway, will not affect individual tax morale, and that immigrants assimilate the tax morale of Norway with time. Whenever this value is missing for that particular year, we have used the nearest value available. We have also inverted the CPI scale so that a higher number reflects a higher corruption level as a continuous variable. Testing a CPI dummy where managers with CPI scores at the same level of Norway or lower are given the value 1, and 0 otherwise, rendering no significant CPI-coefficients.

One may argue that the last year in the home country is not representative of the corruption level experience in their country of origin, especially if a regime changes or government volatility leads people to leave the country. In that case, a cumulative average or the level at a certain age might be more appropriate. But there are also caveats with such an approach, as an average will not capture recent changes (in either direction), and such changes may influence a person more than their lifetime perception. Nevertheless, CPI scores do not change much over time for the countries in the sample; see appendix, table A1B.

We include armed conflict as a dummy. Conflict is synonymous with the destruction of trust between ethnic, religious, or other groups, and institutions typically do not bounce back immediately after a cycle of violence (Collier and Sambanis, 2002; Bellows and Miguel, 2006; Miguel, Saiegh, and Satyanath, 2011. Feldman and Slemrod (2009) find a positive effect of external threats on compliance attitudes because external threats may affect social identification and patriotism. However, war exposure in their setting is limited to violence outside the country of residence and will most likely not reduce trust nor increase trauma in the population, in the way we expect civil war to impose. Lange and Melsom (2022) find a counterintuitive positive effect on employees’ compliance from the ratio of employees in the enterprise exposed to armed conflict, and so we include this variable in our OLS models as well.

We use a recovery period of twenty-five years, i.e., an immigrant exposed to armed conflict from zero to twenty-five years prior to registration in Norway, gets the value 1, and 0 otherwise. Thus, we do not measure armed conflicts older than twenty-five years. This specification follows that of Miguel, Saiegh, and Satyanath (2011). The restriction entails that 37.8 percent of the foreign managers in our sample come from countries involved in armed conflicts. Some of these are Western countries, such as the US and the UK, who have intervened in armed conflicts outside their own territory. However, the number of managers from these countries is small, and unlikely to drive the results. For details on the conflict dummy and which countries it comprises, see appendix, table A4. We have also tried different versions of the conflict dummy, measuring exposure up to five, ten, fifteen, and twenty years prior to migration. None of these yield significant coefficients.

We include the use of an external accountant as a control variable. An external accountant provides an “arm’s length” distance from the manager, and typically has more accounting competence than the latter.

The first stage of the GA and LASSO performs variable selection among the 100 independent variables in our data set. The selected variables from these model runs are included in the summary statistics table. Thus, Table 1 gives summary statistics for variables in all models, broken down by Norwegian versus foreign managers. In appendix table A2, summary statistics are broken down by compliant versus noncompliant managers.

Table 1:

Descriptive statistics.

Independent VariablesNorwegian (N=1,580)Foreign (N=394)tp > |t|
Female0.1610.224–2.9360.003
Age50.92644.23710.9510.000
Residence time51.51517.61253.6520.000
Conflict0.0000.378–25.4220.000
CPI score15.75049.139–52.2780.000
External accountant0.7860.873–3.8890.000
Conflict employees0.0500.202–16.3980.000
Ltd. company0.8330.843–0.4650.642
Salary system0.7000.6830.6670.505
Work training0.1090.112–0.1800.857
Job advertisement0.2160.1552.7150.007
Timesheet0.4770.531–0.8190.413
Terms10.28810.2510.2010.840
Audit employees3.9254.225–4.6070.000
Self-employed0.1630.13711.9330.000
Dependent Variable (2017)
Firm compliance0.8950.8582.0840.037
Tax deduction0.9500.9212.2200.026
Payroll accounts0.9650.967–0.1770.860
Monthly reporting0.9660.9640.1320.895
General accounts0.9970.9970.0020.998

Note: Columns 2 and 3 show the mean value of the variable (Column 1) among Norwegian and foreign managers. Column 4 shows the t-value on the differences between compliant and noncompliant managers. We use the z-test (test of proportion) for binary outcomes and the t-test for continuous outcomes. Column 5 shows the p-value for the test. Female, Conflict, External accountant, Ltd. company, Salary system, Work training, Job advertisement, and Self-employed are dummy variables. Age, Residence time, CPI score, Conflict employees, Timesheet, Terms, and Self-employed are continuous variables. Female takes value 1 if the manager is female. Age value is years. Residence time value is years. Conflict takes the value 1 if the manager was exposed to armed conflict in their home country up to twenty-five years prior to migration. CPI score is the inverted CPI score of the manager’s country of origin at the year of arrival to Norway. External accountant takes the value 1 if the firm has outsourced external accountant services. Conflict employees is the fraction of employees in the firm exposed to armed conflict in home country up to twenty-five years prior to migration. Ltd. company takes the value 1 if the firm is registered as a private limited company. Salary system takes the value 1 if the firm has a digital salary system. Work training takes the value 1 if the firm has registered employees in the work training program subsidized by the Norwegian Labour and Welfare Administration (NAV). Job advertisement takes the value 1 if the firm has advertised for vacancies in Norway. Timesheet describes the number of employees with incorrect timesheets. Terms describes number of terms (one through twelve) the firm reports salaries or benefits for any employee to the NTA. Audit employees is the number of audited employees per firm (zero through nine). Self-employed takes the value 1 if the firm is registered as a sole proprietorship at the NTA.

In the sample, 89.50 percent is compliant. In addition, there are only 394 foreign managers, who constitute about 20 percent of the sample. The results on residence time, CPI score, and conflict exposure are thus based on a quite small sample with relatively little variation. Both appropriate sample sizes and sufficient variation in the dependent variables are necessary to provide reliable, reproducible, and valid results. Even though a larger sample with more variation is always preferable, our sample is significantly larger than other studies using randomized data (Blackford, 2017; Jenkins and Quintana-Ascencio, 2020. It has been difficult to obtain consistent and clear guidelines for minimum N in regression analyses, but the sample size in our study far exceeds the recommendations in a recent study attempting to do so (Jenkins and Quintana-Ascencio 2020).

Of the foreign managers, 22.4 percent are female, compared to 16.1 percent of the Norwegian managers. 87.3 percent of the foreign managers use external accountants, compared to 78.6 percent for the Norwegian managers. The fraction of employees exposed to armed conflict is larger in foreign-managed firms (20.2 percent) than in Norwegian-managed firms (5.0 percent). Company characteristics such as company type, salary system, and work training are quite evenly distributed between the two manager groups. Foreign managers advertise less for vacancies than their Norwegian counterparts.

4
Empirical Strategy

Commonly used dependent variables in studies of tax compliance are differences in self-reported income tax pre- and post-audit (Kleven et al. 2011), tax deficiency to revenue ratio (DeBacker, Heim, and Tran 2015), changes in federal income tax deposits (Boning et al. 2020) or changes in employer-reported income tax (Bjørneby, Alstadsæter, and Telle, 2021).

Extensive use of third-party reporting and employers’ tax withholding are powerful mechanisms to ensure compliance, unless the employer and employee collude to evade (Bjørneby, Alstadsæter, and Kelle, 2021). Thus, tax evasion may very well be partly influenced by the employees’ decisions, and not solely by the managers. Unlike previous contributions who seek to measure tax evasion through changes in individual tax remittance, we measure managers’ tax compliance through their own, direct reporting liabilities. While we cannot infer that variance in these liabilities is due to evasion, this variance is typically not attributable to other parties’ behavior, rather than the managers.’ Noncompliance on the reporting requirements of our dependent variable leads to revenue losses through lower remittance of employee tax and payroll tax.

In this paper, we use two strategies for variable selection. First, the "traditional" approach of testing established relationships from adjacent research literature, using the independent variables from other compliance areas or tax compliance literature. This is the main model. Second, given the large volume of variables available in our sample, we first use a GA to guide the variable selection, and then run a linear regression on a restricted selection. We also run a LASSO model to test whether the LASSO algorithm will select other independent variables. We use both models to validate the results from the OLS model. To obtain inference relevant to the target population, we will cluster standard errors by stratum (NACE codes) (Solon, Haider, and Wooldridge 2015).

The choice of our ML strategy requires a justification. Battaglini et al. (2022) test a random forest model to predict and improve tax auditing efficiency using nonrandomized administrative data. While random forest models may also be used for variable selection, our assessment is that variable selection is not where such models have their advantage. Random forest does not explicitly handle noisy or irrelevant variables. While it can indirectly identify such variables by assigning them lower importance scores, there is no built-in mechanism to explicitly filter out noisy features, as is the case with both GA and LASSO. In some cases, irrelevant variables may still contribute to the variable importance scores and lead to suboptimal selection outcomes. The performance of random forest, including variable importance, can also be sensitive to the choice of hyperparameters such as the number of trees, tree depth, and feature subsampling rate. Suboptimal hyperparameter tuning may affect the accuracy of variable importance rankings and subsequent variable selection decisions.

The data used in this paper provide detailed information on managers’ compliance behavior, firm characteristics, and certain individual characteristics such as age, gender, foreign background, and residence time in Norway. However, there may be many individual characteristics that may be correlated with compliance but are not observed in these data, and so the estimates will be plagued by omitted variables bias. As an example, the quality of the education is probably positively correlated with a manager’s ability to navigate complicated tax laws (which may both make the manager more likely to be able to comply, but probably also more likely to be good at evading taxes), and correlated with age, gender, residence time in Norway, and whether the manager comes from a conflict zone. So unobserved variables may confound the associations we observe if they are related to both compliance behavior and our key independent variables. The Tax Administration has no information on, for example, the managers’ field or level of education. Further studies are needed to determine whether there is a causal relationship between residence time, foreign background, and tax compliance. Nevertheless, the supervised ML approach to variable selection allows us to test many variables from the audits that could affect compliance as well, ruling out potential bias from these. The novelty of our contribution is thus that we have information on variables omitted in other papers.

In econometric regression analysis, several hypothesis tests are often performed simultaneously, and this applies to the regressions in this paper as well. The problem then becomes how to decide which hypotheses to reject, or more precisely, whether significant effects that emerge after many different hypothesis tests are in fact real or spurious. Romano and Wolf (2005) proposed a step-by-step test procedure which, compared to related test methods, is “more powerful” and will more often reject false hypotheses. We have run Romano–Wolf tests according to the procedure described in Clarke, Romano, and Wolf (2019) on all model specifications, and find that most corrected p-values as a result of the test are robust. Romano–Wolf implicitly captures common dependency structure in the test statistics, resulting in an increased ability to detect false hypotheses. Hence, the Romano–Wolf gives more robust tests than traditional tests for multiple hypotheses such as Benjamini and Hochberg (1995) or Bonferroni (1936). We include the Romano–Wolf test statistics in the results tables.

4.1
OLS Model Specification

In the OLS model, we explain manager compliance through three independent variables, namely, residence time in Norway, CPI score of native country upon migration to Norway, and armed conflict exposure in the past twenty-five years upon migration to Norway. We include the control variables gender, age, foreign-born, fraction of employees exposed to armed conflicts, and the use of an external accountant. We use fixed effect regressions on four-digit NACE codes to control for differences between industries and, as mentioned we also use fixed effect estimations to control for systematic differences between auditors.

We run one OLS without fixed effects to study the overall effects, irrespective of industry sector variance, and then two OLS with fixed effects, first on NTA auditor ID, and then on both NTA auditor ID and industry sector (NACE code on the two-digit level). To obtain consistent estimates, we cluster standard errors by NACE-code (two-digit level). We estimate specifications of the following type: 1Yi=β1Ti+β2CPIi+β3Wari+β4Xi+εi,{Y_i} = {\beta _1}{T_i} + {\beta _2}CP{I_i} + {\beta _3}Wa{r_i} + {\beta _4}X_i^\prime + {\varepsilon _i}, where the dependent variable is compliance by manager i. Ti is residence time of that person, CPIi is his or her CPI score, and Wari is the exposure to armed conflict. X is vector controls, including gender, age, and other variables of interest. As managers are exposed to the Norwegian compliance environment over time, we expect the time coefficient to be positive. Both CPI and war should influence compliance negatively because they reflect lack of trust in government, and so we expect the coefficients of these two variables to be negative. There may be problems with potential multicollinearity, which we address by estimating the variance inflation factor (VIF) in the robustness section.

4.2
GA Model Specification

We rely on Broadhurst et al. (1997) in specifying our GA. This is a stochastic optimization technique in which a population of n subsets is created, each containing a random variation of variables. Then, the cost function for each subset is sequentially evaluated, creating a new population. We then apply a weighted random selection to the original population, where the probability of a particular subset being selected is a function of its cost function response (i.e., the better the cost function response, the greater the chance of selection). The selection process is repeated until n new subsets are created and their cost functions are evaluated; the selection process is repeated until a stopping criterion is reached.

The GA model consists of two parts: first, a variable selection, where independent variables are selected through a specified number of iterations, and second, a linear regression is run over these selected variables. We limit the stopping criterion to eight independent variables. Allowing a higher stopping criterion will, eo ipso, increase the explanatory power, but also include variables with very low coefficients, despite their being statistically significant. LASSO deals with this problem through the tuning parameter. We build the model on a training set using .50 of the data and validate the results on the remaining .50. A total of 100 independent variables (continuous and dummies) from our random audit data set are included in the iterations. We set the number of iterations to 150, as the fitness function does not improve beyond this cut-off; see appendix figure A1.

The GA model aims to find the best OLS-model on the eight-variable subset of the 100 independent variables that we have available. To do so, it randomly generates an initial population of 200 possible models of eight variables from our 100 available variables and performs the OLS on all of them. The resulting set of 200 eight-variable OSL models are then ranked and the most promising ones, those with the lowest mean squared error (MSE), are crossed to form a new population of 600 eight-variable models. The best ones are crossed to form 600 new possible models, and the algorithm continues. This algorithm eventually converges to a local maximum of the best OLS model on eight variables and halts. To explore the different local maxima, the GA algorithm is re-initialized 2,000 times.

The GA algorithm selects the following unique linear model in 754 out of 2,000 runs: 2Yi=α+β1Xi+εi,{Y_i} = \alpha + {\beta _1}X_i^\prime + {\varepsilon _i}, where the dependent binary variable is compliance by manager i, α is a constant, and X is a vector of the following variables: private limited company, salary system, work training, external accountant, job advertisement, timesheet, terms, and conflict employees. ε is the error term. Note that residence time is not selected by the GA, and thus is not estimated. The results from the first stage of the model (variable selection) are shown in the appendix, table A5. In the final step, we run this model on the total sample with fixed effects on NTA auditor ID, and then on industry sector (NACE code), and cluster standard errors by NACE code (twodigit level).

4.3
LASSO Model Specification

More common than the GA model in economics are Ridge and LASSO regressions (Pereira, Basto, and da Silva 2016; Hansen and Liao, 2019). While Ridge regressions are more suitable in multicollinear data containing a higher number of independent variables than observations, LASSO regressions are suited both for models with high levels of multicollinearity, or when we want to automate variable selection, perhaps because no theory is available to guide this selection. In our data set, the observations outnumber the variables by far, so a Ridge model is not suitable. A LASSO model will in our case also fit the purpose, but our assessment is that the LASSO entails more preconditions (such as “best subset” under a regularized lp-norm (Zhou, Zhang, and So, 2015) than the GA, so that the variable selection is more restricted. The upside of the LASSO compared to the GA is that the latter typically include some variables with little predictive power in the regression equation, which are likely be removed in a LASSO model.

Like the GA model, a LASSO also performs variable selection. The results are often easier to interpret than those of a linear regression because the dependent variable will only be explained by a small subset of the predictors, i.e., those with nonzero coefficient estimates (James et al. 2013). The LASSO coefficients minimize the quantity: 3i=0n(yiβ0i=0nβjxij)2λj=1p| βj |=RSS+λj=1p| βj |,{\sum\limits_{i = 1}^n {\left( {{y_i} - {\beta _0} - \sum\limits_{i = 0}^n {{\beta _j}{x_{ij}}} } \right)} ^2} - {\rm{\lambda }}\sum\limits_{j = 1}^p {\left| {{\beta _j}} \right| = } RSS + {\rm{\lambda }}\sum\limits_{j = 1}^p {\left| {{\beta _j}} \right|} , where i=1n(yiβ0i=0nβjxij)2{\sum\nolimits_{i = 1}^n {\left( {{y_i} - {\beta _0} - \sum\nolimits_{i = 0}^n {{\beta _j}{x_{ij}}} } \right)} ^2} is the residual sum of squares (RSS), and λ is the tuning parameter determining the punishment of the large coefficients, i.e., if λ = 0, then the model is equal to a standard OLS. The only hyperparameter we need to determine when running the LASSO algorithm is the λ-value. Following James et al. (2013), we use tenfold cross-validation optimized for the MSE to select the optimal λ-value. However, choosing the folds for the cross validations introduces randomness in the LASSO algorithm, and hence cross validation generates different values for optimal λ. To avoid selecting a λ-value at random, we run 2000 tenfold cross validation and end up with fifteen unique values for λ, and hence fifteen different models, estimating the effects of from nine and up to twenty-nine independent variables.

After 2,000 runs on tenfold cross validation and selecting the model with the closest approximation to the OLS and GA on the number of variables, the resulting model is 4Yi=α+β1Xi+εi,{Y_i} = \alpha + {\beta _1}X_i^\prime + {\varepsilon _i}, where the dependent binary variable is compliance by employer i, α is a constant, and X is a vector of the following variables: private limited company’ self-employed, salary system, work training, external accountant, audited employees, timesheet, terms, and conflict employees. ε is the error term. As for the GA, we run this LASSO model on the total sample with fixed effects on NTA auditor ID, and then on industry sector (NACE code), and cluster standard errors by NACE code (two-digit level). The results from the LASSO variable selection are shown in the appendix, table A6.

5
Results
5.1
OLS Model Results

The results from the OLS model runs are presented in Table 2.

Table 2:

Results from the OLS and fixed effects models.

(1)(2)(3)
Female0.029 (0.020)0.033+ (0.019)0.014 (0.021)
Age–0.002*** (0.001)–0.003*** (0.000)–0.003*** (0.000)
Foreign–0.043+ (0.024)–0.045+ (0.024)–0.050* (0.023)
<5–0.121 (0.086)–0.166 (0.101)–0.165+ (0.095)
5–10–0.025 (0.040)–0.035 (0.037)–0.039 (0.036)
Conflict–0.005 (0.030)0.053* (0.026)0.056* (0.027)
CPI score0.001 (0.001)0.000 (0.000)–0.000 (0.000)
External accountant0.096*** (0.021)0.092*** (0.022)0.092*** (0.023)
Conflict employees–0.108 (0.086)–0.125 (0.075)–0.152* (0.066)
Constant0.938*** (0.027)
Observations1,9591,9361,928
R-squared0.0290.1660.236
Romano–Wolf bootstrap p-valuesOriginalRomano–Wolf
Female0.0690.079
Age0.0010.000
Foreign0.0370.050
Residence time0.0690.069
Conflict0.2530.228
CPI score0.4900.446
External accountant0.0000.000
Conflict employees0.0420.040

Note: Estimated coefficients from OLS model runs. Standard errors in parentheses. Column (1) is from OLS without fixed effects, column (2) is from OLS with fixed effects on NTA auditor, and column (3) is from OLS with fixed effects on NTA auditor and NACE code (two-digit). Residence time >10 years is the reference category and therefore is omitted in the table. Standard errors are clustered by NACE code. Romano–Wolf test statistics are given by the original and Romano–Wolf p-values of each independent variable.

***

p<0.001

**p<0.01

*

p<0.05

+

p<0.10

Foreign managers are 5 percent less likely to reach full compliance (3). Bastani, Giebe, and Miao e(2020) reveal that immigrants are more likely to miss the declaration deadline and to be fined for noncompliance, regardless of their residence time in Sweden, but no such effects are replicated in this context. Part of their explanation is the language barrier among immigrants, which may be lower among non-native managers in our sample. On the contrary, we find only a significant and negative effect among the most recently arrived (<5), suggesting a learning effect after fiv years of stay. We find lower coefficients for those managers with residence time < 5 years than those with 5–10 years, compared to those with >10 years of stay in Norway. This tendency brings again some associations to Bastani, Giebe, and Miao (2020), who find that the probability of taking up the commuting deduction in the Swedish tax system is lower among immigrants with residence time < 5 years than those with 5–10 years of stay in Sweden, compared to Swedish natives. As the significance level of our result is weak, one should be careful to suggest common influence from confounding variable(s), even though the patterns appear similar to that of Bastani, Giebe, and Miao (2020).

We find a negative association between an employee’s conflict exposure and a manager’s compliance (3). While Lange and Melsom (2022) find a positive effect on employees’ compliance from other peer employees’ conflict exposure, the effect on manager compliance is negative. A 10 percent increase in the fraction of employees exposed to armed conflict is associated with a 15.2 percent decrease in probability of management compliance (3). As a robustness test, we have run all model specifications with each of the four components of the dependent variable. From the appendix, table A7, we see that only the component “Tax deduction” is driving this result, i.e., a 10 percent increase in the fraction of employees exposed to armed conflict is associated with a 14.9 percent decrease in probability of correct tax deduction filing by the manager. One explanation could be selection. Firms with many marginalized employees, such as war refugees, may have other characteristics that result in poorer compliance, i.e., higher staff turnover or less experienced managers.

We find no stable associations between the manager’s own conflict exposure and compliance, and no effects from CPI score in this sample. Fisman and Miguel (2007) find that home country corruption norms are an important predictor of propensity to behave corruptly among diplomats, but the cultural mechanism explaining noncompliance, i.e., unpaid parking violations, in their sample, seems not to be at play in our case. One explanation may be that codes of conduct in the workplace environment “eradicate” or “neutralize” cultural background characteristics such as corruption because managers are exposed to tax reporting standards as an integrated part of their everyday business. Diplomats’ parking routines, or lack thereof, on the other hand, are not an integrated part of their occupations as diplomats, but rather their personal character. Indeed, Fisman and Miguel (2007) find that the time and space of many violations is strong evidence that these are not even work-related and that third-party reporting closes the incentives to misreport.

We find a small negative, but statistically significant, effect of age on management compliance across all three model specifications. The age effect is stable, i.e., the probability of management compliance decreases with a ratio of .003 (3) as the manager gets one year older. Nevertheless, the sign of the age coefficient may not be surprising, as age correlates with seniority in any position, and thus the more senior, the more knowledge about the loopholes in the tax system. Furthermore, perhaps managers over time get a more realistic view of the (low) probability of audit selection by the tax authorities. This effect may also be explained by a higher understanding of tax legislation among managers in our sample than in the population in general. The age-learning effect may be equivalent to the age gradient identified by Bastani, Giebe, and Miao (2020). They find an age gradient in the take-up of commuter deductions for natives and immigrants with long residence time in Sweden, who presumably have adapted more to the tax system compared to newly arrived and younger immigrants. Furthermore, this learning effect resembles a learning effect in Fisman and Miguel (2007), where diplomats become bolder in their violations once they “successfully ‘got away with it’ a few times (or heard stories about others doing so).” (Fisman and Miguel 2007, 1042)

However, the negative age-effect partly contradicts more recent findings. Hofmann et al. (2017) conducted a comprehensive metastudy on tax compliance across sociodemographic categories, including age, and find a small positive, but significant relation between the age of taxpayers and their tax compliance. This confirms findings in Nordblom and Žamac (2012) and Kirchler (2007), as well as older studies, such as Tittle (1980), Witte and Woodbury (1985), Dubin and Wilde (1988), Feinstein (1991), and Hanno and Violette (1996). Ashby, Webley, and Haslam (2009), Braithwaite and Ahmed (2005), and Muehlbacher, Kirchler, and Schwarzenberger (2011) find no age effect, however.

Furthermore, we find a significant positive effect of an external accountant across all model specifications. In column (3) of Table 2 we observe that hiring an external accountant is associated with an increase in compliance with a ratio of .092 when controlled for auditor and industry sector. Most of the effect stems from the mandatory monthly reporting; see appendix, table A7. The effect is expected, as one would typically infer that accountants possess more knowledge on tax filing and liability than business managers in particular and people in general. The positive effect may both be causal and due to selection. Using an external accountant means hiring someone who is authorized to keep accounts and is obliged to ensure compliance to rules and regulations. In addition, using an external accountant is not mandatory. Thus, it seems likely that firms more concerned with compliance will do so to a greater extent than other firms. Saad (2014) finds some degree of trust in accountants’ tax knowledge as reasons for outsourcing tax filing. Further to this, managers with different characteristics and compliance attitudes may be selected to different sectors. However, adding fixed effects on NACE code (two-digit) on the third model specification, does not alter the significance level, and so this prospective selection bias is unlikely to drive the results.

For foreign managers, residence time is calculated based on arrival date in Norway. For Norwegian managers without any registered long-term stays abroad, residence time coincides with age. As about 80 percent of the managers are Norwegian, age and residence time are highly correlated in the full sample. In addition, residence time has a slightly different meaning for foreign and Norwegian managers. To address these issues, we have tested interaction terms between residence time and foreign background to see if the association between residence time is different for Norwegian and foreign managers but find no significant interaction effects. We have also tried to run the regressions on residence time separately on Norwegian and foreign workers. Limiting the sample to Norwegian managers, the variables on CPI, conflict, and residence time are omitted, leaving only gender, age, and external accountant as independent variables. The coefficient for age is till significant for Norwegian managers. For foreign managers, neither age nor residence time yield significant coefficients.

Conflict and corruption variables are also correlated. The average CPI score is not surprisingly substantially higher in countries who have records of armed conflict. To check this association more closely, we have regressed compliance over these two variables separately and then combined them. In these models, the coefficients remain stable in all models; see appendix, table A10. This indicates that CPI scores measure something different than armed conflict, and we thus argue that it is possible to separate their association with compliance from one another.

There is also considerable variation in CPI scores in both groups. For managers from countries exposed to armed conflict, the CPI score ranges from 12 to 85. For managers from nonconflict countries, it ranges from .6 to 80 Except for the vast majority of Norwegian managers with the CPI score for Norway in 2018, the CPI scores are quite evenly distributed in both groups.

5.2
GA Model Results

The results from the GA model specification are displayed in Table 3.

Table 3:

GA model results.

(1)(2)(3)
Ltd. company0.132*** (0.030)0.145*** (0.029)0.103*** (0.022)
Salary system0.085** (0.025)0.108*** (0.029)0.103*** (0.029)
Work training0.037+ (0.020)0.028+ (0.017)0.024 (0.017)
External accountant0.129*** (0.018)0.115*** (0.020)0.111*** (0.021)
Job advertisement0.024* (0.010)0.024** (0.009)0.019+ (0.010)
Timesheet0.003 (0.005)0.000 (0.006)0.003 (0.007)
Terms0.022*** (0.003)0.023*** (0.003)0.021*** (0.003)
Conflict employees–0.126** (0.047)–0.094* (0.041)–0.119* (0.048)
Constant0.390*** (0.078)
Observations1,8971,8721,864
R-squared0.1540.2830.313
Romano–Wolf bootstrap p-valuesOriginalRomano–Wolf
Ltd. company0.0000.000
Salary system0.0000.000
Work training0.0000.000
External accountant0.0000.000
Job advertisement0.0000.000
Timesheet0.4790.475
Terms0.0000.000
Conflict employees0.0420.050

Note: Estimated coefficients from OLS model runs, after GA variable selection. Standard errors in parentheses. Column (1) is from OLS without fixed effects, column (2) is from OLS with fixed effects on NTA auditor, and column (3) is from OLS with fixed effects on NTA auditor and NACE code (two-digit). Standard errors are clustered by NACE code. Romano–Wolf test statistics are given by the original, and Romano–Wolf p-values of each independent variable.

***

p<0.001

**

p<0.01

*

p<0.05

+

p<0.10

We observe that holding a private limited company increases the probability of compliance by a ratio of .103 (3) and using an external accountant increases the probability of compliance by a ratio of .111 (3). There is a positive effect of .021 (3) on firm compliance of terms. A firm reporting salaries or benefits for employees has sustained activity and therefore reporting liabilities, but, perhaps equally important, more contact points with NTA throughout the year. The negative effect from the fraction of firm employees exposed to armed conflict is reproduced in this model, although it is slightly weaker.

The returns from the GA model yield intuitive results. A private limited company is more transparent, have more reporting liabilities and is under easier surveillance and scrutiny by the tax authorities than self-employed or registered foreign companies. Thus, one would expect a higher probability of compliance for private limited companies. The use of an external accountant also increases the probability of compliance, and this coefficient replicates the effects from the OLS models. Most of the effects from holding an external accountant is driven by monthly reporting; see appendix, tables A7A9.

5.3
LASSO Model Results

The results from the LASSO model specification are displayed in Table 4.

Table 4:

LASSO model results.

(1)(2)(3)
Ltd. company0.108 (0.113)0.131 (0.104)0.145 (0.103)
Self-employed–0.026 (0.118)–0.014 (0.108)0.050 (0.107)
Salary system0.085** (0.025)0.107*** (0.029)0.102*** (0.029)
Work training0.039+ (0.020)0.030+ (0.016)0.024 (0.017)
External accountant0.128*** (0.019)0.115*** (0.021)0.113*** (0.023)
Audit employees0.003 (0.004)0.005 (0.004)0.010* (0.004)
Timesheet0.003 (0.005)–0.001 (0.006)0.001 (0.006)
Terms0.022*** (0.004)0.022*** (0.003)0.020*** (0.003)
Conflict employees–0.127** (0.046)–0.096* (0.040)–0.124* (0.047)
Constant0.408*** (0.112)
Observations1,8971,8721,864
R-squared0.1530.2830.314
Romano–Wolf bootstrap p-valuesOriginalRomano–Wolf
Ltd. company0.0000.000
Self-employed0.0000.000
Salary system0.0000.000
Work training0.0000.000
External accountant0.0000.000
Audit employees0.0000.000
Timesheet0.4790.495
Terms0.0000.000
Conflict employees0.0420.040

Note: Estimated coefficients from OLS model runs, after LASSO variable selection. Standard errors in parentheses. Column (1) is from OLS without fixed effects, column (2) is from OLS with fixed effects on NTA auditor, and column (3) is from OLS with fixed effects on NTA auditor and NACE code (two-digit). Standard errors are clustered by NACE code. Romano–Wolf test statistics are given by the original and Romano–Wolf p-values of each independent variable.

***

p<0.001

**

p<0.01

*

p<0.05

+

p<0.10

The LASSO model results by and large reproduce the effects from the GA model, except the significant effect of holding a private limited company. Neither company type, Ltd. Company nor self-employed are significant predictors of management compliance in this model. This variance between the algorithms is not surprising, however. The inconsistency between the two algorithms is expected, since the two algorithms optimize different functions defined over different spaces. The LASSO algorithm optimizes a carefully modified linear regression problem, while the GA algorithm optimizes over the space of all possible linear regressions.

These different goals mean that there is neither reason expect the algorithms to agree about everything, nor an indication of an arbitrary result. The surprise is rather that the GA and LASSO agree on seven out of eight variables, and that there is low variance in the size of the coefficients between the two.

The appendix, table A11 gives the results of the OLS, GA, and LASSO model runs with fixed effects on NTA auditor and NACE code (two-digit) in one table.

5.4
GA and LASSO Model Performance Comparison

There are a number of predictive performance indicators in the literature, such as the Akaike Information Criterion (SakamotoIshiguro, and Kitagawa. 1986), the Bayesian Information Criterion (Watanabe 2013), or Adjusted R-squared (Mullainathan and Spiess 2017). The latter demonstrate the need for a hold-out sample to assess performance. Certain ML algorithms’ tendency to overfit is also prevalent in our GA model. Thus, one may expect performance to be overstated in the training sample. A second lesson to learn from Mullainathan and Spiess (2017) is that ML algorithms can perform significantly better than OLS, even when sample sizes and number of covariates are limited. In our setting, it makes less sense to compare performance of predictive models with the OLS, as we are mainly using the ML models to guide variable selection. However, as we want to check how well the ML models’ predictions match the observed data, we find MSE to be appropriate (James et al. 2013). The performance of the models is displayed in Table 5, where we also include a column for adjusted R-squared for illustration purposes.

Table 5:

Model performance comparison.

ModelMSEAdjusted R2
GA OLS0.07480.1501
GA auditor0.07050.2833
GA auditor NACE0.06990.2146
LASSSO OLS0.07490.1490
LASSO auditor0.07060.2041
LASSO auditor NACE0.06980.2157

Note: MSE and R-squared for the GA and LASSO models.

Although the ML models have their obvious limitations with respect to causal inference (Pearl 2018), when lack of previous empirical findings or theory cannot guide any explanations of the relationships, there are still lessons to be drawn from a comparison between who ML models performing variable selection in the first stage. We see that the performance is very similar, and so the choice of models should be guided by other criteria, such as, for example, the number of parameters one would have to “arbitrarily” set.

5.5
Further Robustness Tests

To get a clearer picture of the origins of significant effects, we have run all model specifications on each component of the dependent variable, namely, tax deduction, payroll accounts, monthly reporting, and general accounts. An unambiguous result of this test is that most of the effects from holding an external accountant is driven by monthly reporting. All results are given in the appendix, tables A7A9.

As NTA auditors have suggested independent variables, it is likely that some multicollinearity between variables exist. We estimate VIF of the individual variables in all regressions and find that none of the variables in the standard OLS regressions has a VIF higher than 4.22, and none in the GA- and LASSO-specified regressions has a higher VIF than 1.21, and hence we can disregard multicollinearity.

For the same reason related to variable selection by NTA auditors, we may expect some endogeneity in the final model specifications. Thus, for all fixed effects regressions, we have run the Hausman test for endogeneity (Hausman 1978), and find that no difference in coefficients is systematic.

6
Conclusion

We find a small, negative association between a manager’s age and compliance, but positive associations between use of an external accountant, and compliance with reporting requirements. Whereas exposure to armed conflict among the employees in the firm also reduces compliance, our findings from the ML models suggest positive associations between compliance and company characteristics such as holding a private limited company, and internal firm characteristics such as an established salary system and the frequency of terms with salary or benefit payments to employees. We find no associations between managers’ own conflict exposure and compliance.

The OLS, the GA, and the LASSO models all show higher compliance among managers who use an external accountant. However, the use of an external accountant is most likely an endogenous variable. Unlike true independent variables such as gender and age, the use of an external accountant is a choice the managers make. Whether the manager chooses to use an external accountant or not may also be seen as a part of their compliance behavior. It is not surprising that managers with external accounts are more compliant. However, it is difficult to determine whether this is a result of the managers’ inherent inclination to comply, or the services provided by the external accountant. More research is needed to establish causal relationships.

We find no evidence that home country corruption level has any effect on compliance, nor a clear effect of residence time, except that managers with <5 years length of stay in Norway are less compliant than those with residence time >10 years, suggesting a “learning effect” after five years of stay.

The negative age-effect in the OLS models partly contradicts recent findings. Thus, the age effect is likely more context- and application-specific. The positive sign of the coefficient on an external accountant is expected, as we infer that accountants possess more advanced knowledge on tax filing and liability than business managers in particular and people in general.

Allowing for nonparametric inference, using the pool of 100 prospective variables suggested by NTA tax auditors, a second contribution of this paper is to test two ML algorithms, namely, GA and LASSO. We find that both the GA and LASSO model specifications select other independent variables than the standard OLS, except for the use of an external accountant and the fraction of employees exposed to armed conflict.

The policy implications for the Tax Administration are twofold. First, the models in this paper have produced significant explanations of some factors driving manager compliance, namely, holding an external accountant, having a salary system, and managing a private limited company. These factors contribute positively to manager compliance. Age and previous armed conflict exposure of the employees in the firm, on the other hand, contribute negatively to manager compliance. Audit selection should thus take these characteristics into account when limited audit resources are allocated.

Second, when the Tax Administration possesses representative data, supervised ML models such as the GA and LASSO may provide useful tools in both understanding the drivers behind noncompliance and guiding audit selection.

The results of our analysis hint at the use of individual level variables to target audits. On one hand, this might lead to an improvement in audit measure (whatever is used). On the other hand, it might lead to discrimination against a particular sociodemographic group by a state institution. Therefore, we place more confidence in the ML results, which tend to uncover firm-level variables.

The decision of whom to select for an audit is based on a comprehensive assessment in which information about entirely different factors is certainly more important. Nevertheless, knowledge about the characteristics that distinguish compliant from non-compliant businesses can be useful, even when considering these types of characteristics. It can provide relevant information when selecting audit targets, but more importantly, it can be valuable knowledge when designing other measures and initiatives to enhance compliance.

Language: English
Page range: 1 - 29
Submitted on: Oct 24, 2022
Accepted on: Apr 11, 2024
Published on: May 20, 2025
Published by: DJØF Publishing, Nordic Tax Research Council
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2025 Thomas Lange, Anne May Melsom, published by DJØF Publishing, Nordic Tax Research Council
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.