Audits and information letters are two enforcement strategies with distinct effects on a firm's behavior and significantly different administrative costs for tax authorities. On-site audits are more intrusive than impersonal letters requesting needed information to clarify a firm's report or alerting the firm about possible errors in their report. While a scheduled visit from a tax auditor typically evokes a sense of being subject to the disclosure of potential errors, a letter stating the firm's reporting liabilities is perceived more as a reminder of the applicable rules than as a sense of being inspected. Thus, behavioral changes following an audit likely stem from a sense of enforcement, whereas changes after receiving a letter requesting information may feel more voluntary. From the tax administration's perspective, audits are time-consuming, depending on their scope and nature, while letters, once written, are practically cost-free to distribute. A change in enforcement strategy could therefore result in significant cost savings for the tax administration.
Firms play a central part in the economy and account for a significant fraction of the collected tax revenue. In 2021, the tax revenue ratio of firms to individual taxpayers in Norway was 0.75, not including value added tax (VAT) (Ministry of Finance, 2022). The payroll tax represents a significant contribution to the Norwegian tax revenue. In 2021, Norwegian firms paid NOK 216.4 billion in the total payroll tax, equaling 14.2 percent of the total tax revenue, or 22.9 percent of taxes liable to firms only (Ministry of Finance, 2022). Employers are responsible for the remittance and the reporting of the payroll tax on behalf of their employees, as part of the financing of the National Insurance Scheme. The tax is levied on employees' salaries and other taxable remuneration for work and assignments in and outside of an employment relationship. The tax rates are set by the Norwegian Parliament every year and range from 5.1 percent to 14.1 percent depending on the sector and geographical location.
Firms' tax behavior and exposure to the tax administration differ from those of the individual taxpayers. While the latter is obligated to submit their tax return only once a year, a firm has several reporting and payment obligations throughout the year, such as the bi-monthly reporting of the payroll tax remittance, and the monthly submission of the VAT return. Thus, the higher frequency of encounters with the tax authorities may affect the firms' compliance in different ways than in the case of individual taxpayer compliance, requiring different enforcement strategies.
This paper contributes to recent literature on the effects of different tax enforcement strategies. Our experimental setting captures central aspects of the real-world reporting environment, such as the presence of firms' bimonthly remittance of the payroll tax. We compare the effects on firms' remittance of payroll tax of a scalable strategy, namely, comparing the effects of standardized information letters with the effects of on-site audits. Unlike most other contributions to the literature, we study the effects of these two instruments on the same population and in the same institutional setting. The attention is directed to a population of 30,961 firms in different labor-intensive businesses in the Norwegian economy.
Since our two treatments (on-site audits and standardized, electronic information letters) are directed toward the firm's reporting obligations, we seek to uncover whether the firm responds to and corrects their payment of payroll tax after treatment with an audit or a letter stating the required obligations. We measure compliance through remitted amounts of payroll tax. We believe payroll tax is of particular relevance in this context, both because it is a tax only levied on firms, and the message conveyed in the treatments is targeted towards employee reporting obligations, which indirectly influence payroll tax remittance.
The audits were implemented on a stratified random sample of 1,974 firms during the filing and auditing season of 2018. The audited firms were given an up-front notification about the content of the audit so that the firms could prepare relevant documentation. The letters were sent to the same population, but to a different sample of 8,000 recipients simultaneously in June 2018. A 14 percent fraction (1,130) of the letter recipients never opened the letter and were therefore not treated.
The empirical analysis is divided into two main parts. The first part studies the average treatment effects (ATE) of audits and letters on the firms' remittance of payroll tax. As we are interested in a comparison with the article by Bjørneby, Alstadsæter, and Telle (2021) on employee effects, we have kept the number of employees as a dependent variable alongside the analysis of the payroll tax. While on-site audits are more comprehensive in that they involve more scrutiny than standardized letters, we expect considerable variation in remitted payroll tax across firms depending on the treatment assigned.
We find that the audit treatment increased the firm's payroll tax remittance in 2018 by 13.20 percent compared to the pre-treatment level in 2017, and we find a statistically significant increase in remitted payroll tax in all post-treatment years (2018–2020) for the audited firms compared to the reference group receiving no treatment. Furthermore, we find a similar but less strong effect of 1.82 percent for the firms receiving a letter. This confirms the findings of Ortega and Scartascini (2020) who found stronger effects from physical visits than emails. The effect seems to be more stable for the audited firms, two years following the treatment (2020). This is consistent with previous findings such as Kleven et al. (2011), DeBacker et al. (2018) and Advani, Elming, and Shaw (2023).
The second part of the analysis studies the local average treatment Effects (LATE) of the letters. (1) This allows us to isolate the effects of the firms that read the letters, i.e., the treated (or “compliers”) in this experiment, from the entire group of firms that were intended to be in the treated group (ITT). The nature of the letter treatment differs from that of audits in this respect: there is often one-sided noncompliance in that a fraction of firms receiving a digital letter will leave it unread. Thus, we study the effects of being treated, using an IV model capturing this effect. We find that the compliers remit significantly higher payroll taxes in all subsequent years following the treatment. This is consistent with the findings of Bjørneby, Alstadsæter, and Telle (2021).
The policy implications of the results in this paper are twofold. First, our analysis reveals that there is a significant compliance effect for alternative enforcement strategies like information letters. Second, since information letters are scalable to a larger population at low cost, such enforcement may be preferable to traditional, more expensive on-site audits. We include in this part a limited cost-benefit analysis, indicating that letters may be preferable in the short run.
The paper is organized as follows; Section 2 reviews adjacent literature, Section 3 describes the institutional background and the random audit program, Section 4 describes the data, Section 5 lays out the experimental design, Section 6 estimates the effect of the two treatments, Section 7 suggests some policy implications, and Section 8 provides the conclusion.
The rational agent-based theory of Allingham and Sandmo (1972) has shaped the modern tax administration's approach to enforcement by increasing taxpayer's perceived costs of evasion, decreasing the cost of compliance, and tailoring enforcement strategies towards different taxpayer segments (Baer and Silvani, 1997). More recently, behavioral and moral aspects of tax compliance have also found their way into tax research (e.g., Dhami and Al-Nowaihi 2007 and Luttmer and Singhal 2014) and have gained increased attention among tax administrations. Rather than increasing audit frequency, modern tax administrations have sought to gain knowledge of compliance effects from other, prospectively cheaper and more scalable enforcement strategies (Murphy 2019, Alm 2019a, Keen and Slemrod 2017).
The compliance effects of enforcement strategies in general, and audits in particular, have recently been studied in experimental designs, using randomized samples (Kotsadam et al. 2021; Kleven et al. 2011; Advani, Elming, and Shaw 2023; DeBacker et al. 2018; Hebous et al. 2020; Bjørneby, Alstadsæter, and Telle 2021). Most of these studies use samples of individual taxpayers except Almunia and Lopez-Rodriguez (2018), which uses a nonrandom sample of firms from large taxpayers unit (LTU) in Spain; Boning et al. 2020), which uses total employer tax deposits; and Bjørneby Alstadsæter, and Telle (2021), which use on-site audits of Norwegian firms from a stratified sample. Unlike these contributions, which study the effects of enforcement on the individual taxpayer, this article studies the effects of audits and letters as enforcements on the firm.
The novelty of this contribution is the study of how the firm as an entity may respond in a very different way than individual taxpayers. Furthermore, the effects of two enforcements on the payroll tax are studied—a tax to which only the firm is liable—in the same environment. D'Agosto et al. (2018) also studied the compliance effect on small businesses of two different enforcement strategies, namely on-site and desk-based audits. They use risk-based audits, however, not random selection. Our approach also differs from random designs like that of Boning et al. (2020) who studied the effects on overall employment taxes, i.e., payroll taxes and employee income taxes withheld and remitted by IRS-assigned at-risk firms. Since the latter can be manipulated by the employee, even in a third-party remittance regime (Bjørneby et al., 2021), part of the effects in Boning et al. (2020) may be explained by employee noncompliance or collusive actions involving both employer and employee. We overcome these issues by focusing on the payroll tax alone.
The effects of audits on compliance vary across studies and are sensitive to which dependent variable is chosen, but Kleven et al. (2011); DeBacker et al. (2018); and Advani, Elming, and Shaw (2023) find lasting effects on subsequent tax compliance among audited taxpayers. These effects are confirmed by Løyland et al. (2019) using risk-based audits and Hebous et al. (2020). Because audits are costly, tax administrations also use cheaper and less intrusive enforcement policies such as information campaigns, reminders, enforcement emails (Brockmeyer et al. 2019), letters (Pomeranz 2015, Doerrenberg and Schmitz 2015), and encouragements (Kotsadam et al. 2021). The effects of such soft treatments are also mixed, depending on the message portrayed in the treatment (Alm 2019b, Slemrod 2019, Meiselman 2018, Pomeranz and Vila-Belda 2019). Information that increases perceived detection probability seems to have positive short-term effects on reporting (Slemrod, Blumenthal, and Christian 2001; Kleven et al. (2011); Fellner, Sausgruber, and Traxler 2013; Bott et al. 2020), while general appeals to tax morale and social norms seem to have little or no effect (Hallsworth et al. 2017), or even negative effects (De Neve et al. 2021). Using four different letter treatments, Bergolo et al. (2023) found that information on audits decreased the perceived probability of being audited, while, at the same time, inducing a significant deterrent effect on tax evasion.
Our design is similar to that of Boning et al. (2020), except that the 12,172 firms they studied were ex-ante suspected of noncompliance, selected by an algorithm based on payments before and during the fourth quarter of 2014, i.e., firms showing signs of noncompliance before treatment. While it is highly useful for tax administrations to acquire knowledge about treatment effects on potentially noncompliant firms, our contribution differs in that our random selection is not limited to a population of firms suspected of noncompliance, but rather to a fraction of the economy, irrespective of previous risk-based selection. Our findings are representative of a population of 30,961 firms from labor-intensive businesses. Our approach provides a broader view of the compliance effects of the treatments within the entire sector. Boning et al. (2020) is a more targeted approach, allowing for confirmation or rejection of the initial suspicions, and providing evidence of different behavior among high-risk firms. In their setting, a fraction of the firms also received an information letter, and another fraction had an on-site IRS Revenue Officer visit.
The unit of study in Kotsadam et al. (2021) is a sample of individual taxpayers claiming deductions. Taxpayers claiming deductions are more prone to errors than the rest of the taxpayer population, and so the external validity of the effects observed in Kotsadam et al. (2021) may not extend to a population of firms. What motivates individual taxpayer behavior may also differ from the determinants behind firm compliance.
Bjørneby et al. (2021) study a compliance mechanism evolving from third-party reporting using randomized audits of firms in Norway, and they find compliance effects with the audit treatment. We may expect similar effects from the audits in our study, but we move beyond their setup and compare the effects from the audits with prospective effects from letters. This may give tax administrations information about more cost-efficient enforcement strategies.
In 2017, the Norwegian Tax Administration (NTA) introduced the random audit program to build more systematic knowledge about tax compliance. The program is conducted along three thematic strands: labor market regulations, VAT compliance, and quality of third-party data reporting. We focus on data from the first strand.
The audits were directed toward disclosing the scope and magnitude of formal compliance errors in labor-intensive businesses in Norway. This focus originated from risk-based audits, where experience indicates that formal noncompliance is particularly high where foreign labor is involved. Risk-based audits are biased, however, and so the inference from the audit sample to the general population is limited. This random audit program, therefore, sought to measure the compliance gap in labor-intensive industries to make more robust conclusions about compliance risk in the population of these industries. The audits had reporting objects on, for example, firms' accounting and salary systems, recruitment routines, board and lodging for foreign employees, use of foreign workers and subcontractors, tax withholding accounts, salary reporting, and staff registers, if applicable.
The program was thus established to gain knowledge of reporting noncompliance rather than enforcing tax remittance. However, correct reporting will inevitably entail correct tax remittance, but rather than disclosing the noncompliance as evaded taxes, the program aimed at disclosing errors in the firms' reporting procedures.
The random audit program started with a test pilot in 2017. A total of 60 test audits were performed in this initial phase. These audits are not included in the data used in this paper. Following an evaluation of this pilot, adjustments were made before the start of the program. In its full scope, 187 auditors from the NTA have been involved in executing the audits. Each audit averaged three to five days of work, with an average cost of 12,740 NOK per audit. All 1,974 audits were executed in 2018.
The second treatment in the experiment was information letters. The NTA sent 8,000 randomly selected firms letters containing descriptive information on seven relevant reporting duties for the firms in these sectors but contained no “moral” statements (e.g., Bott et al. 2020). The letter recipients were randomly drawn from the same target population as the audited firms. While the audits were more extensive than the letters, the reporting duties stated in the letters reflected some of the same procedures that the audited firms were asked to provide information about, namely accounting standards, monthly reporting (A-melding), employee tax deductions, documentation of salary expenses, documentation of elapsed time, reporting duties to the International Tax Collection Office, and staff register obligations. A copy of the letter can be found in Appendix 3. The marginal cost of electronically distributing the letters, and the average cost per letter are both negligible.
The target population for both the audits and the letters is based on the selection criteria in Table 1 and consists of 30,961 firms.
Population selection criteria
| Criteria |
|---|
| 1. Total revenue in 2017 > NOK 100,000 |
| 2. If not 1, then total revenue in 2016 ≥ NOK 200,000. |
| 3. Mean active work relations per month is ≤ 20 |
| 4. Mean active employees per month is ≥ 5 |
| 5. If not 4, then registered > NOK 100,000 on subcontracts and/or foreign services in 2016 |
| 6. Private limited companies (AS) self-employed (ENK) and Norwegian foreign-registered enterprises (NUF) |
| 7. No termination date registered |
| 8. Sectors in public service, defense and public social security are excluded |
Note: The table describes the selection criteria for firms in the strata described in Appendix Table A2, which resulted in the 30,961 firms in the population of study. The selection criteria are set to secure representative firm activity (criteria 1–2; 7), employment (3–5) organizational form (6), and exclude the public sector (8).
The 8,000 firms receiving letters were randomly drawn from this target population using simple random selection and no stratification or any further selection criteria. With this increased sample size, compared to the audits, the likelihood of obtaining a representative and diverse sample that reflects the overall population characteristics increases, and hence stratification becomes less important (Solon, Haider, and Wooldridge 2015). No audited firms received a letter, however. Thus, no firms received two treatments, and there was no interference between treatment groups. The electronic letters were simultaneously sent out through the NTA e-portal in the digital government dialogue (“Altinn”) to all 8,000 firms on 1 June 2018, among which 6,870 firms did open/read the letters. The NTA can read off which firms opened the letters and which firms left them unopened.
Out of 2,000 randomly selected firms, 26 were found “unworthy” of an audit for different, unsystematic reasons. The 1,974 audited firms were selected through a stratified random sample to capture the relative sizes of the different strata in the total population of 30,961 firms. A total number of 22 strata representing different industries were constructed from 65 different NACE codes on 1- and 2-digit levels, cf. Appendix 1. The sample size allocated to each stratum was determined by the method of proportional allocation (based on the number of foreign employees in each industry). The samples were then selected from each industry (stratum) for each of the five tax regions in Norway separately, considering the audit resource capacity in each region. The aim was to perform at least five audits in each stratum in each region, but the number of audits in each stratum could not exceed 15 percent of the total resource capacity in the region (proportional allocation method with lower and upper cut-off). A list of the industry sector and stratum is included in the Appendix.
The data set covers the period from 2017 to 2020. While the shortcoming of this limited period is of some concern for establishing post-treatment trends, it is evident from other studies that the strongest effects occur immediately or within a couple of years post-treatment (e.g. Advani, Elming, and Shaw 2023; Brockmeyer et al. 2019; Pomeranz 2015), DeBacker et al. (2018) being an exception, finding stronger effects on total income in years 3 and 4 compared to years 1 and 2 post audit. Our data set covers a wider period than, for example, Boning et al. (2020) and Brockmeyer et al. (2019), and an additional post-treatment year compared to Bjørneby, Alstadsæter, and Telle (2021).
To provide an overview of the different sectors' representation in the sample, the descriptive statistics are broken down by stratum, as displayed in Appendix Table A1, while Table A2 displays which sectors are included within each stratum. Construction of buildings and civil engineering (S26); human health, residential care, and social work (S15); publishing, broadcasting, telecommunications, and information services (S17); food and beverage service activities (S31); and manufacturing (S16) are by far the largest strata in the population concerning the number of firms in each stratum.
Payroll tax represents a significant tax revenue contribution from the firms in our population, where the total reported contributions vary from NOK 8.5 to 9.8 billion over the years of study. Table 2 shows reported remittance of payroll tax in NOK, and firms' workforce, broken down by population, reference, audit, and letter groups for the years 2017–2020.
Reported remittance of payroll tax (NOK) and number of employees (2017–2020)
| Payroll Tax | Employees | ||||||
|---|---|---|---|---|---|---|---|
| Year | N | Mean | SD | N | Mean | SD | |
| Population | 2017 | 30 869 | 300 651 | 416 841 | 30961 | 11.95 | 11.41 |
| 2018 | 30 813 | 311 326 | 550 902 | 30961 | 11.99 | 14.25 | |
| 2019 | 30 807 | 318 893 | 702 543 | 30961 | 11.46 | 17.6 | |
| 2020 | 30 667 | 285 473 | 733 289 | 30961 | 10.22 | 15.6 | |
| Reference | 2017 | 20 919 | 300 638 | 423 045 | 20987 | 11.72 | 11.07 |
| 2018 | 20 893 | 310 461 | 586 372 | 20987 | 11.71 | 13.68 | |
| 2019 | 20 884 | 319 539 | 776 792 | 20987 | 11.15 | 16.84 | |
| 2020 | 20 786 | 287 767 | 815 942 | 20987 | 9.99 | 16.06 | |
| Audit Arm | 2017 | 1 970 | 284 822 | 342 069 | 1974 | 14.05 | 14.42 |
| 2018 | 1 964 | 299 351 | 388 336 | 1974 | 14.21 | 14.92 | |
| 2019 | 1 966 | 299 264 | 434 835 | 1974 | 13.49 | 15.67 | |
| 2020 | 1 964 | 253 812 | 417 955 | 1974 | 11.82 | 14.24 | |
| Letter Arm | 2017 | 7 980 | 304 594 | 417 213 | 8000 | 12.04 | 11.42 |
| 2018 | 7 956 | 316 552 | 485 016 | 8000 | 12.18 | 15.47 | |
| 2019 | 7 957 | 322 046 | 529 603 | 8000 | 11.78 | 19.82 | |
| 2020 | 7 917 | 287 302 | 539 802 | 8000 | 10.41 | 14.63 | |
Note: Reported remittance of payroll tax in NOK. Column N denotes total number of observations/firms in the respective years given by the first column. Column “Mean” represents sample means, and column SD gives the standard deviation. 735 erroneous, negative values are omitted. Random checks on negative values indicate no evidence that omitted observations are systematic.
The letter group is four times as large as the audit group, and so we expect lower standard errors on the estimated effects of letters than of the audits. The reference group is the same population of firms for both audit and letter arms. The standard deviation is high for all groups, and similar across groups. All groups' payroll tax remittances increased from the pretreatment year 2017 to the treatment year 2018. The decrease in firms' payroll tax remittance in 2020 is an effect of COVID. As expected for a true experimental design, we found no evidence of the Ashenfelter dip when adding a pretreatment year (2016) in the context of any interventions wherein those assigned to either treatment have a temporarily depressed payroll tax remittance that would revert upward toward their longer-term mean, absent treatment (Ashenfelter, 1978), cf. Appendix Figure AF1. As expected, due to randomization, the treatment groups are similar before treatment, except that the audit group had a systematically higher number of employees compared to all other groups during the entire period of study. This is due to the stratification of the audit sample, where overrepresented strata had firms with higher numbers of employees, cf. t-test statistics with strata fixed effects in Table 3.
T-statistics payroll tax and employees in the base year 2017
| Arm | Payroll Tax | t | P > |t| | Employees | t | P > |t| |
|---|---|---|---|---|---|---|
| Audit | 3658.342 (6721.756) | 0.54 | 0.592 | 0.297 (0.321) | 0.92 | 0.366 |
| Letter | 5317.589 (3322.063) | 1.60 | 0.124 | 0.116 (0.190) | 0.61 | 0.548 |
| Observations | 30869 | 30961 | ||||
| R-squared Audit | 0.055 | 0.124 | ||||
| R-squared Letter | 0.000 | 0.000 | ||||
Note: OLS estimation coefficients and test statistics for the audit arm compared to the stratified audit reference arm, and letter arm compared to the unstratified letter reference arm. Estimated coefficients on Y = Payroll Tax and Y = Employees. Fixed effects on strata for audit arm. Standard errors are adjusted for 22 clusters in strata. Robustness test is run with standard error clusters on firm ID, and the result stands.
A simple way to test the effect of a treatment is to track the means of the outcome variables, pre- and post-treatment, for the treatment and control groups respectively, cf. Figure 1.

Means of payroll tax and number of employees. Reference, audit, and letter groups.
Notes: This figure plots point means and 95% confidence intervals from payroll tax remittance (in NOK), and the number of employees for years 2017 to 2020. The treatment, either audits or letters, takes place in year 2018, as visualized by the vertical, dotted line. The black line represents the annual mean of the audit group, the blue line represents the annual mean of the letter group, and the dotted line represents the mean of the reference group. The 95% confidence intervals are represented by the vertical error bars for each group.
From 2018 to 2019, all three groups slightly increased their payroll tax remittance compared to the pretreatment year of 2017, and then decrease toward 2020. Thus, on average, we cannot infer that the treatments affect firms' payment of payroll tax. The firms' workforce is also downward sloping from 2017 onwards, with a sharper decline from the treatment year of 2018 for all three groups. On average, we do not see any effect of treatment on the firms' workforce.
As we can infer from both panels, randomization fails on raw data because there are significant differences between the groups, pretreatment. Since the audit sample was stratified, however, we need to control for strata fixed effects when testing randomization.
To test whether the treatment groups and reference group were significantly different in the pre-treatment year of 2017, we ran a simple OLS regression with fixed effects on strata for the audit arm to control for the stratification bias, but no fixed effects for the letter arm because the letter sample selection was not stratified:
Furthermore, there are reasons for clustering the standard errors, either as the result of the sampling design because we have sampled data from a population using clustered sampling (Liang and Zeger, 1993) or to accommodate the experimental design (Weiss, Lockwood, and McCaffrey 2016) because of the clustered assignment mechanism for the audit treatment. Albeit both reasons are relevant for this study, at this stage of the analysis, we use clustered standard errors adopting the first reason. The test results are displayed in Table 3.
As we can see from the T-statistics in Table 3, randomization holds for both audit and letter arms on payroll tax and Employees.
Because we are also interested in a comparison of treatment effects on the two treatment arms, we have run a two-sample t-test with equal variances on payroll tax and Employees to reveal prospective pre-treatment deviances between the treatment arms on the two variables. The test results are displayed in Table 4.
Two sample T-statistics payroll tax and employees in the base year 2017
| Arm | Payroll Tax | t | P> |t| | Employees | t | P> |t| |
|---|---|---|---|---|---|---|
| Audit | 284821.6 (342068.8) | 14.07107 (14.42238) | ||||
| Letter | 304594 (417212.9) | 12.05564 (11.42717) | ||||
| Combined | 300679.2 (403508.8) | 1.9480 | 0.0514 | 12.45467 (12.10517) | −6.6322 | 0.0 |
Note: Test statistics for Audit (N=1,970) and Letter (N=7,980) arms on Y = Payroll Tax and Y = Employees. 24 Erroneous, negative values are omitted. Standard deviations in parentheses.
The audit and letter arms are statistically significantly different on employees, but not on payroll tax.
In a randomized experiment, where treatments are assigned to test and control groups by random selection, an OLS model with fixed effects would give unbiased estimates. To exploit the panel structure and to adjust for the fact that the two groups have different average levels of compliance in the base year 2017, we use a difference-in-difference model design (DiD). However, the results are equivalent to an OLS. There are two reasons for this. First, a DiD increases the precision of the estimates (Angrist and Pischke, 2009). Second, we want to estimate LATE for the letter treatment group. As a robustness check, we have run simple OLS fixed effects models. The results are displayed in Appendix Table A3.
The two treatment samples have different properties in three relevant respects. Firstly, the audit sample is stratified, and so we fix the effects on strata for this sample. Secondly, all firms assigned to audit treatment were actually audited. Thus, for the audit treatment group, treatment assignment is identical to the treatment status, and so ITT estimations give us the ATE. Lastly, the letter sample is not stratified but includes noncompliers, i.e., a known sub-sample of those who received the letter but did not open it. Therefore, they are untreated but nevertheless assigned to the letter treatment sample. We estimate both ATE for this group and LATE for those who were actually treated.
The letter recipients and the audited firms are drawn from the same target population, using simple random sampling for the letters and stratified random sampling for the audits. To ensure that the treatment effects are comparable, we include fixed effects on strata in models including both treatments. This allows for a comparison of treatment effects within each stratum.
While evaded tax or changes in reported income tax are commonly used dependent variables (Kotsadam et al. 2021, Bjørneby, Alstadsæter, Telle 2021), we are interested in changes in payroll tax as an expression of a correction following the treatments. As less reported payroll leads to more profits, more dividends, and then an increase in income, the economic incentive for withholding is present.
Our first model utilizes payroll tax as the dependent variable, with fixed effects on strata, firm ID, and year, to isolate the effects from the treatment:
This regression estimates the treatment effect by comparing the changes in outcomes (logged payroll tax remittance and logged number of employees) over time between the treatment group and the reference group. Specifically, it looks at the difference in these changes from the base year (2017) to the subsequent years (2018, 2019, 2020). The DiD approach inherently controls for time-invariant differences between the treatment and reference groups by focusing on the changes over time. The coefficients represent the interaction between the treatment group and the post-treatment periods, capturing the differential effect of the treatment over time. By using 2017 as the base year, the model adjusts for any pre-existing differences between the groups in that year.
There is a difference in effects between the group one intended to treat with the letter and the effects of the group in fact being treated that is, the group opening/reading the letter. The former estimates the effect of being assigned to a treatment group, whereas the latter estimates the effect of being treated.
This difference arises from the fact that there may be firms assigned to the treatment group, that end up not getting treated. To allow for a comparison between ATE and LATE, we run a separate model on the annual effects of the letter treatment compared to the reference group. We include fixed effects on firm ID:
This approach involves running separate regressions for each year (2017–2020) and comparing the outcomes to the reference group each year. Each regression compares the treatment group to the reference group within the same year, without explicitly accounting for changes over time. Since the regressions are run separately for each year and group, the coefficients reflect the annual differences between the treatment and reference groups, rather than the changes over time. Unlike the DiD approach, this method does not adjust for preexisting differences in a baseline year (2017). Each year's coefficient is independent of the others.
Local average treatment effect (LATE) is an estimate that focuses on a specific subgroup within the population, i.e., the compliers. LATE estimates the average effect of the treatment on this subgroup, which is typically smaller than the overall population. It is based on the assumption that there are no unmeasured confounding variables affecting treatment assignment for the compliers. In other words, it assumes that the treatment effect is constant for this subgroup, regardless of the level of treatment received by others. Average treatment effect (ATE), on the other hand, aims to estimate the average effect of the treatment on the entire population, including both compliers and non-compliers. It considers the effect of treatment on all individuals, regardless of whether they fully comply, partially comply, or do not comply with the treatment. For the audits, we have no such differences, since all firms assigned to the audit treatment actually did get audited.
For the letter treatment group, there was one-sided noncompliance, that is, firms who received but didn't open the letters, but no firms who didn't receive the letters and still required/read them. Some of the firms assigned to letter treatment received the letter but never opened them. However, no firms assigned to the reference group received/read the letter. The ATE gives the effects for the whole group intended to get the letters, but there are no reasons to expect effects among firms who never opened the letter, albeit some minor spillovers through informal contacts between firms can occur. To estimate the effect only among firms who were actually treated, that is, the LATE, we use the IV-model described in the following two equations:
Letter assignment is the instrument variable. It affects treatment as only letter recipients may read the letters. It is not associated with the dependent variables (payroll tax and number of employees), however, as firms are randomly assigned to treatment.
Given that the assignment to the letter group was true random on payroll tax, our reduced form ATE estimate (4) can be given causal interpretation of the ATE. The IV estimates (5) and (6) rely on two additional assumptions (Angrist and Pischke (2009); Gerber and Green (2012)).
First, is the noninterference assumption, which consists of two parts. Part A stipulates that whether a firm is treated depends only on the firm's own treatment group assignment. Because there is no firm receiving a letter outside the letter treatment group, this condition holds. Part B stipulates that potential effects are affected by the firm's assignment and the treatment the firm receives as a consequence of that assignment (Gerber & Green, 2012, p. 138).
Second, the exclusion restriction stipulates that potential effects respond to actual treatments, not treatment assignments. That is, firms respond to the letter only if they open/read it, not by merely receiving it. There are good reasons why we should assume that the exclusion restriction holds because treatment does not affect the outcome variables through any other channels than through the direct reading of the letter. Since communication between the NTA and firms in Altinn is not unusual, it is unlikely that firms might respond by merely receiving a letter from the NTA.
The two treatments may affect a firm's reporting liabilities other than the payroll tax. Revenue, salary expenses, and number of employees (workforce) may be affected by both the audit and the letter treatments since these are specific items contained in both treatments. We have run all the models in this paper with revenue, salary expenses, and number of employees as dependent variables as well, but we find no major deviations from the main model using payroll tax remittance as the dependent variable. Audits may affect payroll tax remittance through both better documentation on the payroll tax itself, but also through better documentation of sales and other variables. We are estimating the total effect of all these possible channels, cf. correlation matrix in Appendix Table A4. Nevertheless, as we are interested in a comparison with Bjørneby, Alstadsæter, and Telle (2021) on employee effects, we keep the number of employees as a dependent variable alongside the analysis of the payroll tax.
The regression results from our combined model (3) are presented in Figure 2. The results reveal a positive effect of audit and letters compared to the reference group when we use the pretreatment year of 2017 as the base year, but the audit and letter curves have opposite shapes post-treatment.

Estimated audit and letter effects (base year 2017).
Notes: This figure plots point DiD estimates from regressions of measures of logged payroll tax remittance and logged number of employees for the years 2017–2020, compared to the base year 2017, and compared to the reference group. (Difference between treatment and reference in years 2018, 2019, 2020, minus the difference between treatment and reference in 2017). The treatment, either audits or letters, take place in year 2018, visualized by the vertical line. The black line represents the estimated coefficients of the audit group, the grey line represents the estimated coefficients of the letter group, both estimations are compared to the reference group represented by the dotted line. The specification includes fixed effects on strata and firm ID. Standard errors are adjusted for 22 clusters in strata. Table A5 in the Appendix displays the coefficients and their standard errors.
The audit treatment had an immediate effect in 2018 on firms' payroll tax remittance compared to the reference group, albeit not statistically significant. In 2018, the audited firms increased payroll tax remittance by .124 log points (equivalent to 13.20 percent), compared to the reference group when we use the pretreatment base year 2017.
Increased remittance following an audit may be considered “mechanical” when firms only adjust their behavior and report accurate information in response to the specific discrepancies or errors uncovered during the audit. The increased remittance may thus not be driven by a genuine compliance commitment or improved understanding of reporting liabilities, but rather a reaction to the fear of penalties or consequences for noncompliance. This identification problem pertains to the distinction between actual behavioral changes in taxpayers (real effects) and mere adjustments made to comply with tax liabilities in response to the audit (reporting effects) (Advani, E;lming, and Shaw 2023); Kleven et al. 2011). Real effects refer to substantive changes in the behavior of taxpayers resulting from the audit. These changes may include a genuine improvement in tax compliance practices, a better understanding of tax regulations, and a shift towards more accurate and honest reporting of financial information (Kausar, Shroff, and White 2016). Reporting effects, on the other hand, occur solely in response to the audit itself. Firms may correct errors or discrepancies discovered during the audit, but these changes might not reflect any meaningful change in their overall tax compliance behavior beyond the specific issues identified during the audit. Reporting effects are often temporary, and taxpayers may revert to noncompliant behavior once the audit process is over. The immediate effect in 2018 and the slight dip in 2019 may indicate reporting effects, but the following increase in 2020 makes this difficult to establish.
The effect of letters on payroll tax remittance is .018 log points (equivalent to 1.82 percent) in 2018, but neither is this result statistically significant. The audited firms also increased their workforce in 2018 by .019 log points (equivalent to 1.92 percent), compared to the reference group, and the letter group increased their workforce by .005 (equivalent to .50 percent) in the treatment year 2018, but these effects are not statistically significant. We see the strongest effects on payroll tax remittance of audits in 2018, decreasing in 2019, and then increasing in 2020. The letter treatment follows an opposite pattern, where we find the strongest and statistically significant effects in 2019, and then decreasing in 2020. In 2019, the effect of letters on payroll tax remittance is .110 log points (equivalent to 11.63 percent), which is closer to the 2018 estimate for the audited firms. Thus, there appears to be a lag in the letter treatment effect, that is not apparent in the audit treatment.
The time-lagged compliance effect from the information letter, as compared to the immediate effect from the on-site audits, can be attributed to the different nature of these two compliance interventions and how they influence taxpayer behavior over time. The letters are nonbinding and serve as educational tools to inform firms about their compliance obligations. Unlike the audits, the letters do not involve direct enforcement actions or a perceived threat of penalties for noncompliance. As a result, firms may take longer to internalize the information and voluntarily adjust their behavior. Behavioral change takes time, regardless of the method used to convey information. After receiving an information letter, firms may need time to process the information, assess its implications, and gradually adopt better practices.
An on-site audit typically leads to more immediate compliance adjustments due to the direct scrutiny and enforcement actions involved. The fear of penalties and the immediate presence of auditors can compel firms to address identified compliance issues more promptly.
The treatment effect on the workforce is consistent with the findings of Bjørneby, Alstadsæter, and Telle (2021), who estimate an average increase of 1.12 in the number of employees between treated and untreated firms. The stronger effect from the letter treatment compared to the audit treatment on payroll tax remittance in 2019, can be seen in light of a similar finding in D'Agosto et al. (2018), which reveals a higher effect from the “soft” on-site audit compared to the “deep' desk-based audit treatments in their study. But as our treatments differ in nature, i.e., our audit treatment resembles their on-site audits rather than their desk-based audits, a direct comparison should be considered with caution.
The letter treatment has significant effects on both outcome variables in the first post-treatment year of 2019. Letters affect payroll tax and workforce less than audits two years post-treatment. This confirms the findings of Boning et al. (2020), where letters conveying the same message as on-site visits by the revenue officers in the U.S. have smaller direct effects. The same pattern is evident in Ortega and Scartascini (2020) who find stronger effects from on-site visits than emails. Although Kotsadam et al. (2021) find a similar response of letter treatment (diminishing with time), the results are not directly comparable; their unit of study is the individual, not the firm, and the letters are more specifically addressing the actual source of error, namely deductions on the tax return. Doerrenberg and Schmitz (2015) also suggest evidence that a letter which reminds small firms of the civic duty to pay taxes and informs about an audit probability following the letter, may increase tax compliance, but their results are not statistically significant.
The over-compliance effects as discussed in, for example, Slemrod Blumenthal, and Christian (2001) and Slemrod and Yitzhaki (2002) cannot be entirely ruled out in our context. When a firm anticipates a potential audit, it may opt to overstate its tax liabilities, and this strategy is often used as a safeguard against the risks of underreporting. However, in a third-party tax reporting regime involving 12 annual report submissions (Amelding), this potential effect seems limited. The regularity of monthly declarations increases the visibility of the firm's financial activities. This frequent oversight encourages more accurate reporting, as discrepancies or patterns of over-reporting are more easily detected over short intervals. Over-reporting taxes consistently across multiple declarations would increase the firm's administrative workload, requiring more effort to track and justify inflated figures. The regular correction process in the next monthly reporting increases transparency and seems to discourage firms from over-compliance due to an awareness of increased audit probability.
To allow for a comparison between ATE and LATE for the letter treatment, we have run a separate annual regression model (4). The results are displayed in Figure 3.

Estimated annual letter effects (ATE) 2017–2020.
Notes: This figure plots point estimates from annual letter regressions of measures of logged payroll tax remittance and logged number of employees for the years 2017–2020, compared to the reference group (x-line). The treatment takes place in the year 2018, visualized by the vertical line. The grey line represents the estimated coefficients of the letter group compared to the reference group, represented by the dotted line. Standard errors are adjusted for 22 clusters in strata. Table A6 in the Appendix displays the coefficients and their standard errors.
Receiving a letter would increase the average firm's payroll tax remittance by .127 log points more than the nontreated firms in the treatment year 2018 (equivalent to 13.54 percent). The effect is stronger one year post-treatment (2019) than two years post-treatment (2020). In the post-treatment years 2019–2020, the treatment effect on employees is .039 and .055, respectively (equivalent to 3.98 and 5.65 percent).
The differences in coefficients between the DiD model and the annual regressions are typically due to temporal dynamics, interaction effects, and time-invariant factors. Whereas the DiD model captures the dynamic changes over time, the annual regressions provide a snapshot comparison for each year. This implies that the DiD coefficients reflect the cumulative effect of the treatment over multiple years, whereas the annual regressions show the effect within each specific year. Furthermore, the DiD model includes interaction terms between the treatment and time, capturing how the treatment effect evolves. Annual regressions do not include these interaction terms. Finally, DiD inherently controls for time-invariant, unobserved heterogeneity between the treatment and reference groups, which may also explain the difference in coefficient estimates compared to annual regressions that do not control for these factors. Nevertheless, the focal point here is to quantify the difference between those who actually read the letters and those who did not.
To get a clearer picture of the letter treatment effects, we estimate the LATE. Of the 8,000 firms who received a letter, 1,130 never opened/read it. Thus, the ATE estimates presented above may be biased, since just above 14 percent of the letter treatment group was in fact never treated. While the ATE estimates measure the effects of receiving a letter, the LATE estimates measure the effects of actually reading the letter. This is equivalent to the set-up in Bjørneby, Alstadsæter, and Telle (2021), except that we deal only with one-sided noncompliance, as there were no firms in the reference group receiving a letter.
The regression results, presented in Figure 4, reveal a positive effect of reading the letters (compliers) compared to the reference group.

Estimated LATE of letter treatment (base year 2017).
Notes: This figure plots annual point IV estimates from regressions of measures of logged payroll tax remittance and logged number of employees for the years 2017–2020, compared to the reference group (x-line). The treatment takes place in year 2018, visualized by the vertical line. The dark grey line represents the estimated coefficients of the 6,870 letter openers (LATE), and the light grey line represents the estimated coefficients of the 8,000 assigned to the letter group. Standard errors are adjusted for 22 clusters in strata. Table A7 in the Appendix displays the coefficients and their standard errors.
Both in the treatment year 2018, and the post-treatment year 2019, we observe that the compliers report significantly higher payroll tax (.148 and .255 log points, equaling 16.0 and 29.0 percent, respectively) and number of employees (.046 and .064 log points equaling 4.7 and 6.6 percent, respectively) compared to the reference group. The effects seem to remain for two years post-treatment (2020). Furthermore, the statistically significant estimates may indicate that being assigned to the letter treatment is a strong instrument of actually being treated.
The significant difference between the LATE and the ATE estimates tells us that there may be further gains by taking low-cost measures to facilitate letter reading, such as an automated reminder.
In the second chapter of this thesis, we found significant differences in compliance effects between firm types. To study prospective heterogeneous treatment effects, we have run all models on samples restricted to the three firm types represented in the population, namely private limited liability company (private Ltd), self-employed and Norwegian-registered foreign company. By and large, private Ltds appear to drive the main results. The significance levels of estimated coefficients are higher on this subsample, which is over 80 percent of the firms in the population, cf. Appendix Table A9. This heterogeneity resembles the results of Almunia and Lopez-Rodriguez (2018) who also find that the impact of monitoring on tax compliance varies across different types of firms. Larger and more profitable firms are more responsive to increased monitoring, whereas smaller and less profitable firms show a weaker response. Brockmeyer et al. (2019) find an ambiguous heterogeneous effect between corporations and the self-employed in that the filing rate of corporations responds less strongly but their payment rate responds more strongly to the treatment compared to the self-employed. Pomeranz (2015) finds a stronger response on VAT compliance for smaller firms, following a letter providing information about a random audit selection among 400,000 Chilean firms.
We find some small significant negative treatment effects among the self-employed, cf. Appendix Table A10. This is contrary to the findings of Mittone, Ramachandran (2017). They suggest that tax audits have a significant short-term impact on increasing compliance. Their treatment group tend to immediately adjust their behavior and report more accurately to address the specific issues identified during the audit. However, the study also reveals a gradual decline in tax compliance levels after the initial surge following the audit. Over time, the fear of an audit diminishes, and taxpayers may return to their previous, less compliant behavior or adopt tax avoidance strategies to reduce the risk of future audits, known as “the bomb-crater effect” (Mittone, Panebianco, and Santoro 2017). As our significance levels are low, and not reproduced for the letter treatment, this result should be treated with caution. Baer and Silvani (2020) find both pro-deterrent and counter-deterrent effects on future reporting behavior of audits among the self-employed. They suggest that the observed reduction in reported income among self-employed U.S. taxpayers may be associated with dishonesty caused by undetected misreporting during the audit. According to their study, such taxpayers may infer that audits are ineffective and cause the self-employed to understate their income even more aggressively in subsequent years. Since the sub-sample of self-employed persons is limited in our experiment, and the significant negative treatment effect is found on the reported number of employees, not on payroll tax remittance, we cannot infer that the mechanism suggested by Baer and Silvani (2020) is at play in our environment.
We find no significant effects among Norwegian registered foreign companies (cf. Appendix Table A11), which is also expected since this company type is less than 1 percent of the total sample.
We have also run all models on samples restricted to firm size in the lower and upper 25th percentiles on firm revenue. We find no significant effects on payroll tax nor employees from either treatment on firms in the lower and upper 25th percentiles measured by firm revenue, cf. Appendix Table A8. Using number of employees as a proxy for firm size, Bergolo et al. (2023) finds no substantial or statistically significant difference between firms below and above the median number of employees.
These findings may inform resource allocation decisions. We study only the direct effects on payroll tax remittance and the workforce of the firms and assume few or small network effects as studied by, for example, Boning et al. (2020). As only payroll taxes are studied, the benefit estimates should be considered as the lower boundaries.
Any treatment would increase net revenue if the marginal revenue it raises exceeds its marginal administrative costs. In this paper, we limit the revenue component to payroll tax only, even if there may be other benefits, such as network effects, and a general deterrent effect in the overall population of firms. The revenue raised should be compared to the marginal administrative cost of the treatments. The equation which must be true for implementing either treatment is thus:
Cost-benefit analysis
| Audit (A) | Letter (L) | Reference (R) | A-R | L-R | |
|---|---|---|---|---|---|
| Marginal cost (MC) | 12 740 | 0 | 0 | 12 740 | 0 |
| β (%) 2018 | 0.1320 | 0.0192 | 0 | 0.1320 | 0.0192 |
| Δ (NOK) 2017–2018 | 14 529 | 11 958 | 9 823 | 4 706 | 2 135 |
| β*Δ = Marginal revenue (MR) | 1 918 | 230 | 0 | 1 918 | 230 |
| MR-MC | −10 822 | 230 | 0 | −10 822 | 230 |
Note: Marginal cost of an audit is NOK 12,740. β is the estimated coefficient (%) in year 2018 from models (2) and (3), respectively. Δ is the increment in NOK from 2017 to 2018. A-R gives the difference in NOK between audit and reference, and L-R gives the difference in NOK between letter and reference.
An audit will increase payroll tax remittance by 13.20 percent more than the reference group, whereas the corresponding figure for the letter treatment is 1.92 percent. These changes are due to the treatment. On average, a firm in the audit group increased its payroll tax remittance from 2017 to 2018 by NOK 14,529, whereas the figure for a firm in the letter group was NOK 11,958. By comparison, a firm in the reference group increased its payroll tax remittance by NOK 9,823. Thus, the difference in payroll tax remittance between an audited firm and the reference group is NOK 4,706 on average, and the corresponding figure for a firm receiving a letter is NOK 2,135. If we apply the estimated coefficients to these figures, NOK 1,918 of the payroll tax increment is due to the audit and NOK 230 of the increment is due to the letter. Hence, the administrative cost of the audit exceeds the revenue it generates, and equation (7) is rejected, whereas the opposite is true for the letter treatment, and equation (8) holds.
Some caveats remain, however: First, it is too early to draw any conclusions on the long-term effects of both treatments, and previous literature suggests either declining effects over time (Boning et al., 2020) or mixed long-term effects (Bott et al., 2020). Second, there may also be other effects from both treatments, like general deterrence effects, which may increase the benefits. Third, even without such a deterrence effect, the letters can still help guide the firms that misreport due to honest mistakes, but the additional information will do very little to reduce tax evasion. Fourth, the rejection of (7) might also reflect that randomized audits are wasteful if the objective of those audits is increased compliance measured by revenue on the firm level, which is, however, not the case. Randomized audits are used by tax administrations for objectives other than noncompliance disclosure on the firm level, like disclosing new areas of noncompliance or building datasets for predictive modelling (Alm 2019a; Micci-Barreca and Ramachandran 2004). Finally, an electronic letter is easily scalable to a larger population, which, all else being equal, will increase the revenue further.
There is a need to robustly evaluate different enforcement strategies' effectiveness to choose the most efficient and cost-effective enforcement strategy. Tax authorities may save scarce resources by switching from hard to soft interventions. The main contribution of this paper is a documentation of the firm's response to two interventions in an experimental setting involving the same population of firms. We utilized two randomized experiments, one with stratified on-site audits and one using electronic letters. We demonstrate unbiased, positive effects on firms' remittance of payroll tax from both audits and letters, compared to a reference group with no treatment. While audits have stronger effects than letters, the former is by far the most expensive enforcement strategy.
The results are specific to a population of labor-intensive firms in a relatively advanced tax-reporting environment by international standards. Thus, the external validity of the results is not restricted to the Norwegian setting as such, as one can assume that these effects will be reproduced in several advanced, Western tax jurisdictions with a gold standard system of information reporting. Furthermore, the results suggest that tax authorities may test and compare these two enforcement strategies in other sectors as well.
A cost–benefit assessment including estimations of deterrence effects and other prospective treatment benefits is necessary to make clear recommendations on which enforcement strategy to use. Such an assessment should still be a core priority of the modern tax administration, in order to increase the effects of limited enforcement resources.
Sometimes referred to as complier average causal effect (CACE) (Gerber and Green, 2012).