Insurance and Taxation over the Life Cycle
Abstract
We consider a dynamic Mirrlees economy in a life-cycle context and study the optimal insurance arrangement. Individual productivity evolves as a Markov process and is private information. We use a first-order approach in discrete and continuous time and obtain novel theoretical and numerical results. Our main contribution is a formula describing the dynamics for the labour-income tax rate. When productivity is an AR(1) our formula resembles an AR(1) with a trend where: (i) the auto-regressive coefficient equals that of productivity; (ii) the trend term equals the covariance productivity with consumption growth divided by the Frisch elasticity of labour; and (iii) the innovations in the tax rate are the negative of consumption growth. The last property implies a form of short-run regressivity. Our simulations illustrate these results and deliver some novel insights. The average labour tax rises from 0% to 37% over 40 years, whereas the average tax on savings falls from 12% to 0% at retirement. We compare the second best solution to simple history-independent tax systems, calibrated to mimic these average tax rates.
INTRODUCTION
To a twenty-five-year-old entering the labour market, the landscape must feel full of uncertainties. Will they land a good job relatively quickly or will they initially bounce from one job to another in search of a good match? What opportunities for on-the-job training and other forms of skill accumulation be they find? They face significant uncertainty in their lifetime earnings which is slowly resolved over time. This article investigates the optimal design of a tax system that efficiently shares these risks. With a few notable exceptions, since Mirrlees, optimal tax theory has mostly worked with a static model that treats heterogeneity and uncertainty symmetrically, since redistribution can be seen as insurance behind the "veil of ignorance".
To date, this more dynamic approach has focused on savings distortions, or considered special cases, such as two periods or i.i.d. shocks. Little is known in more realistic settings about the pattern of labour-income taxes when uncertainty is gradually revealed over time.
The drift in the continuous time (the terms multiplying dt ) is the exact counterpart of the discrete-time expectation formula above. The new result here is that the innovations to the labour wedge are related one to one with innovations in the marginal utility of consumption. Economically, this result describes a form of regressivity. When productivity rises, consumption rises, so the marginal utility of consumption falls and the labour wedge must then fall by the same amount, at least in the short run. This induces a negative short-run relation between productivity and the labour wedge. This regressive taxation result is novel and due to the dynamic aspects of our model. In a static optimal taxation settings with a Utilitarian welfare function no general results on regressive or progressive taxation are available, since the optimal tax schedule depends delicately on the skill distribution. For our numerical exploration, we adopt a random walk for productivity. This choice is motivated by two considerations. First, the evidence in Storesletten et al. (2004) points to a near random walk for labour earnings, which requires a near random walk for productivity. Second, by focusing on a random walk we are considering the opposite end of the spectrum of the well explored i.i.d. case (Albanesi and Sleet, 2006).
Our tax system comes out to be slightly regressive in the sense that marginal tax rates are higher for agents with currently low productivity shocks. Our short-run regressivity result seems to explain at least part of this regressivity. In terms of average tax rates the optimal tax system is progressive, the present value of taxes paid relative income is increasing in productivity. This captures the insurance nature of the solution.
Specifically, we compute the equilibrium with history-independent linear taxes on labour and capital income, and consider both age-dependent and age-independent taxes. When age-dependent linear taxes are allowed, the optimal tax rates come out to be indistinguishable from the average rate for each age group from the fully optimal (history dependent) marginal tax rates. Surprisingly, the welfare loss of such a system, relative to the fully optimal one, is minuscule - around 0.15% of lifetime consumption. In this way, our theoretical results do provide guidance for more restrictive tax systems.
THE INSURANCE
+ The environment and planning
++ Preferences, uncertainty, and information
The economy is populated by a continuum of agents who live for T periods. The ex-ante utility
.
We allow the utility function and the density to depend on the period t to be able to incorporate life-cycle considerations. For example, an economy where agents work for periods and retire for periods can be captured by setting for and for . The realization of the state for all is privately observed by the agent. Without loss of generality, we initialize to some arbitrary value. Note that this does not constrain the initial density . More explicitly, an allocation is and utility is
++ Incentive compatibility.
Consider an allocation . Let denote the equilibrium continuation utility after history, defined as the unique solution to
, for all t=1,...,T with . For any strategy , let continuation utility solve
, with .
We say that an allocation {c,y} is incentive compatible if and only if for all That is, an allocation is incentive compatible if truth telling, with , is optimal. Let IC denote the set of all incentive compatible allocation {c,y}.
++ Planning problem.
To keep things simple, we work in partial equilibrium, that is, assuming a linearity technology that converts labour into consumption goods one for one and a linear storage technology with gross rate of return . This allows us to study the contracting problem for a single cohort in isolation. The relevant cost of an allocation is then its expected present value:
.
Efficient allocations solve the following program:
s.t. , .
Our notion here of incentive compatibility is stronger than the ex-ante optimality of truth telling (ex-ante incentive compatibility). We are also requiring the ex-post optimality, after any history of shocks and reports, of subsequent truth telling (ex-post incentive compatibility). This is without loss of generality. To see this note that ex-ante incentive compatibility implies ex-post incentive compatibility almost everywhere. Then, note that one can always insist on ex-post incentive compatibility on the remaining set of measure zero histories, without any effect on welfare for the agent or costs for the planner.
++ Initial heterogeneity and redistribution.
We have interpreted the planning problem as involving a single agent facing uncertainty. Under the interpretation, the planner problem is purely about social insurance and not about redistribution. However, it is simple to add initial heterogeneity and consider redistribution. The simplest way to model heterogeneity is to reinterpret the first shock . Instead of thinking of the value of 0' as the realization of uncertainty, we now interpret as indexing some initial hidden characteristic of an agent. The agent is not alive before the realization of and faces uncertainty only regarding future shocks , Recall that we allow the density to depend flexibly on the period t , so that could accomodate any initial desired dispersion in productivitity types. If the social welfare function is Utilitarian, then the analysis requires no change: insurance behind the veil of ignorance and utilitarian redistribution are equivalent. Formally, the social welfare in this case coincides with the expected utility calculation when is interpreted as uncertainty. Both integrate utility over using the density . Thus, the planning problem at t=l remains unchanged.
However, when it comes to redistribution, a Utilitarian welfare function is a special case. Indeed, we can allow for any social welfare function, or, more generally, characterize the entire set of constrained Pareto-efficient allocations. This does require treating the planning problem in the initial period t=1 differently. It turns out that this only affects the optimal allocation at t=1, as well as the optimal values for the endogenous state variables and . These values for and are inherited at t=2 by the planner, but given these values, the problem from t=2 onwards remains unchanged. Thus, the dynamics for allocation and taxes for t=2,3,..., remains unchanged.
+ OPTIMALITY CONDITIONS
Given an allocation {c,y}, and a history , define the intertemporal wedge
,
and the labor wedge
.
Our model is quite different. In particular, no exogenous restriction on tax instruments is imposed, and distortionary taxes arise endogenously out of a desire to provide social insurance. Our tax-smoothing formula has both differences and similarities with the corresponding results in the Ramsey literature. An important difference is that it applies to the marginal tax rate faced by a given individual in response to idiosyncratic shocks versus aggregate tax rates in response to aggregate shocks. An interesting similarity with Lucas and Stokey (1983) is that taxes inherit the serial correlation of the shocks. The drift is positive whenever provided that consumption is increasing in productivity. Compared to the case with , the additional shocks to productivity create an additional motive for insurance. This pushes the labour wedge up. Interestingly, the size of the drift is precisely the covariance of the log of productivity with the inverse growth rate in marginal utility, divided by , where e is the Frisch elasticity of labour supply. The covariance captures the benefit of added insurance, since it depends on the variability of consumption as well as on the degree of risk aversion. Insurance comes at the cost of lower incentives for work. This effect is stronger the more elastic is labour supply, explaining the role of the Frisch elasticity.
++ Labour wedge at the top and bottom
The only modification to Program FOA is that now incorporates two terms to capture the movements in the support:
The second way of proceeding is simpler. Without loss of generality one can work with an extended allocation, which specifies consumption and labour for all histories . One then proceeds as in the full support case, imposing incentive compatibility after any history including those that lie outside the moving support. This is without loss of generality because we can always perform the extension by assigning bundles for consumption and labour that were already offered. Thus, it does not impose any additional constraints, nor does it affect the planning problem. Using this extended-allocation approach, the derivation of our necessary condition is valid.
+ CONTINUOUS-TIME APPROACH
++ A HJB equation
Having re-expressed the constraints in the relaxed planning problem as stochastic differential equations for the state variables, we can write the HJB equation for the cost .
An interesting feature of this alternative parametrization of the state space is the existence of a sufficient statistic, the volatility process . This volatility controls how much innovations to productivity are passed through to consumption. It can therefore be thought of as a local proxy for the amount of insurance that is provided at the optimal allocation. Higher values for provide more incentives at the expense of insurance. Our regressivity result contrasts with the absence of such results in static settings. As is well understood, the skill distribution is key in shaping the tax schedule in the static model (Mirrlees, 1971; Diamond, 1998; Saez, 2001). In contrast, in our dynamic model, the regressivity result holds with virtually no restrictions for a large class of productivity process.
+ GENERAL PREFERENCES
It is well known that when consumption and labour are not additively separable, the Inverse Euler equation does not hold. Actually, even when there is no additional uncertainty between t-1 and t, so that the Euler and the Inverse Euler equation coincide, the Euler equation might not hold. As is well known, with non-separable preferences, the no capital tax result of Atkinson and Stiglitz (1976) does not hold. The reason for this is that income and productivity now directly affect intertemporal rate of substitution for consumption. Taxing or subsidizing capital therefore helps separating types. Saez (2002) argues that these non-separabilities are relevant in practice. In particular, he suggests that poor agents have a lower propensity to save, and shows in that context that optimal capital taxes are positive. These forces also upset the Inverse Euler equation when there is additional uncertainty between t-1 and t.
++ A Life-cycle economy
Agents live for T = 60 years, working for 40 years and then retiring for 20 years. Their period utility function with a > 1 and k >0 during working years t=1,...,40, and during retirement t=41,42, ...,60.
A fundamental primitive in our exercise is the stochastic process for productivity. Most empirical studies estimate an AR(1) plus white noise, where the white noise is sometimes interpreted as measurement error. Typically, the coefficient of auto-correlation is estimated to be very close to one. We therefore adopt a geometric random walk:
, with .
The value function satisfies
.
Within each period t , we compute the average in the cross section for a number of variables of interest, such as consumption, output, and the labour and intertemporal wedges. During retirement each agent's consumption is constant, while output and wedges are zero. Thus, we focus on the working periods t=1,2, . . . ,40. As the figure shows, the average variance of consumption growth falls over time and reaches zero at retirement. There are two key forces at play. First, as retirement nears, productivity shocks have a smaller effect on the present value of earnings, since they affect earnings for fewer periods. Since consumption is smoothed over the entire lifetime, including retirement, the impact of shocks on consumption falls, and approaches zero at retirement.
Turning to the wedges, panel (a) in Figure 1 shows that the labour wedge starts near zero and increases over time, asymptoting around 37% at retirement. Panel (b) displays the intertemporal wedge, which displays the reverse pattern. It is decreasing over time, starting around 0.6% - which represents an implicit tax on net interest of around 12% - and falling to zero at retirement.15 Both of these findings are easily explained by our theoretical results, together with the behaviour of the average variance of consumption growth.
Panel (b) shows the cross-sectional variance for consumption, productivity, and output. The variance of productivity grows, by assumption, linearly. The variance of output is higher and grows in a convex manner. The variance of consumption, on the other hand, it lower than the variance of productivity and grows in a concave manner. For reference, note that in autarky, with no taxes and no savings, since c=y~0, the variance for consumption, output, and productivity are equal to each other. At the other end of the spectrum, the first-best solution has zero variance in consumption and since .
In the last working period, t = 40, the scatter plot shows an almost perfect relationship between the previous tax and the current one, with a slope of one. Taxes on labour are almost perfectly smoothed near retirement. Recall that the variance of consumption growth drops to zero as retirement approaches. Near retirement, consumption becomes almost perfectly predictable, so the labour wedge does as well. It is important to keep in mind, that a history-independent tax system, with a fixed non-linear tax schedule that allows for savings, can also produces a history-dependent labour wedge. The history of productivity shocks affects savings decisions. The accumulated wealth, in turn, affects the current labour choice, determining the position, and marginal tax rate, along the fixed non-linear tax schedule.
With an age-dependent labour tax, an age-independent tax on capital provides modest but non-negligible benefits, equal to 0.08%. However, the addition of an age-dependent capital tax provides little extra benefit, equal to 0.01% of lifetime consumption. In contrast, age-dependent taxes on labour provide a sizable improvement of 0.33% over the completely age-independent tax system. Allowing for age-dependent labour taxes is more important in this simulation than allowing for age-dependent capital taxes.
CONCLUSION
We consider a dynamic Mirrlees economy in a life-cycle context and study the optimal insurance arrangement. Individual productivity evolves as a general Markov process and is private information. We allow for a very general class of preferences. We use a first-order approach in discrete and continuous time and obtain novel theoretical and numerical results.
Our simulations illustrate these results and deliver some novel insights. The average labour tax rises from 0% to 37% over 40 years, whereas the average tax on savings falls from 12% to 0% at retirement. We compare the second best solution to simple history-independent tax systems, calibrated to mimic these average tax rates. We find that age-dependent taxes capture a sizable fraction of the welfare gains. Hence, it seems that numerically, the history dependence of taxes that are required to implement the full optimum is not an important feature in terms of welfare. Moreover, our simulations emphasize that from a welfare perspective, labour taxes play a more important role than capital taxes (setting capital taxes to zero does not lead to a large deterioration of welfare).
In future work, we plan to enrich the model to incorporate important life-cycle considerations that are absent in our present model: human capital accumulation, endogenous retirement, a more realistic life-cycle profile of earnings, etc. We also plan to continue our numerical explorations by thoroughly investigating the quantiative comparative statics of our model with respect to the stochastic process of earnings, preference parameters, and tastes for initial redistribution.