0% found this document useful (0 votes)
29 views91 pages

Optimal Income Taxation 3

The document outlines the core principles of optimal income taxation, focusing on linear and nonlinear tax models, and the Mirrlees mechanism design approach. It discusses the trade-offs between equity and efficiency in tax policy, the implications of behavioral responses, and the role of social welfare functions. Additionally, it examines historical developments in tax theory and their connections to empirical studies and policy recommendations.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views91 pages

Optimal Income Taxation 3

The document outlines the core principles of optimal income taxation, focusing on linear and nonlinear tax models, and the Mirrlees mechanism design approach. It discusses the trade-offs between equity and efficiency in tax policy, the implications of behavioral responses, and the role of social welfare functions. Additionally, it examines historical developments in tax theory and their connections to empirical studies and policy recommendations.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 91

Optimal Income Taxation

Stefanie Stantcheva

1 77
GOALS OF THESE LECTURES

1) Understand the core optimal income tax model: linear and nonlinear
taxes in the Saez (2001) framework.

General method, intuitive, sufficient statistics.

2) Introduce the mechanism design approach of Mirrlees (1971).

Incentive compatibility, optimal control.

With and without income effects.

3) Extensions: Migration and rent-seeking

4) Should commodity taxes be used in addition to income taxes?


Atkinson-Stiglitz Theorem

2 77
OPTIMAL TAXATION: SIMPLE MODEL WITH NO BEHAVIORAL
RESPONSES

Utility u (c ) strictly increasing and concave

Same for everybody where c is after tax income.

Income is z and is fixed for each individual, c = z − T (z ) where T (z ) is


tax on z. z has density distribution h (z )

Government maximizes Utilitarian objective:


Z ∞
u (z − T (z ))h (z )dz
0
R
subject to budget constraint T (z )h (z )dz ≥ E (multiplier λ)

3 77
SIMPLE MODEL WITH NO BEHAVIORAL RESPONSES

Form lagrangian: L = [u (z − T (z )) + λ · T (z )] · h (z )

First order condition (FOC) in T (z ):

∂L
0= = [−u 0 (z − T (z )) + λ] · h(z ) ⇒ u 0 (z − T (z )) = λ
∂T (z )

⇒ z − T (z ) = constant for all z.


R
⇒ c = z̄ − E where z̄ = zh (z )dz average income.

100% marginal tax rate. Perfect equalization of after-tax income.

Utilitarianism with decreasing marginal utility leads to perfect


egalitarianism [Edgeworth, 1897]

4 77
Utilitarianism and Redistribution
utility

𝑐1 + 𝑐2
𝑢
2

𝑢(𝑐1 ) + 𝑢(𝑐2 )
2

0 𝑐1 𝑐1 + 𝑐2 𝑐2
consumption 𝑐
2
ISSUES WITH SIMPLE MODEL

1) No behavioral responses: Obvious missing piece: 100% redistribution


would destroy incentives to work and thus the assumption that z is
exogenous is unrealistic

⇒ Optimal income tax theory incorporates behavioral responses (Mirrlees


REStud ’71): equity-efficiency trade-off

2) Issue with Utilitarianism: Even absent behavioral responses, many


people would object to 100% redistribution [perceived as confiscatory]

⇒ Citizens’ views on fairness impose bounds on redistribution.

The issue is the restricted nature of social preferences that can be captured
by most social welfare functions.

We will discuss preferences for redistribution in another lecture! For now


we remain agnostic about the “gi ”.
5 77
MIRRLEES OPTIMAL INCOME TAX MODEL

We will solve the Mirrleesian model later. For now, let’s look at the spirit of
optimal tax evolution.

1) Standard labor supply model: Individual maximizes u (c, l ) subject to


c = wl − T (wl ) where c consumption, l labor supply, w wage rate, T (.) nonlinear
income tax ⇒ taxes affect labor supply

2) Individuals differ in ability w , private information w distributed with density


f (w ).

3) Govt social welfare maximization: Govt maximizes


Z
SWF = G (u (c, l ))f (w )dw

(G (.) ↑ concave) subject to


R
(a) budget constraint T (wl )f (w )dw ≥ E (multiplier λ)

(b) individuals’ labor supply l depends on T (.)


77
MIRRLEES MODEL RESULTS

Optimal income tax trades-off redistribution and efficiency (as tax based on
w only not feasible)

⇒ T (.) < 0 at bottom (transfer) and T (.) > 0 further up (tax) [full
integration of taxes/transfers]

Mirrlees formulas complex, only a couple fairly general results:

1) 0 ≤ T 0 (.) ≤ 1, T 0 (.) ≥ 0 is non-trivial (rules out EITC) [Seade ’77]

2) Marginal tax rate T 0 (.) should be zero at the top (if skill distribution
bounded) [Sadka ’76-Seade ’77]

3) If everybody works and lowest wl > 0, T 0 (.) = 0 at bottom

7 77
HISTORY: BEYOND MIRRLEES

Mirrlees ’71 had a huge impact on information economics: models with


asymmetric information in contract theory

Discrete 2-type version of Mirrlees model developed by Stiglitz JpubE ’82


with individual FOC replaced by Incentive Compatibility constraint [high
type should not mimick low type]

Till late 1990s, Mirrlees results not closely connected to empirical tax
studies and little impact on tax policy recommendations

Since late 1990s, Diamond AER’98, Piketty ’97, Saez ReStud ’01 have
connected Mirrlees model to practical tax policy / empirical tax studies

[new approach summarized in Diamond-Saez JEP’11 and Piketty-Saez


Handbook’13]

8 77
INTENSIVE LABOR SUPPLY CONCEPTS

max u ( c , z ) subject to c = z · (1 − τ ) + R
c,z + −

Imagine a linearized budget constraint: R is virtual income (why virtual?)


and τ marginal tax rate.

FOC in c, z ⇒ (1 − τ )uc + uz = 0 ⇒ Marshallian labor supply


z = z (1 − τ, R )

(1 − τ ) ∂z
Uncompensated elasticity εu =
z ∂(1 − τ )

∂z
Income effects η = (1 − τ ) ≤0
∂R

9 77
INTENSIVE LABOR SUPPLY CONCEPTS (II)

Substitution effects: Hicksian labor supply: z c (1 − τ, u ) minimizes cost


needed to reach u given slope 1 − τ ⇒

(1 − τ ) ∂z c
Compensated elasticity εc = >0
z ∂(1 − τ )

∂z ∂z c ∂z
Slutsky equation = +z ⇒ εu = εc + η
∂(1 − τ ) ∂(1 − τ ) ∂R

10 77
Labor Supply Theory
c= Indifference
consumption Curves

c = (1-t)z+R

Marshallian Labor Supply


R Slope=1-τ
z(1-τ,R)

0 earnings supply z
Labor Supply Theory
𝑐= utility 𝑢
consumption

Slope=1-τ

Hicksian Labor Supply


zc(1-τ,u)

0 earnings supply z
Labor Supply Income Effect
𝑐

𝜂=(1−t)​𝜕𝑧/𝜕𝑅 ≤0

R+∆R

R
z(1-τ,R+ΔR) z(1-τ,R)

0 Earnings z
Labor Supply Substitution Effect
𝑐

utility 𝑢
slope= 1-τ+dτ


slope=1-τ

εc= (1-τ)/z ∂zc/ ∂ (1-τ)>0

zc(1-τ,u) zc(1-τ+dτ,u)

0 Earnings z
Uncompensated Labor Supply Effect
𝑐
Slutsky equation: εu = εc + η
slope=1-τ+dτ

income effect
𝜂≤0

slope=1-τ

R
substitution effect: εc>0

0 Earnings z
Labor Supply Effects of Taxes and Transfers

Taxes and transfers change the slope 1 − T 0 (z ) of the budget constraint


and net disposable income z − T (z ) (relative to the no tax situation where
c = z)

Positive MTR T 0 (z ) > 0 reduces labor supply through substitution effects

Net transfer (T (z ) < 0) reduces labor supply through income effects

Net tax (T (z ) > 0) increases labor supply through income effects

11 77
Effect of Tax on Labor Supply
𝑐= z-T(z)

T(z) < 0:
income effect  z  ↓
T’(z) > 0: slope=1-T’  (z)
substitution effect  
z  ↓
T(z) > 0: income effect  z  ↑
T’(z)>0: substitution effect z  ↓

-T(0)

0 ​z↑∗
pre-tax income z
WELFARE EFFECT OF SMALL TAX REFORM

Indirect utility: V (1 − τ, R ) = maxz u ((1 − τ )z + R, z ) where R is virtual


income intercept

Small tax reform: dτ and dR:

dV = uc · [−zdτ + dR ] + dz · [(1 − τ )uc + uz ] = uc · [−zdτ + dR ]

Envelope theorem: no effect of dz on V because z is already chosen to


maximize utility ((1 − τ )uc + uz = 0)

[−zdτ + dR ] is the mechanical change in disposable income due to tax


reform

Welfare impact of a small tax reform is given by uc times the money metric
mechanical change in tax

12 77
WELFARE EFFECT OF SMALL TAX REFORM (II)

!! Remains true of any nonlinear tax system T (z )

Just need to look at dT (z ), mechanical change in taxes, or dTi for agent i.

dVi = Welfare impact is −uc dT (zi ).

When is the welfare impact not just the mechanical change in disposable
income?

Envelope Theorem: For a constrained problem

V (θ ) = max F (x, θ ) s.t. c ≥ G (x, θ )


x

∂F ∗ ∂G
V 0 (θ ) = (x (θ ), θ ) − λ∗ (θ ) (x ∗ (θ ), θ )
∂θ ∂θ

13 77
SOCIAL WELFARE FUNCTIONS (SWF)

Welfarism = social welfare based solely on individual utilities

Any other social objective will lead to Pareto dominated outcomes in some
circumstances (Kaplow and Shavell JPE’01) Why?

Most widely used welfarist SWF:


R
1) Utilitarian: SWF = i u i

2) Rawlsian (also called Maxi-Min): SWF = mini u i


R
3) SWF = i G (u i ) with G (.) ↑ and concave, e.g., G (u ) = u 1−γ /(1 − γ )
(Utilitarian is γ = 0, Rawlsian is γ = ∞)
R
4) General Pareto weights: SWF = i µi · u i with µi ≥ 0 exogenously given

14 77
SOCIAL MARGINAL WELFARE WEIGHTS

Key sufficient statistics in optimal tax formulas are Social Marginal


Welfare Weights for each individual:

Social Marginal Welfare Weight on individual i is gi = G 0 (u i )uci /λ (λ


multiplier of govt budget constraint) measures $ value for govt of giving $1
extra to person i
R
No income effects ⇒ i gi = 1: giving $1 to allRcosts $1 (population has
measure 1) and increase SWF (in $ terms) by i gi

gi typically depend on tax system (endogenous variable)

Utilitarian case: gi decreases with zi due to decreasing marginal utility of


consumption

Rawlsian case: gi concentrated on most disadvantaged (typically those


with zi = 0)
15 77
OPTIMAL LINEAR TAX RATE: INDIVIDUAL PROBLEM

Disposable income (consumption): c = (1 − τ ) · z + R with τ linear tax rate


and R demogrant funded by taxes τZ with Z aggregate earnings

Population of size one (continuum) with heterogeneous preferences u i (c, z )


[differences in earnings ability are built in utility function]

Individual i chooses z to maximize u i ((1 − τ ) · z + R, z ) labor supply R


choices z i (1 − τ, R ) aggregate to economy wide earnings Z (1 − τ ) = i z i
(are a function of the net-of-tax-rate).

Tax Revenue R (τ ) = τ · Z (1 − τ ) is inversely U-shaped with τ:


R (τ = 0) = 0 (no taxes) and R (τ = 1) = 0 (nobody works): called the
Laffer Curve

16 77
OPTIMAL LINEAR TAX RATES: PLAN

Let’s look at:

1) The optimal linear tax formula (on all income z ∈ [0, ∞)).

2) The revenue-maximizing rate (special case).

3) The top revenue-maximizing tax rate (nests previous case if top


bracket starts at z = 0).

17 77
OPTIMAL LINEAR TAX RATE: FORMULA

Government chooses τ to maximize


Z
G [u i ((1 − τ )z i + τZ (1 − τ ), z i )]
i

Govt FOC (using the envelope theorem as z i maximizes u i ):


Z  
0 i i i dZ
0 = G (u )uc · −z + Z − τ ,
i d (1 − τ )
Z h i
τ
0 = G 0 (u i )uci · (Z − z i ) − eZ ,
i 1−τ
First term (Z − z i ) is mechanical redistributive effect of dτ, second term is
efficiency cost due to behavioral response of Z

⇒ we obtain the following optimal linear income tax formula


R
1 − ḡ gi · z i
τ= with ḡ = R , gi = G 0 (u i )uci
1 − ḡ + e Z · gi
18 77
OPTIMAL LINEAR TAX RATE: FORMULA
R
1 − ḡ gi · z i
τ= with ḡ = R , gi = G 0 (u i )uci
1 − ḡ + e Z · gi
0 ≤ ḡ < 1 if gi is decreasing with zi (social marginal welfare weights fall
with zi ).

ḡ low when (a) inequality is high, (b) g i ↓ sharply with z i

Formula captures the equity-efficiency trade-off robustly (τ ↓ ḡ , τ ↓ e)

Rawlsian case: gi ≡ 0 for all zi > 0 so ḡ = 0 and τ = 1/(1 + e )

Rawlsian optimum = top of Laffer curve if mini u i agent earns zi = 0.

19 77
Laffer Curve
Tax
Revenue R = 𝜏 ∙ 𝑍(1 − 𝜏)
R 1 1−𝜏 𝑑𝑍
𝜏∗= with 𝑒 = ∙
1+𝑒 𝑍 𝑑(1−𝜏)

0 𝜏∗ 1 𝜏: Tax Rate
OPTIMAL TOP INCOME TAX RATE (SAEZ ’01)

Consider constant MTR τ above fixed z ∗ . Goal is to derive optimal τ

Assume w.l.o.g there is a continuum of measure one of individuals above z ∗

Let z (1 − τ ) be their average income [depends on net-of-tax rate 1 − τ],


with elasticity e = [(1 − τ )/z ] · dz /d (1 − τ )

! Careful, what is e?

Note that e is a mix of income and substitution effects (see Saez ’01)

21 77
Optimal Top Income Tax Rate (Mirrlees ’71 model)
Disposable
Income
c=z-T(z) Top bracket:
Slope 1-τ

z*-T(z*) Reform:
Slope 1-τ−dτ

0 z* Market
income z
Source: Diamond and Saez JEP'11
Optimal Top Income Tax Rate (Mirrlees ’71 model)
Disposable
Income Mechanical tax increase:
c=z-T(z) dτ[z-z*]

z*-T(z*)
Behavioral Response tax loss:
τ dz = - dτ e z τ/(1-τ)

0 z* z Market
income z
Source: Diamond and Saez JEP'11
OPTIMAL TOP INCOME TAX RATE

Consider small dτ > 0 reform above z ∗ .

1) Mechanical increase in tax revenue:


dM = [z − z ∗ ]dτ

2) Welfare effect:
dW = −ḡ dM = −ḡ [z − z ∗ ]dτ
where ḡ is the social marginal welfare weight for top earners

3) Behavioral response reduces tax revenue:


dz τ 1−τ dz
dB = τ · dz = −τ dτ = − · · zdτ
d (1 − τ ) 1−τ z d (1 − τ )
τ
⇒ dB = − · e · zdτ
1−τ
22 77
OPTIMAL TOP INCOME TAX RATE
h τ i
dM + dW + dB = dτ (1 − ḡ )[z − z ∗ ] − e z
1−τ
Optimal τ such that dM + dW + dB = 0 ⇒

τ (1 − ḡ )[z − z ∗ ]
=
1−τ e ·z

1 − ḡ z
τ= with a=
1 − ḡ + a · e z − z∗

Optimal τ ↓ ḡ [redistributive tastes]

Optimal τ ↓ with e [efficiency]

Optimal τ ↓ a [thinness of top tail]

23 77
OPTIMAL LINEAR RATES: RECAP

1) The optimal linear tax formula (on all income z ∈ [0, ∞)):
R
∗ 1 − ḡ gi · zi
τ = with ḡ = R , gi = G 0 (u i )uci
1 − ḡ + e Z · gi

2) The revenue-maximizing rate (special case if ḡ = 0, i.e., if gi = 0 for


all zi 6= 0).
1
τR =
1+e

3) The top revenue-maximizing tax rate (equal to τ ∗ if z ∗ = 0).


1 − ḡ z
τ top = with a=
1 − ḡ + a · e z − z∗

24 77
SUFFICIENT STATS FORMULA

Pause for a bit: did we say anything about underlying characteristics of


people?

Note how general the formula is!

Sufficient statistics, observables only.

25 77
ZERO TOP RATE RESULT

Suppose top earner earns z T

When z ∗ → z T ⇒ z → z T

τ
dM = dτ [z − z ∗ ] << dB = dτ · e · z when z∗ → zT
1−τ

Intuition: extra tax applies only to earnings above z ∗ but behavioral


response applies to full z ⇒

Optimal τ should be zero when z ∗ close to z T (Sadka-Seade zero top rate


result) but result applies only to top earner

Top is uncertain: If actual distribution is finite draw from an underlying


Pareto distribution then expected revenue maximizing rate is 1/(1 + a · e )
(Diamond and Saez JEP’11)
26 77
2.5
Empirical Pareto Coefficient
1.5 1 2

0 200000 400000 600000 800000 1000000


z* = Adjusted Gross Income (current 2005 $)

a=zm/(zm-z*) with zm=E(z|z>z*) alpha=z*h(z*)/(1-H(z*))

Source: Diamond and Saez JEP'11


OPTIMAL TOP INCOME TAX RATE

Empirically: a = z/(z − z ∗ ) very stable above z ∗ = $400K

Pareto distribution 1 − F (z ) = (k/z )α , f (z ) = α · k α /z 1+α , with α Pareto


parameter
R∞ R ∞ −α
∗ ∗ sf (s )ds ∗ s ds α
z
z (z ) = R ∞ = R ∞z −α−1 = · z∗
z ∗ f ( s ) ds z ∗ s ds α − 1

α = z/(z − z ∗ ) = a measures thinness of top tail of the distribution

Empirically a ∈ (1.5, 3), US has a = 1.5, Denmark has a = 3

1 − ḡ
τ=
1 − ḡ + a · e
Only difficult parameter to estimate is e

27 77
3
ym /(ym − y ∗ ) with ym = E(y|y > y ∗ )
αY = y ∗ hY (y ∗ )/(1 − HY (y ∗ ))

2.5
Empirical Pareto coefficient

1.5

0.5

0
$200,000 $400,000 $600,000 $800,000 $1,000,000
Total Income
3
rkm /(rkm − rk ∗ ) with rkm = E(rk|rk > rk ∗ )
αK = rk ∗ hK (rk ∗ )/(1 − HK (rk ∗ ))

2.5
Empirical Pareto coefficient

1.5

0.5

0
$200,000 $400,000 $600,000 $800,000 $1,000,000
Capital Income
TOP TAX REVENUE MAXIMIZING TAX RATE

Utilitarian criterion with uc → 0 when c → ∞ ⇒ ḡ → 0 when z ∗ → ∞

Rawlsian criterion (maximize utility of worst off person) ⇒ ḡ = 0 for any


z ∗ > min(z )

In the end, ḡ reflects the value that society puts on marginal consumption
of the rich

ḡ = 0 ⇒ Tax Revenue Maximizing Rate τ = 1/(1 + a · e ) (upper bound on


top tax rate)

Example: a = 2 and e = 0.25 ⇒ τ = 2/3 = 66.7%

Laffer linear rate is a special case with z ∗ = 0, z m /z ∗ = ∞ = a/(a − 1)


and hence a = 1, τ = 1/(1 + e )

30 77
EXTENSIONS AND LIMITATIONS

1) Model includes only intensive earnings response. Extensive earnings


responses [entrepreneurship decisions, migration decisions] ⇒ Formulas
can be modified

2) Model does not include fiscal externalities: part of the response to dτ


comes from income shifting which affects other taxes ⇒ Formulas can be
modified

3) Model does not include classical externalities: (a) charitable


contributions, (b) positive spillovers (trickle down) [top earners underpaid],
(c) negative spillovers [top earners overpaid]

Classical general equilibrium effects on prices are NOT externalities and


do not affect formulas [Diamond-Mirrlees AER ’71, Saez JpubE ’04]

31 77
GENERAL NON-LINEAR INCOME TAX T (z )

(1) Lumpsum grant given to everybody equal to −T (0)

(2) Marginal tax rate schedule T 0 (z ) describing how (a) lump-sum grant is
taxed away, (b) how tax liability increases with income

Let H (z ) be the income CDF [population normalized to 1] and h (z ) its


density [endogenous to T (.)]

Let g (z ) be the social marginal value of consumption for taxpayers with


0
income z in R terms of public funds [formally g (z ) = G (u ) · uc /λ]: no income
effects ⇒ g (z )h (z )dz = 1

Redistribution valued ⇒ g (z ) decreases with z

Let G (z ) the average


R ∞ social marginal value of c for taxpayers with income
above z [G (z ) = z g (s )h (s )ds/(1 − H (z ))]
32 77
Small band (z,z+dz): slope 1- T’(z)
Disposable Reform: slope 1- T’(z)d
Income Mechanical tax increase: ddz [1-H(z)]
c=z-T(z) Social welfare effect: -ddz [1-H(z)] G(z)

ddz

Behavioral response:
z = - d e z/(1-T’(z))
Tax loss: T’(z) z h(z)dz
= -h(z) e z T’(z)/(1-T’(z)) dzd

0 z z+dz Pre-tax income z

Source: Diamond and Saez JEP'11


GENERAL NON-LINEAR INCOME TAX

Assume away income effects εc = εu = e [Diamond AER’98 shows this is


the key theoretical simplification]

Consider small reform: increase T 0 by dτ in small band z and z + dz

Mechanical effect dM = dzdτ [1 − H (z )]

Welfare effect dW = −dzdτ [1 − H (z )]G (z )

Behavioral effect: substitution effect δz inside small band [z, z + dz ]:


dB = h (z )dz · T 0 · δz = −h (z )dz · T 0 · dτ · z · e(z ) /(1 − T 0 )

Optimum dM + dW + dB = 0

33 77
GENERAL NON-LINEAR INCOME TAX

1 − G (z )
T 0 (z ) =
1 − G ( z ) + α ( z ) · e(z )

1) T 0 (z ) decreases with e(z ) (elasticity efficiency effects)

2) T 0 (z ) decreases with α (z ) = (zh (z ))/(1 − H (z )) (local Pareto


parameter)

3) T 0 (z ) decreases with G (z ) (redistributive tastes)

Asymptotics: G (z ) → ḡ , α (z ) → a, e(z ) → e ⇒ Recover top rate formula


τ = (1 − ḡ )/(1 − ḡ + a · e )

34 77
2.5
Empirical Pareto Coefficient
1.5 1 2

0 200000 400000 600000 800000 1000000


z* = Adjusted Gross Income (current 2005 $)

a=zm/(zm-z*) with zm=E(z|z>z*) alpha=z*h(z*)/(1-H(z*))

Source: Diamond and Saez JEP'11


Negative Marginal Tax Rates Never Optimal

Suppose T 0 < 0 in band [z, z + dz ]

Increase T 0 by dτ > 0 in band [z, z + dz ]: dM + dW > 0 and dB > 0


because T 0 (z ) < 0

⇒ Desirable reform

⇒ T 0 (z ) < 0 cannot be optimal

EITC schemes are not desirable in Mirrlees ’71 model

35 77
MIRRLEES MODEL
The difference to before: we need to specify the structural primitives.

Key simplification is the lack of income effects (Diamond, 1998). We look


into income effects next time.

Individual utility: c − v (l ), l is labor supply.

Skill n is exogenously given, equal to marginal productivity. Earnings are


z = nl.

Density is f (n ) and CDF F (n ) on [0, ∞).

Entry into contract theory/mechanism design here: The government does


not observe skill. Tax is based on income z, T (z ).

What happens if we had a tax T (n ) available?

Why did we not talk about this in the earlier derivations? Did we ignore
the incentive compatibility constraints?
36 77
Elasticity of labor to taxes
Recall we derive elasticities on the linearized budget set. If marginal tax
rate is τ, labor supply is: l = l (n (1 − τ )). Why the n (1 − τ )? Why only
n (1 − τ )?

FOC of the agent for labor supply:

n (1 − τ ) = v 0 (l )

Totally differentiate this (key thing: skill is fixed!)

d (n (1 − τ )) = v 00 (l )dl

dl (1 − τ )n (1 − τ )n v 0 (l )
⇒e= = =
d (n (1 − tau )) l lv 00 (l ) lv 00 (l )

Is this compensated? uncompensated?


37 77
Direct Revelation Mechanism and Incentive Compatibility

We want to max social welfare and have exogenous revenue requirement


(non transfer-related E ).

We imagine a direct revelation mechanism. Every agent comes to


government, reports a type n0 . We assign allocations as a function of the
report. c (n0 ), z (n0 ), u (n0 ). Why are we not assigning labor l (n0 )?

What are the constraints in this problem?


R
Feasibility (net resources sum to zero): n cn f (n )dn ≥ nln f (n )dn − E .

Incentive compatibility:

38 77
Direct Revelation Mechanism and Incentive Compatibility
We want to max social welfare and have exogenous revenue requirement
(non transfer-related E ).

We imagine a direct revelation mechanism. Every agent comes to


government, reports a type n0 . We assign allocations as a function of the
report. c (n0 ), z (n0 ), u (n0 ). Why are we not assigning labor l (n0 )?

What are the constraints in this problem?


R
Feasibility (net resources sum to zero): n cn f (n )dn ≥ nln f (n )dn − E .

Incentive compatibility:

   
z (n ) 0 z (n 0 )
c (n ) − v ≥ c (n ) − v ∀n, n0
n n
That’s a lot of constraints!
38 77
Envelope Theorem and First order Approach
Replace the infinity of constraints with agents’ first-order condition. If we
take derivative of utility wrt type n at truth-telling

   0  
dun 0 z 0 (n ) 0 z (n ) dn z (n ) 0 z (n )
= c (n ) − v + 2 v
dn n n dn n n

What if report is optimally chosen?

Envelope condition:
dun ln v 0 (ln )
=
dn n

Will replace infinity of constraints.

Is necessary, but what about sufficiency?


39 77
Full Optimization Program

R R R
maxcn ,un ,zn G (un )f (n )dn s.t. cn f (n )dn ≤ nln f (n )dn − E
n n n

ln v 0 (ln )
and s.t. du
dn =
n
n

State variable: un .

Control variables: ln , with cn = un + v (ln ).

Why am I suddenly saying ln is a control?

Use optimal control.

40 77
Hamiltonian and Optimal Control

The Hamiltonian is:


ln v 0 (ln )
H = [G (un ) + p · (nln − un − v (ln ))]f (n ) + φ(n ) ·
n
p: multiplier on the resource constraint.

φ(n ): multiplier on the envelope condition (“costate”). Depends on n!

FOCs:
∂H φ (n )
= p · [n − v 0 (ln )]f (n) + · [v 0 (ln ) + ln v 00 (ln )] = 0
∂ln n

∂H dφ(n )
= [ G 0 ( un ) − p ] f ( n ) = −
∂un dn
Transversality: limn→∞ φ(n ) = 0 and φ(0) = 0.

41 77
Rearranging the FOCs
Take the integral of the FOC wrt un to solve for φ(n ):
Z ∞

−φ(n ) = [p − G 0 (um )]f (m)dm


n

Integrate this same FOC over the full space, using transversality conditions:
Z ∞
p= G 0 (un )f (m )dm
0

What does this say?

How can we make the tax rate appear? Use the agent’s FOC.

n − v 0 (ln ) = nT 0 (zn )

42 77
Obtaining the Optimal Tax Formula
(1−T 0 (zn ))n
Recall that e = lv 00 (l )

Rearranging the last term in the FOC for ln :

[v 0 (ln ) + ln v 00 (ln )]/n = [1 − T 0 (zn )][1 + 1/e ]

Let gm ≡ G 0 (um )/p be the marginal social welfare weight on type m.

Then, the FOC for ln becomes:


  R ∞ 
T 0 ( zn ) 1 n (1 − gm )dF (m )
= 1+ ·
1 − T 0 ( zn ) e nf (n )

This is the Diamond (1998) formula.

What is different from the previously derived formula à la Saez (2001)?

43 77
Let’s go from types to observable income
How do we go from type distribution to income distribution?

Under linearized tax schedule, earnings are a function zn = nl (n (1 − τ )).

How do earnings vary with type?

dzn dl
= l + (1 − τ )n = ln · (1 + e )
dn d (m (1 − τ ))
(intuition?)

Let h (z ) be the density of earnings, with CDF H (z ). The following relation


must hold:
h (zn )dzn = f (n )dn

f (n ) = h (zn )ln (1 + e ) ⇒ ng (n ) = zn h (zn )(1 + e )

Let’s substitute income distributions for type distributions in the formula.44 77


Optimal Tax Formula with No Income Effects
! R ∞ 
T 0 ( zn ) 1 n (1 − gm )dF (m)
= 1+ (primitives)
1 − T 0 (zn ) e ( zn ) nf (n )
 
1 1 − H ( zn )
= · (1 − G (zn )) (incomes)
e ( zn ) zn h (zn )
where: R∞ R∞
gm dF (m ) gm dH (zm )
zn
G ( zn ) = n =
1 − F (n ) 1 − H ( zn )
is the average marginal social welfare weight on individuals with income
above zn (change of variables to income distributions in last equality).

Rearrange, use definition of Pareto parameter α (z ) = (zh (z ))/(1 − H (z ))


to get same formula as before:
1 − G (z )
T 0 (z ) =
1 − G ( z ) + α ( z ) · e(z )

45 77
Recap:

“Mechanism design approach” requires you to specify primitives (utility


function, uni-dimensional heterogeneity) as done in Mirrlees (1971).

“Sufficient stats approach” captures arbitrary heterogeneity conditional on


z as long as well-behaved elasticities.

Yield same formula if can make the link between types and income
distributions.

! R ∞ 
T 0 ( zn ) 1 n (1 − gm )dF (m)
= 1+ (primitives)
1 − T 0 (zn ) e ( zn ) nf (n )
 
1 1 − H ( zn )
= · (1 − G (zn )) (incomes)
e ( zn ) zn h (zn )

46 77
NUMERICAL SIMULATIONS

H (z ) [and also G (z )] endogenous to T (.). Calibration method (Saez Restud


’01):

Specify utility function (e.g. constant elasticity):

1  z 1+ 1e
u (c, z ) = c − 1
·
1+ e
n

Individual FOC ⇒ z = n1+e (1 − T 0 )e

Calibrate the exogenous skill distribution F (n ) so that, using actual T 0 (.),


you recover empirical H (z )

Use Mirrlees ’71 tax formula (expressed in terms of F (n )) to obtain the


optimal tax rate schedule T 0 .

47 77
NUMERICAL SIMULATIONS
  Z ∞ 
T 0 (z (n )) 1 1 G 0 (u (m ))
= 1+ 1− f (m )dm,
1 − T 0 (z (n )) e nf (n ) n λ

Iterative Fixed Point method: start with T00 , compute z 0 (n ) using individual
FOC,R get T 0 (0) using govt budget, compute u 0 (n ), get λ using
λ = G 0 (u )f , use formula to estimate T10 , iterate till convergence

Fast and effective method (Brewer-Saez-Shepard ’10)

48 77
NUMERICAL SIMULATION RESULTS

1 − G (z )
T 0 (z ) =
1 − G ( z ) + α ( z ) · e(z )
Take utility function with e constant

2) α (z ) = (zh (z ))/(1 − H (z )) is inversely U-shaped empirically

3) 1 − G (z ) increases with z from 0 to 1 (ḡ = 0)

⇒ Numerical optimal T 0 (z ) is U-shaped with z: reverse of the general


results T 0 = 0 at top and bottom [Diamond AER’98 gives theoretical
conditions to get U-shape]

49 77
FIGURE 5 − Optimal Tax Simulations
Utilitarian Criterion, Utility type I Utilitarian Criterion, Utility type II
1 1

0.8 0.8 ζc=0.25


Marginal Tax Rate

Marginal Tax Rate


ζc=0.25
0.6 0.6
ζc=0.5
c
0.4 ζ =0.5 0.4

0.2 0.2

0 0
$0 $100,000 $200,000 $300,000 $0 $100,000 $200,000 $300,000
Wage Income z Wage Income z

Rawlsian Criterion, Utility type I Rawlsian Criterion, Utility type II


1 1
ζc=0.25
0.8 0.8
Marginal Tax Rate

Marginal Tax Rate


c
ζ =0.25
c
0.6 0.6 ζ =0.5

0.4 ζc=0.5 0.4

0.2 0.2

0 0
$0
$100,000 $200,000 $300,000 $0 $100,000 $200,000 $300,000
Source: Saez (2001), p. 224 Wage Income z Wage Income z
EXTENSION 1: MIGRATION EFFECTS

Tax rates may affect migration (evidence on this next time).

Migration issues may be particularly important at the top end (brain drain).

Some theory papers (Mirrlees ’82, Lehmann-Simula QJE’14). Here:


Simplified Mirrlees (1982) model.

Earnings z are fixed, conditional on residence.

P (c|z ) is number of residents earning z when disposable income is c, with


c = z − T (z ).

Consider small tax reform dT (z ) for those earning z.

What is migration responding to? Marginal taxes?

50 77
ELASTICITY OF MIGRATION TO TAXES

Mechanical effect net of welfare is: M + W = (1 − g (z ))P (c|z )dT .

Why? Where is utility effect of changing country induced by taxes?

Migration responds to average taxes (or total taxes, since income fixed).

∂P (c|z ) z − T (z )
ηm (z ) =
∂c P (c|z )
T (z )
Fiscal cost of raising taxes by dT (z ) is: B = − z−T (z ) · P (c|z ) · ηm

Optimal tax is where M + W + B = 0:

T (z ) 1
= · (1 − g (z ))
z − T (z ) ηm (z )

What determines the elasticity ηm (z )?

51 77
MIGRATION EFFECTS IN THE STANDARD MODEL

ηm (z ) depends on size of jurisdiction: large for cities, zero worldwide ⇒


(1) Redistribution easier in large jurisdictions, (2) Tax coordination across
countries increases ability to redistribute (big issue currently in EU), (3)
visa system, cost of migration, ...

Top revenue maximizing tax rate formula (Brewer-Saez-Shepard ’10):


1
τ=
1 + a · e + η̄m

where η̄m is the elasticity of top earners to disposable income.

52 77
EXTENSION 2: RENT SEEKING EFFECTS

Pay may not be equal to the marginal economic product for top income
earners. Why? Overpaid or underpaid?

Piketty, Saez, and Stantcheva (2014) “A Tale of Three Elasticities.”

Actual output is y , but individual only receives share η of actual output. To


increase either productive effort or rent-seeking, effort is required.

u i (c, η, y ) = c − hi (y ) − ki (η)

Define bargained earnings: b = (η − 1)y .

Average bargaining is E (b ), extracted equally from everyone else (good


assumption?) Means E (b ) can be perfectly canceled by −T (0).

53 77
RENT SEEKING ELASTICITIES
Given tax, individual maximizes:
u i (c, y , η) = η · y − T (η · y ) − hi (y ) − ki (η)
What will yi and ηi depend on?

Average reported income, productive income and bargained earnings in the


top bracket:
z (1 − τ ), y (1 − τ ), η (1 − τ )

1−τ dz
Total compensation elasticity e: e = z d (1−τ ) (what is it driven by?)

1−τ dy
Real labor supply elasticity ey : ey = y d (1−τ ) ≥ 0.

db 1−τ
Thus the bargaining elasticity component eb = d (1−τ ) z
= s · e with
db/d (1−τ )
s= dz/d (1−τ )

s and eb positive if η > 1.


54 77
OPTIMAL TAX RATE WITH RENT SEEKING
Suppose rent-seeking only at the top, E (b ) = qb (1 − τ ) where q fraction of
top earners.

Government maximizes tax revenues from top bracket earners:

T = τ [y (1 − τ ) + b (1 − τ ) − z ∗ ]q − E (b )

Why does E (b ) enter?

1 + a · eb a(y /z )ey
τ∗ = = 1−
1+a·e 1+a·e
How does τ change with e, ey , and eb ? When is τ ∗ = 1 optimal?

Trickle up vs trickle down: what happens to τ ∗ when top earners are


overpaid? Underpaid?

How would you measure eb (even b itself?)


55 77
OPTIMAL NON-LINEAR TAX WITH INCOME EFFECTS

Consider effect of small reform where marginal tax rates increased by dτ in


[z ∗ , z ∗ + dz ∗ ].

What are the effects on tax receipts?

Mechanical effect net of welfare loss, M:

Every tax payer with income z above z ∗ pays additional dτdz ∗ , valued at
(1 − g (z ))dτdz ∗ .
Z ∞

M = dτdz (1 − g (z ))h(z )dz
z∗

56 77
BEHAVIORAL EFFECT PART 1: SUBSTITUTION
In [z ∗ , z ∗ + dz ∗ ], income changes by dz.

Marginal tax rate changes directly by dτ, but also additionally indirectly
by dT 0 (z ) = T 00 (z )dz. Why? When is this not the case?

dτ + dT 0 (z ) dτ
dz = −ε(cz ) z ∗ ⇒ dz = −ε(cz ) z ∗
1 − T 0 (z ) 1 − T 0 (z ) + ε(cz ) z ∗ T 00 (z )

Define the virtual density: density that would occur at z if tax schedule
replaced by linearized tax schedule. What is the linearized schedule (τ, R )
such that income is (1 − τ )z + R?

h ∗ (z ) h (z )
=
0
1 − T (z ) 1 − T (z ) + ε(cz ) z ∗ T 00 (z )
0

58 77
BEHAVIORAL EFFECT PART 1: SUBSTITUTION

Overall elasticity/substitution effect is then:

T 0 (z ) ∗ ∗
E = −ε(cz ) z ∗ h (z )dτdz ∗
1 − T 0 (z )

Can derive expression without taking into account endogenous (indirect)


change in marginal tax rates if use the virtual density instead of true one.

59 77
BEHAVIORAL EFFECT PART 2: INCOME EFFECT

Taxpayers with income above z ∗ pay −dR = dτdz ∗ additional taxes. Their
change in income is:

T 00 dz dτdz ∗ dτdz ∗
dz = −ε(cz ) z − η ⇒ dz = −η
1 − T0 1 − T 0 (z ) 1 − T 0 (z ) + zε(cz ) T 00 (z )

Why?

Total income effect response:

Z ∞
∗ T 0 (z ) ∗
I = dτdz −η(z ) h (z )dz
z∗ 1 − T 0 (z )

At the optimum: M + E + I = 0.

60 77
PUTTING THE EFFECTS TOGETHER
 
T 0 (z ) 1 1 − H (z ∗ )
= c
1 − T 0 (z ) ε(z ) z ∗ h ∗ ( z ∗ )
Z ∞ Z ∞ 
h (z ) T 0 (z ) h ∗ (z )
× (1 − g (z )) dz + −η dz
z∗ 1 − H (z ∗ ) z∗ 1 − T 0 (z ) 1 − H (z ∗ )

First-order differential equation. See Saez (2001) Appendix for solution (is
standard).

Change of variable from z to n?

z˙n 1+ε(uzn )
Recall with a linear tax: zn = n .

What happens with nonlinear tax? See Saez (2001) Appendix for derivation.

z˙n 1 + ε(uzn ) T 00 (zn ) c


= − żn ε
zn n 1 − T 0 ( zn ) z ( n ) 61 77
ATKINSON-STIGLITZ THEOREM

Famous Atkinson-Stiglitz JpubE’ 76 shows that

max SWF = max SWF


t,T (.) t =0,T (.)

(i.e, commodity taxes not useful) under two assumptions on utility functions
u h (c1 , .., cK , z )

1) Weak separability between (c1 , .., cK ) and z in utility

2) Homogeneity across individuals in the sub-utility of consumption


v (c1 , .., cK ) [does not vary with h]

(1) and (2): u h (c1 , .., cK , z ) = U h (v (c1 , .., cK ), z )

Original proof was based on optimum conditions, new straightforward proof


by Laroque EL ’05, and Kaplow JpubE ’06.
77
ATKINSON-STIGLITZ THEOREM PROOF

Let V (y , p + t ) = maxc v (c1 , .., cK ) st (p + t ) · c ≤ y be the indirect utility


of consumption c [common to all individuals]

Start with (T (.), t ). Let c (t ) be consumer choice.

Replace (T (.), t ) with (T̄ (.), t = 0) where T̄ (z ) such that


V (z − T (z ), p + t ) = V (z − T̄ (z ), p ) ⇒ Utility U h (V , z ) and labor supply
choices z unchanged for all individuals.

Attaining V (z − T̄ (z ), p ) at price p costs at least z − T̄ (z )

Consumer also attains V (z − T̄ (z ), p ) = V (z − T (z ), p + t ) when choosing


c (t ) ⇒ z − T̄ (z ) ≤ p · c (t ) = z − T (z ) − t · c (t )

⇒ T̄ (z ) ≥ T (z ) + t · c (t ): the government collects more taxes with


(T̄ (.), t = 0)
63 77
ATKINSON-STIGLITZ INTUITION

With separability and homogeneity, conditional on earnings z, consumption


choices c = (c1 , .., cK ) do not provide any information on ability

⇒ Differentiated commodity taxes t1 , .., tK create a tax distortion with no


benefit ⇒ Better to do all the redistribution with the individual income tax

Note: With weaker linear income taxation tool (Diamond-Mirrlees AER ’71,
Diamond JpubE ’75), need v (c1 , .., cK ) homothetic (linear Engel curves,
Deaton EMA ’81) to obtain no commodity tax result

[Unless Engel curves are linear, commodity taxation can be useful to


“non-linearize” the tax system]

64 77
Generalization of Atkinson-Stiglitz to Heterogeneous Tastes –
Saez (2002)
Can we generalize AS to case with heterogeneous consumption
preferences?

Individuals indexed by h, utility U (c, z ) with c = (c1 , ..., cK ).

Nonlinear income tax T (z ).

Pre-tax prices: p, post-tax price: q = p + t.

Budget constraint q · c ≤ z − T (z ).

Demands: c h (q, R, z ), labor supply z h (q, T ), indirect utility v h (q, R, z ).

Suppose that T (z ) is optimally chosen at zero commodity taxation p = q


to max
X X
W = α h v h (p, z h − T (z h ), z h ) s.t. T (z h ) ≥ E
h h

Marginal social welfare weight gh = α h vRh /λ. 65 77


Can commodity taxation improve welfare?

Imagine dt1 .
P h
Mechanical revenue effect: dM1 = = C1 dt1 .
h c1 dt1
P
Welfare effect (envelope theorem): dU1 = − h g h c1h dt1 .
P
Behavioral labor supply response: dB1 = − h T 0 (z h )dzth1 with
∂z h
dzth1 = dt1 ∂q 1
.

Why no behavioral response on revenue from changes in consumption?

If no commodity tax introduction can increase welfare, need


dW = dM1 + dU1 + dB1 = 0.

66 77
Can commodity taxation improve welfare? (II)

Find a small income tax reform that “mimics” commodity tax change:
dT (z ) = C1 (z )dt1 .

This reform has zero first order welfare impact, why?

Mechanical
P Revenue effect:
P
dMT = h dT (z ) = h C1 (z h )dt1 = C1 dt1 = dM1 (why?)
h

P
Welfare effect: dWT = − h g h C1 (z h )dt1
P
Behavioral effect: dBT = − h T 0 (z h )dzTh .

Subtract one from other (using that dWT = 0).

67 77
Can commodity taxation improve welfare? (III)

X X  
dW dzTh dT (z h ) dz1h
=− g h [c1h − C1 (z h )] + T 0 (z h ) · −
dt1 dT (z h ) dt1 dt1
| h {z } |h {z }
Pure Welfare Effect Behavioral Effect

Pure welfare effect is zero if conditional on z, g h and c h are


uncorrelated.

What does this mean? Is this reasonable? (recall weights are


"generalized" since α h depends on h directly).

Young/old? medical expenses?

Always satisfied in "standard AS" assumptions.

68 77
Behavioral Effects under Commodity and Income Taxation
Since T 0 (z h ) ≥ 0 (remember), increasing dt1 > 0 more efficient than
equivalent income tax increase if labor supply increase from commodity tax
change larger than that of income tax change.

When is this the case? Can show that:

    
zch dc1h zRh
E [dzth1 ] = −dt1 E +E c h
1 + T 00 (z )zch dz 1 + T 00 (z )zch 1

     
zch dC1 (z ) zRh
E [dzTh ] = −dt1 E +E C1 ( z )
1 + T 00 (z )zch dz 1 + T 00 (z )zch

Need E [dzth1 ] = E [dzTh ] for no commodity taxation.


69 77
Assumptions needed for behavioral effects to be the same under
Commodity and Income Taxation

Assumption 2: Conditional on z, behavioral responses zch and zRh


dc1h
independent of consumption patterns c1h and dz .

Do you think this holds?


 
dc1h h dC1 (z )
Assumption 3: For any income level, E dz |z =z = dz .

This is the key assumption. What does it say? Why is this not
mechanically true?
h h h h
= limdz→0 E (c1 |z =z +dzdz)−E (c1 |z =z ) is cross-sectional variation in
dC1 (z )
dz
consumption of good 1 when income changes.
 h 
dc
What is E dz1 |z h = z ?

70 77
Assumptions needed for behavioral effects to be the same under
Commodity and Income Taxation
Imagine 2 groups:

Group A: z h = z. consume C1 (z ) on average.


dC1 (z )
Group B: z h = z − dz. Consumes on average dc1 = dz less of good 1.

Group A’: Individuals from group A who are forced to reduce their income
to z − dz. Reduce their
 consumption relative to group A by
0 dc1h h
dc1 = E dz |z = z dz.

Assumption 3 says: Group B = Group A’ for consumption of good 1.

Always true in AS since consumption only depends on after tax income


with separability C1 (z ) = c1h (z ).

Why would Group A’ not have same consumption of good 1 as Group B?

71 77
WHEN ATKINSON-STIGLITZ ASSUMPTIONS FAIL

Thought experiment we just did was: force high earners to work less and
earn only as much as low earners: if high earners consume more of good k
than low earners, taxing good 1 is desirable.

1) High earners are “different” (since if left to chose, chose to work more. If
they have a relatively higher/lower taste for good 1 (independently of
income), tax more/less good 1. [indirect tagging] Cigarettes? Fancy wine?
How would you see this empirically?

2) High earners now have more leisure. If Good 1 positively related to


leisure (consumption of 1 increases when leisure increases keeping
after-tax income constant), tax it! [tax on holiday trips, subsidy on work
related expenses such as child care]

In general Atkison-Stiglitz assumption is a good starting place for most


goods ⇒ Zero-rating on some goods under VAT for redistribution is
inefficient and administratively burdensome [Mirrlees review]
77
REFERENCES (for lectures 2 and 3)

Akerlof, G. “The Economics of Tagging as Applied to the Optimal Income Tax,


Welfare Programs, and Manpower Planning”, American Economic Review, Vol. 68,
1978, 8-19. (web)

Atkinson, A.B. and J. Stiglitz “The design of tax structure: Direct versus indirect
taxation”, Journal of Public Economics, Vol. 6, 1976, 55-75. (web)

Besley, T. and S. Coate “Workfare versus Welfare: Incentives Arguments for Work
Requirements in Poverty-Alleviation Programs”, American Economic Review, Vol.
82, 1992, 249-261. (web)

Boskin, M. and E. Sheshinski “Optimal tax treatment of the family: Married


couples”, Journal of Public Economics, Vol. 20, 1983, 281-297 (web)

Brewer, M., E. Saez, and A. Shephard “Means Testing and Tax Rates on Earnings”,
in The Mirrlees Review: Reforming the Tax System for the 21st Century, Oxford
University Press, 2010. (web)

73 77
Cremer, H., F. Gahvari, and N. Ladoux “Externalities and optimal taxation”, Journal
of Public Economics, Vol. 70, 1998, 343-364. (web)

Deaton, A. “Optimal Taxes and the Structure of Preferences”, Econometrica, Vol. 49,
1981, 1245-1260 (web)

Diamond, P. “A many-person Ramsey tax rule”, Journal of Public Economics, Vol.4,


1975, 335-342. (web)

Diamond, P. “Income Taxation with Fixed Hours of Work”Journal of Public


Economics, Vol. 13, 1980, 101-110. (web)

Diamond, P. “Optimal Income Taxation: An Example with a U-Shaped Pattern of


Optimal Marginal Tax Rates”, American Economic Review, Vol. 88, 1998, 83-95.
(web) Skim this for historical reasons.

Diamond, P. and E. Saez “From Basic Research to Policy Recommendations:


The Case for a Progressive Tax”, Journal of Economic Perspectives, 25(4), 2011,
165-190. (web)

Edgeworth, F. “The Pure Theory of Taxation”, The Economic Journal, Vol. 7, 1897,
550-571. (web)
77
Kaplow, L. “On the undesirability of commodity taxation even when income taxation
is not optimal”, Journal of Public Economics, Vol. 90, 2006, 1235-1250. (web)

Kaplow, L. The Theory of Taxation and Public Economics. Princeton University


Press, 2008.

Kaplow, L. and S. Shavell “Any Non-welfarist Method of Policy Assessment


Violates the Pareto Principle,” Journal of Political Economy, 109(2), (April 2001),
281-286 (web)

Kleven, H., C. Kreiner and E. Saez “The Optimal Income Taxation of Couples”,
Econometrica, Vol. 77, 2009, 537-560. (web)

Laroque, G. “Indirect Taxation is Superfluous under Separability and Taste


Homogeneity: A Simple Proof”, Economic Letters, Vol. 87, 2005, 141-144. (web)

Lehmann, E., L. Simula, A. Trannoy “Tax Me if You Can! Optimal Nonlinear Income
Tax between Competing Governments,” Quarterly Journal of Economics 129(4),
2014, 1995-2030. (web)

75 77
Mankiw, G. and M. Weinzierl “The Optimal Taxation of Height: A Case Study of
Utilitarian Income Redistribution”, AEJ: Economic Policy, Vol. 2, 2010, 155-176.
(web)

Mirrlees, J. “An Exploration in the Theory of Optimal Income Taxation”, Review


of Economic Studies, Vol. 38, 1971, 175-208. (web) Read it for historical
reasons, it is not very easy to follow.

Mirrlees, J. “Migration and Optimal Income Taxes”, Journal of Public Economics,


Vol. 18, 1982, 319-341. (web)

Nichols, A. and R. Zeckhauser“Targeting Transfers Through Restrictions on


Recipients”, American Economic Review, Vol. 72, 1982, 372-377. (web)

Piketty, Thomas and Emmanuel Saez “Optimal Labor Income Taxation,”


Handbook of Public Economics, Volume 5, Amsterdam: Elsevier-North Holland,
2013. (web)

Piketty, Thomas, Emmanuel Saez, and Stefanie Stantcheva "Optimal Taxation of


Top Labor Incomes: A Tale of Three Elasticities", American Economic Journal:
Economic Policy, 6(1), 2014, 230-271 (web)

77
Sadka, E. “On Income Distribution, Incentives Effects and Optimal Income
Taxation”, Review of Economic Studies, Vol. 43, 1976, 261-268. (web)

Saez, E. “Using Elasticities to Derive Optimal Income Tax Rates”, Review of


Economics Studies, Vol. 68, 2001, 205-229. (web)

Saez, E. “The Desirability of Commodity Taxation under Non-linear Income


Taxation and Heterogeneous Tastes”, Journal of Public Economics, Vol. 83, 2002,
217-230. (web)

Sandmo, A. “Optimal Taxation in the Presence of Externalities”, The Swedish


Journal of Economics, Vol. 77, 1975, 86-98. (web)

Seade, Jesus K. “On the shape of optimal tax schedules.” Journal of public
Economics 7.2 (1977): 203-235. (web)

Stiglitz, J. “Self-selection and Pareto Efficient Taxation”, Journal of Public


Economics, Vol. 17, 1982, 213-240. (web)

77 77

You might also like