Catalan Combinatorics
Catalan Combinatorics
1 ε
111
2 211
11
3 21 311
221
22
321
31
32
1
FRANÇOIS BERGERON
Contents
Introduction 1
1. A very brief history 2
2. Dealing with Catalan numbers 3
3. More combinatorial structures counted by Catalan numbers 3
4. Polynomial count 6
Appendix A. Partitions and Diagrams 7
Appendix B. q-Analogs 8
Appendix C. Generalizations 9
Appendix D. Problems 13
References 14
Introduction
The study of the interplay between Catalan combinatorics and abstract algebra has interesting
ties with many subjects including: Computer Science, Algebraic Geometry, Knot Theory, and
Quantum Mechanics. On the combinatorial side, the story involves many classical structures
including triangulation of polygons, binary trees, Dyck paths, partitions contained in a staircase,
well-parenthesized expressions, two-row standard Young tableaux, and pattern avoiding permutations;
to mention but a few. The enumeration of all of these objects involves Catalan numbers 1{pn`1q 2n
` ˘
n .
Several interesting families of polynomials arise when one considers weighted enumerations of the
combinatorial objects considered in this broad subject. As a matter of fact, since they arise in
so many areas, several generalizations of these families of object have been introduced over the
years. On top of this rich combinatorics, ties with abstract algebra also abound. Typically, one
sees the Catalan numbers appear as the dimensions of interesting vector spaces (for more on this,
see [1, 2, 3]).
the Catalan numbers, is the set of partitions3 contained in a staircase partition pn ´ 1, n ´ 2, . . . 2, 1q.
For n “ 4, there are 14 such partitions:
t321, 32, 311, 31, 3, 221, 22, 211, 21, 2, 111, 11, 1, εu,
where ε stands for the empty partition. As we have already mentioned, there are many families
pFn qnPN of “nice” combinatorial structures that are counted by the Catalan numbers: i.e.
ˆ ˙
1 2n
cardpFn q “ , for all n P N.
n`1 n
1For Eugène Charles Catalan (1814–1894).
2Such as Johann Andreas Segner (1704–1777).
3See the relevant section for definitions.
2
For any other family pGn qnPN having the same number of elements, one may certainly find bijections4
between the sets Fn and Gn . However, finding an explicit description of such a bijection is often an
interesting challenge (see Problem 2), which may prove to be very hard in some instances.
When studying sequences of numbers, one looks for interesting ways to describe them. Different
descriptions often add interesting new perspectives. Some classical ways of calculating Catalan
numbers include:
n
ź n`k
Cn “ , (2.1)
k“2
k
n´1
ÿ
“ Ck Cn´1´k , C0 :“ 1, (2.2)
k“0
2 p2n ´ 1q
“ Cn´1 ; (2.3)
n`1
as well as via their generating series
8
ÿ
Cpxq :“ Cn x n (2.4)
0
“ 1 ` x ` 2x2 ` 5x3 ` 14x4 ` 42x5 ` 132x6 ` 429x7 ` 1430x8 ` . . . ,
which is such that
Cpxq “ 1 ` x Cpxq2 ; (2.5)
as may be show using Equation 2.2. It follows that
?
1 ´ x 1 ´ 4x
Cpxq “ . (2.6)
2x
Using this, one may prove that Catalan numbers grow asymptotically as
4n
Cn „ ? . (2.7)
n3{2 π
Equation 2.5 may also be exploited to calculate Cpxq as the limit of “iterations” of f pzq “ z 2 ` x:
f p0q, f pf p0qq, f pf pf p0qqq, . . . f ăną p0q, . . .
This is closely linked to the definition of the Mandelbrot5 set (see Figure 2.1).
In his book Catalan Numbers, Richard Stanley presents 214 different families of structures counted
by Catalan numbers. On top of those already mentioned, some of the best known are discussed
below.
4Recall that two sets have the same number of elements if and only if there exists a bijection between them.
5Benoit Mandelbrot (1924–2010).
3
Fig. 2.1. The Mandelbrot set.
3.1. Dyck paths. A southeast lattice path is a sequence of points pak , bk q P N2 , for 0 ď k ď `,
such that #
pak´1 , bk´1 q ` p0, ´1q or,
pak , bk q “
pak´1 , bk´1 q ` p1, 0q.
The path is said to start at pa0 , b0 q and end at pa` , b` q. A Dyck path of size n is southeast path
going from p0, nq to pn, 0q, that stays below the diagonal joining p0, nq to pn, 0q. In other terms,
for every point on the path, we have ak ` bk ď n. The number of size n Dyck paths is the Catalan
number. The 14 Dyck paths of size 4 appear in Figure 3.1.
$ ,
’
’ /
/
’
’ /
/
’
’ /
/
’ /
, , , , , , , ,
& .
’
’ /
/
’
’ /
/
’
’ /
/
’
% /
-
, , , , , ,
A size n (unlabelled) binary tree is the “shape” of such a tree on a set of n vertices. Informally,
all labels are replaced by a ‚. The number of unlabeled binary tree is the Catalan number. For
instance, Figure 3.2 illustrates the 14 binary trees with 4 nodes, with the root being the topmost
node (in red).
‚ ‚ ‚ ‚ ‚ ‚ ‚ ‚
‚ ‚ ‚ ‚ ‚ ‚ ‚ ‚
‚ ‚ ‚ ‚ ‚ ‚ ‚ ‚
‚ ‚ ‚ ‚ ‚ ‚ ‚ ‚
‚ ‚ ‚ ‚ ‚ ‚
‚ ‚ ‚ ‚ ‚ ‚ ‚ ‚ ‚ ‚
‚ ‚ ‚ ‚ ‚ ‚ ‚ ‚
3.4. Excluded pattern permutations. A permutation of the set rns “ t1, 2, . . . , nu may be
described (in one line notation) as a rearrangement a1 a2 ¨ ¨ ¨ an of the numbers from 1 to n. Hence,
the two sets t1, 2, . . . , nu and ta1 , a2 , . . . , an u are the same. A permutation is said to avoid the
pattern 312, if there does not exist i ă j ă k such that aj ă ak ă ai . The number of permutations
of rns that avoid the pattern 312 is the Catalan number. The 14 such permutations for n “ 4 are:
1234, 1243, 1324, 1342, 1432, 2134, 2143, 2314, 2341, 2431, 3214, 3241, 3421, 4321.
6One in which all vertices are different.
5
This is also true for permutations that avoid any length 3 pattern. However, the number of
permutations that avoid a pattern of length 4 may depend on which permutation of 4 is chosen as a
pattern (see wiki/Permutation pattern).
4. Polynomial count
One may shed more light on specific properties of elements of a family Fn , via a “weighted”
enumeration involving some associated measure, ω : Fn Ñ N, on the combinatorial structures
considered. As it happens, the distribution of this measure may be nicely described via the
polynomial ÿ
Fn pxq :“ xωpf q , (4.1)
f PFn
in a variable x. Clearly, the coefficient of xk
is the number of elements in Fn having ω-measure equal
to k. Setting x “ 1, we evidently get back the cardinality of Fn , i.e. Fn p1q “ cardpFn q. Observe in
particular that7
Fn1 p1q 1 ÿ
“ ωpf q,
Fn p1q cardpFn q f PF
n
gives the mean value of ωpf q, as f runs over all elements of Fn . One may also calculate the
variance of ω in a similar manner (with a formula that also involves the second derivative of Fn pxq).
Let us consider the family Fn of Ferrers’ diagrams (see Appendix A) of partitions contained in
the staircase δn . For instance, we have the cardinality 14 set of Ferrers’ diagrams: For a given
! )
F4 :“ .
, , , , , , , , , , , , , , H.
Appendix B. q-Analogs
The notion of q-analog goes back to the 19th -century. The idea is to replace integers by
polynomials in q, that play a role that is “analogous” to the a corresponding numbers. By
assumption the polynomials have positive integer coefficient, and specialize to the corresponding
number when the variable q is set to 1. One of the aim is to find formulas/identities for the
polynomials obtained, that “mimick” formulas/identities for the corresponding numbers. Note that
a number may have different analogs, depending on the “role” it plays. For instance, the number 6
affords (at least) the two different q-analogs:
‚ q 5 ` q 4 ` q 3 ` q 2 ` q ` 1, and
‚ q 3 ` 2q 2 ` 2q ` 1 “ pq 2 ` q ` 1qpq ` 1q.
where d runs over the divisors of n (different from 1, but including n itself). For instance,
r6sq “ ϕ2 pqqϕ3 pqqϕ6 pqq. (B.2)
As it happens, one may recursively calculate ϕn pqq using Equation B.1, as
rnsq
ϕn pqq “ ś . (B.3)
d | n d“n ϕd pqq
The first few values (which are also already defined in Sage) of the cyclotomic polynomials are:
ϕ2 pqq “ q ` 1,
ϕ3 pqq “ q 2 ` q ` 1,
ϕ4 pqq “ q 2 ` 1,
ϕ5 pqq “ q 4 ` q 3 ` q 2 ` q ` 1,
ϕ6 pqq “ q 2 ´ q ` 1,
ϕ7 pqq “ q 6 ` q 5 ` q 4 ` q 3 ` q 2 ` q ` 1,
ϕ8 pqq “ q 4 ` 1,
ϕ9 pqq “ q 6 ` q 3 ` 1,
ϕ10 pqq “ q 4 ´ q 3 ` q 2 ´ q ` 1.
In particular, one may directly check that Equation B.2 holds. As is apparent here, it follows from
the definition that for n a prime number, the cyclotomic polynomial is simply rnsq itself.
Appendix C. Generalizations
For any pair of positive real numbers pr, sq, X consider\ the partition s
τrs :“ t1 t2 ¨ ¨ ¨ tn having parts equal to tj :“ r ´ j r{s . Then, τrs is
said to be a triangular partition. In other words, the 4-partition
τrs contains all the cells γ “ pi, jq lying below the line joining p0, sq γ
to pr, 0q, hence
i j
pi, jq P τr,s iff ` ă 1. r
r s Fig. C.1. Triangular partition τrs .
Many different pairs pr, sq may give rise to a given 4-partition τ .
Indeed, as illustrated in Figure C.2, we have an equality τ “ τrs “ τr1 s1 when there are no positive
integer coordinate points outside the partition12 between the lines x{r ` y{s “ 1 and x{r1 ` y{s1 “ 1.
Sometimes, but not always, there exist integers m and n in N such τ “ τmn .
The beginning of the sequence of number of triangular partitions of size n is
1, 1, 2, 3, 4, 6, 7, 8, 10, 12, 13, 16, 16, 18, 20, 23, . . .
for m ě 0. Table C.1 displays all triangular partitions of m ď 6.
0, , , , , , , , , , ,
, , , , , , , , , , , , .
n is 1 is of particular interest. We then say that m and n are coprime. The ν-area of a τ -Dyck
path α is the number of cells lying in the skew shape ν{α, hence areaτ pαq :“ |τ | ´ |α|.
‹‹
r
Fig. C.4. A dinv cell
For a τ -Dyck path α, we set dinvτ pαq to be equal to the cardinality of the set of τ -dinv-cells (or
simply dinv-cells) of α:
! )
lpγq
Dinvτ pαq :“ γ P α | apγq`1 ă σ τ ` ε ă lpγq`1
apγq ,
where any division by 0 gives `8. Here ε ą 0 is some small (enough) irrational value, that removes
possible ambiguities in inequalities. The respective dinv-cells of the partitions contained in τ “ 32
are marked by stars in Figure C.5.
C.2. Counting Triangular Dyck Paths. The enumeration of triangular Dyck paths has an
interesting (ongoing) history. The simple counting in the case of any integer pairs pm, nq had
10
‹ ‹ ‹‹ ‹ ‹‹
‹ ‹‹ ‹‹ ‹‹‹ ‹‹‹ ‹‹ ‹
Fig. C.5. Dinv-cells for subpartitions of 32. The first 6 subpartitions have full-dinv.
already been worked out in the 1950s (see [?]). However, it is only rather recently that the overall
enumerative combinatorics community has become aware of that fact. For a long time, only the
“coprime case” was deemed really understood. Sample values for the integral case of pm, nq-Catalan
numbers Cmn are given in Table C.2. These include the classical Fuss-Catalan numbers when
m{n 1 2 3 4 5 6 7
1 1 1 1 1 1 1 1
2 1 2 2 3 3 4 4
3 1 2 5 5 7 12 12
4 1 3 5 14 14 23 30
5 1 3 7 14 42 42 66
6 1 4 12 23 42 132 132
7 1 4 12 30 66 132 429
1
`pk`1qn
˘
m “ kn (or equivalently when m “ kn ` 1): Ckn,n “ Ckn`1,n “ kn`1 n . This formula is
k{n 0 1 2 3 4 5 6 7 8 9
a special case of the following lemma. Indeed, it is a direct extension of the classical “cycling”
argument proving that the number Dyck paths corresponds to the Catalan numbers.
1
`m`n˘
Lemma C.1. For m and n coprime integers, we have Cmn “ m`n n .
For all triangular partitions τ of size at most 10, the number of τ -Dyck paths are given in the
Table C.4. A formula for the non-coprime case of τmn is described in the next subsection.
C.3. Bizley formula. In the general “integral” situation, the enumeration formula of pm ˆ nq-Dyck
paths takes the form of a sum of terms indexed by partitions of the greatest common divisor of
m and n. This is what makes it harder to “guess” a formula13, since the numbers obtained do
not factor nicely in general, even if they are effectively sums of nicely factorized numbers. The
13Most guessing approaches rely (directly or indirectly) on the fact the numbers considered have nice factorization
in small prime numbers.
11
1
2
12 2
3 3
13 21 3
4 5 4
14 212 31 4
5 7 7 5
15 213 22 1 32 41 5
6 9 9 9 9 6
16 214 22 12 321 42 51 6
7 11 12 14 12 11 7
17 215 22 13 3211 421 52 61 7
8 13 15 19 19 15 13 8
18 216 22 14 23 12 322 1 431 53 62 71 8
9 15 18 18 23 23 18 18 15 9
19 217 22 15 23 13 322 12 32 21 432 531 63 72 81 9
10 17 21 22 30 28 28 30 22 21 17 10
110 218 22 16 23 14 322 13 32 212 4321 532 631 73 82 91 p10q
11 19 24 26 37 37 42 37 37 26 24 19 11
where pm, nq “ pad, bdq with a and b coprime, so that d “ gcdpm, nq. It is worth recalling that d!{zµ
is the number of size d permutations of cycle type µ, where
ź
zµ :“ k dk dk !.
k
Observe that, for fixed coprime numbers a and b, all the formulas for pm, nq “ pad, bdq, with 0 ď d,
may be presented in the form of the generating function:
8
ÿ ´ÿ ¯
1 kpa`bq
Cad,bd xd “ exp
` ˘ k
a`b ka x {k . (C.3)
d“1 kě1
C.4. Area enumerator. For any triangular partition τ , the q-area enumerator of τ -Dyck paths, is
ÿ
Cτ pqq :“ q areaτ pαq . (C.4)
αĎτ
Since conjugation is an area-preserving bijection between the set of τ -Dyck paths and the set of
τ 1 -Dyck paths, we clearly have Cτ pqq “ Cτ 1 pqq. The q-area enumerators for all triangular partitions
12
of size at most 6 are as follows (avoiding repetitions for conjugate partitions):
C1 “ q ` 1, C2 “ q 2 ` q ` 1,
C21 “ q 3 ` q 2 ` 2q ` 1, C3 “ q 3 ` q 2 ` q ` 1,
C31 “ q 4 ` q 3 ` 2q 2 ` 2q ` 1, C4 “ q 4 ` q 3 ` q 2 ` q ` 1,
C32 “ q 5 ` q 4 ` 2q 3 ` 2q 2 ` 2q ` 1, C41 “ q 5 ` q 4 ` 2q 3 ` 2q 2 ` 2q ` 1,
C5 “ q 5 ` q 4 ` q 3 ` q 2 ` q ` 1, C321 “ q 6 ` q 5 ` 2q 4 ` 3q 3 ` 3q 2 ` 3q ` 1,
C42 “ q 6 ` q 5 ` 2q 4 ` 2q 3 ` 3q 2 ` 2q ` 1, C51 “ q 6 ` q 5 ` 2q 4 ` 2q 3 ` 2q 2 ` 2q ` 1,
C6 “ q 6 ` q 5 ` q 4 ` q 3 ` q 2 ` q ` 1.
C.5. Counting by Area and Dinv. We now consider enumeration of τ -Dyck paths with respect
to both statistics: “area” and “dinv”, in the triangular case. The resulting pq, tq-polynomials:
ÿ
Cτ pq, tq :“ q areaτ pαq tdinvτ pαq , (C.5)
αĎτ
plays a central role in a wide range of subjects. Finding formulas for these is not easy, and many
properties are still being studied. In the case of τab , with a and b coprime integers, we have the nice
q-analog (obtained via setting t “ 1{q)
„
|τab | 1 a`b
q Cτab pq, 1{qq “ . (C.6)
ra ` bsq a
Recall that, for all k and n we have τkn,n “ τkn`1,n , hence the above also covers such cases.
Appendix D. Problems
Problem 1. Directly prove that at least one of the following family of structures is counted by
Catalan numbers: triangulations of n-gons, subpartitions of the staircase δn , Dyck paths, balanced
words, binary trees, permutations avoiding 321, two equal row standard tableaux.
Problem 2. Give explicit bijections between size n structures of the following Catalan numbers
enumerated structures: triangulations of n-gons, subpartitions of the staircase δn , Dyck paths,
balanced words, binary trees, permutations avoiding 321, two equal row standard tableaux. Conclude
that all of them are counted by the Catalan numbers.
Problem 3. Prove that the recurrence of Equation 2.2 holds, as well as that of Equation 4.3.
Exploiting the fact that two series are equal if and only their coefficients are the same, prove
Equation 2.5 by comparing coefficients on both sides of the equation. Deduce from this Formula (2.6)
(being careful to give an explanation for the sign).
Problem 4. In each case, directly prove that the number of following family of structures satisfies
the recurrence of Equation 2.2: triangulations of n-gons, subpartitions of the staircase δn , Dyck
paths, balanced words, binary trees, permutations avoiding 321, two equal row standard tableaux.
Conclude that all of them are counted by the Catalan numbers.
Problem 5. Assuming that there exist integer coefficient polynomials ϕd pqq such that
ź
qn ´ 1 “ ϕd pqq,
d|n
for all n, check for small values that Bn pqq defined by Equation 4.4 is indeed a polynomial. Using
the values provided for these in Appendix B, also verify that the result is indeed a positive integer
polynomial. Generalize if you can.
14