A (Slightly) Different Proof of Burnside’s Theorem

A pdf of this post is available here.

1. Introduction

Burnside’s theorem, first proved in the early 20th century by William Burnside, shows that a group of order ${p^aq^b}$, where ${p}$ and ${q}$ are primes and ${a, b \in \mathbf{N} \cup \{0\}}$, is solvable. The original proof of Burnside’s theorem utilized representation theory in an essential way. A proof using purely group theoretical tools was given by D. Goldschmidt around 1970. The aim of this note is to provide yet another proof of Burnside’s theorem, via representation theory. First we establish a fundamental definition: A group ${G}$, with identity ${1}$, is solvable, if there exists a chain of subgroups

$\displaystyle \begin{array}{rcl} G = G_0 \supseteq G_1 \supseteq \cdots \supseteq G_{k-1} \supseteq G_k = \{1\} \end{array}$

such that ${G_i}$ is normal in ${G_{i-1}}$ and ${G_i/G_{i-1}}$ is abelian for ${i = 1, \cdots, k}$. Such a chain is called an abelian tower. Thus, for instance, all abelian subgroups are solvable. We also get the easy result: If ${P}$ is a ${p}$-group, then ${P}$ is solvable and has a non-trivial center. Proof: Note that ${P}$ acts on itself by conjugation, so by the class equation:

$\displaystyle \begin{array}{rcl} |P| &=& |Z(P)| + \sum (P : \text{Stab}(x)). \end{array}$

where the sum is over distinct conjugacy classes, and ${Z(P)}$ is the center of ${P}$. Because each index ${(P:\text{Stab}(x))}$ is divisible ${p}$ and ${|Z(P)| \geq 1}$, it follows that ${|Z(P)| >1}$. Now, note that a group of order ${p}$ is abelian, so it is solvable. Thus, by induction, ${P/Z(P)}$ is solvable. We can lift any abelian tower in ${P/Z(P)}$ to get an abelian tower in ${P}$, as desired. $\Box$

We fix some notation. If ${\rho : G \rightarrow \text{GL}(V)}$ is a complex representation of ${G}$, then we omit the homomorphism ${\rho}$ and simply write ${V}$ instead. We denote the character of the representation ${V}$ by ${\chi_V}$. If ${g \in G}$, then ${\text{Cent}_G(g)}$, is the centralizer of ${g}$ in ${G}$ and ${C(g)}$ is the conjugacy class of ${g}$ in ${G}$. Note that ${(G : \text{Cent}_G(g)) = |C(g)|}$, by the orbit-stabilizer formula. We write ${\text{Stab}(g)}$ for the stabilizer of ${g}$ in ${G}$.

2. The Proof of Burnside’s Theorem

The proof will proceed by a series of claims. We assume that ${G}$ is non-abelian and ${a > 0}$, otherwise ${G}$ is solvable by section one. We wish to derive a contradiction.

Claim 1 We can, without loss of generality, assume that ${G}$ is simple.

Proof: If ${N}$ is a normal subgroup of ${G}$, then ${N}$ and ${G/N}$ are solvable, by induction, therefore, ${G}$ is solvable. $\Box$

Assuming this, we note that ${Z(G) = 1}$ because ${Z(G)}$ is normal, and ${G}$ is non-abelian. This also implies that every non-trivial representation of ${G}$ is faithful.

Claim 2 Let ${V}$ be an irreducible representation of ${G}$. Then, for all ${g \in G}$

$\displaystyle \begin{array}{rcl} \frac{|C(g)|\chi_V(g)}{\chi_V(1)} \end{array}$

is an algebraic integer.

Proof: Let ${C(g_1), \cdots C(g_k)}$ be the conjugacy classes of ${G}$. Then

$\displaystyle \begin{array}{rcl} \varphi_{g_i} &=& \sum_{g \in C(g_i)} g \end{array}$

is in the center of ${\mathbf{Z}[G]}$ because it is invariant under conjugation. Further, the center of ${\mathbf{Z}[G]}$ is generated by these elements. Indeed, suppose ${ x= \sum_{g\in G} a(g)g}$ is in the center of ${G}$. Write ${x = \sum_{i=1}^k \sum_{g \in C(g_i)} a(g)g}$. Suppose that ${h,h' \in C(g_i)}$, and choose ${y}$ such that ${h' = yh'y^{-1}}$. Then conjugation by ${y}$ fixes ${\sum_{g \in C(g_i)} a(g)g}$, so ${a(h) = a(h')}$. It follows that ${a(g)}$ is a class function and so ${\sum_{g \in C(g_i)} a(g)g = a(C(g_i)) \varphi_{g_i}}$. We have shown that the center of ${\mathbf{Z}[G]}$ is finitely generated over ${G}$, and hence it is integral over ${\mathbf{Z}}$. Note that ${\varphi_{g} : V \rightarrow V}$ is a ${G}$ invariant map. Thus, by Schur’s lemma there exists ${\lambda \in \mathbf{C}}$ such that ${\varphi_{g} = \lambda\text{id}_V}$. Because ${\varphi_{g}}$ is integral over ${G}$, and ${\lambda}$ satisfies the same polynomial over ${\mathbf{Z}}$ that ${\varphi_{g}}$ does, it follows that ${\lambda}$ is integral over ${\mathbf{Z}}$. Thus, ${\lambda \text{dim} V = \text{Tr}(\varphi_{g}) = |C(g)|\chi_V(g)}$ and so ${\lambda = \frac{|C(g)|\chi_{V}(g)}{\chi_V(1)}}$ is integral over ${\mathbf{Z}}$. $\Box$

Claim 3 Let ${V}$ be an irreducible representation of ${G}$. Then ${\chi_V(1) \mid |G|}$.

Proof: We maintain the notation from the previous claim. Note that by the second orthogonality relations:

$\displaystyle \begin{array}{rcl} \sum_{i=1}^k |C(g_i)|\overline{\chi_{V}(C(g_i))}\chi_V(C(g_i)) &=& |G| \end{array}$

If we divide both sides by ${\text{dim} V}$, we get a sum of algebraic integers on the left, and a rational number on the right. Thus, ${\text{dim} V \mid |G|}$. $\Box$

Claim 4 G has no non-trivial 1-dimensional representations.

Proof: If ${V}$ is a non-trivial one 1-dimensional representation, then ${G \hookrightarrow \mathbf{C}^\ast}$, so ${G}$ is abelian. This is a contradiction. $\Box$

Claim 5 For every ${g \in G}$, ${g \neq 1}$, there exists an irreducible representation ${V}$ such that ${\chi_{V}(g) \neq 0}$ and ${\text{dim} V = p^r}$ for some ${r > 0}$.

Proof: Let ${V_1, \cdots, V_k}$ be the irreducible representations of ${G}$, where ${V_1}$ is the trivial representation. By claim 2.3, ${\text{dim}(V_i) \mid |G|}$ for all ${i}$, so ${\chi_{V_i}(1) = p^{c}q^d}$ for some ${c \leq a, b \leq d}$. If ${r_G}$ is the character of the regular representation, then ${r_G(g) = \sum_{i} \chi_{V_i}(1) \chi_{V_i}(g)}$ is ${0}$ if ${g \neq 1}$, and ${|G|}$ otherwise. Now, suppose ${q}$ divides ${\chi_{V_i}(1)}$ for ${i = 2, \cdots, n}$. Then

$\displaystyle \begin{array}{rcl} \sum_{i=2}^k \chi_{V_i}(1) \chi_{V_i}(g) &=& -1 \end{array}$

because ${V_1}$ is the only 1-dimensional representation of ${G}$. This is a contradiction: if we divide both sides by ${q}$, we get an algebraic integer on the left, and a rational number on the right. In particular, it follows that there is an ${i}$ such that ${\chi_{V_i}(1) = p^r}$ for some ${ r > 0}$ and ${\chi_{V_i}(g) \neq 0}$. $\Box$

The following claim holds for any finite group.

Claim 6 If there exists ${g \in G}$ and a representation ${V}$ such that ${|\chi_V(g)| = \chi_V(1)}$, then ${g \in Z(G)}$.

Proof: Because ${g}$ is unitary, it is diagonalizable: ${g = \text{diag}(\lambda_1, \cdots, \lambda_{\chi_V(1)})}$. Note that ${|\lambda_1 + \lambda_2| \leq |\lambda_1| + |\lambda_2|}$ with equality if, and only if, ${\lambda_1 = \lambda_2}$. Thus, because

$\displaystyle \begin{array}{rcl} \chi_V(1) = \left|\sum_{i=1}^{\chi_V(1)}\lambda_i \right| \leq \sum_{i=1}^{\chi_{V}(1)} |\lambda_i|= \chi_{V}(1), \end{array}$

it follows by induction that ${\lambda_i = \lambda_1}$ for all ${i}$. Thus, ${g = \lambda_{1}\text{id}}$, and so ${g}$ is in the center of the image of ${G}$, but ${G}$ is simple so it is isomorphic to its image and hence ${g \in Z(G) }$. $\Box$

By the Sylow theorems, ${G}$ has a ${p}$-Sylow subgroup ${P}$. By proposition 1.2, ${Z(P)}$ is non-trivial. Thus, we can choose ${g \in Z(P)}$. By taking an appropriate power of ${g}$, we can assume that ${\text{ord} (g) = p}$. Note that ${P \subseteq \text{Cent}_G(g)}$, so ${(G : \text{Cent}_G(g)) = |C(g)|}$ is not divisible by ${p}$, i.e. ${|C(g)| = q^k}$ for some ${k \geq 0}$. Let ${V}$ be a representation such that ${\text{dim} V = p^r}$ and ${\chi_{V}(g) \neq 0}$. Note that the the map

$\displaystyle \begin{array}{rcl} G \hookrightarrow \text{GL}(V) \stackrel{\det}{\rightarrow} \mathbf{C}^\ast \end{array}$

is a 1-dimensional representation of ${G}$, thus, it is trivial, so ${\det(h) = 1}$, for all ${h \in G}$. Let ${\lambda_1, \cdots, \lambda_{p^r}}$ be the eigenvalues of ${g}$ listed with multiplicity. Note that ${\lambda_i}$ is a ${p}$-th root of unity for all ${i}$, because ${g^{p} = 1}$ but ${g^i \neq 1}$ for all ${ 1\leq i < p}$. Let ${\lambda_1, \cdots, \lambda_{j_0} = 1}$ and ${\lambda_{j_0+1}, \cdots, \lambda_{p^r} \neq 1}$. By the formula

$\displaystyle \begin{array}{rcl} \prod_{i=j_0}^{p^r} \lambda_i = \det (g) = 1, \end{array}$

each conjugate of ${\lambda_i}$, ${i \geq j_0}$ appear with the constant multiplicity because the product is stable under the action of ${\text{Gal}(\overline{\mathbf{Q}}/ \mathbf{Q})}$, in other words, if ${\lambda_i}$ has multiplicity ${l}$, then each ${\lambda_{i'}}$ has multiplicity ${l}$ for ${i' \geq j_0}$. Therefore, because each ${\lambda_i}$, ${i \geq j_0}$ has ${p-1}$ conjugates, we have ${l(p-1) = p^r - j_0}$. In particular, ${l \equiv j_0 \mod p}$ Recall that for any primitive ${p}$-th root of unity ${\zeta_{p}}$, ${\sum_{i=1}^{p-1} \zeta_{p}^i = -1}$, thus,

$\displaystyle \begin{array}{rcl} \chi_V(g) &=& \sum_{i=1}^{p^r} \lambda_i \\ &=& \sum_{i=1}^{j_0} \lambda_i + \sum_{i=j_0+1}^{p^r} \lambda_{i} \\ &=& j_0 - l \\ &\equiv& 0 \mod p. \end{array}$

By claim 2.2, it follows that ${\frac{|C(g)| \chi_V(g)}{\chi_V(1)}}$ is an algebraic integer, and hence an integer. However, ${|C(g)| = q^k}$, is relatively prime to ${p}$, so ${\frac{\chi_{V}(g)}{\chi_V(1)} \in \mathbf{Z}}$, and ${\chi_V(g) = p^rb}$ for some ${b \in \mathbf{Z}}$, because ${\chi_V(g) \neq 0}$. Now, because

$\displaystyle \begin{array}{rcl} |\chi_V(g)| \leq \sum_{i=1}^{p^r} |\lambda_i| = \chi_V(1) \end{array}$

it follows that ${b = \pm 1}$ and ${|\chi_V(g)| = \chi_V(1)}$. By claim 2.6, it follows that ${g \in Z(G) = \{1\}}$, which is a contradiction. Thus, ${G}$ is solvable.

3. Consequences

There are several key steps in this argument. The most important assumptions are that ${G}$ is simple, and for any ${g \in G}$, there is a prime power dimensional representation ${V}$ such that ${\chi_V(g) \neq 0}$. Retracing the steps of the proof yields the following consequence:

Corollary 1 Suppose ${G}$ is a simple group and ${V}$ is an irreducible representation of ${G}$ of dimension ${p^r}$, for some ${ r > 0}$. Let ${P}$ be a ${p}$-Sylow subgroup of ${G}$. If ${g \in Z(P)}$ of order ${p}$, and ${\chi_V(g) \neq 0}$, then ${G}$ is abelian.

${\Box}$

Parker’s Lemma is Equivalent to Burnside’s Lemma

You can view a pdf of this entry here.

Let ${G}$ be a finite group that acts on a finite set, ${X}$. Given elements ${g \in G}$ and ${x \in X}$, we introduce the cycle notation, ${(xgx\cdots g^{k-1}x)}$ to denote that ${g^kx = x}$, but ${g^jx \neq x}$ for all ${0. We say that ${(xgx\cdots g^{k-1}x)}$ is a ${k}$-cycle in ${g}$. Conceptually, this is a natural construction: the action of ${G}$ on ${X}$ induces a map, ${G \rightarrow S_{X}}$, of ${G}$ into the symmetric group of ${X}$. This allows us to decompose the image of any element ${g \in G}$ into disjoint cycles.

Let the expression ${c_{k}(g)}$ denote the number of ${k}$-cycles of ${g}$. Let ${X^{(k,G)}}$ be the set of all ${k}$-cycles contained in elements of ${G}$. In particular, for ${k = 1}$, ${X^{(1, G)} = \{(x) : x\in X\}}$ because ${1}$-cycles are fixed points of elements of ${G}$, and ${1\in G}$ fixes every element of ${X}$. Note that if ${X^{(k, G)}}$ is nonempty, then ${G}$ acts on ${X^{(k, G)}}$ by conjugation: if ${c = (xhx \cdots h^{k-1}x)}$, then ${g\cdot c = (gx ghx \cdots gh^{k-1}x)}$ is a cycle in ${ghg^{-1}}$, so ${g\cdot c \in X^{(k, G)}}$. Parker’s lemma is the following:

Lemma 1 (Parker’s Lemma) The number of orbits of ${G}$ on ${X^{(k,G)}}$ is

$\displaystyle \begin{array}{rcl} \left|X^{(k, G)}/G\right| &=& \frac{1}{|G|} \sum_{g \in G} kc_k(g). \end{array}$

At first glance, this lemma appears to be a straightforward application of Burnside’s orbit counting lemma:

$\displaystyle \begin{array}{rcl} \left| X/G\right| &=& \frac{1}{|G|} \sum_{g \in G} \text{Fix}_{X}(g) \end{array}$

In fact, the case ${k = 1}$ is Burnside’s lemma: the number, ${c_1(g)}$, is the number of ${1}$-cycles of ${g}$, i.e. ${\text{Fix}_{X}(g)}$. However, for ${k \geq 2}$, if ${X^{(k, G)}}$ is nonempty, then ${kc_{k}(1) = 0 \neq \text{Fix}_{X^{(k, G)}}(1) = \left|X^{(k, G)}\right|}$. Thus, we cannot simply equate terms: ${\text{Fix}_{X^{(k, G)}}(g) \neq kc_k(g)}$, in general. The difficulties described above suggest that Parker’s lemma is a stronger statement than Burnside’s lemma. In fact, I am unaware of any proof in the literature that deduces Parker’s lemma as a consequence of Burnside’s Lemma. The aim of this post is to establish Parker’s lemma as a simple consequence of Burnside’s lemma.

Proof: We assume that ${X^{(k, G)}}$ is nonempty, otherwise the result is trivial. We note that by Burnside’s lemma

$\displaystyle \begin{array}{rcl} \left|X^{(k, G)}/G\right| &=& \frac{1}{|G|} \sum_{g \in G} \text{Fix}_{X^{(k, G)}} (g). \end{array}$

Define subsets ${Z_1, Z_2 \subseteq G\times X^{(k, G)}}$ by

$\displaystyle \begin{array}{rcl} Z_1 &=& \{ (g, c) \in G \times X^{(k, G)} : g \cdot c = c \} \\ Z_2 &=& \{ (g, c) \in G \times X^{(k, G)} : c \text{ is a cycle in } g \}. \end{array}$

For each ${c \in X^{(k, G)}}$, choose ${g_c}$ such that ${(g_c, c) \in Z_2}$. Let ${H_c}$ be the pointwise stabilizer of ${c}$. Then ${g_cH_c}$ is the set of elements of ${G}$ that contain ${c}$, and ${\langle g_c\rangle H_c}$ is the stabilizer of ${c}$. Note that ${g_c^i H_c \neq g_c^j H_c}$ for ${0 < i,j \leq k}$, with ${i \neq j}$. Indeed, if ${g_c^i = g_c^j h}$, where ${h \in H_c}$, then ${g_c^{i-j} = h}$ is disjoint from ${c}$, i.e. ${i-j \equiv 0 \mod k}$, which is a contradiction. Thus, because ${g_c^k \in H_c}$, we see that ${\text{Stab}(c) = \langle g_c \rangle H_c = \bigcup_{i = 1}^k g_c^i H_c}$ is a partition of ${\text{Stab}(c)}$. Therefore, ${|\text{Stab}(c)| = k |g_c H_c|}$, and hence

$\displaystyle \begin{array}{rcl} |Z_1| &=& |\bigcup_{c} \text{Stab}(c) \times \{c\}| \\ &=& \sum_{c} k\left| g_c H_c\right| \\ &=& k |Z_2|, \end{array}$

because ${ Z_2 = \bigcup_{c} \left(g_c H_c \times \{c\}\right)}$. Now, we count the order of ${Z_1}$ and ${Z_2}$ by counting points along the fiber of the canonical projection

$\displaystyle \begin{array}{rcl} \pi_G : G \times X^{(k, G)} &\rightarrow& G. \end{array}$

Indeed,

$\displaystyle \begin{array}{rcl} |Z_1| &=& \sum_{g \in G} \left|\pi^{-1}_G (g) \cap Z_1\right| = \sum_{g \in G} \text{Fix}_{X^{(k, G)}}; \\ |Z_2| &=& \sum_{g \in G} \left|\pi^{-1}_G (g) \cap Z_2\right| = \sum_{g \in G} c_k(g). \end{array}$

Thus,

$\displaystyle \begin{array}{rcl} \left|X^{(k, G)}/G\right| &=& \frac{1}{|G|} \sum_{g \in G} \text{Fix}_{X^{(k, G)}} (g) = \frac{|Z_1|}{|G|} = \frac{k|Z_2|}{|G|} = \frac{1}{|G|} \sum_{g \in G} k c_k(g), \end{array}$

as desired.

${\Box}$