A Note On Sheafification

You can view a pdf of this entry here.

A presheaf {\mathcal{F}} on a topological space {X} with values in a category {\mathcal{D}} (usually with values in GrpCommRing, or Set), can be associated to a sheaf {\widetilde{\mathcal{F}}} on {X} in a natural way. In fact, this association is left adjoint to the inclusion functor from the category of sheaves on {X} to the category of presheaves on {X}.

The standard construction of {\widetilde{\mathcal{F}}} is fairly straightforward: consider the topology on the disjoint union of stalks {\text{Tot}(\mathcal{F}) = \bigcup_{x \in X} \mathcal{F}_x} generated by the collection of sets

\displaystyle \begin{array}{rcl} \mathfrak{V}(b, U) &=& \{ (b_x, x) : x \in U\}, \end{array}

where {U \subseteq X} is open, {b\in \mathcal{F}(U)} and {b_x} is the image of {b} in {\mathcal{F}_x}. Let {p : \text{Tot}(\mathcal{F}) \rightarrow X} be the natural projection. For every {U \subseteq X} define {\widetilde{\mathcal{F}}(U)} to be the set of continuous sections {s : U \rightarrow \text{Tot}(\mathcal{F})} of {p}, i.e. {p\circ s = \text{id}_U}. It follows that {\widetilde{\mathcal{F}}} is a sheaf and there is a natural morphism {i : \mathcal{F} \rightarrow \widetilde{\mathcal{F}}}, such that {\mathcal{F}} is a sheaf if, and only if, {i} is an isomorphism, (this is a fairly easy exercise).

This shows that every sheaf arises as the sheaf of sections of some continuous map onto {X}. This also justifies the terminology: an element {a \in \mathcal{F}(U)} is called a section. At first glance, it is hard to see how this sheaf relates to {\mathcal{F}}. Indeed, the canonical map {i} fails to be either surjective or injective, in general. This says that we cannot “complete” {\mathcal{F}} by simply adding more elements, or by glueing together elements; we must do both. Thus, we can can’t view {\mathcal{F}} as a subpresheaf of {\widetilde{\mathcal{F}}} or as a cover. However, there is another natural way to think of {\widetilde{\mathcal{F}}}.

Let {\mathcal{G}} be a sheaf on {X} with values in some category {\mathcal{D}}, as above. One of the advantages of having a sheaf on {X} is that a section {s \in \mathcal{G}(U)} is uniquely determined by the collection of elements {\{s_x \in \mathcal{G}_x : x \in U\}}. This can fail for presheaves. Thus, we need {i(U): \mathcal{F}(U) \rightarrow \widetilde{\mathcal{F}}(U)} to glue together elements {a, b \in \mathcal{F}(U)} such that {a_x = b_x} for all {x \in U}. This is evident in the commutative diagram:

\displaystyle \begin{array}{rcl} \begin{array}{ccc} \mathcal{F}(U) & \longrightarrow & \widetilde{\mathcal{F}}(U) \\ \downarrow & \; & \downarrow \\ \mathcal{F}_x & \longrightarrow & \widetilde{\mathcal{F}}_x \end{array} \end{array}

This says that {i(U)(a)} and {i(U)(b)} have the same image in {\widetilde{\mathcal{F}}_x} for all {x \in U}. Thus, because {\widetilde{\mathcal{F}}} is a sheaf, {i(U)(a) = i(U)(b)}.

Aside from glueing sections together that “should” be equal, we also need to add more elements to our presheaf {\mathcal{F}}. Suppose we have an open cover of an open subset {U \subseteq X}: {U = \bigcup_\alpha U_\alpha}. If we can find a collection {\{b_\alpha\}, b_\alpha \in U_\alpha}, such that {b_\alpha|_{U_\alpha \cap U_\beta} = b_{\beta}|_{U_\alpha \cap U_\beta}}, then we should be able to find a “global” section {b} on {U} such that {b|_{U_\alpha} = b_\alpha}. Because our presheaf may be too “small” and may not contain this element, we need to ensure that our sheafification {\widetilde{\mathcal{F}}} does. The set {\{b_\alpha\}} is compatible in the obvious sense, so we can glue them together to form a section {\hat{b} \in \widetilde{\mathcal{F}}(U), \hat{b} : x \mapsto (b_{\alpha})_x}. This map is continuous because {\hat{b}^{-1}(\mathfrak{V}(a, W)) = \{x \in W \cap U : \hat{b}(x) = a_x\}}. is open. Indeed, if {a_x = b(x) = (b_\alpha)_x} for some {x}, then there is an open subsets {W' \subseteq W \cap U} such that {x\in W'} and {a|_{W'} = (b_\alpha)|_{W'}} in {\mathcal{F}(W')}, so {W' \subseteq \hat{b}^{-1}(\mathfrak{V}(a, W))}. It is also clear that {p \circ \hat{b} = \text{id}_U}.

To sum up our construction, we note the following: to construct {\widetilde{\mathcal{F}}} we find the most “efficient” sheaf such that

  1. sections that have the same germ at every point are equal
  2. compatible sets of sections have a common extension.

This gives the following alternative description of {\widetilde{\mathcal{F}}}:

\displaystyle \begin{array}{rcl} \widetilde{\mathcal{F}}(U) &=& \{ \{(b_{\alpha}, W_\alpha)\}_{\alpha} : U = \bigcup_{\alpha} W_\alpha, b_\alpha \in \mathcal{F}(W_{\alpha}), (b_\alpha)_x = (b_\beta)_x \text{ for all } x \in W_\alpha \cap W_\beta\} /\sim \end{array}

where {\{(b_\alpha, W_\alpha)\}_\alpha \sim \{(a_\beta, V_{\beta})\}_\beta}, if for all {\alpha, \beta} and {x \in W_\alpha \cap V_{\beta}}, {(b_\alpha)_x = (a_\beta)_x}. Note that condition {2} and the assumption that this extension is unique is just the sheaf axiom and, hence, it implies {1}.

We claim that this definition of {\widetilde{\mathcal{F}}} is equivalent to the definition given above. Indeed, we can view each collection {\{(b_\alpha, W_\alpha)\}_\alpha} as a section {\hat{b}} in the natural way: {\hat{b}(x) = (b_\alpha)_x} if {x \in W_\alpha}. This is independent of the representative of {\{(b_\alpha, W_\alpha)\}_\alpha} by {\sim}. Note that {\hat{b}(x)} is continuous, as we showed earlier. Further, any section {s : U \rightarrow \text{Tot}(\mathcal{F})} arises in this way. Indeed, let {x \in U}, then {s(x) \in \mathfrak{V}(b^x, W_x)} for a neighborhood {W_x} of {x} and an element {b^x \in \mathcal{F}(W_x)}. Thus, because {s} is continuous there exists {W_x' \subseteq s^{-1}( \mathfrak{V}(b^x, W_x)) = \{ y \in W \cap U : s(y) = (b^x)_y\}}. Consider the collection {b = \{(b^x, W_x')\}_{x\in U}}. Because {(b^x)_y = s(y) = (b^z)_y} for all {y \in W_x' \cap W_z'}, it follows that {b} corresponds to {s}. This shows that the correspondence is surjective. We claim that this correspondence is also injective. Indeed, if {\{(b_\alpha, W_\alpha)\}_\alpha} and {\{(a_\beta, V_\beta)\}_\beta} correspond to the same section {s}, then {(b_\alpha)_x = s(x) = (a_\beta)_x} for all {x \in W_\alpha \cap V_\beta}. Thus, {\{(b_\alpha, W_\alpha)\}_\alpha = \{(a_\beta, V_\beta)\}_\beta} Finally we note that this correspondence is actually a morphism of presheaves: Restricting a collection {\{(b_\alpha, W_\alpha)\}_\alpha \in \widetilde{\mathcal{F}}(U)} to a subset {U'\subseteq U}, yields the element {\{(b_\alpha|_{U'}, W_\alpha\cap U')\}_{\alpha}} and this corresponds to the restricted section {\hat{b}|_{U'}}.

This new definition has some utility. It helps to illuminate the following claims:

Claim 1 Let {i : \mathcal{F} \rightarrow \widetilde{\mathcal{F}}} be the natural map: {i(U)(b) = \{(b, U)\}}. Then {\mathcal{F}} is a sheaf if, and only if, {i} is an isomorphism.

Proof: If {\mathcal{F}} is a sheaf then each collection {\{(b_{\alpha}, W_\alpha)\}_{\alpha} \in \widetilde{\mathcal{F}}(U)} can be written uniquely as {\{(b, U)\}} for some {b \in \mathcal{F}(U)}. Conversely, suppose {i} is an isomorphism, let {U \subseteq X} be an open subset, let {U = \bigcup W_\alpha}, and let {\{b_\alpha : b_\alpha \in \mathcal{F}(W_\alpha)\}} be a collection such that {b_{\alpha}|_{W_\alpha \cap W_\beta} = b_{\beta}|_{W_\alpha \cap W_\beta}}. Then {(b_{\alpha})_x = (b_\beta)_x} for all {x \in W_\alpha \cap W_\beta}, so {\{(b_\alpha, W_\alpha)\}_\alpha \in \widetilde{\mathcal{F}}(U)}. Thus, there exists a unique element {b \in \mathcal{F}(U)} such that {i(U)(b)= \{(b_\alpha, W_\alpha)\}_\alpha}. Therefore, for each {\alpha}, {b_x = (b_\alpha)_x} for all {x \in W_\alpha}. In particular, for each fixed {\alpha}, {i(W_\alpha)(b|_{W_\alpha}) = \{(b|_{W_\alpha}, W_\alpha)\} = \{(b_\alpha, W_\alpha)\} = i(W_\alpha)(b_\alpha)}, whence, {b|_{W_\alpha} = b_\alpha}, as desired. \Box

Now, suppose that {\mathcal{G}} is a sheaf on {X}. Let {j : \mathcal{F} \rightarrow \mathcal{G}} be a morphism of presheaves.

Claim 2 There exists a unique map {\widetilde{j} : \widetilde{\mathcal{F}} \rightarrow \mathcal{G}}, such that {\widetilde{j} \circ i = j}.

Proof: Let {U \subseteq X} be an open subset and let {\{(b_\alpha, W_\alpha)\}_\alpha \in \widetilde{\mathcal{F}}(U)}. Observe that the commutative diagrams

\displaystyle \begin{array}{rcl} \begin{array}{ccc} \mathcal{F}(W_\alpha) & \stackrel{j(W_\alpha)}{\longrightarrow} & \mathcal{G}(W_\alpha) \\ \downarrow & \; & \downarrow \\ \mathcal{F}(W_\alpha\cap W_\beta) & \stackrel{j(W_\alpha\cap W_\beta)}{\longrightarrow} & \mathcal{G}(W_\alpha\cap W_\beta) \\ \downarrow & \; & \downarrow \\ \mathcal{F}_x & \longrightarrow & \mathcal{G}_x \end{array} \end{array}

ensure that {j(W_\alpha)(b_\alpha)_x = j(W_\beta)(b_\beta)_x} for all {x \in W_\alpha \cap W_\beta}. Thus, {j(W_\alpha)(b_\alpha)|_{W_\alpha \cap W_\beta} = j(W_\beta)(b_\beta)|_{W_\alpha \cap W_\beta}} and so by the sheaf axiom for {\mathcal{G}}, there is a unique extension {\widetilde{j}(U)(\{(b_\alpha, W_\alpha)\}_\alpha) \in \mathcal{G}(U)}. For convenience, we can view {\mathcal{G}} as {\widetilde{\mathcal{G}}}, with {\widetilde{j}(U)(\{(b_\alpha, W_\alpha)\}_\alpha) =\{( j(U)(b_\alpha), W_\alpha)\}_\alpha}. Now, if {U' \subseteq U} is an open set, then

\displaystyle \begin{array}{rcl} \widetilde{j}(U')( \{(b_\alpha, W_\alpha)\}_\alpha|_{U'}) &=& \widetilde{j}(U')(\{(b_\alpha|_{U'}, W_\alpha\cap U')\}_\alpha) \\ &=& \{(j(U')(b_\alpha|_{U'}), W_\alpha\cap U')\}_\alpha \\ &=& \{(j(U)(b_\alpha)|_{U'}, W_\alpha\cap U')\}_\alpha \\ &=& \{(j(U)(b_\alpha), W_\alpha )\}_\alpha|_{U'} \\ &=& \widetilde{j}(U) (\{(b_\alpha, W_\alpha)\})|_{U'}. \end{array}

Thus, {\widetilde{j}} is a morphism {\widetilde{j} : \widetilde{\mathcal{F}} \rightarrow \mathcal{G}}. From the definition of {\widetilde{j}}, it is clear that {\widetilde{j} \circ i = j} and {\widetilde{j}} is unique. \Box

A (Slightly) Different Proof of Burnside’s Theorem

A pdf of this post is available here.

1. Introduction

Burnside’s theorem, first proved in the early 20th century by William Burnside, shows that a group of order {p^aq^b}, where {p} and {q} are primes and {a, b \in \mathbf{N} \cup \{0\}}, is solvable. The original proof of Burnside’s theorem utilized representation theory in an essential way. A proof using purely group theoretical tools was given by D. Goldschmidt around 1970. The aim of this note is to provide yet another proof of Burnside’s theorem, via representation theory. First we establish a fundamental definition: A group {G}, with identity {1}, is solvable, if there exists a chain of subgroups

\displaystyle \begin{array}{rcl} G = G_0 \supseteq G_1 \supseteq \cdots \supseteq G_{k-1} \supseteq G_k = \{1\} \end{array}

such that {G_i} is normal in {G_{i-1}} and {G_i/G_{i-1}} is abelian for {i = 1, \cdots, k}. Such a chain is called an abelian tower. Thus, for instance, all abelian subgroups are solvable. We also get the easy result: If {P} is a {p}-group, then {P} is solvable and has a non-trivial center. Proof: Note that {P} acts on itself by conjugation, so by the class equation:

\displaystyle \begin{array}{rcl} |P| &=& |Z(P)| + \sum (P : \text{Stab}(x)). \end{array}

where the sum is over distinct conjugacy classes, and {Z(P)} is the center of {P}. Because each index {(P:\text{Stab}(x))} is divisible {p} and {|Z(P)| \geq 1}, it follows that {|Z(P)| >1}. Now, note that a group of order {p} is abelian, so it is solvable. Thus, by induction, {P/Z(P)} is solvable. We can lift any abelian tower in {P/Z(P)} to get an abelian tower in {P}, as desired. \Box

We fix some notation. If {\rho : G \rightarrow \text{GL}(V)} is a complex representation of {G}, then we omit the homomorphism {\rho} and simply write {V} instead. We denote the character of the representation {V} by {\chi_V}. If {g \in G}, then {\text{Cent}_G(g)}, is the centralizer of {g} in {G} and {C(g)} is the conjugacy class of {g} in {G}. Note that {(G : \text{Cent}_G(g)) = |C(g)|}, by the orbit-stabilizer formula. We write {\text{Stab}(g)} for the stabilizer of {g} in {G}.

2. The Proof of Burnside’s Theorem

The proof will proceed by a series of claims. We assume that {G} is non-abelian and {a > 0}, otherwise {G} is solvable by section one. We wish to derive a contradiction.

Claim 1 We can, without loss of generality, assume that {G} is simple.

Proof: If {N} is a normal subgroup of {G}, then {N} and {G/N} are solvable, by induction, therefore, {G} is solvable. \Box

Assuming this, we note that {Z(G) = 1} because {Z(G)} is normal, and {G} is non-abelian. This also implies that every non-trivial representation of {G} is faithful.

Claim 2 Let {V} be an irreducible representation of {G}. Then, for all {g \in G}

\displaystyle \begin{array}{rcl} \frac{|C(g)|\chi_V(g)}{\chi_V(1)} \end{array}

is an algebraic integer.

Proof: Let {C(g_1), \cdots C(g_k)} be the conjugacy classes of {G}. Then

\displaystyle \begin{array}{rcl} \varphi_{g_i} &=& \sum_{g \in C(g_i)} g \end{array}

is in the center of {\mathbf{Z}[G]} because it is invariant under conjugation. Further, the center of {\mathbf{Z}[G]} is generated by these elements. Indeed, suppose { x= \sum_{g\in G} a(g)g} is in the center of {G}. Write {x = \sum_{i=1}^k \sum_{g \in C(g_i)} a(g)g}. Suppose that {h,h' \in C(g_i)}, and choose {y} such that {h' = yh'y^{-1}}. Then conjugation by {y} fixes {\sum_{g \in C(g_i)} a(g)g}, so {a(h) = a(h')}. It follows that {a(g)} is a class function and so {\sum_{g \in C(g_i)} a(g)g = a(C(g_i)) \varphi_{g_i}}. We have shown that the center of {\mathbf{Z}[G]} is finitely generated over {G}, and hence it is integral over {\mathbf{Z}}. Note that {\varphi_{g} : V \rightarrow V} is a {G} invariant map. Thus, by Schur’s lemma there exists {\lambda \in \mathbf{C}} such that {\varphi_{g} = \lambda\text{id}_V}. Because {\varphi_{g}} is integral over {G}, and {\lambda} satisfies the same polynomial over {\mathbf{Z}} that {\varphi_{g}} does, it follows that {\lambda} is integral over {\mathbf{Z}}. Thus, {\lambda \text{dim} V = \text{Tr}(\varphi_{g}) = |C(g)|\chi_V(g)} and so {\lambda = \frac{|C(g)|\chi_{V}(g)}{\chi_V(1)}} is integral over {\mathbf{Z}}. \Box

Claim 3 Let {V} be an irreducible representation of {G}. Then {\chi_V(1) \mid |G|}.

Proof: We maintain the notation from the previous claim. Note that by the second orthogonality relations:

\displaystyle \begin{array}{rcl} \sum_{i=1}^k |C(g_i)|\overline{\chi_{V}(C(g_i))}\chi_V(C(g_i)) &=& |G| \end{array}

If we divide both sides by {\text{dim} V}, we get a sum of algebraic integers on the left, and a rational number on the right. Thus, {\text{dim} V \mid |G|}. \Box

Claim 4 G has no non-trivial 1-dimensional representations.

Proof: If {V} is a non-trivial one 1-dimensional representation, then {G \hookrightarrow \mathbf{C}^\ast}, so {G} is abelian. This is a contradiction. \Box

Claim 5 For every {g \in G}, {g \neq 1}, there exists an irreducible representation {V} such that {\chi_{V}(g) \neq 0} and {\text{dim} V = p^r} for some {r > 0}.

Proof: Let {V_1, \cdots, V_k} be the irreducible representations of {G}, where {V_1} is the trivial representation. By claim 2.3, {\text{dim}(V_i) \mid |G|} for all {i}, so {\chi_{V_i}(1) = p^{c}q^d} for some {c \leq a, b \leq d}. If {r_G} is the character of the regular representation, then {r_G(g) = \sum_{i} \chi_{V_i}(1) \chi_{V_i}(g)} is {0} if {g \neq 1}, and {|G|} otherwise. Now, suppose {q} divides {\chi_{V_i}(1)} for {i = 2, \cdots, n}. Then

\displaystyle \begin{array}{rcl} \sum_{i=2}^k \chi_{V_i}(1) \chi_{V_i}(g) &=& -1 \end{array}

because {V_1} is the only 1-dimensional representation of {G}. This is a contradiction: if we divide both sides by {q}, we get an algebraic integer on the left, and a rational number on the right. In particular, it follows that there is an {i} such that {\chi_{V_i}(1) = p^r} for some { r > 0} and {\chi_{V_i}(g) \neq 0}. \Box

The following claim holds for any finite group.

Claim 6 If there exists {g \in G} and a representation {V} such that {|\chi_V(g)| = \chi_V(1)}, then {g \in Z(G)}.

Proof: Because {g} is unitary, it is diagonalizable: {g = \text{diag}(\lambda_1, \cdots, \lambda_{\chi_V(1)})}. Note that {|\lambda_1 + \lambda_2| \leq |\lambda_1| + |\lambda_2|} with equality if, and only if, {\lambda_1 = \lambda_2}. Thus, because

\displaystyle \begin{array}{rcl} \chi_V(1) = \left|\sum_{i=1}^{\chi_V(1)}\lambda_i \right| \leq \sum_{i=1}^{\chi_{V}(1)} |\lambda_i|= \chi_{V}(1), \end{array}

it follows by induction that {\lambda_i = \lambda_1} for all {i}. Thus, {g = \lambda_{1}\text{id}}, and so {g} is in the center of the image of {G}, but {G} is simple so it is isomorphic to its image and hence {g \in Z(G) }. \Box

By the Sylow theorems, {G} has a {p}-Sylow subgroup {P}. By proposition 1.2, {Z(P)} is non-trivial. Thus, we can choose {g \in Z(P)}. By taking an appropriate power of {g}, we can assume that {\text{ord} (g) = p}. Note that {P \subseteq \text{Cent}_G(g)}, so {(G : \text{Cent}_G(g)) = |C(g)|} is not divisible by {p}, i.e. {|C(g)| = q^k} for some {k \geq 0}. Let {V} be a representation such that {\text{dim} V = p^r} and {\chi_{V}(g) \neq 0}. Note that the the map

\displaystyle \begin{array}{rcl} G \hookrightarrow \text{GL}(V) \stackrel{\det}{\rightarrow} \mathbf{C}^\ast \end{array}

is a 1-dimensional representation of {G}, thus, it is trivial, so {\det(h) = 1}, for all {h \in G}. Let {\lambda_1, \cdots, \lambda_{p^r}} be the eigenvalues of {g} listed with multiplicity. Note that {\lambda_i} is a {p}-th root of unity for all {i}, because {g^{p} = 1} but {g^i \neq 1} for all { 1\leq i < p}. Let {\lambda_1, \cdots, \lambda_{j_0} = 1} and {\lambda_{j_0+1}, \cdots, \lambda_{p^r} \neq 1}. By the formula

\displaystyle \begin{array}{rcl} \prod_{i=j_0}^{p^r} \lambda_i = \det (g) = 1, \end{array}

each conjugate of {\lambda_i}, {i \geq j_0} appear with the constant multiplicity because the product is stable under the action of {\text{Gal}(\overline{\mathbf{Q}}/ \mathbf{Q})}, in other words, if {\lambda_i} has multiplicity {l}, then each {\lambda_{i'}} has multiplicity {l} for {i' \geq j_0}. Therefore, because each {\lambda_i}, {i \geq j_0} has {p-1} conjugates, we have {l(p-1) = p^r - j_0}. In particular, {l \equiv j_0 \mod p} Recall that for any primitive {p}-th root of unity {\zeta_{p}}, {\sum_{i=1}^{p-1} \zeta_{p}^i = -1}, thus,

\displaystyle \begin{array}{rcl} \chi_V(g) &=& \sum_{i=1}^{p^r} \lambda_i \\ &=& \sum_{i=1}^{j_0} \lambda_i + \sum_{i=j_0+1}^{p^r} \lambda_{i} \\ &=& j_0 - l \\ &\equiv& 0 \mod p. \end{array}

By claim 2.2, it follows that {\frac{|C(g)| \chi_V(g)}{\chi_V(1)}} is an algebraic integer, and hence an integer. However, {|C(g)| = q^k}, is relatively prime to {p}, so {\frac{\chi_{V}(g)}{\chi_V(1)} \in \mathbf{Z}}, and {\chi_V(g) = p^rb} for some {b \in \mathbf{Z}}, because {\chi_V(g) \neq 0}. Now, because

\displaystyle \begin{array}{rcl} |\chi_V(g)| \leq \sum_{i=1}^{p^r} |\lambda_i| = \chi_V(1) \end{array}

it follows that {b = \pm 1} and {|\chi_V(g)| = \chi_V(1)}. By claim 2.6, it follows that {g \in Z(G) = \{1\}}, which is a contradiction. Thus, {G} is solvable.

3. Consequences

There are several key steps in this argument. The most important assumptions are that {G} is simple, and for any {g \in G}, there is a prime power dimensional representation {V} such that {\chi_V(g) \neq 0}. Retracing the steps of the proof yields the following consequence:

Corollary 1 Suppose {G} is a simple group and {V} is an irreducible representation of {G} of dimension {p^r}, for some { r > 0}. Let {P} be a {p}-Sylow subgroup of {G}. If {g \in Z(P)} of order {p}, and {\chi_V(g) \neq 0}, then {G} is abelian.


Parker’s Lemma is Equivalent to Burnside’s Lemma

You can view a pdf of this entry here.

Let {G} be a finite group that acts on a finite set, {X}. Given elements {g \in G} and {x \in X}, we introduce the cycle notation, {(xgx\cdots g^{k-1}x)} to denote that {g^kx = x}, but {g^jx \neq x} for all {0<j < k}. We say that {(xgx\cdots g^{k-1}x)} is a {k}-cycle in {g}. Conceptually, this is a natural construction: the action of {G} on {X} induces a map, {G \rightarrow S_{X}}, of {G} into the symmetric group of {X}. This allows us to decompose the image of any element {g \in G} into disjoint cycles.

Let the expression {c_{k}(g)} denote the number of {k}-cycles of {g}. Let {X^{(k,G)}} be the set of all {k}-cycles contained in elements of {G}. In particular, for {k = 1}, {X^{(1, G)} = \{(x) : x\in X\}} because {1}-cycles are fixed points of elements of {G}, and {1\in G} fixes every element of {X}. Note that if {X^{(k, G)}} is nonempty, then {G} acts on {X^{(k, G)}} by conjugation: if {c = (xhx \cdots h^{k-1}x)}, then {g\cdot c = (gx ghx \cdots gh^{k-1}x)} is a cycle in {ghg^{-1}}, so {g\cdot c \in X^{(k, G)}}. Parker’s lemma is the following:

Lemma 1 (Parker’s Lemma) The number of orbits of {G} on {X^{(k,G)}} is

\displaystyle \begin{array}{rcl} \left|X^{(k, G)}/G\right| &=& \frac{1}{|G|} \sum_{g \in G} kc_k(g). \end{array}

At first glance, this lemma appears to be a straightforward application of Burnside’s orbit counting lemma:

\displaystyle \begin{array}{rcl} \left| X/G\right| &=& \frac{1}{|G|} \sum_{g \in G} \text{Fix}_{X}(g) \end{array}

In fact, the case {k = 1} is Burnside’s lemma: the number, {c_1(g)}, is the number of {1}-cycles of {g}, i.e. {\text{Fix}_{X}(g)}. However, for {k \geq 2}, if {X^{(k, G)}} is nonempty, then {kc_{k}(1) = 0 \neq \text{Fix}_{X^{(k, G)}}(1) = \left|X^{(k, G)}\right|}. Thus, we cannot simply equate terms: {\text{Fix}_{X^{(k, G)}}(g) \neq kc_k(g)}, in general. The difficulties described above suggest that Parker’s lemma is a stronger statement than Burnside’s lemma. In fact, I am unaware of any proof in the literature that deduces Parker’s lemma as a consequence of Burnside’s Lemma. The aim of this post is to establish Parker’s lemma as a simple consequence of Burnside’s lemma.

Proof: We assume that {X^{(k, G)}} is nonempty, otherwise the result is trivial. We note that by Burnside’s lemma

\displaystyle \begin{array}{rcl} \left|X^{(k, G)}/G\right| &=& \frac{1}{|G|} \sum_{g \in G} \text{Fix}_{X^{(k, G)}} (g). \end{array}

Define subsets {Z_1, Z_2 \subseteq G\times X^{(k, G)}} by

\displaystyle \begin{array}{rcl} Z_1 &=& \{ (g, c) \in G \times X^{(k, G)} : g \cdot c = c \} \\ Z_2 &=& \{ (g, c) \in G \times X^{(k, G)} : c \text{ is a cycle in } g \}. \end{array}

For each {c \in X^{(k, G)}}, choose {g_c} such that {(g_c, c) \in Z_2}. Let {H_c} be the pointwise stabilizer of {c}. Then {g_cH_c} is the set of elements of {G} that contain {c}, and {\langle g_c\rangle H_c} is the stabilizer of {c}. Note that {g_c^i H_c \neq g_c^j H_c} for {0 < i,j \leq k}, with {i \neq j}. Indeed, if {g_c^i = g_c^j h}, where {h \in H_c}, then {g_c^{i-j} = h} is disjoint from {c}, i.e. {i-j \equiv 0 \mod k}, which is a contradiction. Thus, because {g_c^k \in H_c}, we see that {\text{Stab}(c) = \langle g_c \rangle H_c = \bigcup_{i = 1}^k g_c^i H_c} is a partition of {\text{Stab}(c)}. Therefore, {|\text{Stab}(c)| = k |g_c H_c|}, and hence

\displaystyle \begin{array}{rcl} |Z_1| &=& |\bigcup_{c} \text{Stab}(c) \times \{c\}| \\ &=& \sum_{c} k\left| g_c H_c\right| \\ &=& k |Z_2|, \end{array}

because { Z_2 = \bigcup_{c} \left(g_c H_c \times \{c\}\right)}. Now, we count the order of {Z_1} and {Z_2} by counting points along the fiber of the canonical projection

\displaystyle \begin{array}{rcl} \pi_G : G \times X^{(k, G)} &\rightarrow& G. \end{array}


\displaystyle \begin{array}{rcl} |Z_1| &=& \sum_{g \in G} \left|\pi^{-1}_G (g) \cap Z_1\right| = \sum_{g \in G} \text{Fix}_{X^{(k, G)}}; \\ |Z_2| &=& \sum_{g \in G} \left|\pi^{-1}_G (g) \cap Z_2\right| = \sum_{g \in G} c_k(g). \end{array}


\displaystyle \begin{array}{rcl} \left|X^{(k, G)}/G\right| &=& \frac{1}{|G|} \sum_{g \in G} \text{Fix}_{X^{(k, G)}} (g) = \frac{|Z_1|}{|G|} = \frac{k|Z_2|}{|G|} = \frac{1}{|G|} \sum_{g \in G} k c_k(g), \end{array}

as desired.


A Probabilistic Approach to Group Commutativity

A few weeks ago I went to a talk by Igor Pak. The talk discussed methods of generating random elements in finite groups. During the talk, Professor Pak mentioned an interesting probabilistic result:

Let {G} be a finite group. The probability that two random group elements commute is equal to {\frac{k}{n}}, where {k} is the number of conjugacy classes in {G}.

The proof is surprisingly straight forward. Indeed, we are trying to calculate the following:

\displaystyle \begin{array}{rcl} \text{\bf{Pr}}(xy = yx) &=& \frac{\#\{(x, y) \in G \times G : xy = yx\}}{(\# G)^2}. \end{array}


{C = \{(x, y) \in G \times G : xy = yx\}}.

For each {x \in G}, the number of {y \in G} such that {(x,y) \in G} is equal to the cardinality of the centralizer of {x}: {\#C_G(x)}. Thus,

{\#C = \sum_{x \in G} \#C_G(x)}.

If {x} and {x'} are conjugate, then {C_g(x)} and {C_g(x')} are conjugate. Thus,

{\#C= \sum_{x_i} \#C_G(x_i)(G : C_G(x_i)) = \sum_{x_i}|G|},

where the sum is over distinct conjugacy classes. Therefore, {\#C = k\cdot n} where {k} is the number of conjugacy classes in {G}, as desired.

When {G} is non-abelian, it’s easy to show that {\frac{k}{n} \leq \frac{5}{8}}, with equality if, and only if, the index of the center of {G} is {4}. See the following paper for some interesting exercises, and the continuous version of this theorem: http://www.jstor.org/pss/2318778

Universality and Characteristic Polynomials

Today I’m going to describe a simple, yet powerful, trick to deduce specific problems from a “universal” case. The construction is common in algebra; given an object {O} and a property {P} that we would like to deduce about this object, we construct a certain “universal” object {U} such that

  1. {U} has property {P};
  2. there exists a surjective homomorphism {\varphi : U \rightarrow  O} that “preserves” {P}.

In order to justify this abstraction, it should be easier to show that {U} satisfies {P} than it is to show that {O} satisfies {P}. However, {U} should not be a “trivial” object that vacuously satisfies {P}. Instead, {U} should have several “nice” properties that {O} does not have, but it should not have properties that you feel are irrelevant to the current problem. Terry tao discusses a similar trick in analysis (trick 10) in his “problem solving strategies,” post.

One of my favorite examples of this strategy is the following standard qual problem:

Problem 1 Let {R} be a commutative ring with unity. Let {M} and {N} be {n\times n} matrices with entries in {R}. Then {MN} and {NM} have the same characteristic polynomial.

The first thing we notice is that if {M} or {N} is invertible, the problem is easy. Indeed, if {M} is invertible, then {MN =  MNMM^{-1}} so

\displaystyle   \begin{array}{rcl}  \det(I X- MN) &=& \det(IX-MNMM^{-1}) \\  &=& \det(MM^{-1}X - M\left(NM\right)M^{-1}) \\ &=&  \det(M)\det(IX - NM)\det(M)^{-1} \\ &=& \det(IX - NM).  \end{array}

This observation shows that a possible candidate for a universal space {U}, should contain a lot of invertible matrices. This may not be the case for {R} because it may have many zero divisors and few units. Even if {M} has a non-zero determinant, it might not be invertible. Another possible reduction could involve working over a field. Indeed, in this case, a proof involving row and column operations exists, but it is tedious.

A clever way of proceeding, is to consider the following ring:

\displaystyle  \begin{array}{rcl}  U &:=& \mathbf{Z}[X_{ij},  Y_{kl} : i,j, k, l \in \{1, \cdots, n\}], \end{array}

where {X_{ij}} and {Y_{lk}} are indeterminates. Define the matrices {M' = (X_{ij})_{i,j \in  \{1, \cdots, n\}}} and {N' = (Y_{kl})_{k, l \in  \{1, \cdots, n\}}}. We consider the the quotient field

\displaystyle  \begin{array}{rcl}  K = \text{quot}(U) =  \mathbf{Q}(X_{ij}, Y_{kl} : i,j, k, l \in \{1, \cdots, n\}) \end{array}

Observe that over {K}, {\det(M')} and {\det(N')} are non-zero. Hence, {M'} is invertible. Our remark before shows that the characteristic polynomial of {M'N'} is the characteristic polynomial of {N'M'}. Moreover, the characteristic polynomial of {M'N'} has integer coefficients by the definition of the determinant. Thus, the remark is almost trivially true over {U}. Further, we can define a map

\displaystyle  \begin{array}{rcl}  \varphi : U &\rightarrow& R  \\ X_{ij} &\mapsto& a_{ij} \\ Y_{kl} &\mapsto& b_{kl} \\  1 &\mapsto& 1, \end{array}

where {M = (a_{ij})_{i, j \in \{1, \cdots, n\}}} and {N = (b_{kl})_{k, l \in \{1, \cdots, n\}}}. Because the coefficients of {P(X)}, the characteristic polynomial of {M'N' }, has coefficients that are polynomials in the entries of {M'N'}, with a bit of thought, we see that the characteristic polynomial of {MN} is equal to {P^\varphi}. The same statement holds for {N'M'} and {NM}. Thus, {MN} and {NM} have the same characteristic polynomial.

Because {\mathbf{Z}} is the initial object in the category of commutative rings with unity, we can always define such a map. If {R} does not contain an identity element, is it possible to apply this proof to that case as well? The map {\varphi} is not so well defined in this case. However, this technique exposes this problem for what it really is, (symbol manipulation), so it appears that the result should continue to hold.