Introduction

These are the lecture notes and class materials for Math 817 Introduction to Modern Algebra I in Fall 2025. This is the first of a two-part course on groups, rings, modules, and fields. In this first half, we will discuss group theory, including group actions, and introduce rings. A major goal of this course is to prepare graduate students for the PhD qualifying exam in algebra.

The lecture notes draw heavily on Eloísa Grifo’s Algebra Notes (PDF, opens in new tab) , which in turn draw from earlier lecture notes of Mark Walker and Alexandra Seceleanu. The textbook Abstract Algebra by Dummit and Foote is a good resource covering similar material.

Groups

1. An Introduction to Groups

This class has four major topics: Groups, Rings, Modules, and Fields. Let us begin with group theory.

A group is a basic algebraic structure that is found in many objects we might otherwise care about, but has enough structure that we can deduce general statements and theorems.

1.1 Definitions and first examples

Definition 1.1.1 (Binary operation)

A binary operation on a set \(S\) is a function \(S \times S \to S\). If the binary operation is denoted by \(\cdot\), we write \(x \cdot y\) for the image of \((x,y)\).

Remark 1.1.2

We often write \(xy\) instead of \(x \cdot y\) if the operation is clear from context.

Remark 1.1.3 ("Closed under \(\cdot\)")

We say that a set \(S\) is closed under the operation \(\cdot\) when we want to emphasize that for any \(x,y\in S\) the result \(xy\) belongs to \(S\). Closure is part of the definition of a binary operation and is implicitly assumed.

Definition 1.1.4 (Group)

A group is a set \(G\) with a binary operation \(\cdot\) (group multiplication) such that:

  • Associativity: For all \(x,y,z \in G\), \((x\cdot y)\cdot z = x\cdot(y\cdot z)\).
  • Identity element: There exists \(e\in G\) with \(e\cdot x = x = x\cdot e\) for all \(x\in G\).
  • Inverses: For each \(x\in G\) there is \(y\in G\) with \(xy=e=yx\).

The element \(e\) is the identity. For each \(x\in G\), an element \(y\) with \(xy=e=yx\) is an inverse of \(x\). The order of \(G\) is the number of elements in \(G\).

Example 1.1.5 (General linear group)

\[ GL_n(\mathbb{R}) := \{\, \text{invertible } n\times n \text{ matrices with entries in } \mathbb{R}\,\}. \]

This is a group under matrix multiplication: associativity holds, the identity matrix is the identity, and every element has an inverse by definition.

Vaguely, the definition of group is motivated by the idea that a collection of functions from set to itself that preserves some extra structure naturally satisfies the three axioms: for example, the general linear group consists of functions from the vector space \(\mathbb{R}^n\) to itself that preserve the linearity structure.

Remark 1.1.6 (Naming groups)

Although a group is the set and the operation, we often refer to it by the underlying set \(G\).

Remark 1.1.7 (Semigroups and monoids)

A set with an associative binary operation is a semigroup; if it has an identity, it is a monoid. While we will not study non-group monoids in this course, they’re useful objects.

Lemma 1.1.8 (Identity and inverses are unique)

  1. The identity \(e\) is unique.
  2. For each \(x\in G\), the inverse of \(x\) is unique.

Proof of Lemma 1.1.8

If \(e\) and \(e'\) are both identities, then \(e = ee' = e'\). If \(y\) and \(z\) are both inverses of \(x\), then \(z = (yx)z = y(xz) = ye = y\).

Now given \(x \in G\), suppose \(y\) and \(z\) are two inverses for \(x\), meaning that \(yx = xy = e\) and \(zx = xz = e\). Then

\[ \begin{aligned} z & = ez &&& \textrm{since $e$ is the identity}\\ & = (yx)z &&& \textrm{since $y$ is an inverse for $x$}\\ & = y(xz) &&& \textrm{by associativity}\\ & = ye &&& \textrm{since $z$ is an inverse for $x$}\\ & = y &&& \textrm{since $e$ is the identity}. \quad \square \end{aligned} \]

Remark 1.1.9

The same argument shows the identity in a monoid is unique (no inverses needed).

Notation 1.1.10

For \(x\in G\), write \(x^{-1}\) for its unique inverse.

Remark 1.1.11 (Left/right inverses in monoids)

In a monoid, an element might have left or right inverses (possibly several). If it has both a left and a right inverse, they are unique and equal (by adapting the uniqueness proof).

Exercise 1.1.12

Give an example of a monoid \(M\) and an element with a left inverse but not a right inverse.

Definition 1.1.13 (Powers)

For \(x\in G\) and integer \(n\ge 1\), define \(x^n\) as the product of \(x\) with itself \(n\) times.

Exercise 1.1.14 (Properties of group elements)

  1. If \(xy=xz\), then \(y=z\).
  2. If \(yx=zx\), then \(y=z\).
  3. \((x^{-1})^{-1}=x\).
  4. \((a_1\cdots a_n)^{-1}=a_n^{-1}\cdots a_1^{-1}\).
  5. \((x^{-1}yx)^n=x^{-1}y^n x\) for \(n\ge 1\).
  6. \((x^{-1})^n=(x^n)^{-1}\).

Notation 1.1.15

For \(n>0\), we define negative powers by the rule \(x^{-n}:= (x^n)^{-1}\). Note that \(x^{-n}=(x^{-1})^n\).

Exercise 1.1.16

Show that \(x^a x^b = x^{a+b}\) for all integers \(a,b\).

Definition 1.1.17 (Abelian group)

A group \(G\) is abelian if \(xy=yx\) for all \(x,y\in G\).

For abelian groups, we often write the operation as “\(+\)”, identity as \(0\), and inverse as \(-x\).

Example 1.1.18

The following are examples of abelian groups.
  • The trivial group \(\{e\}\).
  • \((\mathbb{Z},+), (\mathbb{Q},+), (\mathbb{R},+), (\mathbb{C},+)\).
  • \((\mathbb{Z}/n, +)\) (addition modulo \(n\)).
  • Let \(F\) be a field. If you are unfamiliar with fields, it is good enough for now to keep in mind the following examples: the real numbers \(\mathbb{R}\), the complex numbers \(\mathbb{C}\), and modular rings \(\mathbb{Z}/(p)\) for a prime number \(p\). Then \(F^\times = F\setminus\{0\}\) under multiplication.

Example 1.1.19 (\(GL_n(F)\))

For a field \(F\), \(\displaystyle GL_n(F) = \{\text{invertible } n\times n \text{ matrices over } F\}\). If an \(n\times n\) matrix has a left inverse, it also has a right inverse (and vice-versa). For \(n\ge 2\), \(GL_n(F)\) is generally nonabelian; \(GL_1(F) = F^\times\).

Definition 1.1.20 (Center)

The center of a group is the subset \[ \mathcal{Z}(G) := \{\, x\in G \mid xy=yx \text{ for all } y\in G \,\}. \]

Remark 1.1.21

The center always contains the identity. If \(\mathcal{Z}(G)=\{e_G\}\), the center is said to be trivial.

\(G\) is abelian if and only if \(\mathcal{Z}(G)=G\).

Informal Definition 1.1.22 (Presentation)

A presentation specifies a group via generators and relations:

\[ G = \langle \text{generators} \mid \text{relations} \rangle . \]

A set \(S\) generates \(G\) if every element of \(G\) can be expressed as a finite product of elements of \(S\) and their inverses. A relation is an identity among such words.

Only finite products are allowed; infinite products are not considered.

We will return to presentations later in the class when we can give a proper defintion.

Remark 1.1.23

We only take products of finitely many generators and their inverses; infinite products are not defined here.

Example 1.1.24

The group \(\mathbb{Z}\) has one generator \(1\) and no relations.

Example 1.1.25

The group of integers modulo \( n \) has presentation \[ \mathbb{Z}/n \cong \langle x \mid x^n = e \rangle . \]

Definition 1.1.26 (Cyclic / finitely generated)

A group is cyclic if generated by one element. It is finitely generated if generated by finitely many elements.

Example 1.1.27

\(\mathbb{Z}\) and \(\mathbb{Z}/n\) are cyclic.

Exercise 1.1.28

Prove that every cyclic group is abelian.

Exercise 1.1.29

Prove that \((\mathbb{Q}, +)\) and \(\operatorname{GL}_2(\mathbb{Z}_2)\) are not cyclic.

There is no algorithm that, given any group presentation as an input, can decide whether the group is actually the trivial group with just one element. There exist a presentation with finitely many generators and finitely many relations such that whether or not the group is actually the trivial group with just one element is independent of the standard axioms of mathematics! Nonetheless, finding and working with presentations of groups is a crucial technique in group theory.

1.2 Permutation groups

Definition 1.2.1 (Permutation group)

For any set \(X\), the permutation group on \(X\) is the set \(\mathrm{Perm}(X)\) of all bijective functions from \(X\) to itself equipped with the binary operation given by composition of functions.

Notation 1.2.2

For an integer \(n \geqslant 1\), we write \([n] := \{1,\ldots,n\}\) and \(S_n := \mathrm{Perm}([n])\). An element of \(S_n\) is called a permutation on \(n\) symbols, sometimes also called a permutation on \(n\) letters or \(n\) elements. The group \(S_n\) is also called the symmetric group on \(n\) symbols.

We can write an element \(\sigma\) of \(S_n\) as a table of values:

\[ \begin{array}{c||c|c|c|c|c} i & 1 & 2 & 3 & \cdots & n \\ \hline \sigma(i) & \sigma(1) & \sigma(2) & \sigma(3) & \cdots & \sigma(n) \\ \end{array} \]

We may also represent this using arrows, as follows:

\[ \begin{array}{rcl} 1 & \mapsto & \sigma(1) \\ 2 & \mapsto & \sigma(2) \\ \vdots & & \vdots \\ n & \mapsto & \sigma(n) \end{array} \]

Remark 1.2.3

To count the elements \(\sigma \in S_n\), note that

  • there are \(n\) choices for \(\sigma(1)\);
  • once \(\sigma(1)\) has been chosen, we have \(n-1\) choices for \(\sigma(2)\);
\(\vdots\)
  • once \(\sigma(1), \ldots, \sigma(n-1)\) have been chosen, there is a unique possible value for \(\sigma(n)\).

Thus the group \(S_n\) has \(n!\) elements.

It is customary to use cycle notation for permutations.

Definition 1.2.4 (Cycle, \(m\)-cycle)

If \(i_1, \dots, i_m\) are distinct integers between \(1\) and \(n\), then \(\sigma=(i_1 \, i_2 \, \cdots i_m)\) denotes the element of \(S_n\) determined by

\[ \sigma(i_1)=i_2, \quad \sigma(i_2)=i_3, \quad \ldots, \quad \sigma(i_{m-1})=i_m, \quad \sigma(i_m)=i_1, \]

and which fixes all elements of \([n] \setminus \{i_1, \dots, i_m\}\).

\[ \sigma(j) = j \quad \text{for all } j \in [n] \setminus \{i_1, \dots, i_m\}. \]

Such a permutation is called a cycle or an \(m\)-cycle. In particular, we say that \(\sigma\) has length \(m\).

Remark 1.2.5

A 1-cycle is the identity permutation.

Notation 1.2.6

A 2-cycle is often called a transposition.

Remark 1.2.7

The cycles \((i_1 \cdots i_m)\) and \((j_1 \cdots j_m)\) represent the same cycle if and only if the two lists are cyclical rearrangements. Example: \((1 \, 2 \, 3) = (2 \, 3 \, 1)\) but \((1 \, 2 \, 3) \neq (2 \, 1 \, 3)\).

Remark 1.2.8

For \(\sigma = (i_1 \ldots i_m)\), any integer \(k\) gives

\[ \sigma^k(i_j) = i_{\,j+k \pmod{m}}. \]

Here we interpret \(j+k \!\pmod{m}\) to denote the unique integer \(0 \leqslant s < m\) such that

\[ s \equiv j+k \pmod m. \]

Notation 1.2.9

The product of cycles \((i_1 \cdots i_s)\) and \((j_1 \cdots j_t)\) is written \((i_1 \cdots i_s)(j_1 \cdots j_t)\), composed right-to-left.

Example 1.2.10

We claim that the permutation group \(\mathrm{Perm}(X)\) is nonabelian whenever the set \(X\) has \(3\) or more elements. Indeed, given three distinct elements \(x, y, z \in S\), consider the transpositions \((xy)\) and \((yz)\). Now consider the permutations \((yz)(xy)\) and \((xy)(yz)\), where the composition is read from right to left, such as function composition. Then

\[ \begin{array}{c|ccc} (yz)(xy): & x \mapsto z & y \mapsto x & z \mapsto y \\ (xy)(yz): & x \mapsto y & y \mapsto z & z \mapsto x \end{array} \]

Note that \((yz)(xy) \neq (xy)(yz)\), since for example the first one takes \(x\) to \(z\) while the second one takes \(x\) to \(y\).

Lemma 1.2.11 (Disjoint cycles commute)

If the cycles involve disjoint sets of elements, they commute.

Proof of Lemma 1.2.11

We need to show \(\sigma_1(\sigma_2(l)) = \sigma_2(\sigma_1(l))\) for all \(l \in [n]\). If \(l \notin \{i_1, \ldots, i_m, j_1, \dots, j_k\}\), Then \(\sigma_1(l) = l = \sigma_2(l)\), so

\[ \sigma_1(\sigma_2(l)) = \sigma_1(l) = l \qquad \textrm{and} \qquad \sigma_2(\sigma_1(l)) = \sigma_2(l) = l. \]

If \(l \in \{j_1, \dots, j_k\}\), then \(\sigma_2(l) \in \{j_1, \dots, j_k\}\) and hence, since the subsets are disjoint, \(l\) and \(\sigma_2(l)\) are not in the set \(\{i_1 , i_2 , \dots i_m\}\). It follows that \(\sigma_1\) preserves \(l\) and \(\sigma_2(l)\), and thus

\[ \sigma_1(\sigma_2(l)) = \sigma_2(l) \quad \textrm{and} \quad \sigma_2(\sigma_1(l)) = \sigma_2(l). \]

The case when \(l \in \{i_1, \dots, i_m\}\) is analogous.

Theorem 1.2.12 (Products of disjoint cycles)

Every \(\sigma\in S_n\) can be written uniquely (up to order) as a product of disjoint cycles.

Remark 1.2.13

For the uniqueness part of the Theorem, one needs to establish a convention regarding 1-cycles: we need to decide whether the 1-cycles will be recorded. If we decide not to record \(1\)-cycles, this gives the shorter version of our factorization into cycles. If all the 1-cycles are recorded, this gives a longer version of our factorization, but this option has the advantage that it makes it clear what the size \(n\) of our group \(S_n\) is.

We will follow the first convention: we will write only \(m\)-cycles with \(m \geqslant 2\). Under this convention, the identity element of \(S_n\) is the empty product of disjoint cycles. We will, however, sometimes denote the identity by \((1)\) for convenience.

Proof of Theorem 1.2.12

Fix a permutation \(\sigma\). The key idea is to look at the orbits of \(\sigma\): for each \(x \in [n]\), its orbit by \(\sigma\) is the subset of \([n]\) of the form

\[ O_x=\{ \sigma(x), \sigma^2(x), \sigma^3(x), \ldots \} = \{\sigma^i(x) \mid i \geqslant 1 \}. \]

Notice that the orbits of two elements \(x\) and \(y\) are either the same orbit, which happens precisely when \(y \in O_x\), or disjoint. Since \([n]\) is a finite set, and \(\sigma\) is a bijection of \(\sigma\), we will eventually have \(\sigma^i(x) = \sigma^j(x)\) for some \(j > i\), but then

\[ \sigma^{j-i}(x) = \sigma^{i-i}(x) = \sigma^0(x) = x. \]

Thus we can find the smallest positive integer \(n_x\) such that \(\sigma^{n_x}(x)=x\). Now for each \(x \in [n]\), we consider the cycle

\[ \tau_x = (\sigma(x) \,\, \sigma^2(x) \,\, \sigma^3(x) \, \cdots \, \sigma^{n_x}(x)). \]

Now let \(S\) be a set of indices for the distinct \(\tau_x\), where note that we are not including the \(\tau_x\) that are \(1\)-cycles. We claim that we can factor \(\sigma\) as

\[ \sigma=\prod_{i\in S}\tau_i. \]

To show this, consider any \(x \in [n]\). It must be of the form \(\sigma^j(i)\) for some \(i \in S\), given that our choice of \(S\) was exhaustive. On the right hand side, only \(\tau_i\) moves \(x\), and indeed by definition of \(\tau_i\) we have

\[ \tau_i(x) = \sigma^{j+1}(i) = \sigma(\sigma^j(i)) = \sigma(x). \]

This proves that

\[ \sigma=\prod_{i\in S}\tau_i. \]

As for uniqueness, note that if \(\sigma = \tau_1 \cdots \tau_s\) is a product of disjoint cycles, then each \(x \in [n]\) is moved by at most one of the cycles \(\tau_i\), since the cycles are all disjoint. Fix \(i\) such that \(\tau_i\) moves \(x\). We claim that

\[ \tau_x = (\sigma(x) \,\, \sigma^2(x) \,\,\sigma^3(x) \, \cdots \, \sigma^{n_x}(x)). \]

This will show that our product of disjoint cycles giving \(\sigma\) is the same (unique) product we constructed above. To do this, note that we do know that there is some integer \(s\) such that \(\tau_x^s(x) = e\), and

\[ \tau_x = (\tau_x(x) \,\, \tau_x^2(x) \,\, \tau_x^3(x) \, \cdots \, \tau_x^{s}(x)). \]

Thus we need only to prove that

\[ \tau_x^k(x) = \sigma^k(x) \]

for all integers \(k \geqslant 1\). Now by the Theorem, disjoint cycles commute, and thus for each integer \(k \geqslant 1\) we have

\[ \sigma^k = \tau_1^k \cdots \tau_s^k. \]

But \(\tau_j\) fixes \(x\) whenever \(j \neq i\), so

\[ \sigma^k = \tau_i^k (x). \]

We conclude that the integer \(n_x\) we defined before is the length of the cycle \(\tau_i\), and that

\[ \tau_i = (x \, \tau_i(x) \, \tau_i^2(x) \cdots \tau_i^{n_x-1}(x)) = (x \, \sigma(x) \, \sigma^2(x) \cdots \sigma^{n_x-1}(x)). \]

Thus this decomposition of \(\sigma\) as a product of disjoint cycles is the same decomposition we described above.

Example 1.2.14

For \(\sigma\in S_5\) given by

\[ \begin{array}{rcl} 1 & \mapsto & 3 \\ 2 & \mapsto & 4 \\ 3 & \mapsto & 5 \\ 4 & \mapsto & 2 \\ 5 & \mapsto & 1 \end{array} \]
its decomposition is \((135)(24)\).

Definition 1.2.15 (Cycle type)

The cycle type of \(\sigma\) is the unordered list of lengths of cycles in its disjoint decomposition.

Example 1.2.16

\((3\,4)(1\,5)(2\,6\,7)(9\,8\,11)(15\,16\,17\,105\,114)\) in \(S_{156}\) has cycle type \(2,2,3,3,5\).

Exercise 1.2.17

Show \((i_1 \, i_2 \, \cdots \, i_p) = (i_1 \, i_2) (i_2 \, i_3) \cdots (i_{p-2} \, i_{p-1}) (i_{p-1} \, i_p)\).

Corollary 1.2.18

\(S_n\) is generated by transpositions.

Proof of Corollary 1.2.18

Given any permutation, we can decompose it as a product of cycles by the Theorem. Thus it suffices to show that each cycle can be written as a product of permutations. For a cycle \((i_1 \, i_2 \, \cdots \, i_p)\), one can show that

\[ (i_1 \, i_2 \, \cdots \, i_p) = (i_1 \, i_2)(i_2 \, i_3)\cdots(i_{p-2} \, i_{p-1})(i_{p-1} \, i_p), \]

which we leave as an exercise.

Remark 1.2.19

Note however that when we write a permutation as a product of transpositions, such a product is no longer necessarily unique.

Example 1.2.20

If \(n \geqslant 2\), the identity in \(S_n\) can be written as \( (1 2) (1 2) \). In fact, any transposition is its own inverse, so we can write the identity as \( (i j)(i j) \) for any \(i \neq j\).

Exercise 1.2.21

Show \((cd)(ab) = (ab)(cd)\) and \((bc)(ab) = (ac)(bc)\) for distinct \(a,b,c,d\).

Theorem 1.2.22

Given a permutation \(\sigma \in S_n\), the parity (even vs. odd) of the number of transpositions in any representation of \(\sigma\) as a product of transpositions depends only on \(\sigma\).

Proof of Theorem 1.2.22

Suppose that \(\sigma\) is a permutation that can be written as a production of transpositions \(\beta_i\) and \(\lambda_j\) in two ways,

\[ \sigma = \beta_1 \cdots \beta_s = \lambda_1 \cdots \lambda_t \]

where \(s\) is even and \(t\) is odd. As we noted in Exercise 1.2.21, every transposition is its own inverse, so we conclude that

\[ e_{S_n} = \beta_1 \cdots \beta_s \lambda_t \cdots \lambda_1, \]

which is a product of \(s+t\) transpositions. This is an odd number, so it suffices to show that it is not possible to write the identity as a product of an odd number of transpositions.

Suppose the identity can be written as the product \((a_1 b_1) \cdots (a_k b_k)\), where each \(a_i \neq b_i\). A single transposition \emph{cannot} be the identity, and thus \(k \neq 1\). So assume, for the sake of an argument by induction, that for a fixed \(k\), we know that every product of fewer than \(k\) transpositions that equals the identity must use an even number of transpositions. Since \(2\) is even, we might as well assume \(k \geqslant 3\). Now note that since \(k > 1\), and our product is the identity, then some transposition \((a_i b_i)\) with \(i > 1\) must move \(a_1\); otherwise, \(b_1\) would be sent to \(a_1\), and our product would not be the identity.

Of all the possible ways we can write the identity as a product of \(k\) many transpositions \((a_1 b_1) \cdots (a_k b_k)\) with \(a_1 = a\) and \(b_1 = b\) fixed, choose one where the number \(N\) of times that \(a_1\) appears in one of the transpositions is smallest. The two rules in Exercise 1.2.21 allow us to rewrite the overall product without changing the number of transpositions in such a way that the transposition \((a_2 b_2)\) moves \(a_1\), meaning \(a_2 = a_1\) or \(b_2=a_1\). So let us assume that our product of transpositions has already been put in this form. Note also that \((a_i b_i) = (b_i a_i)\), so we might as well assume without loss of generality that \(a_2 = a_1\).

Case 1: When \(b_1 = b_2\), our product is

\[ (a_1 b_1) (a_1 b_1) (a_3 b_3) \cdots (a_k b_k), \]

but \((a_1 b_1) (a_1 b_1)\) is the identity, so we can rewrite our product using only \(k-2\) transpositions. By induction hypothesis, \(k-2\) is even, and thus \(k\) is even.

Case 2: When \(b_1 \neq b_2\), we can use Exercise 1.2.21 to write

\[ (a_1 b_1) (a_1 b_2) = (a_1 b_1) (b_2 a_1) = (a_1b_2)(b_1b_2). \]

Notice here that it matters that \(a_1\), \(b_1\), and \(b_2\) are all distinct, so that we can apply Exercise 1.2.21. So our product, which equals the identity, is

\[ (a_1 b_2)(b_1 b_2)(a_3 b_3) \cdots (a_k b_k). \]

The advantage of this shuffling is that while we have only changed the first two transpositions, we have decreased the number \(N\) of transpositions that move \(a_1\). But this contradicts our choice of \(N\) to be smallest possible.

Definition 1.2.23 (Parity of a permutation)

Consider a permutation \(\sigma \in S_n\). If \(\sigma = \tau_1 \cdots \tau_s\) is a product of transpositions, the sign of \(\sigma\) is given by \((-1)^s\). Permutations with sign \(1\) are called even and those with sign \(-1\) are called odd. This is also called the parity of the permutation.

Example 1.2.24

The identity is even. Every transposition is odd.

Example 1.2.25

The 3-cycle \( (123) \) can be rewritten as \( (12)(23) \), a product of 2 transpositions, so the sign of \( (123) \) is \(1\). That is, it is an even permutation.

Exercise 1.2.26

Show every permutation is a product of adjacent transpositions of the form \((i \,\, i+1)\).

1.3 Dihedral groups

For any integer \(n \geqslant 3\), let \(P_n\) denote a regular \(n\)-gon. For concreteness sake, let us imagine \(P_n\) is centered at the origin with one of its vertices located along the positive \(y\)-axis. Note that the size of the polygon will not matter. Here are some examples:

Example 1.3.1

Equilateral triangle An equilateral triangle with equal sides, oriented with a flat base and point upwards.
\( P_3\)
Square A square with four equal sides, oriented upright.
\( P_4\)
Regular pentagon A regular pentagon with five equal sides, centered and upright.
\( P_5\)

Definition 1.3.2

The dihedral group \(D_n\) is the set of symmetries of the regular \(n\)-gon \(P_n\) equipped with the binary operation given by composition.

Remark 1.3.3

There are competing notations for the group of symmetries of the \(n\)-gon. Some authors prefer to write it as \(D_{2n}\), since, as we will show, that is the order of the group. Democracy has dictated that we will be denoting it by \(D_n\), which indicates that we are talking about the symmetries of the \(n\)-gon. Some authors like to write \(D_{2 \times n}\), always keeping the \(2\), for example with \(D_{2 \times 3}\), to satisfy both camps.

Let \(d(-,-)\) denote the usual Euclidean distance between two points on the plane \(\mathbb{R}^2\). An isometry of the plane is a function \(f\!: \mathbb{R}^2 \to \mathbb{R}^2\) that is bijective and preserves the Euclidean distance, meaning that

\[ d(f(A),f(B))=d(A,B) \quad \textrm{ for all } A,B \in \mathbb{R}^2. \]

Though not obvious, it is a fact that if \(f\) preserves the distance between every pair of points in the plane, then it must be a bijection.

A symmetry of \(P_n\) is an isometry of the plane that maps \(P_n\) to itself. By this I do not mean that \(f\) fixes each point of \(P_n\), but rather that we have an equality of sets \(f(P_n) = P_n\), meaning every point of \(P_n\) is mapped to a (possibly different) point of \(P_n\) and every point of \(P_n\) is the image of some point in \(P_n\) via \(f\).

We are now ready to give the formal definition of the dihedral groups:

Remark 1.3.4

Let us informally verify that this really is a group. If \(f\) and \(g\) are in \(D_n\), then \(f \circ g\) is an isometry (since the composition of any two isometries is again an isometry) and

\[ (f \circ g)(P_n) = f(g(P_n)) = f(P_n) = P_n, \]

so that \(f \circ g \in D_n\). This proves composition is a binary operation on \(D_n\). Now note that associativity of composition is a general property of functions. The identity function on \(\mathbb{R}^2\), denoted \(\mathrm{id}_{\mathbb{R}^2}\), belongs to \(D_n\) and it is the identity element of \(D_n\). Finally, the inverse function of an isometry is also an isometry. Using this, we see that every element of \(D_n\) has an inverse.

Lemma 1.3.5

Every point on a regular polygon is completely determined, among all points on the polygon, by its distances to two adjacent vertices of the polygon.

Exercise 1.3.6

Prove Lemma 1.3.5.

Definition 1.3.7 (Rotations in \(D_n\))

Assume that the regular \(n\)-gon \(P_n\) is drawn in the plane with its center at the origin and one vertex on the \(x\) axis. Let \(r\) denote the rotation about the origin by \(\frac{2\pi}{n}\) radians counterclockwise; this is an element of \(D_n\). Its inverse is the clockwise rotation by \(\frac{2 \pi}{n}\). This gives us rotations \(r^i\), where \(r^i\) is the counterclockwise rotation by \(\frac{2 \pi i}{n}\), for each \(i = 1, \ldots, n\). Notice that when \(i=n\) this is simply the identity map.

Example 1.3.8

Here are the rotations of \(D_3\).

Triangle labeled 1-2-3 An upright triangle with vertices labeled 1 at bottom left, 3 at bottom right, and 2 at the top. 1 3 2
The identity
Triangle rotation by 120 degrees An upright triangle with vertices labeled 2 at bottom left, 1 at bottom right, and 3 at the top. 2 1 3
Rotation by 2π/3
Triangle rotation by 240 degrees An upright triangle with vertices labeled 3 at bottom left, 2 at bottom right, and 1 at the top. 3 2 1
Rotation by 4π/3

Definition 1.3.9 (Reflections in \(D_n\))

For any line of symmetry of \(P_n\), reflection about that line gives an element of \(D_n\). When \(n\) is odd, the line connecting a vertex to the midpoint of the opposite side of \(P_n\) is a line of symmetry. When \(n\) is even, there are two types of reflections: the ones about the line connecting tow opposite vertices, and the ones across the line connecting midpoints of opposite sides.

In both cases, these give us a total of \(n\) reflections.

Example 1.3.10

Reflection lines in D3 An isosceles triangle with vertices (0,0), (2,3), and (4,0). Dashed reflection lines: a vertical line through x=2 from y=-0.5 to y=3.5, a line from (0,0) to (4,2), and a line from (4,0) to (0,2).
The reflection lines in D3
Reflection lines in D4 A square with corners (0,0), (4,0), (4,4), (0,4). Dashed reflection lines: vertical through x=2, horizontal through y=2, and both diagonals extended slightly beyond the square.
The reflection lines in D4

Notation 1.3.11 (Defining \(r\) and \(s\))

Fix \(n \geqslant 3\). We will consider two special elements of \(D_n\):

  • Let \(r\) denote the symmetry of \(P_n\) given by counterclockwise rotation by \(\frac{2 \pi}{n}\).
  • Let \(s\) denote a reflection symmetry of \(P_n\) that fixes at least one of the vertices of \(P_n\), as described in Definition 1.3.9. Let \(V_1\) be a vertex of \(P_n\) that is fixed by \(s\), and label the remaining vertices of \(P_n\) with \(V_2, \ldots, V_{n}\) by going counterclockwise from \(V_1\).

From now on, whenever we are talking about \(D_n\), the letters \(r\) and \(s\) will refer only to these specific elements. Finally, we will sometimes denote the identity element of \(D_n\) by \(\mathrm{id}\), since it is the identity map.

Theorem 1.3.12

The dihedral group \(D_n\) has \(2n\) elements.

Proof of Theorem 1.3.12

First, we show that \(D_n\) has order at most \(2n\). Any element \(\sigma \in D_n\) takes the polygon \(P_n\) to itself, and must in particular send vertices to vertices and preserve adjacencies, meaning that any two adjacent vertices remain adjacent after applying \(\sigma\). Fix two adjacent vertices \(A\) and \(B\). By Lemma 1.3.5, the location of every other point \(P\) on the polygon after applying \(\sigma\) is completely determined by the locations of \(\sigma(A)\) and \(\sigma(B)\). There are \(n\) distinct possibilities for \(\sigma(A)\), since it must be one of the \(n\) vertices of the polygon. But once \(\sigma(A)\) is fixed, \(\sigma(B)\) must be a vertex adjacent to \(\sigma(B)\), so there are at most \(2\) possibilities for \(\sigma(B)\). This gives us at most \(2n\) elements in \(D_n\).

Now we need only to present \(2n\) distinct elements in \(D_n\). We have described \(n\) reflections and \(n\) rotations for \(D_n\); we need only to see that they are all distinct. First, note that the only rotation that fixes any vertices of \(P_n\) is the identity. Moreover, if we label the vertices of \(P_n\) in order with \(1, 2, \ldots, n\), say by starting in a fixed vertex and going counterclockwise through each adjacent vertex, then the rotation by an angle of \(\frac{2 \pi i}{n}\) sends \(V_{1}\) to \(V_{i+1}\) for each \(i \lt n \), showing these \(n\) rotations are distinct. Now when \(n\) is odd, each of the \(n\) reflections fixes exactly one vertex, and so they are all distinct and disjoint from the rotations. Finally, when \(n\) is even, we have two kinds of reflections to consider. The reflections through a line connecting opposite vertices have exactly two fixed vertices, and are completely determined by which two vertices are fixed; since rotations have no fixed points, none of these matches any of the rotations we have already considered. The other reflections, the ones through the midpoint of two opposite sides, are completely determined by (one of) the two pairs of adjacent vertices that they switch. No rotation switches two adjacent vertices, and thus these give us brand new elements of \(D_n\).

In both cases, we have a total of \(2n\) distinct elements of \(D_n\) given by the \(n\) rotations and the \(n\) reflections.

Remark 1.3.13

Given an element of \(D_n\), we now know that it must be a rotation or a reflection. The rotations are the elements of \(D_n\) that preserve orientation, while the reflections are the elements of \(D_n\) that reverse orientation.

Remark 1.3.14

Any reflection is its own inverse. In particular, \(s^2 = \mathrm{id}\).

Remark 1.3.15

Note that \(r^j(V_1) = V_{1+j \pmod{n}}\) for any \(j\). Thus if \(r^j = r^i\) for some \(1 \leqslant i,j \leqslant n\), then we must have \(i=j\).

In fact, we have seen that \(r^n = \mathrm{id}\) and that the rotations \(\mathrm{id}, r, r^2, \ldots, r^{n-1}\) are all distinct, so \(|r| = n\). In particular, the inverse of \(r\) is \(r^{n-1}.\)

Lemma 1.3.16

Following Notation 1.3.11, we have \(srs^{-1} = r^{-1}\).

Proof of Lemma 1.3.16

First, we claim that \(rs\) is a reflection. To see this, observe that \(s(V_1) = V_1\), so

\[ rs(V_1) = r(V_1) = V_2 \]

and

\[ rs(V_{2}) = r(V_{n}) = V_1. \]

This shows that \(rs\) must be a reflection, since it reverses orientation. Reflections have order \(2\), so \(rsrs = (rs)^2 = \mathrm{id}\) and hence \(srs = r^{-1}.\)

Remark 1.3.17

Given \(|r| = n\) and \(|s| = 2\), as noted in Remarks 1.3.14 and 1.3.15, we can rewrite Lemma 1.3.16 as \[ srs^{-1} = r^{n-1}. \]

Exercise 1.3.18

Show that \(s r^i s^{-1} = r^{n-i}\) for all \(i\).

Theorem 1.3.19

Every element in \(D_n\) can be written uniquely as \(r^j\) or \(r^j s\) for \(0 \leqslant j \leqslant n-1\).

Proof of Theorem 1.3.19

Let \(\alpha\) be an arbitrary symmetry of \(P_n\). Note \(\alpha\) must fix the origin, since it is the center of mass of \(P_n\), and it must send each vertex to a vertex because the vertices are the points on \(P_n\) at largest distance from the origin. Thus \(\alpha(V_1) = V_ j\) for some \(1 \leqslant j \leqslant n\) and therefore the element \(r^{-j}\alpha\) fixes \(V_1\) and the origin. The only elements that fix \(V_1\) are the identity and \(s\). Hence either \(r^{-j}\alpha = \mathrm{id}\) or \(r^{-j}\alpha = s\). We conclude that \(\alpha = r^j\) or \(\alpha = r^js\).

Notice that we have shown that \(D_n\) has exactly \(2n\) elements, and that there are \(2n\) distinct expressions of the form \(r^j\) or \(r^js\) for \(0 \leqslant j \leqslant n-1\). Thus each element of \(D_n\) can be written in this form in a unique way.

Remark 1.3.20

The elements \(s, rs, \dots, r^{n-1}\) are all reflections since they reverse orientation. Alternatively, we can check these are all reflections by checking they have order \(2\). As we noted before, the elements \(\mathrm{id}, r, \dots, r^{n-1}\) are rotations, and preserve orientation.

Example 1.3.21

The group \(D_4\) has eight elements: The rotations are \(\mathrm{id}, r, r^2, r^3\) and the reflections are \(s, rs, r^2s, r^3s\).

Let us now give a presentation for \(D_n\).

Theorem 1.3.22

Let \(r:\mathbb{R}^2\to\mathbb{R}^2\) denote counterclockwise rotation around the origin by \(\frac{2\pi}{n}\) radians and let \(s:\mathbb{R}^2\to\mathbb{R}^2\) denote reflection about the \(x\)-axis respectively. Set

\[ X_{2n}=\langle r,s \mid r^n = 1, s^2 = 1, srs^{-1} = r^{-1} \rangle. \]

Then \(D_n=X_{2n}\), that is,

\[ D_n = \langle r,s \mid r^n = 1, s^2 = 1, srs^{-1} = r^{-1} \rangle. \]

Proof of Theorem 1.3.22

Theorem 1.3.19 shows that \(\{r,s\}\) is a set of generators for \(D_n\). Moreover, we also know that the relations listed above \(r^n = 1, s^2 = 1, srs^{-1} = r^{-1}\) hold; the first two are easy to check, and the last one is Lemma 1.3.16. The only concern we need to deal with is that we may not have discovered all the relations of \(D_n\); or rather, we need to check that we have found enough relations so that any other valid relation follows as a consequence of the ones listed.

Let

\[ X_{2n}=\langle r,s \mid r^n = 1, s^2 = 1, srs^{-1} = r^{-1} \rangle. \]

Assume that \(D_n\) has more relations than \(X_{2n}\) does. Then \(D_n\) would be a group of cardinality strictly smaller than \(X_{2n}\), meaning that \(|D_n|<|X_{2n}|\). This will become more clear once we properly define presentations. We will show below that in fact \(|X_{2n}|\leqslant 2n=|D_n|\), thus obtaining a contradiction.

Now we show that \(X_{2n}\) has at most \(2n\) elements using just the information contained in the presentation. By definition, since \(r\) and \(s\) generated \(X_{2n}\) then every element \(x\in X_{2n}\) can be written as

\[ x = r^{m_1} s^{n_1} r^{m_2} s^{n_2} \cdots r^{m_j} s^{n_j} \]

for some \(j\) and (possibly negative) integers \(m_1, \dots, m_j, n_1, \dots, m_j\).Note that, \(m_1\) could be \(0\), so that expressions beginning with a power of \(s\) are included in this list. As a consequence of the last relation, we have

\[ sr = r^{-1}s, \]

and its not hard to see that this implies

\[ sr^m = r^{-m} s \]

for all \(m\). Thus, we can slide an \(s\) past a power of \(r\), at the cost of changing the sign of the power. Doing this repeatedly gives that we can rewrite \(x\) as

\[ x = r^M s^N. \]

By the first relation, \(r^n = 1\), from which it follows that \(r^a = r^b\) if \(a\) and \(b\) are congruent modulo \(n\). Thus we may assume \(0 \leqslant M \leqslant n-1.\) Likewise, we may assume \(0 \leqslant N \leqslant 1\). This gives a total of at most \(2n\) elements, and we conclude that \(X_{2n}\) must in fact be \(D_n.\)

Note that we have not shown that

\[ X_{2n}=\langle r,s \mid r^n, s^2, srs^{-1} = r^{-1} \rangle \]

has at least \(2n\) elements using just the presentation. But for this particular example, since we know the group presented is the same as \(D_n\), we know from Theorem 1.3.19 that it has exactly \(2n\) elements.

1.4 The quaternions

For our last big example we mention the group of quaternions, written \(Q_8\).

Definition 1.4.1

The quaternion group \(Q_8\) is a group with \(8\) elements

\[ Q_8=\{ 1, -1, i, -i, j, -j, k, -k \} \]

satisfying the following relations: \(1\) is the identity element, and

\[ i^2 = -1, \quad j^2 = -1, \quad k^2 =-1, \quad ij = k, \quad jk = i, \quad ki = j, \]
\[ (-1)i = -i, \quad (-1)j = -j, \quad (-1)k = -k, \quad (-1)(-1) = 1. \]

To verify that this really is a group is rather tedious, since the associative property takes forever to check. Here is a better way: in the group \(\mathrm{GL}_2(\mathbb{C})\), define elements

\[ I = \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}, \quad A = \begin{bmatrix} \sqrt{-1} & 0 \\ 0 & -\sqrt{-1} \end{bmatrix}, \quad B = \begin{bmatrix} 0 & 1 \\ -1 & 0 \end{bmatrix}, \quad C = \begin{bmatrix} 0 & \sqrt{-1} \\ \sqrt{-1} & 0 \end{bmatrix} \]

where \(\sqrt{-1}\) denotes a complex number whose square is \(-1\), to avoid confusion with the symbol \(i \in Q_8\). Let \(-I, -A, -B, -C\) be the negatives of these matrices.

Then we can define an injective map \(f:Q_8\to \mathrm{GL}_2(\mathbb{C})\) by assigning

\[ \begin{aligned} 1 &\mapsto I, &\quad -1 &\mapsto -I\\ i &\mapsto A, &\quad -i &\mapsto -A \\ j &\mapsto B, &\quad -j &\mapsto -B \\ k &\mapsto C, &\quad -k &\mapsto -C. \end{aligned} \]

It can be checked directly that this map has the nice property (called being a group homomorphism) that

\[ f(xy)=f(x)f(y) \text{ for any elements } x,y\in Q_8. \]

Let us now prove associativity for \(Q_8\) using this information:

Claim 1.4.2

For any \(x,y,z\in Q_8\), we have \((xy)z=x(yz)\).

Proof of Claim 1.4.2

By using the property \(f(xy)=f(x)f(y)\) as well as associativity of multiplication in \(\mathrm{GL}_2(\mathbb{C})\) (marked by \(*\)) we obtain

\[ f((xy)z)=f(xy)f(z)=\left(f(x)f(y)\right)f(z)\stackrel{*}{=}f(x)\left(f(y)f(z)\right)=f(x)f(yz)=f(x(yz)). \]

Since \(f\) is injective and \(f((xy)z)=f(x(yz))\), we deduce \((xy)z=x(yz)\).

The subset \(\{\pm I, \pm A, \pm B, \pm C\}\) of \(\mathrm{GL}_2(\mathbb{C})\) is a subgroup (a term we define carefully later), meaning that it is closed under multiplication and taking inverses. (For example, \(AB= C\) and \(C^{-1} = -C\).) This proves it really is a group and one can check it satisfies an analogous list of identities as the one satisfied by \(Q_8\).
This is an excellent motivation to talk about group homomorphisms.

1.5 Group homomorphisms

A group homomorphism is a function between groups that preserves the group structure.

Definition 1.5.1

Let \((G, \cdot_G)\) and \((H, \cdot_H)\) be groups. A (group) homomorphism from \(G\) to \(H\) is a function \(f: G \to H\) such that

\[ f(x \cdot_G y) = f(x) \cdot_H f(y). \]

Note that a group homomorphism does not necessarily need to be injective nor surjective, it can be any function as long as it preserves the product.

Definition 1.5.2

Let \(G\) and \(H\) be groups. A homomorphism \(f\!: G \to H\) is an isomorphism if there exists a homomorphism \(g: H \to G\) such that

\[ f \circ g = \mathrm{id}_H \textrm{ and } g \circ f = \mathrm{id}_G. \]

If \(f:G\to H\) is an isomorphism, \(G\) and \(H\) are called isomorphic, and we denote this by writing \(G\cong H\). An isomorphism \(G \longrightarrow G\) is called an automorphism of \(G\). We denote the set of all automorphisms of \(G\) by \(\mathrm{Aut}(G)\).

Remark 1.5.3

Two groups \(G\) and \(H\) are isomorphic if we can obtain \(H\) from \(G\) by renaming all the elements, without changing the group structure. One should think of an isomorphism \(f\!: G \longrightarrow{\cong} H\) of groups as saying that the multiplication tables of \(G\) and \(H\) are the same up to renaming the elements.

The multiplication rule \(\cdot_G\) for \(G\) can be visualized as a table with both rows and columns labeled by elements of \(G\), and with \(x \cdot_G y\) placed in row \(x\) and column \(y\). The isomorphism \(f\) sends \(x\) to \(f(x)\), \(y\) to \(f(y)\), and the table entry \(x \cdot_G y\) to the table entry \(f(x) \cdot_H f(y)\). The inverse map \(f^{-1}\) does the opposite.

Remark 1.5.4

Suppose that \(f\!: G \to H\) is an isomorphism. As a function, \(f\) has an inverse, and thus it must necessarily be a bijective function. Our definition, however, requires more: the inverse must in fact also be a group homomorphism.

Note that many books define group isomorphism by simply requiring it to be a homomorphism that is bijective: and we will soon show that this is in fact equivalent to the definition we gave. There are however good reasons to define it as we did: in many contexts, such as sets, groups, rings, fields, or topological spaces, the correct meaning of the word “isomorphism” in “a morphism that has a two-sided inverse”. This explains our choice of definition.

Exercise 1.5.5

Let \(G\) be a group. Show that \(\mathrm{Aut}(G)\) is a group under composition.

Example 1.5.6

\(\,\)

  1. For any group \(G\), the identity map \(\mathrm{id}_G\!: G \to G\) is a group isomorphism.
  2. For all groups \(G\) and \(H\), the constant map \(f\!: G \to H\) with \(f(g) = e_H\) for all \(g \in G\) is a homomorphism, which we sometimes refer to as the trivial homomorphism.
  3. The exponential map and the logarithm map
    \[ \exp\!: (\mathbb{R}, +) \to (\mathbb{R} \setminus \{0\}, \cdot), \quad x \mapsto e^x \] \[ \ln\!: (\mathbb{R}_{>0}, \cdot) \to (\mathbb{R}, +), \quad y \mapsto \ln y \]
    are both isomorphisms, so \((\mathbb{R}, +)\cong (\mathbb{R}_{>0}, \cdot)\). In fact, these maps are inverse to each other.
  4. The function \(f\!: \mathbb{Z} \to \mathbb{Z}\) given by \(f(x) = 2x\) is a group homomorphism that is injective but not surjective.
  5. For any positive integer \(n\) and any field \(F\), the determinant map
    \[ \det\!: \mathrm{GL}_n(F) \to (F \setminus \{0\}, \cdot), \quad A \mapsto \det(A) \]
    is a group homomorphism. For \(n \geqslant 2\), the determinant map is not injective (you should check this!) and so it cannot be an isomorphism. It is however surjective: for each \(c \in F \setminus \{ 0 \}\), the diagonal matrix
    \[ \begin{pmatrix} c & & & \\ & 1 && \\ && \ddots & \\ &&& 1 \end{pmatrix} \]
    has determinant \(c\).
  6. Fix an integer \(n > 1\), and consider the function \(f\!: (\mathbb{Z},+) \to (\mathbb{C}^*,\cdot)\) given by \(f(m) = e^{\frac{2 \pi i m}{n}}\). This is a group homomorphism, but it is neither surjective nor injective. It is not surjective because the image only contains complex numbers \(x\) with \(|x| = 1\), and it is not injective because \(f(0) = f(n)\).

Lemma 1.5.7 (Properties of homomorphisms)

If \(f: G \to H\) is a homomorphism of groups, then

\[f(e_G) = e_H.\]

Moreover, for any \(x \in G\) we have

\[f(x^{-1}) = f(x)^{-1}.\]

Proof of Lemma 1.5.7

By definition, \[ f(e_G)f(e_G) = f(e_Ge_G) = f(e_G). \] Multiplying both sides by \(f(e_G)^{-1}\), we get \[ f(e_G) = e_H. \]

Now given any \(x \in G\), we have \[ f(x^{-1}) f(x) = f(x^{-1}x) = f(e) = e, \] and thus \(f(x^{-1}) = f(x)^{-1}\).

Remark 1.5.8

Let \(G\) be a cyclic group generated by the element \(g\). Then any homomorphism \(f\!: G \to H\) is completely determined by \(f(g)\), since any other element \(h \in G\) can be written as \(h = g^n\) for some integer \(n\), and \[ f(g^n) = f(g)^n. \]

More generally, given a group \(G\) and a set \(S\) of generators for \(G\), any homomorphism \(f\!: G \longrightarrow H\) is completely determined by the images of the generators in \(S\): the element \(g = s_1 \cdots s_m\), where \(s_i\) is either in \(S\) or the inverse of an element of \(S\), has image \[ f(g) = f(s_1 \cdots s_m) = f(s_1) \cdots f(s_m). \]

Note, however, that not all choices of images for the generators might actually give rise to a homomorphism; we need to check that the map determined by the given images of the generators is well-defined.

Definition 1.5.9

The image of a group homomorphism \(f\!: G \longrightarrow H\) is

\[\mathrm{im}(f) := \{f(g) \mid g \in G \}.\]

Notice that \(f\!: G \to H\) is surjective if and only if \(\mathrm{im}(f) = H\).

Definition 1.5.10

The kernel of a group homomorphism \(f\!: G \longrightarrow H\) is

\[\ker(f) := \{g \in G \mid f(g) = e_H\}.\]

Remark 1.5.11

Given any group homomorphism \(f\!: G \longrightarrow H\), we must have \(e_G \in \ker f\) by Lemma 1.5.7.

When the kernel of \(f\) is as small as possible, meaning \(\ker(f) = \{ e \}\), we say that the kernel of \(f\) is trivial. A homomorphism is injective if and only if it has a trivial kernel.

Lemma 1.5.12

A group homomorphism \(f: G \to H\) is injective if and only if \(\ker(f) = \{e_G\}\).

Proof of Lemma 1.5.12

First, note that \(e_G \in \ker f\) by Lemma 1.5.7. If \(f\) is injective, then \(e_G\) must be the only element that \(f\) sends to \(e_H\), and thus \(\ker(f) = \{ e_G \}\).

Now suppose \(\ker(f) = \{e_G\}\). If \(f(g) = f(h)\) for some \(g,h \in G\), then \[ f(h^{-1}g) = f(h^{-1})f(g) = f(h)^{-1}f(g) = e_H. \] But then \(h^{-1}g \in \ker(f)\), so we conclude that \(h^{-1}g = e_G\), and thus \(g = h\).

Example 1.5.13

First, number the vertices of \(P_n\) from \(1\) to \(n\) in any manner you like. Now define a function \(f\!: D_{n} \to S_n\) as follows: given any symmetry \(\alpha \in D_n\), set \(f(\alpha)\) to be the permutation of \([n]\) that records how \(\alpha\) permutes the vertices of \(P_n\) according to your labelling.

So \(f(\alpha) = \sigma\) where \(\sigma\) is the permutation that for all \(1 \leqslant i \leqslant n\), if \(\alpha\) sends the \(i\)th vertex to the \(j\)th one in the list, then \(\sigma(i) = j\). This map \(f\) is a group homomorphism.

Now suppose \(f(\alpha) = \mathrm{id}_{S_n}\). Then \(\alpha\) must fix all the vertices of \(P_n\), and thus \(\alpha\) must be the identity element of \(D_n\). We have thus shown that the kernel of \(f\) is trivial. By Lemma 1.5.12, this proves \(f\) is injective.

Lemma 1.5.14

Suppose \(f\!: G \to H\) is a group homomorphism. Then \(f\) is an isomorphism if and only if \(f\) is bijective.

Proof of Lemma 1.5.14

(\(\Rightarrow\)) A function \(f: X \to Y\) between two sets is bijective if and only if it has an inverse, meaning that there is a function \(g: Y \to X\) such that \(f \circ g = \mathrm{id}_Y\) and \(g \circ f = \mathrm{id}_X\). Our definition of group isomorphism implies that this must hold for any isomorphism (and more!), as we noted in Remark 1.5.4.

(\(\Leftarrow\)) If \(f\) is bijective homomorphism, then as a function it has a set-theoretic two-sided inverse \(g\), as remarked in Remark 1.5.4. But we need to show that this inverse \(g\) is actually a homomorphism. For any \(x,y \in H\), we have

\[ \begin{aligned} f(g(xy)) & = xy \quad && \textrm{since } fg=\mathrm{id}_G \\ & = f(g(x))f(g(y)) \quad && \textrm{since } fg=\mathrm{id}_G\\ & = f(g(x)g(y)) \quad && \textrm{since $f$ is a group homomorphism} . \end{aligned} \]

Since \(f\) is injective, we must have \(g(xy) = g(x)g(y)\). Thus \(g\) is a homomorphism, and \(f\) is an isomorphism.

Exercise 1.5.15

Let \(f\!: G \to H\) be an isomorphism. Show that for all \(x \in G\), we have \(|f(x)| = |x|\).

In other words, isomorphisms preserve the order of an element. This is an example of an isomorphism invariant.

Definition 1.5.16

An isomorphism invariant (of a group) is a property \(P\) (of groups) such that whenever \(G\) and \(H\) are isomorphic groups and \(G\) has the property \(P\), then \(H\) also has the property \(P\).

Theorem 1.5.17

The following are isomorphism invariants:

  1. the order of the group,
  2. the set of all the orders of elements in the group,
  3. the property of being abelian,
  4. the order of the center of the group,
  5. being finitely generated.

Recall that by definition two sets have the same cardinality if and only if they are in bijection with each other.

Proof of Theorem 1.5.17

Let \(f\!:G\to H\) be any group isomorphism.

  1. Since \(f\) is a bijection by Remark 1.5.4, we conclude that \(|G|=|H|\).
  2. We wish to show that \(\{|x| \ | \ x\in G\}= \{|y| \ | \ y\in H\}\). (\(\subseteq\)) follows from Exercise 1.5.15: given any \(x\in G\), we have \(|x| = |f(x)|\), which is the order of an element in \(H\). (\(\supseteq\)) follows from the previous statement applied to the group isomorphism \(f^{-1}\): given any \(y\in H\), we have \(f^{-1}(y)\in G\) and \(|y| = |f^{-1}(y)|\) is the order of an element of \(G\).
  3. For any \(y_1,y_2\in H\) there exist some \(x_1, x_2\in G\) such that \(f(x_i)=y_i\). Then we have \[ y_1y_2=f(x_1)f(x_2)=f(x_1x_2)\stackrel{*}{=}f(x_2x_1)=f(x_2)f(x_1)=y_2y_1, \] where \(*\) indicates the place where we used that \(G\) is abelian.
  4. Exercise. The idea is to show \(f\) induces an isomorphism \(\mathcal{Z}(G)\cong \mathcal{Z}(H)\).
  5. Exercise. Show that if \(S\) generates \(G\) then \(f(S)=\{f(s) \ | \ s\in S\}\) generates \(H\).\qedhere

The easiest way to show that two groups are not isomorphic is to find an isomorphism invariant that they do not share.

Remark 1.5.18

Let \(G\) and \(H\) be two groups. If \(P\) is an isomorphism invariant, and \(G\) has \(P\) while \(H\) does not have \(P\), then \(G\) is not isomorphic to \(H\).

Example 1.5.19

  1. We have \(S_n\cong S_m\) if and only if \(n=m\), since \(|S_n| = n!\) and \(|S_m| = m!\) and the order of a group is an isomorphism invariant.
  2. Since \(\mathbb{Z}/6\) is abelian and \(S_3\) is not abelian, we conclude that \(\mathbb{Z}/6\ncong S_3\).
  3. You will show in Problem Set 2 that \(|Z(D_{24})|=2\), while \(S_n\) has trivial center. We conclude that \(D_{24}\ncong S_4\).
We come to one of the central concepts in group theory: the action of a group on a set. Some would say this is the main reason one would study groups, so we want to introduce it early both as motivation for studying group theory but also because the language of group actions will be very helpful to us.