Semigroups and Monoids

In functional programming there are two concepts that are mentioned a lot that may sound intimidating: the semigroup and the monoid (not monad). These concepts originate from category theory, which is a branch of math that aims to reason about the entirety of math through graphs called “categories”. While monoids and semigroups sound complex, they are a simple but powerful abstraction that allow you to generically combine data types.

Semigroups

Before we get into monoids, we first need to define a Semigroup, since the monoid is a strict subset of the semigroup. A semigroup is basically a set with an associative multiplication operation defined on it. In Haskell, we don’t have a multiplication operation, but rather, the <> operator. Both of these things allude to the same thing: a method of coalescing elements across a set in a associative manner. Note that all I mean by “associative operation” is that a <> (b <> c) = (a <> b) <> c, so order shouldn’t matter when chaining these operations together (just like with multiplication).

So how do we go about implementing a semigroup in Haskell? You only need to implement one function: the binary <> operator. There’s more to it, but the other functions have default implementations that you probably won’t need to worry about.

An Example

So how would we define a semigroup on some arbitrary data type?

First, let’s start out by defining a new data type: a 2D point which consists of an x and y value:

data Point2D x y = Point2D Int Int

Now how are we going to combine Point2D values? It seems natural that we would add them together by adding the x and y values to create a new point. This is easy to implement, and it has the bonus of being already associative.

instance Semigroup (Point2D a b) where
    (<>) (Point2D x1 y1) (Point2D x2 y2) = Point2D newX newY
      where
        newX = x1 + x2
        newY = y1 + y2

It really is that easy. We have implemented a Semigroup instance on a data type by implementing one function.

Monoids

Monoids are a strict subset of semigroups. They have the same properties as semigroups save for one thing: the identity element. Monoids need to have some identity element (ident) such that a <> ident = a and ident <> a = a. For addition, the identity element is 0, because a + 0 = a and 0 + a = a. For multiplication, the identity element is 1 (proof is left as an exercise to the reader).

We can see from the following declaration of the Monoid class that you only have to implement one function, the identity element. The rest is derived from your semigroup implementation by default.

class Semigroup m => Monoid m where
  mempty :: m

  -- defining mappend is unnecessary, it copies from Semigroup
  mappend :: m -> m -> m
  mappend = (<>)

  -- defining mconcat is optional, since it has the following default:
  mconcat :: [m] -> m
  mconcat = foldr mappend mempty

An Example

Implementing the identity element for Point2D is trivial, because we already know the identity element in addition.

instance Monoid (Point2D a b) where
    mempty = Point2D 0 0

We can play around with using the methods provided by the monoid class. For example, let’s define a few points:

p1 = Point2D 1 2
p2 = Point2D 2 3

We can combine these points using mconcat, <>, or mappend:

mconcat [p1, p2]

> Point2D 3 5

--

p1 <> p2

> Point2D 3 5

You can combine any number of these points in any arbitrary manner using semigroups and monoids, which provide a powerful abstraction over elements of sets, or more concretely, elements of a data/type class.