Photo by Graham Holtshausen on Unsplash
The Maybe monad in C#
Say goodbye to null-reference exceptions.
Introduction
Audience
This blog post is targeted at intermediate-level C# programmers.
Prerequisites
This blog post assumes that the reader has a solid understanding of the following C# features. In case you don't, I suggest you look them up before proceeding to avoid frustration and confusion.
Feature | Description |
Exceptional Handling | The abstraction mechanism provided by C# to gracefully handle runtime errors. We use this to implement the smart constructor pattern. |
Operator Overloading | The mechanism that allows custom classes to integrate perfectly with the language constructs, e.g., operators. |
Functions | The mechanism provided by the language to achieve abstract computations |
Objectives
This blog post's main aim is to introduce the Maybe monad which is a functional approach to handling missing values. The article shows how we can use this monad in C# code to avoid null reference exceptions and write composable computations.
By the end of this post, you should be able to have answers to the following questions.
What are monads and what is the Maybe monad?
What is the smart constructor pattern?
How to design meaningful function types.
Tools used
If you want to test the code in this blog post, you will need to install the following nuget packages on your computer.
Library | Description |
OneOf | We shall be using algebraic data types in this article. Since C#'s support for sum types is so lacking, we shall mimic it using the OneOf library. For more information, look here |
LanguageExt.Core | To use monads without having to implement them ourselves, we will use this library. It enhances C# with lots of nice features from Haskell that facilitate coding in a functional way. For more information, look here |
Algebraic data types
Algebraic types are a feature of type systems known as Hindley-Milner type systems. Some of the famous languages that use this type system are Haskell, Ocaml, F# etc. In this type system, data and logic are separated, unlike in object-oriented type systems.
An algebraic type system is a type system where compound types are composed from simpler types using AND and OR operations.
Scott Wlaschin
Sum types
To understand sum types, we are going to follow a series of examples. Let us start simple. Imagine that you are working on a project that needs to model people's genders. Let us say we want to write a function that takes in the gender and generates a title for a given person. Assuming that we represent the gender as a string, here is our first implementation.
public string GenerateTitle(string gender)
{
if(gender == "Male")
return "Mr";
else if(gender == "Female")
return "Mrs";
else
return "Mr/Mrs"
}
There are lots of problems that this function has, let us try to go through them.
Invalid arguments. Note that the gender is modeled as a string. This means we can pass anything to this function that is a string. This contains garbage inputs, like
"kl98"
.Missing abstraction. Another problem with this function is that it fails to capture a domain concept, which is a sign of poor domain modeling. Gender is a concept in the domain model, but there is explicit mention of it here. This means the type system won't help us eliminate some errors.
Enumerations
Seasoned C# programmers will notice that this is a good use case for enumerations. Since the Gender
type is finite, and contains only 3 values, we can represent this using an enumeration. We can then pattern match on the enumeration value
public enum Gender { Male, Female, Other}
public string GenerateTitle(Gender gender)
{
if(gender == Gender.Male)
return "Mr";
else if(gender == Gender.Female)
return "Mrs";
else
return "Mr/Mrs"
}
This implementation has advantages over the previous one, as explained below.
We now know that our function takes in only valid values since the
Gender
enumeration limits the kind of values passed in, not just any string as before.Our code now speaks in the language of the domain, also known as the
ubiquotous language
.
Note that this function can also be coded as a dictionary since we can explicitly enumerate the input arguments. Dictionaries are a good representation of arbitrary functions.
var title = new Dictionary<Gender, string> {
[Gender.Male] = "Mr",
[Gender.Female] = "Mrs"
[Gender.Other] = "Mr/Mrs"
}
public string GenerateTitle(string gender)
{
if(gender == "Male")
return "Mr";
else if(gender == "Female")
return "Mrs";
else
return "Mr/Mrs"
}
Product types
Product types are so common that you already know them by different names depending on which languages you have used in the past.
Domain modeling using algebraic data types
The Maybe monad
In this section, we shall work through a series of refactorings of a simple code snippet until we discover the Maybe monad. So let us get started. Consider a function that takes in a test mark and returns the student's grade.
Less informative function types
public enum Grade { Poor, Good, Excellent};
public Grade GetGrade(int mark)
{
if(mark >= 0 && mark < 10)
return Grade.Poor;
else if (mark >= 10 && mark < 15)
return Grade.Good;
else if (mark >= 15 && mark <= 20)
return Grade.Excellent;
else
throw new ArgumentException("Invalid mark");
}
This works but it is far from ideal. Let us try to analyse the problems with this code.
The fact that an
int
can store from -4 billion to 4 billion means that we can pass in numbers in that range to this function. This is meaningless. You should avoid exposing primitive types in API interfaces. They are a sign of missing abstractions and poor design.Functions should do one thing, a close look a this function suggests the reverse, this function does both validation and computing the grade. If you think about it, all places in our codebase that take a mark will duplicate this validation logic.
More informative function types
The function type in the previous section is not informative enough. It is dishonest and misleading. It says give me an int and I will give you a grade, but this is not true, it throws an exception for some ints
. We can do better, let us start by constricting the values that can be fed to this function.
Value Objects
There is a problem with the previous code snippet and this is the primitive type argument. Primitive obsession is a disease that most programmers suffer from. The problem with primitive types is that they are too general to be useful. You should always remember that programming is modeling, any model is as good as its constituents. Therefore it is better to model even the smallest domain elements. We shall need to replace this int
with a more representative type.
In DDD, we model such values as value objects. Value objects are domain elements that have no identity. If we current domain is academia, one value object would be a mark that a student scored on a given test.
Different businesses have different business rules. These rules are represented as invariants in the model. In our case, a valid mark is between 0 and 20 inclusive. The system must be designed in such a way that at any point in time, invariants are not violated. Let us create a value object to represent our Mark
concept.
public record Mark
{
public int Value { get; }
private Mark(int mark) => Value = mark;
public static Mark FromInt(int value)
{
if(IsValid(value))
return new Mark(value);
else
throw new ArgumentException("Invalid mark");
}
private static bool IsValid(int value)
=> (value >= 0 && value <= 20) ? true : false;
}
Let us say something about this new abstraction.
Note that we are using a record and not a class. This is because value objects should be immutable and must be compared using structural equality. Records provide most of this infrastructure for us.
The constructor is hidden. This is because it is recommended practice to make invalid states unrepresentable. Hiding the constructor means we can't pass to it any
int
but only valid integer values.We use a static method to help us create an instance of this abstraction. Such methods are ubiquitous in DDD and are known as factory methods. They enable us to do validation before creating an object, so once created, we know the object is in a valid state.
Lastly, notice that the validation has now been put in this record. This means we won't have to duplicate in all code points that need a mark.
With this abstraction in place, we can make changes to our function and make it more informative. The following snippet shows the changes.
public Grade GetGrade(Mark mark)
{
if(mark.Value >= 0 && mark.Value < 10)
return Grade.Poor;
else if (mark.Value >= 10 && mark.Value < 15)
return Grade.Good;
else
return Grade.Excellent;
}
Oh, this is a big improvement over the previous code. Let us try to analyse this code.