This article was translated from Chinese by ChatGPT and may contain some errors.
❖Type Inference
❖What is Type Inference
Programmers are lazy creatures. Often, we don’t want to verbosely specify types:
1let t = 1 + 1
2let f x = x + 1
The types of the three variables t
, f
, and x
here are very clear. Of course, more often, it is better to explicitly specify the type.
There are also times when it is not easy to specify the type, for example:
1let s f g x = f x (g x)
2let k x y = x
3let i = s k k
4let ϕ = s (k (s i)) k
What is the type of ϕ?
The type of ϕ can be calculated, but it is very troublesome. The natural idea is to find an algorithm to calculate it instead of the human brain.
In fact, if you send this code to the haskell or f# repl, they will tell you that the type of ϕ is p -> (p -> q) -> q
. The algorithm they use is the Hindley-Milner type inference algorithm.
❖Type Inference Algorithm
In a nutshell, the type inference algorithm is to solve equations. The process of establishing equations is as follows:
- Each function parameter and each function call return value are represented by a type variable
xᵢ
. - Establish equations when calling functions. The function call
f x
can establish equations with the following rules:
For example, such a function:
1let f a b = a + b
Can establish such equations:
By solving the two equations below, we can determine the type of each variable. The question now is, how do we solve such equations?
❖Unification
Unification is an algorithm from automated theorem proving that can be used to solve the above equations. It is backed by a series of theories about substitution. Due to space limitations, we will not discuss these theories here, but will introduce the most natural algorithm1 in a very short space.
First of all, the above equations will involve at most two forms of terms. One is the type variable xᵢ
, and the other is the arrow xᵢ → xⱼ
. In fact, the arrow xᵢ→ xⱼ
can be regarded as f(xᵢ, xⱼ)
, so we have two forms:
- Type variable.
- The form of
f(v₀, v₁, v₂, ...)
. The aboveint
can be regarded asint()
The algorithm is as follows:
Traverse each equation:
- Perform a self-loop check. If both sides appear(For example ), then the algorithm fails.
- If the form of the equation is:
Then no matter what is, replace in equations that have not been traversed yet with . - If it is not case 1., that isThen,
- If , the algorithm fails.
- If , add ( is the number of parameters of f ) new equations to be traversed .
❖Let Polymorphism
Now we have a type inference algorithm. Consider such a function:
1let f x = x
The above inference algorithm will infer f:x₀ → x₀
, where there is a free variable x₀
. Using the type notation of System F, generalize
such functions:
In this way, the type of f
becomes a polymorphic type
, which can be instantiated
:
A troublesome question is: when should generalization and instantiation be performed?
The natural idea is to generalize outside of lambda expressions like fun x -> ...
, and instantiate when referencing a generalized variable. E.g.
1// The following are all type-correct items
2(λ x. x) 10
3(λ x. x) true
The problem with this is that only λ expressions can be of polymorphic type
, and variables inside lambda can still only be of a single type.
1(λf.(λ t. f true)(f 0))(λx.x)
Taking the above expression as an example, even if λx.x
is of polymorphic type, f
cannot be inferred as polymorphic type. In fact, f
is inferred as int → x₀
in f 0
and bool → x₀
in f true
. This will cause a type error.
Recall that in F# or ocaml, the method for f
to be truly assigned to generics is:
1let f x = x in
2 let t = f 0 in f true
In untyped or simple type λ expressions, the above two pieces of code are completely equivalent, let
is just syntactic sugar. But in ML languages, let
is not syntactic sugar. In fact, in ML languages, the time to generalize
is exactly at the let
binding!
That is to say:
- In the expression
let var = e₁ in e₂
, the type ofe₁
is generalized. The method of generalization is:- The free variables in the current environment are
- The free variables of are
- For each free variable in , becomes
- Whenever a generic variable is referenced from the current environment , instantiate with a new type variable until no longer contains
And this is the basic idea of the Hindley-Milner algorithm. For specific implementations, there are algorithm J and algorithm W, please refer to Wikipedia or Milner’s original paper. My implementation uses alogrithm J.
❖Value restriction
If you really tried the previous code:
1let s f g x = f x (g x)
2let k x y = x
3let i = s k k
4let ϕ = s (k (s i)) k
You will find that in Haskell, this is no problem:

But in F#, this will report an error:

Why should it report an error? This has to start from a more tricky program. F# has mutable types:
1let mutable x = 10
2x <- 20
3printfn "%A" x
If given such a program:
1let mutable x = []
2x <- [3]
3List.map (fun x -> x = true) x
Then the first line will report an error. However, suppose the first line does not report an error, will the second line report an error? We need to study it carefully.
First, the type of <-
is hidden in F#, but F# has a set of deprecated operators !
, :=
and ref
, their types are:
1ref : 'a -> 'a ref
2(!) : 'a ref -> 'a
3(:=) : 'a ref -> 'a -> unit
The above program is written as:
1let x = ref []
2x := [3]
3List.map (fun x -> x = true) x
When x
is referenced in the second line, the type of x
is 't list ref
, that is, ∀t.ref[list[t]]
, it should be instantiated as ref[list[b]]
, so two equations are obtained:
The above equation gets , this will not report an error. And the bigger problem is that after the type check of the second line, b
indeed becomes int
, but this is just the instantiated x
, and the x
in , the type has always been 't list ref
, that is to say, the third line will not report an error anyway!
This is very bad because it introduces unsoundness into the type system. The third line will undoubtedly result in a runtime error.
To solve this problem, ANDREW K. WRIGHT2 introduced value restriction. The meaning of value restriction is simple: only when e₁
of let var = e₁ in e₂
is a syntactic value, generalize e₁
. If e₁
is not a syntactic value and it has free variables not in , then report an error.
Syntactic values in ML languages are constants, variables, λ expressions, and type constructors. Of course, a value of type ref[a]
cannot be considered a syntactic value. In this way, the problem is obviously solved, because the above error is essentially due to the occurrence of side effects, and evaluating a syntactic value in ML cannot cause side effects.
This solution is over-approximate, and many correct programs (such as the one mentioned before) will also be rejected. Haskell does not have this problem because it does not have side effects. This is also a major reason why Haskell can use a lot of function combinations similar to f . g
, but ML cannot.
❖Position of quantifiers?
In System F, quantifiers can appear anywhere in a type. For example, . In fact, although System F cannot construct the Y combinator (contrary to strong normalization), it can encode structures similar to untyped lambda calculus :
The type of this λ expression is . The type of its first argument determines that it can only act on the id
function Λα.λ(x:α).x
. The form of applying it to id
is written in untyped λ expression as:
Without a doubt, it will be β-reduced to .
From this, it can be seen that in System F, the type of the parameters of the λ expression can contain polymorphic types
. So why couldn’t our previous type inference algorithm infer its polymorphic type?
This is because our previous inference algorithm can be regarded as the type inference algorithm of simple type λ expression
. And the HM type system is an extension of the simple type λ expression.
At this point, a natural question arises: Can System F perform similar inference? Or say, if we write this:
Is there an algorithm that can get the values of and through this expression?
In 1985, Hans-J. Boehm’s paper PARTIAL POLYMORPHIC TYPE INFERENCE IS UNDECIDABLE told us that if the fix
combinator is added, then the above problem is undecidable. In 1995, J.B.Wells’s paper Typability and type checking in System F are equivalent and undecidable further told us that if explicit and are not given, and only the final type is required to be checked, then this problem is still undecidable.
Therefore, the HM type system can be seen as a variant of two kinds of λ expressions:
- If it is seen as a variant of simple type λ expressions, then the HM type system has added the form of
let var = e₁ in e₂
, and allowse₁
to generalize, implementing parametric polymorphism. - If it is seen as a variant of System F, then the HM type system restricts the position of the quantifier–it can only be at the very beginning of the type. This restriction allows the HM type system to implement inference, while System F cannot implement type inference.
However, this does not mean that a stronger type system cannot perform type inference. In fact, Haskell has implemented stronger type inference, and languages like Scala, Typescript, Rust use a technique called local type inference
, which can eliminate most unnecessary type annotations based on some type annotations given by the programmer.