Solving the Maximum Subarray Sum Problem Using the Qin Jiushao Algorithm

WARNING

This article was translated from Chinese by ChatGPT and may contain some errors.

All along, some friends have looked down on functional programming. They feel that functional programming just adds some unnecessary abstraction, and the cost is a decrease in performance. Friends who hold this view, Im afraid, have definitely not read Richard Birds PEARLS OF FUNCTIONAL ALGORITHM DESIGN. The content discussed in this book is indeed too obscure, and I have not read it all the way through.

Fortunately, Richard Bird wrote a paper in 1989 that is only 5 pages long, Algebraic identities for program calculation. This paper uses a famous problemthe maximum subarray sum1 (Maximum subarray problem) to fully demonstrate the unique understanding of algorithmic problems by functional programming thinking: the maximum subarray sum problem can be solved by the Qin Jiushao algorithm.

Maximum Subarray Sum Problem

The goal of the maximum subarray sum problem is to find a continuous subsequence in the one-dimensional direction of the sequence, so that the sum of this subsequence is the largest. For example, for a sequence [−2, 1, −3, 4, −1, 2, 1, −5, 4], the continuous subsequence with the largest sum is [4, −1, 2, 1], and its sum is 6 2.

Obviously, we have a trivial algorithm, that is, enumerate all subarrays, and then find the maximum value:

1def mss(arr):
2    r = 0
3    for end in range(len(arr) + 1):
4        for start in range(0, end):
5            s = 0
6            for k in range(start, end):
7                s += arr[k]
8            r = max(s, r)

This program would be more concise if written in Haskell:

1import Data.List (tails, inits)
2
3mss = maximum . map sum . segs
4segs = concat . map tails . inits

Where tails is suffix, inits is prefix:

1tails [1, 2, 3] = [[1, 2, 3], [2, 3], [3], []]
2inits [1, 2, 3] = [[], [1], [1, 2], [1, 2, 3]]

It is easy to see that all suffixes of all prefixes, or all prefixes of all suffixes are all sublists of a list.

Now lets return to the algorithm itself. The time complexity of this trivial algorithm is O(n3)O(n^3). Enumerating all subarrays requires O(n2)O(n^2), and summing each subarray requires O(n)O(n), so it constitutes O(n3)O(n^3).

In fact, the best algorithm can solve this problem within O(n)O(n) time complexity. Birds paper is discussing how to derive the best algorithm from the trivial one. First, slightly modify the algorithm:

1def mss(arr):
2    m = 0
3    for end in range(len(arr) + 1):
4        m = max(max_tails(arr, end), m)
5
6def max_tails(arr, end):
7    m = 0
8    for start in range(0, end):
9        s = 0
10        for k in range(start, end):
11            s += arr[k]
12        m = max(m, s)
13    return m

This seems to be a trivial transformation, I just broke down the two layers inside the three loops. The max_tails function will return the maximum suffix sum of the array arr in the [0, end) interval. We use \uparrow to represent the maximum value function, the functional version of max_tails, or the mathematical version, can be written as:

arr[0:end][x1,x2,,xn]m=(i=1nxi)(i=2nxi)(xn)0 \begin{aligned} \textsf{arr[0:end]} &\equiv [ x_1, x_2, \cdots, x_n ] \\ m &= (\sum_{i=1}^n x_i) \uparrow (\sum_{i=2}^n x_i) \uparrow \cdots \uparrow (x_n) \uparrow 0 \end{aligned}

This looks very puzzling at first glance, lets explain it with an example. First, consider max_tails([1, 2, 3], 3), that is, the maximum value of all suffix sums of [1, 2, 3].

The calculation of max_tails can be written as:

1(1 + 2 + 3) ↑ (2 + 3) ↑ (3) ↑ 0

That is to say, max_tails will calculate each suffix sum and find their maximum value. When the maximum value is written as a binary function \uparrow, it can naturally connect each suffix sum to form the final expression.

In the naive algorithm, finding the maximum suffix sum requires a complexity of O(n2)O(n^2). Below, we use the Qin Jiushao algorithm to reduce this complexity to O(n)O(n).

Qin Jiushao Algorithm

In his book Mathematical Art in Nine Chapters, Qin Jiushao proposed a method for finding approximate solutions to high-degree equations. One part of this is referred to as the Qin Jiushao algorithm by modern researchers.

Consider the nn-degree polynomial

f(x)=i=0naixi f(x) = \sum_{i=0}^{n} a_i x^{i}

For a given tt, there are several ways to find the value of f(t)f(t). The most trivial method is to calculate the value of each term separately and then add them up. This algorithm requires O(n2)O(n^2) multiplications. The Qin Jiushao algorithm notes that:

anxn+an1xn1++a0=((anxn1)+(an1xn2)++a1)x+a0==(((anx+an1)x+)x+a1)x+a0 \begin{aligned} & a_n x^n + a_{n-1} x^{n-1} + \cdots + a_0 \\ = & ((a_n x^{n - 1}) + (a_{n-1} x^{n - 2}) + \cdots + a_{1}) x + a_0 \\ = & \cdots \\ = & (((a_n x + a_{n - 1}) x + \cdots) x + a_{1})x + a_0 \end{aligned}

This can be written in an iterative form:

s0=ansi+1=sit+ani1 \begin{aligned} s_0 &= a_n \\ s_{i + 1} &= s_i t + a_{n - i - 1} \end{aligned}

In this way sn=f(t)s_n = f(t). In this calculation, multiplication is performed nn times, and addition is also performed nn times.

At first glance, the Qin Jiushao algorithm seems to be just a numerical calculation algorithm. What does it have to do with the max_tails function? We need to consider the generalized form of the Qin Jiushao algorithm.

First, the input to this algorithm does not necessarily have to be a polynomial. The following expression can still be rewritten using the Qin Jiushao algorithm:

x1x2x3+x2x3+x3+1=(((x1+1)x2)+1)x3+1 \begin{aligned} & x_1x_2x_3 + x_2x_3 + x_3 + 1 \\ =& (((x_1 + 1) x_2) + 1) x_3 + 1 \end{aligned}

Secondly, the operations considered by the Qin Jiushao algorithm are not necessarily (+,)(+, \cdot). When this algorithm extracts common factors, it is actually rewriting the formula using the distributive law of multiplication. Consider any two functions (,)(\oplus, \odot), as long as they satisfy the following conditions3, the Qin Jiushao algorithm can be applied:

  • \odot distributes over \oplus (right)
    ac+bc=(a+b)c(ac)(bc)=(ab)c \begin{aligned} a \cdot c + b \cdot c &= (a + b) \cdot c \\ (a \odot c) \oplus (b \odot c) &= (a \oplus b) \odot c \\ \end{aligned}
  • There exists a (left) unit element for \odot, i.e., there exists aa such that x,ax=x\forall x, a \odot x = x

Taking the previous expression as an example:

((x1x2)x3)(x2x3)(x3)(1)=((((x11)x2)1)x3)1 \begin{aligned} & ((x_1 \odot x_2) \odot x_3) \oplus (x_2 \odot x_3) \oplus (x_3) \oplus (1) \\ = & ((((x_1 \oplus 1) \odot x_2) \oplus 1) \odot x_3) \oplus 1 \end{aligned}

Where 11 is the left identity element of \odot.

Kadane Algorithm

As mentioned earlier, max_tails can be written as:

arr[0:end][x1,x2,,xn]m=(i=1nxi)(i=2nxi)(xn)0 \begin{aligned} \textsf{arr[0:end]} &\equiv [ x_1, x_2, \cdots, x_n ] \\ m &= (\sum_{i=1}^n x_i) \uparrow (\sum_{i=2}^n x_i) \uparrow \cdots \uparrow (x_n) \uparrow 0 \end{aligned}

Here, \uparrow can be seen as \oplus, and ++ can be seen as \odot. Addition has the identity element 00, and the distributive property is obvious:

(ab)+c=(a+c)(b+c) (a \uparrow b) + c = (a + c) \uparrow (b + c)

Continuing with the example of [1, 2, 3],

1(1 + 2 + 3) ↑ (2 + 3) ↑ (3) ↑ 0
2= ((1 + 2) ↑ 2 ↑ 0) + 3 ↑ 0
3= (((1 ↑ 0) + 2) ↑ 0) + 3 ↑ 0
4= ((((0 + 1) ↑ 0) + 2 ↑ 0) + 3) ↑ 0

Based on this property, we rewrite max_tails using the Qin Jiushao algorithm:

1def step(s, a):
2    # (s + a) ↑ 0
3    return max(s + a, 0)
4
5def max_tails(arr, end):
6    s = 0
7    for i in range(end):
8        s = step(s, arr[i])

Obviously, the time complexity of the rewritten max_tails is O(n). Not only that, we notice:

max_tails(arr,end+1)= step(step(,arr[end1]),arr[end])= step(max_tails(arr, end),arr[end]) \begin{aligned} & \textsf{max\_tails}(\textsf{arr}, \textsf{end} + 1) \\ =\ & \textsf{step}(\textsf{step}(\cdots,\textsf{arr}[\textsf{end} - 1]), \textsf{arr}[\textsf{end}]) \\ =\ & \textsf{step}(\textsf{max\_tails(arr, end)}, \textsf{arr}[\textsf{end}]) \end{aligned}

Since mss needs to find the max_tails of [1, end + 1), the time complexity is O(n2)O(n^2) after rewriting. However, the above discovery tells us that the value of max_tails(arr, i) can be obtained from the value of max_tails(arr, i - 1). This can immediately be used to get an O(n)O(n) mss function:

1def step1(state, a):
2    (s, m) = state
3    s_next = step(s, a)
4    return (s_next, max(m, s_next))
5
6def mss(arr):
7    state = (0, 0)
8    for a in arr:
9        state = step1(state, a)
10    return state[1]

Without a doubt, this is the famous Kadane algorithm.

Original Derivation

Professor Bird summarized his research in a paragraph:

I was interested in the specific task of taking a clear but inefficient functional program, a program that acted as a specification of the problem in hand, and using equational reasoning to calculate a more efficient one. 4

In other words, Professor Bird likes to start with a simple, clear, but inefficient functional program, optimize the program through general rules, and thus obtain a more efficient program. Starting from the most intuitive program, Professor Bird can even derive famous algorithms like the KMP algorithm after step-by-step derivation. The original derivation of the maximum subarray sum problem in Professor Birds paper is5:

1mss = { by definition }                                     
2      max . map sum . segs                                  O(n³)
3    = { by definition of segs }
4      max . map sum . concat . map tails . inits            O(n³)
5    = { map promotion }
6      max . concat . map (map sum) . map tails . inits      O(n³)
7    = { definition of max and fold promotion }               
8      max . map max . map (map sum) . map tails  . inits    O(n³)
9    = { map distributivity }
10      max . map (max . (map sum) . tails) . inits           O(n³)
11    = { Horner's Rule }
12      max . map (foldl (⊗) 0) . inits                       O(n²)
13    = { scan theorem }
14      max . scanl (⊗) 0                                     O(n)
15    = { fold-scan fusion }
16      fst . foldl (⊙) (0, 0)                                O(n)
17    where a ⊗ b = (a + b) ↑ 0
18          (u, v) ⊙ t = (w ↑ u, w), w = v ⊗ t

With only 8 equations, the final algorithm was obtained, which can be said to be magical. Whats more valuable is that both the Qin Jiushao algorithm and the final optimization are shown as direct theorems in the derivation process. And these theorems are identities about commonly used functions in functional programming (such as map, foldl, concat). For example, the Qin Jiushao algorithm can be directly expressed as:

1fold (⊕) b . map (fold (⊗) a) . tails = fold (⊙) a
2u ⊙ v = (u ⊗ v) ⊕ a

So, why can functional programming conveniently carry out this kind of program derivation? Because functional programming makes us pay more attention to the algebraic properties of the program. Professor Bird also wrote a book called The algebra of programming.

The Unlovable Side

If readers try to submit the above code to LeetCode, they may find that it fails on some tests. This is not because the algorithm is wrong, but because the statement of the maximum subarray sum problem is different.

  • (LeetCode): Given an array, give its maximum subarray sum.
  • (Programming Pearls): Given an array, give its maximum subarray sum. If the maximum subarray sum of the array is negative, then return 0.

Our problem is the Programming Pearls version, which will never return a negative maximum subarray sum.

If you want to handle the LeetCode version of the problem, a simple fix is:

1mssNeg l
2  | r == 0    = max l
3  | otherwise = r
4    where r = mss l

This approach is a bit clumsy. Can we modify Professor Birds derivation process to derive an algorithm that can handle negative numbers? For this problem, I havent given any ideal solutions. Because Professor Birds first equation is wrong.

1--sum [] = 0
2mss = maximum . map sum . segs

There must be an empty table in the result of segs, and after the empty table is evaluated with sum, it will get 0. So obviously, this equation cannot handle the situation of negative maximum sum. Considering that in the Qin Jiushao algorithm here, 0 is the unit element of +, I think that to modify it to be able to handle negative numbers, it may need to be completely overhauled.

This may be the unlovable side of functional program derivation. Functional program derivation depends on the algebraic properties of the program, and these algebraic properties are sometimes fragile. A slight modification of the problem may cause the original algebraic properties to disappear. For imperative programs, perhaps they do not depend on these algebraic properties, so a trivial change can work, but for functional, especially programs derived from functional program derivation, this modification is not trivial.

Can you email Professor Bird for his opinion?

Maybe. Professor Bird left us forever on April 4, 2022, is there a post office that can send letters to heaven