AoC 2020 Day 18 involves parsing mathematical expressions. My final solution used the Shunting Yard algorithm adapted for expressions represented as binary trees.

Expressions as Binary Trees

Jump to the next section if you already know about how to represent expressions as binary trees, and how to recursively evaluate an expression that’s represented as a binary tree.

One way we can represent a mathematical expression is as a binary tree. Instead of writing too many words that won’t make sense together, observe the following diagrams.

Expression Trees

In the second tree, the 1 + 2 subtree is evaluated first, and that result is then added to 3. In the third example, the 2 + 3 subtree is evaluated first, followed by the addition of 1. The most important thing to note here conceptually is that the lowest subtrees are evaluated first.

Each of the circles in the trees above is a node. Each node is connected to a left and right node. This is what makes the tree a binary tree - it only has the two children nodes. Once we have all our nodes connected to each other, we can grab the entire tree by just having our topmost root node. This is the standard binary tree representation in JavaScript.

class Node {
    constructor(token) {
        this.token = token
        this.left = null
        this.right = null
    }

    setLeft(node) {
        this.left = node
    }

    setRight(node) {
        this.right = node
    }
}

In textbooks or other articles, token may be called value instead. I call the node content token here because the Shunting Yard algorithm detailed below uses this term. In the end, it’s just a variable name. You can call it nodeValue, thing, or even yum if you want. After you’ve chosen your desired variable name, let’s take a look at the recursive method to evaluate a given expression tree.

function evaluateTree(node) {
    if(node.token != "*" && node.token != "+") {
        return parseInt(node.token)
    } else if(node.token == "*") {
        return evaluateTree(node.left) * evaluateTree(node.right)
    } else if(node.token == "+") {
        return evaluateTree(node.left) + evaluateTree(node.right)
    }
}

If this is the first time you’re seeing recursion, use some of the example trees and step through it with the code provided. The most important thing to notice is that this function calls itself, but on one of its subtrees. Trees are a classic data structure where recursion is used because recursion is the perfect way to break down a tree into smaller trees to evaluate first.

Using the Shunting Yard Algorithm

Compilers use the Shunting Yard algorithm to parse expressions. Take a look at the Shunting Yard pseudocode here. The Shunting Yard algorithm is usually used to convert infix expressions into postfix expressions. What does that mean, exactly? Infix expressions are a fancy term for regular math expressions like 1 + 2 where the operation is in between the operands. The postfix version of 1 + 2 is 1 2 +, where the operator is written after its 2 operands. It’s just a different way of writing a mathematical expression. Postfix expressions turn out to be really useful because it doesn’t need parentheses, just like trees. Both postfix expressions and expression trees tell us which operations need to happen first.

For this problem however, I don’t want my expressions in the postfix form, I want them in expression tree form. We can still use the Shunting Yard algorithm provided in the link above as a reference. I’ve copied the pseudocode from the page and modified it to create expression trees. Note that it also only takes into account the + and * operators for the AoC problem as well.

Read tokens (operators or digits) from left to right:
    If token is an operator
        Pop existing operators from the stack, pop the two most recent nodes from the queue, make a subtree, and put the new operator node on the queue
        Push the current operator onto the stack
    If token is a number, make it a node and add it to queue
    If token is a left parenthesis push it onto the stack
    If token is a right parenthesis
        Pop operators from the stack, making subtrees for them, until we reach the first left parenthesis
        Pop the left parenthesis from the stack and discard it
While there are operators on the stack, pop them to the queue

And here is that pseudocode implemented in JavaScript.

function parseToTree(expression) {
    var operatorStack = []
    var nodeQueue = []

    for(var i = 0; i < expression.length; i++) {
        if(['+','*'].includes(expression.charAt(i))) { //Token is an operator
            while(operatorStack.length > 0 && ['+','*'].includes(operatorStack[operatorStack.length-1])) {
                var operatorNode = new Node(operatorStack.pop())
                var rightNode = nodeQueue.pop()
                var leftNode = nodeQueue.pop()
                operatorNode.setLeft(leftNode)
                operatorNode.setRight(rightNode)
                nodeQueue.push(operatorNode)
            }
            operatorStack.push(expression.charAt(i))
        } else if(/[1-9]{1}/.test(expression.charAt(i))) { //Token is a digit
            var newNode = new Node(expression.charAt(i))
            nodeQueue.push(newNode)
        } else if(expression.charAt(i) == "(") { //Token is a left parenthesis
            operatorStack.push("(")
        } else if(expression.charAt(i) == ")") { //Token is a right parenthesis
            while(operatorStack[operatorStack.length-1] != "(") {
                var operatorNode = new Node(operatorStack.pop())
                var rightNode = nodeQueue.pop()
                var leftNode = nodeQueue.pop()
                operatorNode.setLeft(leftNode)
                operatorNode.setRight(rightNode)
                nodeQueue.push(operatorNode)
            }
            operatorStack.pop() //Pops the left parens off the stack
        }
    }

    while(operatorStack.length > 0) {
        var operatorNode = new Node(operatorStack.pop())
        var rightNode = nodeQueue.pop()
        var leftNode = nodeQueue.pop()
        operatorNode.setLeft(leftNode)
        operatorNode.setRight(rightNode)
        nodeQueue.push(operatorNode)
    }

    return nodeQueue.pop() //Returns the last node on the queue which is the root of the tree
}

Check out my final code solution with all the bells and whistles. Note that the Node class I had used has an extra reference to the parent node because of the issue I ran into described below. The parent node is actually not needed in the Node class.

My Faulty First Attempt

I used this website for my first attempt at the problem. Let’s convert the algorithm detailed there into this problem space: Given an expression, traverse the expression from left to right. When we encounter a token, we follow these rules.

  1. If the current token is a (, add a new node as the left child of the current node, and descend to that left child.
  2. If the current token is + or *, set the value of the current node to the token. Add a new node as the right child of the current node and descend to the right child.
  3. If the current token is a number, set the root value of the current node to the number and return to the parent.
  4. If the current token is a ), go to the parent of the current node.

However, there was a big problem with the algorithm detailed above. It’s difficult to see since the example seemed to work so well! What happens when we try to use the algorithm to convert 8 + (1 * 2 * 3) into a tree?

This problem happens because when we encounter an open parenthesis, we don’t know how many terms are inside the parenthesis. So when we try to create left nodes following rule 1, we don’t know how many left nodes to create. We would need to implement some sort of lookahead, which may be more time inefficient or use more memory than the data structures used by the Shunting Yard algorithm.

Expanding the Problem

AoC is not meant to have “the hardest” code problems out there. Here are some ways you can build upon the code solution detailed in this post.

  1. Be able to handle multiple-digit numbers or decimals, like 20 + 9.15 * 2.
  2. Handle more operations such as subtraction, division, or powers. The Shunting Yard link provided above details the algorithm using order of operations.
  3. Use the Shunting Yard algorithm to create a postfix expression, then evaluate the postfix expression instead of the expression tree.