Balanced Trees, Part 4:
Left Leaning Red-Black Trees

Overview

While searching around for info about balanced trees, I found some new papers by Robert Sedgewick, describing an algorithm called Left Leaning Red-Black (LLRB) trees. Since Sedgewick (along with Guibas back in 1978) helped come up with the abstraction for red-black trees, I assumed this would be a good algorithm to test. However...

There is not yet much information about LLRBs, mostly in the form of a couple PDFs circa 2008 that appear to be based off presentations Sedgewick gave to one of his classes. The snapshots of code presented are mostly complete, but require some massaging to get into workable C++ code. And once tested, this code ties with external 2-3 trees as the worst performance for insertion and deletion — but look-up operations yield the same performance as the fastest algorithms tested.

LLRBs emulate either 2-3 trees or 2-3-4 trees, depending how insertion is performed. The name of the algorithm derives from its main rule: every 3-node must be represented by storing the red node as the left child. This creates a tree that leans to the left.

In addition to the usual rules of a red-black tree, an LLRB adds the following rules:

  • Represent 2-3-4 trees as binary search trees.
  • Use internal red edges for 3-nodes and 4-nodes.
  • Require that 3-nodes be left-leaning.

Because the tree is arranged with all 3-nodes leaning to the left, there are fewer special cases to handle in the code. Sedgewick's assertion is that this makes the code shorter (and therefore faster) than the implementations based off of other common references. However, as implemented here, both LLRBs and basic red-black trees require about 140 lines for insertion, while deletion requires 160 lines for red-black trees versus 220 lines for LLRBs. And performance testing shows that LLRBs are significantly slower.

The assumption that fewer lines of code will execute faster is a reasonable assumption. However, this only holds true when memory speed is close to CPU speed. While this can be true for some types of processors (typically embedded devices), this is not true for normal computers — in other words, any system with an Intel CPU. By their very nature, red-black trees need to touch and rearrange lots of child nodes when performing both insertions and deletions. Given how expensive it to stall the CPU while waiting for cache lines to fill from main memory, red-black trees suffer performance issues on modern computers.

Any desktop or laptop computer is going to have a CPU that is substantially faster than main memory. The more memory that a process needs to touch, the slower it will run. This is the performance cost for red-black trees on modern computers: a binary tree will require two memory blocks to store the data that a 2-3 tree can store in one block. In the general case, these blocks will be in completely unrelated portions of main memory, forcing a cache-line fetch for every block touched. The CPU will stall while these fetches are occurring, losing performance.

The other distinction of LLRBs is that they explicitly allow for 2-3-4 trees. In terms of a red-black tree, this means that a black node may have two red children. Granted, this is allowed by most implementations of red-black trees, but normally as a simplification of the algorithm — LLRBs actually include this rule in the definition.

Much in the way that a strict red-black tree has a 1:1 correspondence with 2-3 trees, an LLRB has a 1:1 correspondence with 2-3-4 trees. For an LLRB to represent a 4-node, the 4-node must be red and the right child, and its sibling must be a red 3-node.

LLRBs can emulate either 2-3 trees or 2-3-4 trees, depending on where the ColorFlip operation is performed during an insertion. The symbol USE_234_TREE is used in this code to control which type of tree is being emulated. Empirically, 2-3 trees yield slightly better performance for insertions and deletions.

According to Sedgewick, an LLRB has the following performance properties:

  • Worst case: log2 (all 2-nodes)
  • Best case: log4 N = 1/2 log2 N (all 4-nodes)
  • Tree height: Between 10 and 20 for 1 million nodes.
  • Tree height: Between 15 and 30 for 1 billion nodes.

Working source code and a test app can be found in balanced_tree.zip.

Data Structures

For evaluation purposes, I define a simple key-value pair, where data is sorted with an unsigned 32-bit integer. The value is a blind void* pointer. This pointer is not used for testing, it is simply the value returned when doing a look-up operation in the resulting binary tree.

struct VoidRef_t
{
    U32   Key;
    void* pContext;
};

Each node in the tree uses the following structure:

struct LLTB_t
{
    VoidRef_t Ref;

    bool      IsRed;

    LLTB_t*   pLeft;
    LLTB_t*   pRight;
};

Insertion

A few support routines are needed for insertion (a few of which are also shared with deletion). These are used to apply rotations to a subtree, and to change the red/black coloring of a subtree.

Consider the following tree:

Rotation Base

One operation used by both insertion and deletion is RotateLeft. This function will rotate the nodes to the left, so that 6 will take on the red/black color that 4 previously had, then change 4 to be a red node:

Rotate Left

Likewise, the RotateRight function will rotate the nodes to the right, so that 2 will take on the red/black color that 4 previously had, then change 4 to be a red node:

Rotate Right

In both rotation cases, since the new root of the subtree takes on the color of the old root node, the code will not introduce any color violations in the parent tree.

static QzLLTB_t* RotateLeft(QzLLTB_t *pNode)
{
    QzLLTB_t *pTemp = pNode->pRight;
    pNode->pRight   = pTemp->pLeft;
    pTemp->pLeft    = pNode;
    pTemp->IsRed    = pNode->IsRed;
    pNode->IsRed    = true;

    return pTemp;
}

static QzLLTB_t* RotateRight(QzLLTB_t *pNode)
{
    QzLLTB_t *pTemp = pNode->pLeft;
    pNode->pLeft    = pTemp->pRight;
    pTemp->pRight   = pNode;
    pTemp->IsRed    = pNode->IsRed;
    pNode->IsRed    = true;

    return pTemp;
}

Another frequent operation is ColorFlip. Given any node, ColorFlip will toggle the red/black color of that node and both of its children.

static void ColorFlip(QzLLTB_t *pNode)
{
    pNode->IsRed = !pNode->IsRed;

    if (NULL != pNode->pLeft) {
        pNode->pLeft->IsRed  = !pNode->pLeft->IsRed;
    }

    if (NULL != pNode->pRight) {
        pNode->pRight->IsRed = !pNode->pRight->IsRed;
    }
}

Note that since ColorFlip does change the color of the nodes without rearranging them, it is possible that this operation will introduce a color violation in the tree arrangement (such as having one red node be the child of another red node, which is prohibited). Any time that ColorFlip is called, the caller will need to apply some kind of fix-up logic to repair any color violations.

The bulk of the insertion logic is handled by InsertRec, which recursively traverses the tree until it finds the location where it should insert the new key.

QzLLTB_t* LeftLeaningRedBlack::InsertRec(QzLLTB_t* pNode, VoidRef_t ref)
{
    // Special case for inserting a leaf.  Just return the pointer;
    // the caller will insert the new node into the parent node.
    if (NULL == pNode) {
        pNode      = NewNode();
        pNode->Ref = ref;
        return pNode;
    }

    // If we perform the color flip here, the tree is assembled as a
    // mapping of a 2-3-4 tree.
#if defined(USE_234_TREE)
    // This color flip will effectively split 4-nodes on the way down
    // the tree (since 4-nodes must be represented by a node with two
    // red children).  By performing the color flip here, the 4-nodes
    // will remain in the tree after the insertion.
    if (IsRed(pNode->pLeft) && IsRed(pNode->pRight)) {
        ColorFlip(pNode);
    }
#endif

    // Check to see if the value is already in the tree.  If so, we
    // simply replace the value of the key, since duplicate keys are
    // not allowed.
    if (ref.Key == pNode->Ref.Key) {
        pNode->Ref = ref;
    }

    // Otherwise recurse left or right depending on key value.
    //
    // Note: pLeft or pRight may be a NULL pointer before recursing.
    // This indicates that pNode is a leaf (or only has one child),
    // so the new node will be inserted using the return value.
    //
    // The other reason for pass-by-value, followed by an assignment,
    // is that the recursive call may perform a rotation, so the
    // pointer that gets passed in may end up not being the root of
    // the subtree once the recursion returns.
    //
    else {
        if (ref.Key < pNode->Ref.Key) {
            pNode->pLeft = InsertRec(pNode->pLeft, ref);
        }
        else {
            pNode->pRight = InsertRec(pNode->pRight, ref);
        }
    }

    // If necessary, apply a rotation to get the correct representation
    // in the parent node as we're backing out of the recursion.  This
    // places the tree in a state where the parent can safely apply a
    // rotation to restore the required black/red balance of the tree.

    // Fix a right-leaning red node: this will assure that a 3-node is
    // the left child.
    if (IsRed(pNode->pRight) &&
        (false == IsRed(pNode->pLeft)))
    {
        pNode = RotateLeft(pNode);
    }

    // Fix two reds in a row: this will rebalance a 4-node.
    if (IsRed(pNode->pLeft) &&
        IsRed(pNode->pLeft->pLeft))
    {
        pNode = RotateRight(pNode);
    }

    // If we perform the color flip here, the tree is assembled as a
    // mapping of a 2-3 tree.
#if !defined(USE_234_TREE)
    // This color flip will effectively split 4-nodes on the way back
    // out of the tree.  By doing this here, there will be no 4-nodes
    // left in the tree after the insertion is complete.
    if (IsRed(pNode->pLeft) && IsRed(pNode->pRight)) {
        ColorFlip(pNode);
    }
#endif

    // Return the new root of the subtree that was just updated,
    // since rotations may have changed the value of this pointer.
    return pNode;
}

Another thing to note about InsertRec is the USE_234_TREE symbol. This will control where the color flip is performed. If done near the beginning of the insertion, this will produce a tree that maps to a 2-3-4 tree. But if the color flip is near the end of the function, the tree will be equivalent to a 2-3 tree.

The main Insert routine has little to do, other than to assure that the root of the tree is always black.

bool LeftLeaningRedBlack::Insert(VoidRef_t ref)
{
    m_pRoot = InsertRec(m_pRoot, ref);

    // The root node of a red-black tree must be black.
    m_pRoot->IsRed = false;

    return true;
}

Deletion

One of the fix-up routines used by deletion is MoveRedLeft. When given a sub-tree that starts with a red node, it applies the following sequence of transforms:

MoveRedLeft

All of this work is done to change the 3-node (the node with the single red child) from being the right child to being the left child. Once it is complete, the new root of the subtree is still a red node, but the 3-node is now the left child, which preserves the left leaning property of the tree.

static QzLLTB_t* MoveRedLeft(QzLLTB_t *pNode)
{
    // If both children are black, we turn these three nodes into a
    // 4-node by applying a color flip.
    ColorFlip(pNode);

    // But we may end up with a case where pRight has a red child.
    // Apply a pair of rotations and a color flip to make pNode a
    // red node, both of its children become black nodes, and pLeft
    // becomes a 3-node.
    if ((NULL != pNode->pRight) &&
        IsRed(pNode->pRight->pLeft))
    {
        pNode->pRight = RotateRight(pNode->pRight);
        pNode         = RotateLeft(pNode);

        ColorFlip(pNode);
    }

    return pNode;
}

The related function to MoveRedLeft is MoveRedRight. This is used whenever pNode has two red children and its left child is a 3-node. This will rearrange the subtree so that the root is a red node, with its left child still being a 3-node.

static QzLLTB_t* MoveRedRight(QzLLTB_t *pNode)
{
    // Applying a color flip may turn pNode into a 4-node,
    // with both of its children being red.
    ColorFlip(pNode);

    // However, this may cause a situation where both of pNode's
    // children are red, along with pNode->pLeft->pLeft.  Applying a
    // rotation and a color flip will fix this special case, since
    // it makes pNode red and pNode's children black.
    if ((NULL != pNode->pLeft) &&
        IsRed(pNode->pLeft->pLeft))
    {
        pNode = RotateRight(pNode);

        ColorFlip(pNode);
    }

    return pNode;
}

Internal nodes are not easy to delete, since this might require rearranging a substantial part of the tree. Instead, it is easier to find the smallest key that is larger than the key being deleted. This can be done by starting at the node's right child, then traversing left until we reach a leaf node. The contents of that leaf node can then be used to replace the value being deleted.

This replacement key is found with FindMin:

static QzLLTB_t* FindMin(QzLLTB_t *pNode)
{
    while (NULL != pNode->pLeft) {
        pNode = pNode->pLeft;
    }

    return pNode;
}

After replacing the deleted value with a different key, the empty leaf node selected by FindMin can be deleted by calling DeleteMin.

QzLLTB_t* LeftLeaningRedBlack::DeleteMin(QzLLTB_t *pNode)
{
    // If this node has no children, we're done.
    // Due to the arrangement of an LLRB tree, the node cannot have a
    // right child.
    if (NULL == pNode->pLeft) {
        Free(pNode);
        return NULL;
    }

    // If these nodes are black, we need to rearrange this subtree to
    // force the left child to be red.
    if ((false == IsRed(pNode->pLeft)) &&
        (false == IsRed(pNode->pLeft->pLeft)))
    {
        pNode = MoveRedLeft(pNode);
    }

    // Continue recursing to locate the node to delete.
    pNode->pLeft = DeleteMin(pNode->pLeft);

    // Fix right-leaning red nodes and eliminate 4-nodes on the way up.
    // Need to avoid allowing search operations to terminate on 4-nodes,
    // or searching may not locate intended key.
    return FixUp(pNode);
}

As we recurse down the tree, the code will leave right-leaning red nodes and unbalanced 4-nodes. These rule violations will be repaired when recursing back out of the tree by the FixUp function:

static QzLLTB_t* FixUp(QzLLTB_t *pNode)
{
    // Fix right-leaning red nodes.
    if (IsRed(pNode->pRight)) {
        pNode = RotateLeft(pNode);
    }

    // Detect if there is a 4-node that traverses down the left.
    // This is fixed by a right rotation, making both of the red
    // nodes the children of pNode.
    if (IsRed(pNode->pLeft) &&
        IsRed(pNode->pLeft->pLeft))
    {
        pNode = RotateRight(pNode);
    }

    // Split 4-nodes.
    if (IsRed(pNode->pLeft) &&
        IsRed(pNode->pRight))
    {
        ColorFlip(pNode);
    }

    return pNode;
}

The main deletion logic is recursive. Comment blocks aside, this code is fairly short, since most of the support logic is found in other functions. As mentioned above, this code will leave right-leaning red nodes and unbalanced 4-nodes as it works its way down the tree. Once the deletion has been completed, the code will back out of the recursion, allowing FixUp to fix all rule violations created while traversing down the tree.

QzLLTB_t* LeftLeaningRedBlack::DeleteRec(QzLLTB_t *pNode, const U32 key)
{
    if (key < pNode->Ref.Key) {
        if (NULL != pNode->pLeft) {
            // If pNode and pNode->pLeft are black, we may need to
            // move pRight to become the left child if a deletion
            // would produce a red node.
            if ((false == IsRed(pNode->pLeft)) &&
                (false == IsRed(pNode->pLeft->pLeft)))
            {
                pNode = MoveRedLeft(pNode);
            }

            pNode->pLeft = DeleteRec(pNode->pLeft, key);
        }
    }
    else {
        // If the left child is red, apply a rotation so we make
        // the right child red.
        if (IsRed(pNode->pLeft)) {
            pNode = RotateRight(pNode);
        }

        // Special case for deletion of a leaf node.
        // The arrangement logic of LLRBs assures that in this case,
        // pNode cannot have a left child.
        if ((key == pNode->Ref.Key) &&
            (NULL == pNode->pRight))
        {
            Free(pNode);
            return NULL;
        }

        // If we get here, we need to traverse down the right node.
        // However, if there is no right node, then the target key is
        // not in the tree, so we can break out of the recursion.
        if (NULL != pNode->pRight) {
            if ((false == IsRed(pNode->pRight)) &&
                (false == IsRed(pNode->pRight->pLeft)))
            {
                pNode = MoveRedRight(pNode);
            }

            // Deletion of an internal node: We cannot delete this node
            // from the tree, so we have to find the node containing
            // the smallest key value that is larger than the key we're
            // deleting.  This other key will replace the value we're
            // deleting, then we can delete the node that previously
            // held the key/value pair we just moved.
            if (key == pNode->Ref.Key) {
                pNode->Ref    = FindMin(pNode->pRight)->Ref;
                pNode->pRight = DeleteMin(pNode->pRight);
            }
            else {
                pNode->pRight = DeleteRec(pNode->pRight, key);
            }
        }
    }

    // Fix right-leaning red nodes and eliminate 4-nodes on the way up.
    // Need to avoid allowing search operations to terminate on 4-nodes,
    // or searching may not locate intended key.
    return FixUp(pNode);
}

Since all of the deletion logic is handled recursively, the only thing that Delete needs to do is assure that the new root node is black.

void LeftLeaningRedBlack::Delete(const U32 key)
{
    if (NULL != m_pRoot) {
        m_pRoot = DeleteRec(m_pRoot, key);

        // Assuming we have not deleted the last node from the tree, we
        // need to force the root to be a black node to conform with the
        // the rules of a red-black tree.
        if (NULL != m_pRoot) {
            m_pRoot->IsRed = false;
        }
    }
}