Pattern Search with the Knuth-Morris-Pratt (KMP) algorithm | Towards ...
Learning

Pattern Search with the Knuth-Morris-Pratt (KMP) algorithm | Towards ...

2560 × 1628 px November 6, 2025 Ashley Learning
Download

In the realm of computer science, algorithms play a crucial role in solving complex problems efficiently. One such algorithm that has garnered significant attention is the Pratt algorithm. This algorithm is particularly useful for parsing mathematical expressions and is often implemented in Python due to its simplicity and readability. In this post, we will delve into the intricacies of the Pratt algorithm, its implementation in Python, and its applications.

Understanding the Pratt Algorithm

The Pratt algorithm, also known as the top-down operator precedence parser, is designed to parse mathematical expressions written in infix notation. It handles expressions with different operator precedences and associativities, making it a versatile tool for expression evaluation. The algorithm works by recursively parsing the expression from left to right, ensuring that operators with higher precedence are evaluated first.

The Pratt algorithm is based on two main functions:

  • parsePrimary: This function parses the primary expressions, which are the basic building blocks of the expression, such as numbers, variables, and parentheses.
  • parseExpression: This function parses the entire expression by recursively calling itself and the parsePrimary function.

Implementing the Pratt Algorithm in Python

To implement the Pratt algorithm in Python, we need to define the two main functions mentioned above. Let's start with the parsePrimary function, which handles the basic components of the expression.

Here is a simple implementation of the parsePrimary function:

def parsePrimary(tokens, index):
    token = tokens[index]
    if token.isdigit():
        return int(token), index + 1
    elif token == '(':
        expr, index = parseExpression(tokens, index + 1)
        if tokens[index] == ')':
            return expr, index + 1
        else:
            raise SyntaxError('Expected closing parenthesis')
    else:
        raise SyntaxError('Unexpected token: ' + token)

Next, we need to implement the parseExpression function, which uses the parsePrimary function to build the entire expression. This function also handles operator precedence and associativity.

Here is the implementation of the parseExpression function:

def parseExpression(tokens, index):
    left, index = parsePrimary(tokens, index)
    while index < len(tokens) and tokens[index] in '+-*/':
        op = tokens[index]
        if op == '+':
            right, index = parseExpression(tokens, index + 1)
            left = left + right
        elif op == '-':
            right, index = parseExpression(tokens, index + 1)
            left = left - right
        elif op == '*':
            right, index = parseExpression(tokens, index + 1)
            left = left * right
        elif op == '/':
            right, index = parseExpression(tokens, index + 1)
            left = left / right
    return left, index

To use these functions, we need to tokenize the input expression and then call the parseExpression function. Here is an example of how to do this:

def evaluateExpression(expression):
    tokens = expression.split()
    result, _ = parseExpression(tokens, 0)
    return result

# Example usage
expression = "3 + 5 * ( 2 - 8 )"
result = evaluateExpression(expression)
print("Result:", result)

💡 Note: The above implementation assumes that the input expression is well-formed and does not handle errors such as division by zero or invalid tokens. For a more robust implementation, additional error handling should be added.

Applications of the Pratt Algorithm

The Pratt algorithm has a wide range of applications in various fields, including:

  • Mathematical Expression Evaluation: The primary use of the Pratt algorithm is to evaluate mathematical expressions written in infix notation. It ensures that operators with higher precedence are evaluated first, following the standard rules of arithmetic.
  • Compiler Design: The Pratt algorithm is used in compiler design to parse expressions in programming languages. It helps in converting infix expressions into postfix or prefix notation, which can then be evaluated efficiently.
  • Symbolic Computation: In symbolic computation, the Pratt algorithm is used to parse and evaluate symbolic expressions. This is particularly useful in fields such as mathematics, physics, and engineering, where complex expressions need to be manipulated and evaluated.

Handling Operator Precedence and Associativity

One of the key features of the Pratt algorithm is its ability to handle operator precedence and associativity. Operator precedence determines the order in which operators are evaluated, while associativity determines the direction in which operators of the same precedence are evaluated.

In the Pratt algorithm, operator precedence and associativity are handled by defining a precedence table for the operators. This table specifies the precedence and associativity of each operator. The parseExpression function uses this table to determine the order of evaluation.

Here is an example of a precedence table for common arithmetic operators:

Operator Precedence Associativity
+ 1 Left
- 1 Left
* 2 Left
/ 2 Left

In this table, operators with higher precedence values are evaluated first. For example, multiplication and division have higher precedence than addition and subtraction. The associativity column specifies whether the operator is left-associative or right-associative. Most arithmetic operators are left-associative, meaning that they are evaluated from left to right.

To incorporate this precedence table into the Pratt algorithm, we need to modify the parseExpression function to use the table. Here is an example of how to do this:

def parseExpression(tokens, index, precedence_table):
    left, index = parsePrimary(tokens, index)
    while index < len(tokens) and tokens[index] in precedence_table:
        op = tokens[index]
        if precedence_table[op]['associativity'] == 'left' and precedence_table[op]['precedence'] < precedence_table[tokens[index + 1]]['precedence']:
            break
        right, index = parseExpression(tokens, index + 1, precedence_table)
        if op == '+':
            left = left + right
        elif op == '-':
            left = left - right
        elif op == '*':
            left = left * right
        elif op == '/':
            left = left / right
    return left, index

# Example usage
precedence_table = {
    '+': {'precedence': 1, 'associativity': 'left'},
    '-': {'precedence': 1, 'associativity': 'left'},
    '*': {'precedence': 2, 'associativity': 'left'},
    '/': {'precedence': 2, 'associativity': 'left'}
}

expression = "3 + 5 * ( 2 - 8 )"
tokens = expression.split()
result, _ = parseExpression(tokens, 0, precedence_table)
print("Result:", result)

💡 Note: The above implementation assumes that the input expression is well-formed and does not handle errors such as division by zero or invalid tokens. For a more robust implementation, additional error handling should be added.

Extending the Pratt Algorithm

The Pratt algorithm can be extended to handle more complex expressions and additional operators. For example, we can add support for unary operators, such as negation, and functions, such as trigonometric functions.

To add support for unary operators, we need to modify the parsePrimary function to handle unary expressions. Here is an example of how to do this:

def parsePrimary(tokens, index):
    token = tokens[index]
    if token.isdigit():
        return int(token), index + 1
    elif token == '(':
        expr, index = parseExpression(tokens, index + 1)
        if tokens[index] == ')':
            return expr, index + 1
        else:
            raise SyntaxError('Expected closing parenthesis')
    elif token == '-':
        expr, index = parsePrimary(tokens, index + 1)
        return -expr, index
    else:
        raise SyntaxError('Unexpected token: ' + token)

To add support for functions, we need to modify the parsePrimary function to handle function calls. Here is an example of how to do this:

def parsePrimary(tokens, index):
    token = tokens[index]
    if token.isdigit():
        return int(token), index + 1
    elif token == '(':
        expr, index = parseExpression(tokens, index + 1)
        if tokens[index] == ')':
            return expr, index + 1
        else:
            raise SyntaxError('Expected closing parenthesis')
    elif token == '-':
        expr, index = parsePrimary(tokens, index + 1)
        return -expr, index
    elif token in ['sin', 'cos', 'tan']:
        expr, index = parsePrimary(tokens, index + 1)
        if tokens[index] == '(':
            arg, index = parseExpression(tokens, index + 1)
            if tokens[index] == ')':
                if token == 'sin':
                    return math.sin(arg), index + 1
                elif token == 'cos':
                    return math.cos(arg), index + 1
                elif token == 'tan':
                    return math.tan(arg), index + 1
            else:
                raise SyntaxError('Expected closing parenthesis')
        else:
            raise SyntaxError('Expected opening parenthesis')
    else:
        raise SyntaxError('Unexpected token: ' + token)

With these modifications, the Pratt algorithm can handle more complex expressions, including unary operators and functions. This makes it a powerful tool for parsing and evaluating a wide range of mathematical expressions.

💡 Note: The above implementation assumes that the input expression is well-formed and does not handle errors such as division by zero or invalid tokens. For a more robust implementation, additional error handling should be added.

In addition to extending the Pratt algorithm to handle more complex expressions, we can also optimize its performance. One way to do this is by using memoization to cache the results of previously evaluated expressions. This can significantly reduce the time complexity of the algorithm, especially for large and complex expressions.

To implement memoization, we can use a dictionary to store the results of previously evaluated expressions. Here is an example of how to do this:

def parseExpression(tokens, index, precedence_table, memo):
    if (index, tokens) in memo:
        return memo[(index, tokens)]
    left, index = parsePrimary(tokens, index)
    while index < len(tokens) and tokens[index] in precedence_table:
        op = tokens[index]
        if precedence_table[op]['associativity'] == 'left' and precedence_table[op]['precedence'] < precedence_table[tokens[index + 1]]['precedence']:
            break
        right, index = parseExpression(tokens, index + 1, precedence_table, memo)
        if op == '+':
            left = left + right
        elif op == '-':
            left = left - right
        elif op == '*':
            left = left * right
        elif op == '/':
            left = left / right
    memo[(index, tokens)] = (left, index)
    return left, index

# Example usage
precedence_table = {
    '+': {'precedence': 1, 'associativity': 'left'},
    '-': {'precedence': 1, 'associativity': 'left'},
    '*': {'precedence': 2, 'associativity': 'left'},
    '/': {'precedence': 2, 'associativity': 'left'}
}

expression = "3 + 5 * ( 2 - 8 )"
tokens = expression.split()
memo = {}
result, _ = parseExpression(tokens, 0, precedence_table, memo)
print("Result:", result)

With memoization, the Pratt algorithm can handle large and complex expressions more efficiently, making it a valuable tool for a wide range of applications.

💡 Note: The above implementation assumes that the input expression is well-formed and does not handle errors such as division by zero or invalid tokens. For a more robust implementation, additional error handling should be added.

In conclusion, the Pratt algorithm is a powerful tool for parsing and evaluating mathematical expressions. Its ability to handle operator precedence and associativity makes it a versatile algorithm for a wide range of applications. By implementing the Pratt algorithm in Python, we can leverage its efficiency and simplicity to build robust and efficient expression evaluators. Whether you are working on a compiler, a symbolic computation system, or any other application that requires expression evaluation, the Pratt algorithm is a valuable tool to have in your toolkit.

Related Terms:

  • what is pratt parsing
  • pratt top down algorithm
  • pratt parsing method
  • pratt top down parsing