일 | 월 | 화 | 수 | 목 | 금 | 토 |
---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | |||
5 | 6 | 7 | 8 | 9 | 10 | 11 |
12 | 13 | 14 | 15 | 16 | 17 | 18 |
19 | 20 | 21 | 22 | 23 | 24 | 25 |
26 | 27 | 28 | 29 | 30 | 31 |
- shiba
- concurrency
- kaggle
- Python Code
- 43. Multiply Strings
- 109. Convert Sorted List to Binary Search Tree
- 밴픽
- LeetCode
- 30. Substring with Concatenation of All Words
- Class
- Protocol
- Substring with Concatenation of All Words
- 315. Count of Smaller Numbers After Self
- DWG
- attribute
- iterator
- Python Implementation
- 운영체제
- Decorator
- 시바견
- 프로그래머스
- Convert Sorted List to Binary Search Tree
- 715. Range Module
- Generator
- Regular Expression
- t1
- 파이썬
- Python
- data science
- 컴퓨터의 구조
- Today
- Total
Scribbling
Recursive Descent Parsing - Python Implementation 본문
Recursive Descent Parsing - Python Implementation
focalpoint 2022. 11. 12. 05:19
Video I recommend:
https://www.youtube.com/watch?v=SToUyjAsaFk
In this post, I will cover "Recursive Descent Parsing". Details about it are well elaborated in the above video. This post is to summarize core concepts and implement the algorithm with an actual code.
Goal: Implement a parser that parse an arithmetic experssion.
- ( 3 + 5) * -2
Elements:
Numbers
Infix Operators (+, -, *, /)
Prefix Operator (+, -)
Parenthesis ( )
Precedence
3 + 5 * 6
should be 3 + (5 * 6) not (3 + 5) * 6
Associativity
7 - 5 + 1
= (7 - 5) + 1 not 7 - ( 5 + 1)
Grammar
Grammar (Non-Terminal) Symbols: E: Expression, T: Term, F: Factor
Terminal Symbols: *, -, 37, 62, +, /, (, )
T+E, T-E, T --> E
F * T, F / T, F --> F
Integer, (E), -F --> F
Tokens
Integers {digits}
Lexical Analysis
function scanToken(): scans the input and moves nextToken
variable nextToken
(optional) OOP class for data types like integers
Class
We will use a tree structure to represent the expression. Thanks to Python's versatility, we don't need to implement different class for each operator or data types.
class Node(object):
def __init__(self, val="", left=None, right=None):
self.val = val
self.left = left
self.right = right
Print Method
Recursively call to print the expression.
Eval Method
Recursively call to evaluate the expression.
Parsing Algorithm
Functions: each function will return a node (or pointer in other languages) of the tree it builds
parseExpression()
parseTerm()
parseFactor()
Details:
- at any moment, nextToken will contain the next unscanned thing from the input.
- scanToken() to advance.
- return Null if error.
Python Implementation
To be very technical, it is very important to handle errors at each step. However, for simplicity, I won't handle errors as for now. If I have some time in the future, I might update the code.
from collections import deque
class Node(object):
def __init__(self, val=" ", left=None, right=None):
self.val = val
self.left = left
self.right = right
class Parser:
@staticmethod
def expTree(expression):
tokens = deque(list(expression))
return Parser.parse_expression(tokens)
@staticmethod
def parse_expression(tokens):
lhs = Parser.parse_term(tokens)
while tokens and tokens[0] in ['+', '-']:
op = tokens.popleft()
rhs = Parser.parse_term(tokens)
lhs = Node(val=op, left=lhs, right=rhs)
return lhs
@staticmethod
def parse_term(tokens):
lhs = Parser.parse_factor(tokens)
while tokens and tokens[0] in ['*', '/']:
op = tokens.popleft()
rhs = Parser.parse_factor(tokens)
lhs = Node(val=op, left=lhs, right=rhs)
return lhs
@staticmethod
def parse_factor(tokens):
# if integer
if tokens[0].isnumeric():
num = 0
while tokens and tokens[0].isnumeric():
num = num * 10 + int(tokens.popleft())
return Node(val=num)
elif tokens[0] == '(':
# consume '('
tokens.popleft()
# if node is none: raise error
node = Parser.parse_expression(tokens)
# consume ')'
# if token is not ')': raise error
tokens.popleft()
return node
# if negate
elif tokens[0] == '-':
# consume '-'
tokens.popleft()
# if numNode is none: raise error
numNode = Parser.parse_factor(tokens)
return Node(val='-', left=None, right=numNode)
# invalid token: raise error
else:
return None
@staticmethod
def print_tree(root):
def _print_node(node):
# leaf node with number value
if node.left is None and node.right is None:
print(node.val, end='')
return
# negate case: - (3 + 5)
if node.left is None and node.right is not None:
print('(', end='')
# node.val should be '-' (negate)
print(node.val, end='')
_print_node(node.right)
print(')', end='')
return
# operator case with both children
# if node.left is None: raise error
print('(', end='')
_print_node(node.left)
# node.val should be an operator
print(node.val, end='')
_print_node(node.right)
print(')', end='')
_print_node(root)
print()
@staticmethod
def eval_tree(root):
def _eval_node(node):
# leaf node with number value
if node.left is None and node.right is None:
return node.val
# negate case: - (3 + 5)
if node.left is None and node.right is not None:
# node.val should be '-' (negate)
return -_eval_node(node.right)
# operator case with both children
# if node.left is None: raise error
lhs, rhs = _eval_node(node.left), _eval_node(node.right)
if node.val == '+':
return lhs + rhs
elif node.val == '-':
return lhs - rhs
elif node.val == '*':
return lhs * rhs
elif node.val == '/':
return int(lhs/rhs)
# invalid operator: raise error
else:
return None
return _eval_node(root)
'''
Example
'''
root = Parser.expTree('2-3/(5*2)+1')
Parser.print_tree(root)
print(Parser.eval_tree(root))
Relevant LeetCode Problem:
1597. Build Binary Expression Tree From Infix Expression
https://leetcode.com/problems/build-binary-expression-tree-from-infix-expression/description/
We can basically use the same code for the solution !
# Definition for a binary tree node.
class Node(object):
def __init__(self, val=" ", left=None, right=None):
self.val = val
self.left = left
self.right = right
from collections import deque
class Solution:
def expTree(self, s: str) -> 'Node':
tokens = deque(list(s))
return self.parse_expression(tokens)
def parse_expression(self, tokens):
lhs = self.parse_term(tokens)
while tokens and tokens[0] in ['+', '-']:
op = tokens.popleft()
rhs = self.parse_term(tokens)
lhs = Node(val=op, left=lhs, right=rhs)
return lhs
def parse_term(self, tokens):
lhs = self.parse_factor(tokens)
while tokens and tokens[0] in ['*', '/']:
op = tokens.popleft()
rhs = self.parse_factor(tokens)
lhs = Node(val=op, left=lhs, right=rhs)
return lhs
def parse_factor(self, tokens):
# if integer
if tokens[0].isnumeric():
num = 0
while tokens and tokens[0].isnumeric():
num = num * 10 + int(tokens.popleft())
return Node(val=str(num))
elif tokens[0] == '(':
# consume '('
tokens.popleft()
# if node is none: should raise error
node = self.parse_expression(tokens)
# consume ')'
# if token is not ')': should raise error
tokens.popleft()
return node
'Computer Science > Algorithms & Data Structures' 카테고리의 다른 글
Fenwick Tree (or Binary Indexed Tree) Python Implementation, 308. Range Sum Query 2D - Mutable (0) | 2023.03.02 |
---|---|
LeetCode: 1628. Design an Expression Tree With Evaluate Function (0) | 2023.01.26 |
Rabin-Karp Algorithm (2) | 2022.11.09 |
Python Sorted Containers (0) | 2022.10.26 |
LeetCode: 1996. The Number of Weak Characters in the Game (0) | 2022.10.21 |