Kalanand's May 2014 Log
May 5th
Remove redundant parentheses
I recently encountered an interesting practical problem.
I have several mathematical expressions like
(a + (b*c)) * (d * ( f * j) )
in a large file and I want to remove the redundant parentheses.
So, the desired output in this particular instance should be:
(a + b * c) *d * f * j .
How can we do this in an automated way ?
The easiest way I can think of is the commandline sed utility:
echo "(a + (b*c)) * (d * ( f * j) )" | sed 's/[ ]//g' | sed 's/([a-z]\*[a-z])/{&}/g' | sed 's/{(//g; s/)}//g' | sed 's/([a-z]\*[a-z]\*[a-z])/{&}/g' | sed 's/{(//g; s/)}//g'
Output: (a+b*c)*d*f*j
I suspect there is a simpler way, but it took me five successive
invocations of sed command to get the desired output.
The first invocation gets rid of all the blank spaces.
echo "(a + (b*c)) * (d * ( f * j) )" | sed 's/[ ]//g'
(a+(b*c))*(d*(f*j))
The second invocation puts curly brackets around expressions like (?*?).
echo "(a + (b*c)) * (d * ( f * j) )" | sed 's/[ ]//g' | sed 's/([a-z]\*[a-z])/{&}/g'
(a+{(b*c)})*(d*{(f*j)})
The third invocation removes the combination of parentheses and curly
brackets around ?*?.
echo "(a + (b*c)) * (d * ( f * j) )" | sed 's/[ ]//g' | sed 's/([a-z]\*[a-z])/{&}/g' | sed 's/{(//g; s/)}//g'
(a+b*c)*(d*f*j)
We are still left with expressions like ?*?*?. So we need to
repeat the previous two steps, and then we are done.
Clearly, the above solution is not scalable if we also have
curly and angular brackets in the mix. In that case we do need to use a
regular programming language.
A pair of parentheses (or brackets) is necessary if and only if they
enclose an unparenthesized expression of the form X % X % ... % X, where
X are either parenthesized expressions or atoms, and % are binary operators,
and if at least one of the operators % has lower precedence than an
operator attached directly to the parenthesized expression on either
side of it.
So, we can check any pair of matching parentheses directly for
redundancy by the following algorithm:
- Let L be operator immediately left of the left parenthesis, or nil
- Let R be operator immediately right of the right parenthesis, or nil
- If L is nil and R is nil:
Redundant
Else:
- Scan the unparenthesized operators between the parentheses
- Let X be the lowest priority operator
- If X has lower priority than L or R:
Not redundant
Else:
Redundant
Here is a python code to implement this:
#!/usr/bin/python
def compare(operator1, operator2):
list1 = ['*', '/', '%']
list2 = ['+', '-', '(', ')', '{', '}', '[', ']']
priority = 0
if operator1 in list1 and operator2 in list1: priority = 0
if operator1 in list1 and operator2 in list2: priority = 1
if operator1 in list2 and operator2 in list1: priority = -1
if operator1 in list2 and operator2 in list2: priority = 0
return priority
def findLowestPriorityOperator(operators):
lowest = operators[0]
for operator in operators:
if compare(operator, lowest) < 0: lowest = operator
return lowest
def getOperators(string, start, end):
operators = []
symbols = ['+', '-', '*', '/', '%', '(', ')', '{', '}', '[', ']']
for symbol in symbols:
if(symbol in string): operators.extend(symbol)
return operators
def removeRedundantParentheses(str, symbol1, symbol2):
str = str.replace(' ', '') # get rid of white space
if not symbol1 in str or not symbol2 in str:
return str # do nothing
size = len(str)
open = str.rfind(symbol1)
close = str.find(symbol2, open, size)
# First check the operators outside the parentheses
outer1 = '',
if open != 0: outer1 = str[open-1]
outer2 = ''
if close != size-1: outer2 = str[close+1]
# Now get the list of operators inside the parentheses
ref = getOperators(str, open, close)
lowest = findLowestPriorityOperator(ref) # find lowest priority
# Check if redundant
isRedundant = False
if outer1 == '' and outer2 == '': isRedundant = True
elif compare(lowest, outer1)>=0 and compare(lowest, outer2)>=0:
isRedundant = True
# If redundant, we need to check recursively
if not isRedundant: return str
str = str[0:open] + str[open+1: close] + str[close+1: size]
str = removeRedundantParentheses(str, symbol1, symbol2)
return str
def removeRedundant(str):
str = removeRedundantParentheses(str, '[', ']')
str = removeRedundantParentheses(str, '{', '}')
str = removeRedundantParentheses(str, '(', ')')
return str
# Try with a few complicated expressions
print removeRedundant('a * ( { [( b+ c )] } )')
print removeRedundant('(a+(((b-c)))*d)')
print removeRedundant('((a+b))*c')
print removeRedundant('((a+b)/(c+d))')
print removeRedundant('(a+(((b-c)))*d)')
The output is
a*(b+c)
(a+(b-c)*d)
(a+b)*c
((a+b)/(c+d))
(a+(b-c)*d)
Go to April's log
Last modified: Wed May 7 19:00:01 CST 2014