regex - How to use re.sub to replace a match by repl which contains '\g' in python -
re.sub(pattern, repl, string, count=0, flags=0)
as in doc, if \g
in repl, python looking next char <
. unfortunately need repl contain \g
, , cannot put raw string r'repl_string'
in position of repl since string variable. , if put re.escape('repl_string') works result not want, since escapes of chars.
what should do?
here code have:
newline = '<p align="center"><img src="https://s0.wp.com/latex.php?latex=%5cdisplaystyle+%7b%5cbf+p%7d%28+%7c%5cfrac%7bs_n+-+n+%5cmu%7d%7b%5csqrt%7bn%7d+%5csigma%7d%7c+%5cgeq+%5clambda+%29+%5c+%5c+%5c+%5c+%5c+%282%29&bg=ffffff&fg=000000&s=0" alt="\\displaystyle {\x08f p}( |\x0crac{s_n - n \\mu}{\\sqrt{n} \\sigma}| \\geq \\lambda ) \\ \\ \\ \\ \\ (2)" title="\\displaystyle {\x08f p}( |\x0crac{s_n - n \\mu}{\\sqrt{n} \\sigma}| \\geq \\lambda ) \\ \\ \\ \\ \\ (2)" class="latex" width="173" height="38" srcset="https://s0.wp.com/latex.php?latex=%5cdisplaystyle+%7b%5cbf+p%7d%28+%7c%5cfrac%7bs_n+-+n+%5cmu%7d%7b%5csqrt%7bn%7d+%5csigma%7d%7c+%5cgeq+%5clambda+%29+%5c+%5c+%5c+%5c+%5c+%282%29&bg=ffffff&fg=000000&s=0&zoom=2 2x" scale="2">' re.sub(r'<img.*?>', '\\[ {\\bf p}( |\\frac{s_n - n \\mu}{\\sqrt{n} \\sigma}| \\geq \\lambda ) \\ \\ \\ \\ \\ (2)\\]', newline, count = 1)
you need make sure \g
turned \\g
in replacement string. more, need replace backslashes in replacement pattern 2 backslashes prevent further issues.
use
rpl = rpl.replace('\\', '\\\\')
see demo:
import re rpl = r'\geq \1' # print(re.sub(r'\d+', rpl, 'text 1')) # sre_constants.error: missing group name # print(re.sub(r'\d+', r'some \1', 'text 1')) # sre_constants.error: invalid group reference print(re.sub(r'\d+', rpl.replace('\\', '\\\\'), 'text 1')) # => text \geq \1 (as expected)
Comments
Post a Comment