I have a bunch of perl regexps in a script. I would like to know how many capture groups are in them. More precisely I'd like to know how many items would be added to the @- and @+ arrays if they matched before actually use them in a real match op.
An example:
'XXAB(CD)DE\FG\XX' =~ /(?i)x(ab)\(cd\)(?:de)\\(fg\\)x/
and print "'@-', '@+'\n";
In this case the output is:
'1 2 11', '15 4 14'
So after matching I know that the 0th item is the matched part of the string, and there are two capture group expressions. Would it be possible to know right before the actual match?
I tried to concentrate onto the opening brackets. So I removed the '\\' patterns first to make easier to detect the escaped brackets. Then I removed '\(' strings. Then came '(?'. Now I can count the remaining opening brackets.
my $re = '(?i)x(ab)\(cd\)(?:de)\\\\(fg\\\\)x'; print "ORIG: '$re'\n";
'XXAB(CD)DE\FG\XX' =~ /$re/ and print "RE: '@-', '@+'\n";
$re =~ s/\\\\//g; print "\\\\: '$re'\n";
$re =~ s/\\\(//g; print "\\(: '$re'\n";
$re =~ s/\(\?//g; print "\\?: '$re'\n";
my $n = ($re =~ s/\(//g); print "n=$n\n";
Output:
ORIG: '(?i)x(ab)\(cd\)(?:de)\\(fg\\)x'
RE: '1 2 11', '15 4 14'
\\: '(?i)x(ab)\(cd\)(?:de)(fg)x'
\(: '(?i)x(ab)cd\)(?:de)(fg)x'
\?: 'i)x(ab)cd\):de)(fg)x'
n=2
So here I know that 2 capture groups are in this regexp. But maybe there is an easier way and this is definitely not complete (e.g. this treats (?<foo>...) and (?'foo'...) as a non-caputre groups).
Another way would be to dump the internal data structures of regcomp function. Maybe the package Regexp::Debugger could solve the issue, but I have no right to install packages in my environment.
Actually the regexps are keys to some ARRAY refs and I'd like to check if the referenced ARRAY contains the proper amount of values before actually applying the regexps. Of course this checking can be done right after the pattern matching, but it would be nicer if I could do it in the loading stage of the script.
Thank you for your help and comments in advance!
[^()]– ikegami Jan 19 at 14:51# ()(when/xis used) – ikegami Jan 19 at 14:51(?{ () })and similar. – ikegami Jan 19 at 14:52