Robert's Perl Tutorial

http://www.sthomas.net/roberts-perl-tutorial.htm


The Difference Between + and *

You know what * means, namely match 0 or more. If you want to match 1 or more, then use + . The difference is important.

$_='The number is 2200 and the day is Monday';

($star)=/([0-9]*)/;

($plus)=/([0-9]+)/;

print "Star is '$star' and Plus is '$plus'\n";

You'll note that $star has no value. The match was successful though. It managed to match 0 or more characters from 0 to 9 at the very start of the regex.

The second regex with $plus worked a little better, because we are matching one or more characters from 0 to 9. Therefore, unless one 0 to 9 is found the match will fail. Once a 0-9 is found, the match continues as long as the next character is 0-9, then it stops.

Now we know this, there is another way to remove an email address from within angle brackets:

$_='My email address is <robert@netcat.co.uk> !.';

/<([^>]+)/i;

print "Found it ! $1\n";

This regex matches <. Then the capturing parens start. They have no effect on this regex other than to capture the match. After that, there is a character class, containing one character. As ^ is the first character is the class, it negates the class. That's why we are using a character class with only one character in it, because it can be negated.

So far we have matched < and anything that is not >. The + ensures we match as many characters that are not <'s as we can. This has the same effect as .*? but is more efficient. It may also suit your purposes, as .*? relies on you knowing what you want to match up to, whereas [^>]+ simply contines matching until it finds something that fails its criteria. Just make sure you understand the difference because it is a crucial part of regexery.