Robert's Perl Tutorial

http://www.sthomas.net/roberts-perl-tutorial.htm


\w

What would be more useful is to use a-zA-Z instead. If we weren't using /i we'd need that. As a-zA-Z is such a common construct, Perl provides an easy shorthand:

s/ us[^\w]/ them/g;

The \w construct actually means 'word' - equivalent to a-zA-Z_0-9 . So we'll use that instead.

To negate any construct, simply capitalise it:

s/ us[\W]/ them/g;

and of course we don't need the negating caret now. In fact, we don't even need the character class!

s/ us\W/ them/g;

So far, so good. Matching the first 'us' is going to be difficult though. Fortunately, there is an easy solution. We've seen Perl's definition of a word - \w . Between each word is a boundary. You can match this with \b .

s/\bus\W/ them/g;

that's \b followed by 'us', not 'bus' :-)

Now, we require a word boundary before 'us'. As there is a 'nothing' at the start of the string, we have a match. There is a space after the first 'Us', so the match is successful. You might notice an extra space has crept in - that's the space we added earlier. The match doesn't include the space any more - it matches on the word boundary, that is just before the word begins. The space doesn't count.

Did you notice the final period and the comma are replaced? They are part of the match - it is the \W that matches them. We can't avoid that. We can however put back that part of the match.