Robert's Perl Tutorial

http://www.sthomas.net/roberts-perl-tutorial.htm


Opening a Process

The problem with backticks is that you have to wait for the entire process to complete, then analyse the entire return code. This is a big problem if you have large return codes or slow processes. For example, the DOS command tree. If you aren't familiar with this command, run a DOS/command prompt, switch to the root directory (C:\ ) and type tree. Examine the wondrous output.

We can open a process, and pipe data in via a filehandle in exactly the same way you would read a file. The code below is exactly the same as opening a filehandle on a file, with two exceptions:

  1. We use an external command, not a filename. That's the process name, in this case, tree.
  2. A pipe, ie | is appended to the process name.
open TRIN, "tree c:\\ /a |" or die "Can't see the tree :$!";

while (<TRIN>) {
	print "$. $_";
}

Note the | which denotes that data is to be piped from the specified process. You can also pipe data to a process by using | as the first character.

As usual, $. is the line number. What we can do now is terminate our tree early. Environmentally unsound, but efficient.

open TRIN, "tree c:\\ /a |" or die "Can't see the tree :$!";

while (<TRIN>) {
	printf "%3s $_", $.;
	last if $. == 10;
}

As soon as $. hits 10 we shut the process off by exiting the loop. Easy.

Except, maybe it won't. What if this was a long program, and you forgot about that particular line of code which exits the loop? Suppose that $. somehow went from 9 to 11, or was assigned to? It would never reach 10. So, to be safe

open TRIN, "tree c:\\ /a |" or die "Can't see the tree :$!";

while (<TRIN>) {
	printf "%3s $_", $.;
	last if $. >= 10;
}

exit your loops in a paranoid manner, unless you really mean only to exit when at line ten. For maximum safety, maybe you should create your own counter variable because $. is a global variable. I'm not necessarily advocating doing any of the above, but I am suggested these things are considered.

You might notice the presence of a new keyword - printf . It works like print , but formats the string before printing. The formatting is controlled by such parameters as %3s , which means "pad out to a total of three spaces". After the doublequoted string comes whatever you want to be printed in the format specified. Some examples follow. Just uncomment each line in turn to see what it does. There is a lot of new stuff below, but try and work out what is happening. An explanation follows after the code.

$windir=$ENV{'WINDIR'};		# yes, you can access the environment variables !

$x=0;

opendir WDIR, "$windir" or die "Can't open $windir !!! Panic : $!";

while ($file= readdir WDIR) {
	next if $file=~/^\./;		# try commenting this line to see why it is there

	$age= -M "$windir/$file";	# -M returns the age in days
	$age=~s/(\d*\.\d{3}).*/$1/;	# hmmmmm

	#### %4.4d - must take up 4 columns, and pad with 0s to make up space
	####         and minimum width is also 4
	#### %10s  - must take up 10 columns, pad with spaces
	# printf "%4.4d %10s %45s \n", $x, $age, $file;

	#### %-10s - left justify
	# printf "%4.4d %-10s %-45s \n", $x, $age, $file;

	####  %10.3 - use 10 columns, pad with 0s if less than 3 columns used
	# printf "%4.4d %10.3d %45s \n", $x, $age, $file;

	$x++;

	last if $x==15;			# we don't want to go through all the files :-)
}

There are some intentionally new functions there. When you start hacking Perl (actually, you already started if you have worked through this far) you'll see a lot of example code. Try and understand the above, then read the explanation below.

Firstly, all environment variables can be accessed and set via Perl. They are in the %ENV hash. If you aren't sure what environment variables are, refer to your friendly Microsoft documentation or books. The best known environment variable is path, and you can see its value and that of all other environment variables by simply typing set at your command prompt.

The regex /^\./ bounces out invalid entries before we bother do any processing on them. Good programming practice. What it matches is "anything that begins with '.'". The caret anchors the match to the beginning of the string, and as . is a metacharacter it has to be escaped.

Perl has several tests to apply on files. The -M test returns the age in days. See the documentation for similar tests. Note that the calls to readdir return just the file, not the complete pathname. As you were careful to use a variable for the directory to be opened rather than hardcoding it (horrors) it is no trouble to glue it together by using doublequotes.

Try commenting out $age=~s/(\d*\.\d{3}).*/$1/ and note the size of $age . It could do with a trim. Just for regex practice, we make it a little smaller. What the regex does is:

Easy !

Mention should also be made of sprintf , which is exactly like printf except it doesn't print. You just use it to format strings, which you can do something with later. For example :

open TRIN, "tree c:\\ /a |" or die "Can't see the tree :$!";

while (<TRIN>) {
	$line= sprintf "%3s $_", $.;
	print $line;
	last if $. == 10;
}