Web Design and Technologies (4004-737)


9 February 2004

Today's Topic: Dynamic Sites and CGI

This week’s topic is CGI, or the “common gateway interface.” This is the interface that allows the web server to run a program on the server, and then return the results of the program to the user. Typically, the program will take as an input the contents of a form—whether it’s a simple one-box search form (like Google), or a complex, lengthy registration form.

Before starting on the exercises (which will be a separate post), read Chapter 11 of the Webmaster in a Nutshell book, and the Webmonkey article “CGI Scripts for Fun and Profit.”

You should also review the information we covered on forms, since form attributes like “action” and “method” become very important in the context of CGI.

Dynamic Sites and CGI: Second Exercise

Now that you’ve done a little bit with CGI, I‘m going to have you retrieve a CGI script from an archive, and install it into your account on Grace. Then you’ll develop a form to access that script.

We’ll be using a script from Matt’s Script Archive called FormMail, which is used to email form inputs to you. Go to the archive and download the script now. You should also review at the ReadMe file for FormMail.

Part 1: Configuring and Installing FormMail

You’ll be installing the formmail script into the same directory where you placed the first.cgi script from the last exercise. The permissions should already be set properly on that directory.

Save/download and open FormMail.pl so that you can edit the necessary variables.

The first line of the file needs to show the location of perl on the server. On grace, the location is /usr/local/bin/perl (you can find this by doing the “which perl” command at the unix prompt).

Default: #!/usr/bin/perl
Grace: #!/usr/local/bin/perl

After that, there are only three variables in the perl file that you will need to define:

1) The $mailprog variable must properly define the location to your server’s sendmail program. If this is incorrect, form results will not be mailed to you (because the program won’t know WHERE the mail program is on your system). It’s tricky to find the sendmail program on grace. You’d think it would be in something like usr/lib or even usr/bin. It’s in usr/sbin … and, if you type “which sendmail” at the prompt, you can verify this.

Default: $mailprog = ’/usr/lib/sendmail –i -t’;
Yours: $mailprog = ’/usr/sbin/sendmail –i -t’;

2) The next thing that must be changed is called @referers. This controls basic access to your formmail script. You wouldn’t want the entire world pointing to your server, right? Let them get their own script and install it themselves. (Think of how much a spammer would enjoy having free access to your mail scripts to blanket the world with more unwanted mail.) On mine, I changed this value to (‘rit.edu’,’lawley.net’) so that I can run the script from any of my RIT accounts or my server.

The default @referers looks like this:
@referers = (‘scriptarchive.com’,’209.196.21.3’);

Yours should look like this:
@referers = (‘rit.edu’);

Now only web pages on rit.edu folks can call this script.

3) The third one, @recipients, is the most important one… This one will stop spammers or hackers from using your form to pollute the world with unwanted e-mail! We can set this one to hold either domain names or specific e-mail addresses that the form can send mail to. (For an exhaustive description, see the Read Me).

It’s important to realize that you need to add the domain of each e-mail address you want to send to (sub-domains need to be listed separately!)

Default: @recipients = &fill_recipients(@referers);
Yours: @recipients = &fill_recipients(‘rit.edu’,’it.rit.edu’);

Now you can have the form send E-Mail to either your RIT address or your FirstClass address! While you could add something like ‘hotmail.com’, that makes the script less secure. The most secure approach would be to use specific addresses rather than domains.

Finally, because it’s a cgi script, you’ll need to change the name of the script from formmail.pl to formmail.cgi to get it to work. You can rename the file in any number of ways including using the mv command in UNIX or renaming the file before you install it. (Remember – since it is a script it MUST have execute permissions!)

Test the script by loading it directly in a browser; http://www.rit.edu/~yourid/pathtoyourcgidir/formmail.cgi. You should see a box with the name of the program and a copyright statement; if you get an error, be sure to check (a) permissions on the directory and the script (should be 755), (b) line breaks on the perl file, © correct perl address in the first line, etc.

Part 2: Creating a Form to Use FormMail

Now you need to create a form to call the script. The form should be somewhere in your www directory tree (but not in the cgi directory, ideally). Set the method=”POST” and action=”http://www.rit.edu/~yourid/pathtoyourcgidir/formmail.cgi”

One field on your form should be named “recipient” and should have a value of an email address with a domain that is included in the referrers array of the script (e.g @rit.edu or @it.rit.edu)

Use the documentation (the Read Me file on the FormMail page) to determine what other reserved field names are used by the script, and see if you can use them appropriately and successfully in your mail form.

Make sure you upload the HTML page with the form to Grace before you test it; otherwise the script will reject it because it’s not on an approved “referrer” site.

Open your form in a browser, fill it out, and submit it. Check your email. Did you get the information? If not, go back through these steps and keep trying until you get it installed correctly. (Sometimes there’s a delay in receiving the mail.)

Dynamic Sites and CGI: First Exercise

We need to start this exercise by making sure that you are able to install and run simple CGI programs on Grace written in Perl.

Setup: Create a subdirectory in your ‘www/737’ directory on grace called “cgi”. Remember to set the permissions so that the web server can read and execute both the contents and the program files. (Example: chmod 755 first.cgi)

Create the Program File: Using a text editor type in the following program, exactly as it appears below:

#!/usr/local/bin/perl -w
#
# Very simple cgi script that produces a web page as output
#
#  Your name here 
#  4002-737 2/04
#

#print the http Response Header before the html document
print "Content-type: text/html\n\n";  #notice blank line

#start the web page
print "<html><head><TITLE>Generic CGI program</TITLE> </head>";
print "<body> <H1>Generic Web Page</H1> \n";
print "<HR>\n";
print "stuff goes here\n";
print "<HR>\n";

#print the end of the web page
print "</body> \n</html>\n";

#end of this program

Install the Program File: If you created it on your PC or Mac, now upload it to your cgi subdirectory. Make sure you transfer the file in “ASCII” or “Text” mode and not “binary.” Then login to grace and view the file using a Unix text editor (like pico or vi). Make sure it looks okay (are the line endings in the right place? are there strange characters? if there are problems, you may need to save it in a different format, or upload it differently…)

What should the output of this program be? (Describe it in your blog entry?)

Make sure you understand what the program is supposed to be doing, so you will recognize correct output when you see it.

Test the Program File: Now, run the program directly on Grace, by typing this command at the prompt:

perl first.cgi

If all goes well, your program should execute. If it doesn’t, track down and fix the syntax errors until it runs.

Capture the output: To more accurately simulate how the web server will execute your program, use redirection of output to capture the output of your program into a text file. Try running your program a slightly different way:

first.cgi > firstout.txt

Notice we don’t have to use the “Perl” command at the command line.

Examine the output to make sure it is formed correctly by viewing the file firstout.txt. If everything looks okay, keep going. If not, see if you can find the problem.

Question to answer in your blog entry: What precedes the html in your output? Why would that need to be there?

Now you are ready to see if the program can be run as a cgi program.

Note: on some systems the program must end in a .cgi suffix regardless of the language in which it was written, in order to run as a cgi program. Grace is one of them. Some servers also require that all cgi files be placed in a “cgi-bin” directory in your main web directory, but Grace does not require this; you can place .cgi files anywhere in your web directory.

Test your cgi program by loading it in a browser (http://www.rit.edu/~yourid/737/cgi/first.cgi).

Create Another CGI Program: If that works, create a second cgi program—call it second.cgi. This second program will not produce a web page. All that it will do is output an http response header to send the user to a new location. Not surprisingly, this uses a Location: header, and works much like a html refresh meta-tag. It tells the browser to request another document at another location. Have the program redirect to your 737 personal page. (http://www.rit.edu/~yourid/737/).

Here’s an example of a Location header:

Location: http://www.w3.org/pub/WWW/People.html

How would you output this? You use a print statement to output this type of header, of course. The header is for the browser and tells it to look elsewhere for a document. Remember to output a blank line after the header! There is no html output by this program and that means, you do not return content from your program (so what header don’t you need?).

Test the Second Program: Create an index.html file for your cgi directory, and include a link to both cgi programs. When a user clicks on the link to the second.cgi program, s/he should be redirected immediately to your main 737 page.

Question to answer on your blog: Is this a useful thing? Why bother?