Digital Media

Moshell - Spring 2000

Lecture 23 - Internet Security

This lecture discusses two kinds of security: hacking, and encryption-based defenses against hacking. Both topics are enormous, and so we can just sample a few issues, including
 

We also lead the reader through the actual setup of a secure site. However,  future users of these notes will have to recustomize these notes since the specific commercial server, 'creationfactory.com' may not be available at that time.

Here we go...

Server Side Includes

SSI stands for two things (as you find out when you search the web:) Social Security Income, and Server Side Includes. The latter is our topic today. Come back and see me when you're 65 to discuss the former meaning.

A good page about SSI can be found at the National Center for Supercomputer Applications (NCSA).

SSI's Purposes

In brief, SSI allows you to run a script or ask the operating system for some information, right in the middle of a HTML document. You usually have to call the document SHTML instead of HTML (in the file extension), to get the server to look up and process the SSIs. The basic cost of SSI is that any HTML run through this processor must be parsed, looking for the flags. This is a performance hit on your server, and this cost would be multiplied by thousands of Web accesses. So the SSI pre-processor is only used when the SHTML extension is seen.

As you found out on the midterm exam, SSIs are marked with a special kind of HTML comment:

<!--#   and then the SSI stuff  -->

The sequence <!--# is required, to trigger the SSI process. The little yellow book seems to have omitted the # but it is required.

Here are some of the commands that SSI supports:

<!--#ECHO VAR="DATE_LOCAL" -->    prints the date and time
<!--#INCLUDE FILE="mytext.html" --> inserts the associated file right at this point in output stream

Here's a demo page of shtml commands. The original file looks like this:

<HTML>
<HEAD><TITLE>SSI Demo Page </TITLE></HEAD>
<BODY>
<H1>SSI Demo Page</H1><HR>
<!--#CONFIG TIMEFMT="%c" -->
Date/time: <!--#ECHO VAR="DATE_LOCAL" --> <BR>
<!--#CONFIG TIMEFMT="%x" -->
Local date: <!--#ECHO VAR="DATE_LOCAL" --> <BR>
Document Name: <!--#ECHO VAR="DOCUMENT_NAME" -->
URL: <!--#ECHO VAR="DOCUMENT_URL" --> <BR>
Last modified: <!--#ECHO VAR="LAST_MODIFIED" --> <BR>
<!--#INCLUDE FILE="insertfile.html" -->
<HR>
<!--#EXEC CMD=finger $REMOTE_USER@$REMOTE_HOST" -->
</body></html>

When this demo runs, of course, the source code your browser receives (give it a look with the View Source option)  is NOT identical to the demo page's source.

Query 23.1: Why not?

The EXEC command looks like this:

<!--#EXEC cmd="finger $REMOTE_USER@$REMOTE_HOST" -->

In this case, it's executing a simple Unix command (finger). However you can also have it execute a CGI script. There are security hazards in both options, however. Consider allowing the system to execute rm *.*
for instance.... if 'nobody' owns any files, they are gone bye-bye.

There are two basic reasons to use SSI:

1) Sometimes it's simpler to invoke a server functionality, or include some other HTML file, from inside one's HTML, than it is to write a special script (like our Lab #3) which generates special purpose dynamic HTML;

2) Sometimes you want to invoke a CGI script (or even to execute a regular program) to dynamically generate part of the HTML that you need.

The second capability listed above is the one where unscrupulous people can use SSI to do mischief. If the administrator has for some foolish reason set up the system so that it can execute CGI scripts anywhere on that server, then any other user of your ISP could potentially violate your security. To see why, we gotta discuss ownership and Unix.

Ownership, suEXEC and CGIwrap

Any time you access a web server such as Apache, you start a process. But since web-hits are from strangers, this process cannot belong to any particular user. Therefore it is assigned to a pseudo-user, usually called 'nobody'. Now Apache parses the URL and decides to run a Perl script in your directory. The only way it could do that would be if the ownership of that script was 'nobody'. But that script, in turn, could only access data files which also belong to 'nobody', or which were set with 666 access. Any user of that Unix system could read or write your files! Obviously a bad idea.

CGIwrap and suEXEC are Unix utilities that deals with this problem. They allow your script to execute with your own permissions. Therefore your script can access your files-but other users of your Unix host computer cannot simply open or modify the files (as they could if you set the privileges to 666.)

A whole bunch of other security checks are performed before CGIwrap or suEXEC allows your script to run.  I find the suEXEC explanation to be relatively easy to read, and recommend that you go read it to answer the following queries.

Query 23.2: In the above suEXEC writeup, skip down to suEXEC Security Model.  Step 4 (out of 20) security precautions asks:

Does the target program contain a leading '/' or have a '..' backreference?

Explain why these elements in a script's pathname are unacceptable.

Query 23.3: suEXEC uses a Unix feature called setuid. Look this up somewhere and write a brief explanation of how it works. Or (if all else fails) consult the notes you took from my lecture on the subject.

Back to SSI security

So.... web servers that support Server Side Includes usually either turn off the EXEC feature entirely, or run them through something like suEXEC or CGIwrap, whether or not the user explicitly requests this protection (as we do in our labs.)

Firewalls

The original meaning of firewall was a masonry structure within an otherwise wooden building. It had to run all the way through the roof, so that half the building could burn down and leave the other half standing. You can see firewalls sticking through the roofs

There is an excellent Tutorial on firewalls, and I'm gonna query you about it right below.

To understand firewalls, you need to understand Ports. Basic services such as FTP are offered through a set of numbers called "Well known ports". UNIX standardly offers a large set of such ports, which offer telnet, finger, e-mail, web access and many more capabilities. If a remote user can simply telnet into your system, then nothing is preventing him from doing so except a password (see next section about how to get around passwords.) So, the first idea about a firewall is to simply not pass through most inbound port access requests. A very strict form of firewall is one which only lets e-mail through. A slightly less strict one would allow most kinds of outward access, but very little inbound access.

Now go read the tutorial if you haven't, to learn lots of other things that firewalls do.

Query 23.4 Modems are the great enemy of firewalls. Why?

Query 23.5 Can a firewall protect you against viruses? Why or why not?

Query 23.6 A network level firewall can be thought of as a security guard standing by the front door of your company. In this analogy, what kind of scrutiny is that guard applying to stuff coming in and out of the building?

Query 23.7 Application level firewalls are computers running proxy servers. What is a proxy server? What level of specifity does a proxy server typically have (i. e. how many kinds of them do you need?)

Query 23.8. Your company has a large amount of data they want external users to be able to FTP, but they are afraid of allowing general FTP access to the intranet. What do you recommend?

Query 23.9. What is a denial of service attack?
 

Password Cracking

A hacker needs to know two things to enter a system: a userid, and the password. Userids are usually trivial to get, because they appear all over the place on websites and are the usual names used as email addresses. Passwords can be figured out in two ways:

1) If you're not inside the system yet, you have to knock on the front door and try passwords. Most systems nowadays are smart enough to take longer and longer between incorrect attempts, and to hang up after the fourth try (if on a modem.)

2) If you're inside the system (with a possibly not-very-powerful userid) and can get at the password file, you will find the encrypted versions of everyone's password. Now you can really go to town, because you can use the Unix encryption algorithm to rapidly try huge lists of words to see if they produce the same encrypted password you see in the list.

"Huge lists of words" include the entire dictionaries of most European languages, lists of first names and last names. Common substitutions which cracker programs also try include substituting 1 for L or I and O for zero, and varying uppercase and lowercase.

Basic password hygiene suggests that you not use any natural language words or personal names (or, obviously, any pure integers!) The best password is a totally random sequence of characters including some alpha in both cases and some numbers.

Cracker software is freely downloadable on the Web. Some of the sites are downright scary.

Secure Sockets Layer and HTTPS

There is a simplistic tutorial available on this subject at web2010. A much more sophisticated one can be found at Netscape. This one serves as the basis for today's lecture.

Query 23.10. Explain how HTTPS uses public key encryption and yet doesn't pay the heavy computational penalties associated with that technique, if all the traffic on an HTTPS link were so encrypted.
 

Setting up your own secure site.

Well, it is almost too easy. The easiest way is to use your web host's own certificate, which works just as well as having your own. The only disadvantage is that the URL you use will reveal that you don't own your own equipment, but that's no shame - most e-businesses use virtual servers these days.

For creationfactory.com, the url one uses for secure access is

https://www35.securedweb.net/creationfactory/cgibin/moshell/hello.cgi

In this case, replace 'moshell' with your own subdirectory within creationfactory's cgibin directory, and 'hello.cgi' with whatever script you want the form to run. I determined that we are on server www35 by going to web2010's FAQ page and following its advice ... to wit, look at creationfactory's configuration file which is located at

www.creationfactory.com/setup

It told me, way down at the bottom, that we are running on a machine named www35. Then I followed the advice in the FAQ about how to construct a secure url, which is above. The https tells the system to use the Secure Sockets Layer.

Digital Cash

.. is a great idea that's taking its time becoming real. In fact when you go see Digicash.com, you discover that they're still operating but not (it seems) making much money. However, there is a pretty good tutorial article that covers some key ideas. However it doesn't explain 'anonymous e-money', so we need another article.

Here's a fragment of an article that appeared in Scientific American, which is pretty good on this subject.

Here's an article that sets forth some useful ideas, by a guy named J. O. Grabbe. It was published in a Libertarian journal, as this political group is very concerned with government intrusion into privacy.

Here's a much more mathematical  J. O. Grabbe article for those who really want to go into the subject of digital money.

Query 23.11. What is a blind signature and how does it permit Alice to spend money without either the merchant or the bank being able to trace Alice's spending patterns?

Query 23.12. How does the DigiCash scheme depend on an 'observer' software system that runs on your own smartcard? How does this idea relate to the First Principle of Cryptography (see the last lecture.)
 

Back to the course index
Back to the course syllabus
Back to the previous lecture
Onward to the next lecture