This lecture is based on Wu Chapter 7, and supplementary material about the TCP Transmission Control Protocol.
Internet is the descendant of ARPAnet. e-mail, FTP were two of its earliest services. Initially an ancient Network Control Protocol (1971) was used. In 1983 the net converted from NCP to TCP/IP. Domain Name Servers were introduced in 1984 when the number of hosts exceeded 1000. In 1991, Arpanet split into Milnet and Internet. HTML was created and the WWW was released. In 1992 the number of hosts exceeded 1 million. In 1996 the number of hosts exceeded 10 million.
Section 7.1.2 lists a few miscellaneous Internet applications. New trends including Java and thin clients (network PCs, an idea which has already died) are discussed in 7.1.3.
Internet Architecture
-- is shown in an idealized form, in Section 7.2. Routers connect Autonomous Systems (AS) to the Internet Backbone. However this section implies that all Internet traffic flows through the NSF-funded vBNS backbone (verh high speed Backbone Network Service.) In fact this doesn't happen. vBNS is only accessible to certain organizations which are participating in high speed experiments, including the Internet II experiment. The Backbone actually has several components, explained below.
The best and clearest explanation I've found of the Internet's architecture is located at the University of Central Oklahoma's web site. Seems like an unusual place to get such information, but it makes good sense. Some key elements of the system:
ISP: Internet Service Provider - equivalent to your local grocery store. Provides dial-up Point of Presence (POP) (or leased line) access to end users, via local telephone company (or maybe cable company) and a server computer. IP addresses are only granted to ISPs or organizations (like universities), not to individuals. ISP's provide a great deal of "hand-holding", teaching users how to set up their modems and deal with ordinary problems. Most small ISPs are losing money and consolidation into larger, more efficient, less user-friendly "K mart" operations is underway.
ISP also provides a local Domain Name Server (discussed below under
Internet Technology). Some ISPs provide web hosting services for private
and business; but web hosting is also being handled increasingly by specialized
firms sometimes called Commerce Service Providers. For a good example
of a CSP, see www.web2010.com.
RNP: Regional Network Provider - the folks from whom ISPs buy their Internet access. Operates a WAN using Frame Relay or ATM, and must access at least one Network Access Point, which are the main crossbar switches of the Internet.
NSP: National (or Network) Service Providers. These are the "Interstate Highways" or perhaps the railroads, in the metaphoric Internet. NSPs must connect to at least three NAPs using at least DS3 rates (45 mbps.) Major NSPs include MCI Worldcom (which incorporates UUnet, ANSnet and MCI); Sprint, AGIS, Cable & Wireless plc, and vBNS (operated by MCI).
NISP: National Internet Service Provider - such as Netcom, provides POPS in many cities, and routes directly into a NAP. Thus it plays the roles of both ISP and RNP.
NAP: Network Access Points. Figure 7.5 correctly shows that the key role of NAP is to provide crossover points between ISPs and the NAP. They don't show how signals get from a local ISP such as Magicnet, into the NSPs. This is typically via a RNP.
There are multiple ways to get from one NAP to the other; the NAP assigns routes depending on traffic volume.
Internet Technology.
We'll talk about Firewalls and Encryption when we do Security, in a later lesson.
Internet Addresses. Actually they're IP addreses. IP addresses are associated with interface cards, not the host computer. But we often speak as though the computer owns the address.
IP addresses are currently expressed as four bytes, either in hex or dotted-decimal, such as 134.24.8.66.
Originally, 134 was the network address and the lower 3 bytes identified a computer. This only worked when there were <=255 networks. (FF means "broadcast.")
Now, the high order bits identify an address class, as
follows:
| Class | High Order Bits | Bytes for Net-ID |
| A | 0 - - - - | 1 |
| B | 1 0 - - - | 2 |
| C | 1 1 0 - - | 3 |
| D | 1 1 1 0 - | (multicasting) |
| E | 1 1 1 1 0 | (reserved for future) |
| 1 bit | 7 bits | 24 bits |
| 0 | Network ID | Host ID |
Class B:
|
|
|
|
|
|
|
|
Class B addreses can connect up to 16,384 networks, each could have 65,536 hosts. Class B can be recognized with first octet between 128 and 191 (=128 + 64 -1). Class B is reserved for networks expected to have at least 256 host computers. Class B addresses are almost impossible to get now.
Class C:
| 3 bits | 21 bits | 8 bits |
| 110 | Network ID | Host ID |
So, the Internet can support about 2 million individual Class C networks,
but each will have less than 256 hosts. Class C addresses' first octet
falls between 192 and 223. Most entities that really should be class B
are now being assigned multiple Class C addresses.
Class D and Class E.
E is reserved for future expansion. But it won't happen now that we're going to IPv6.
Reflections on Addresses. Two million networks: enough?
Well, they would be small ones. The more likely bottleneck will occur at Class B, because any decent sized business needs a Class B, and it's not at all unlikely that there are more than 16,000 businesses or equivalent governmental entities that will want a network in the next few years.
Large providers will need Class A, and there are only a few of these; so I conjecture that a stop-gap will be that large providers will get several class B's or a whole bunch of C's issued to them.
Internet Network Information Center (InterNIC) assigns unique network ID numbers. Local administrators assign host ID numbers. Often they effectively 'recurse' on the Internet strategy by creating sub-networks.
Multicasting. Regular transmission is "unicasting". Multicasting
refers to a host group. The Internet Group Management Protocol (IGMP)
is used.
Example multicast addresses:
224.0.1.1 - Network Time Protocol
Internet Assigned Number Authority (IANA) assigns some multicast addresses as well-known addresses, designating a permanent host group.
Address Protocols: Address Resolution Protocol and RARP.
ARP translates 32 bit Internet addresses into 48 bit Ethernet (or other "link layer") addresses; RARP does the reverse. ARP uses the link layer's broadcast capability to query the net and identify who's currently linked up. It asks them what they think their IP address is, and caches the information for later use. When an IP datagram arrives, the router translates its address into a link layer address and sends it on along.
Domain Name Servers
Domain names like www.cs.ucf.edu can be mapped to numerical IP addresses somewhat like human names can be mapped to telephone numbers. Every ISP has a DNS and a backup DNS.. When it doesn't know the answer to a question, it must have an upstream DNS to ask the question of. Top level domain names are .edu. .com, etc. Total domain name length cannot exceed 24 characters. The Network Information Center, "InterNIC" maintains for each Top Level Domain, a directory of all the Domain Names it has issued. (Certain domains like .mil are handled elsewhere.)
Consider www.creat.cas.ucf.edu. If someone out in the world were trying to find our IP address, they'd ask InterNIC. There, ucf.edu would get stripped off and our local DNS' address given. The user would then ask our DNS and it would be given the full IP address to find www.creat.cas within ucf.edu.
Configuring a Host for the Internet
You gotta provide:
Host's IP address
IP address of the router or gateway of the subnet to which the host
is attached. This is where it's gonna send all outbound traffic if ARP
can't get a local Ethernet address which corresponds to the IP address
you asked about.
IP address of the domain name server and usually a spare, too
Subnet mask, so the router knows what class of net it's dealing with.
Data enroute from the application layer to the transport layer is called an application message. It is probably embodied as a parameter value or an array of data being pointed to in a procedure-call.
When it's flowing from TCP into the network layer it's called a TCP segment (because TCP is a byte-stream delivery service) or sometimes a transport message. If using UDP, we'd encounter a UDP datagram.
When it comes out of IP headed for the hardware, it's an IP Packet. We discussed IP packets back in Lecture 11.
To get information to a destination, a given router can either perform Direct Delivery (if it knows that the host is in an immediately connected network) or Indirect Delivery (by asking the router to find out which adjacent net is best able to pass the information onward.)
Each entry in a routing table contains three fields: Network destination, Gateway and Flags. The Network field is the "input" - this is the field which must match the destination address. The Gateway and Flags fields are the "output" - the answer to the question "what do I do next?"
Direct Delivery. If the Flags field indicates a direct connection, then the network will translate the destination IP address into a link layer address (e. g. Ethernet), using the Address Resolution Protocol; encapsulate the data into a data frame and transmit it directly to its destination.
At the end of every journey is a direct delivery. The mailman puts the letter in your box.
Indirect Delivery. If an indirect connection was indicated by the routing table, the link layer translation still has to occur but the destination will be the next direct connection along the pathway - i. e. the address of the next gateway or router in the chain.
Query 17.1: Internet addresses are 32 bits long. Does this mean that the Internet can have essentially 2**32 computers on it before we run out of usable address space? If not, why not?
IP delivers data between host computers; TCP & UDP deliver data between applications.
IP is, metaphorically, a mail truck and UDP is a postal worker. IP gets the mail into the right post office; then UDP sorts it out to ports. Applications have to check their boxes to see if any mail arrived.
TCP, being connection-oriented, is more like telephone communication. Each application protocol is assigned its own port for incoming traffic. If this is a well-known port (analogous to a company's main telephone number) the app can switch out to a spare port to free up the WKP for other traffic.
When your application sends a message, its port number is automatically attached so that the other party can reply. So you don't need to know your own port number; but you do need to notice the "caller ID" of the one who called you.
You can request a fixed port number to be used by a server app you're building, so that others will know where to call you.
| Protocol | Port Number |
| Echo Protocol | 7 |
| Daytime Protocol | 13 |
| File Transfer Protocol | 21 |
| Telnet Protocol | 23 |
| Simple Mail Transfer Protocol | 25 |
| Time Protocol | 37 |
| Whois Protocol | 43 |
| Trivial File Transfer Protocol | 69 |
| Finger Protocol | 79 |
Query 17.2. Explain the concept of a Protocol Port.
| bit 0 ..... to bit 15 | bit 16 to bit 31 |
| UDP Source Port | UDP Destination Port |
| UDP Message Length | UDP Checksum |
| UDP data area begins here ....... |
The length field describes the entire datagram including the header. The checksum (which is optional) includes the header and the data.
Acknowledgements are used by TCP. Each time transmitting end of a connection sends a message, it starts a timer. If the timer expires before ACK comes, the message is automatically re-transmitted. But it would be inefficient to transmit one message, wait for ACK, then transmit another. Instead, TCP uses a sliding window - i. e. a range of message numbers. It might be working on messages 2,3,4,5,6,7,8,9 of a group, all at once. By the time message 9 is transmitted, it's time for ACK 2 to come back.
The actual width of the window is adapted ("negotiated") to net conditions along the particular virtual circuit being used. High congestion leads to smaller windows (because less speed is possible, old messages get stale when only a few have been transmitted.) Actual windows are several thousand bytes wide.
TCP Segments (messages) have a complex header structure.
Use this diagram to visualize the TCP segment's structure.
| bit 0 ..... to bit 15 | bit 16 to bit 31 |
| 16-bit Source Port | 16-bit Destination Port |
| 32 bit Sequence Number | ----- continues ----- |
| 32 bit Acknowledgement Nr. | ----- continues ----- |
| Header Length (4), Flags (5) | 16-bit Window Size |
| 16-bit TCP Checksum | 16-bit Urgent Pointer |
| Options if any, and padding. | ----- continues ----- |
| Optional Data begins here | ----- continues ----- |
The flag bits are named URG (urgent); ACK (acknowledgement, PSH (please send data now), RST (reset), SYN (synchronize), FIN (finished).
The most obvious thing not itemized here is the LENGTH field. How much data? To be seen later ...
To start a TCP connection, your program (A) sends a request for TCP connection to your host computer's transport layer - which sends a TCP message with the SYN flag set. This message includes a 32 bit sequence number; receiving TCP module (B) stores this for future use. Then B sends back to A, a segment with ACK flag set and an acknowledgement number. A, in turn, stores this one for later use.
Analogy: A says to B: "I'm gonna send you a bunch of Fed Ex packages with sequentially numbered airbills. The first airbill number will be 345 677 891." B replies: "OK, and I'll send you receipts. Here's the first one. In its Ack Nr. field, we put 345 667 892." These "acknowledgement numbers" tell the sender what is the NEXT Fed ex airbill number (seq nr) it expects to receive.
B sets the ACK flag to mark the Ack number as the important part of this message. When doing this, B sets the SYNchronization flag also. This flag means "we're starting a new relationship here; if you're gonna send me data, I probably will want to send you some, too."
Acknowledging the Acknowledgement. Now A has requested and received synchronization from B, but A still must say "OK, we're in business - i. e. officially establishing the connection.
So, A sends back a segment with its Ack number field set to the Seq. Nr. from B, plus one - that is, 1002. No SYN flags are set this time. Summary:
Sequence Numbers begin at an arbitrary value (to increase their usefulness as identifiers, compared to 1,2,3.) They then increment by the number of bytes in the data; so that they look like "bookmarks" reporting where, in the total data stream, the first byte of this segment's data is located. (But that's not seq. nr=0 or 1; it's whatever the initial value was.)
With 32 bit numbers, we can exchange around 4 gigabytes before we wrap around, and even then it's no problem. You're unlikely to have 4 gb of data in the pipe at the same time so that confusion would result.
Closing a Connection: Half-Close and Full-Close. You can close one direction of a connection (A no longer sends to B) but B may still send to A. The FINished flag from A to B just means that A is finished sending. Acknowledging that message only means that B "understands" the half close.
The first end to initiate a close is doing an active close. If the plan is for the other end to then close too, it is said to be doing a passive close.
URGent flag: process this data before any other data.
PSH means PUSH: this tells the TCP module to immediately send this data to the destination application, rather than buffering it until some buffer size is reached. This is used by Telnet for instance, to assure character by-character echo on the screen.
RST means RESET: Sent when TCP detects a problem with a connection. Most applications will terminate when they get RST, but you could design a recovery scheme built around this flag.
FINish means that the sender is done sending. This only
closes data flow in one direction.
TCP Checksum is, obviously, appended by TCP sender and checked by the receiver. UDP does not require a checksum but TCP does.
Urgent Pointer is a controversial critter; supposedly it specifies an address in the TCP data area. However, since it is unclear how to use this pointer you should only use it if you are in complete control of both ends of the communication.
Options is inadequately explained in the Jamsa text. we
won't study it now.
Fragmentation as seen by TCP. TCP doesn't see fragmentation. Simple, eh? IP reassembles a "very large packet" as though it had never been fragmented, for presentaton to TCP. Thus, the data length field in the IP packet reports the information needed by TCP and its application user. So segments are effectively limited to 2**16 bytes in length.
Fragmentation as handled by IP. To perform this magic,
IP uses
three fields: Identification (16 bits), Flags (3 bits) and Fragment
Offset (13 bits.) The three flags are "Do not Fragment", "x" and "More
Fragments".
Assume that a datagram is 4096 bytes long but the MTU (Maximum Transfer Unit) is 1024 bytes. IP will break the data portion of the packet into smaller pieces called fragments, in multiples of 8 bytes. The first packet is transmitted with its MORE FRAGMENTS flag set. The receiver starts a reassembly timer. If this timer expires before the whole mess arrives, the receiver discards all the pieces and does not process the datagram.
As the receiver reassembles fragments, IP uses the Source Address and Identification fields to put 'em together. When the host receives a fragment with the More Fragments flag turned off, it can calculate the length of the original datagram.
Each fragment is using its Fragment Offset field to describe its starting point measured from the beginning of the original packet. Thus, like rebuilding a jigsaw puzzle, the receiver can tell when it has received all the fragments.
Query 17.3. Is the entire three-way handshake really necessary if you are only going to transmit data in one direction? (It may be inherent to TCP; we're just talking conceptually here.) Why or why not?
Query 17.4. What do you expect a "Push" to actually be doing in software? That is, how can a service routine such as TCP (inside a tool such as WinSock) actually get an application program to do something like pick up incoming data?
Parity is a one-bit checksum. Modems may or may not use parity. Add up all the bits in a word (including one "parity bit"); if the total is even, the word had even parity. Equivalently: if there were an even number of 1's, the word had even parity. We set the parity bit to make this true, then check it on receipt.
Odd parity (actually more often used) sets the parity bit so that the total number of on-bits in the word is odd. The advantage of this is that all zeroes (for even parity) may in some systems not even be visible; odd parity puts at least one 1-bit ("set") in each word.
Start and Stop bits. Asynchronous transmissions mean that every character can come at any time; so a "start bit" was originally used in Teletype machines to get the equipment moving. The line was resting at 0, and the first 1 means "here goes". Thus the first 1 contains no data except the fact of startup.
Stop bit=0 is less necessary, if the system counted the bits of a byte as they came in; but they are used. Sometimes they even use two of 'em for ancient and obscure reasons.
8-N-1 is a code meaning "8 data bits, no parity bit, and 1 stop bit. With the start bit, this requires ten bits per byte of transmitted data.
Baud Rates. Since it takes 10 bits (or 11 or 12) to get 8 useful data bits to the other end. Modems with advanced protocols play cunning tricks like providing faster service in one direction than the other, by trying to guess which end is occupied by a human. (Humans need to receive much more than they usually send.)
There are two "virtual wires" in a modem, each consisting of a pair of frequencies for 0 and 1. Modems often "steal" one channel (pair of frequencies) from its intended direction to bolster the bandwidth in the opposite direction. Obviously this requires a negotiation between the modems. No wonder the clatter and whir of modems negotiating, sounds complex. It is.
Using a 2400 baud modem you can only support remote terminal applications. At 14.4k, SLIP and PPP become practical. We'll see why in a moment.
SLIP uses two characters as markers: END (ASCII 192, hex C0) and ESC (ASCII 219, hex DB). END is used to end each packet (and also to begin them as we will see.) But how can you deal with the possibility that an END might occur in the natural run of your data?
SLIP's transmitter looks for END characters and translates them into a two character sequence - ESC END. Its receiver in turn looks for ESC END and translates them into ASCII 192, passes the data onward. If it sees a naked END without ESC, that's the end of a packet.
Now of course ESC has become "special" and cannot simply be transmitted either. SLIP's transmitter replaces ESC with ESC ESC; its receiver performs the opposite translation.
Most SLIP implementations start a packet with END, which "flushes" any previous noise. In effect it declares any previous noise to be an IP packet, which the receiving system immediately rejects because it is meaningless. If there is no noise, the receiving SLIP will see two back to-back END characters and know to stasrt a new packet.
Error Detection is up to TCP/IP. SLIP just delivers the packets. Since UDP doesn't do error correction, don't use it with SLIP.
SLIP Deficiencies. Your computer will have an IP address which (if you're dialing in) is usually dynamically assigned whenever you log in (i. e. different each time.) You could purchase a dedicated line and IP address for more money. Unfortunately there is no way for your application to ask SLIP what your IP address is. Some "kluges" (non-standard tricks) are occasionally used to get around this problem but they are not uniform between implementations.
SLIP packets contain no typing information, and so cannot be mixed with any other type packets over the same serial line. Also, SLIP incorporates no data compression. Compressed Slip (CSLIP) improves on this.
CSLIP compresses TCP/IP header information only. This is more useful than it sounds, particularly for interactive operations such as Telnet. With bulk transfers such as FTP, large packets are the norm and header overhead costs are relatively small. But interactive keystroke-by-stroke interactions generate a packet per keystroke - which would be 40 bytes of TCP/IP overhead per one byte of keystroke information!
Line Efficiency is the ratio of data to header-plus-data in a TCP/IP datagram. Thus for one-keystroke transmission, LE=1/41 = .0244.
Interactive Response refers to how quickly something happens when the user presses a key. People really need interactive response within 100 milliseconds or their personal feedback loops don't work well.
CSLIP header compression. About half the TCP/IP header info remains constant over the life of a connection. CSLIP saves a copy of the last header, and substitutes a small connection identifier which means "pick up header #34 and make these changes to it:", followed by the changes. This key technique is called differential coding. It accounts for most of the efficiency of CSLIP.
CSLIP then uses a variety of ad-hoc tricks to nibble away the changed information in the TCP/IP header to an average of 3 bytes per packet. For instance, CSLIP depends on the link level framing protocol (i. e. SLIP, or (later) PPP) to tell the receiver the length of the received message, and thus can eliminate the Total Length field in the IP header.
* An encapsulation method that lets the network software use a single serial link for multiple protocols;
* a Link Control Protocol (LCP) with which the two ends of a PPP connection use to negotiate the connection; and
* a family of Network Control Protocols (NCPs) that let PPP connections use different network-layer protocols.
PPP will use CSLIP's compression strategy, while wrapping it in
a more robust encapsulation system. To wit, a PPP frame looks like this:
| starter flag byte | 7E |
| Address byte | FF |
| Control byte | 03 |
| PPP Data: up to 1500 bytes | up to 1500 bytes |
| CRC: two bytes | two butes |
| end flag byte | 7E |
Within the data, hex 7D serves as an escape character to mark data of value hex 7E or 7D. Also, the sixth bit of any escaped data is toggled. This has the objective of making it possible to escape ASCII chars < hex 20 (e. g. 1B) yielding 3B, since "non printing" chars <20 are often used by modems as escape chars of their own.
The PPP data consists of a 2 byte protocol header and any number
of bytes of enclosed information. The protocol header has three possible
values, with these basic meanings:
| hex 0021 | IP datagram follows |
| hex C021 | Link control data follows |
| hex 8021 | Network control data follows |
Frequently the Link Control Protocol is used to negotiate a Protocol field of one rather than two bytes. They can also negotiate the elimination of the flag, address and control fields. The author says this boils down to an overhead of 3 bytes (one protocol, two CRC) by trusting the IP packet's length information.
As stated above, PPP link control negotiates the use of CSLIP compression on the contents. Other alternatives can be negotiated as technology improves.
Query 17.5. SLIP can transmit any 8 bit character including its own End Flag character hex C0, inside its data packet (IP frame.) Explain how this can be done without confusing plain old data for the end of the SLIP frame.
Query 17.6. Explain the concept of line efficiency and why it is particularly critical for interactive applications such as Telnet.
Query 17.7. Explain how CSLIP's compression technique manages to reduce the size of IP packet headers so drastically. (You don't have to provide all the gory details; the flavor is what we want.)
Query 17.8. Summarize the differences between SLIP and PPP.
Back to the course index
Back to the course syllabus
Back to the previous lecture
Onward to the next lecture