Home | Research | Personal activity | Link |
(Last Update: March 1st, 2002)
I. Background
The Windows IIS vulnerability exploit was discovered on June 18, 2001. First version of the Code-Red virus
emerged on July 12th, 2001. The truly virulent strain of the worm began to spread on July 17th. This new worm
has correct random generator. It generates 100 threads. Each thread randomly picks one IP address and tries
to setup connection with the port 80. If the connection were successful, the worm will send copy of itself to
the victim web server and continues to find another web server. If the victim were not a web server and
connection could not be setup, the thread will randomly generate another IP address to probe.
The Code Red worm kept spreading on July 19th until 0:00 GMT (8:00 pm EDT) July 20th, after which the worm
stopped spreading [4]. Dr. Stuart Staniford of Silicon Defense published his theoretical
analysis of this worm propagation [1]. The differential equation is a typical epidemic
equation and he claimed that the equation fitted observed Code Red spreading quite well. After that, many
researchers just took this model as granted. During this one-day period, the worm kept spreading
continuously, which makes it valuable for us to study worm propagation properties.
II.Questions on Stuart Staniford report [1]
1. Why using worm scan number, not infected host number?
In Dr. Staniford's report, the Ken Eichman data set [14] consist of 2 types of data, one is Code
Red worm scan number in each hour, another is the number of unique source addresses scanning during each
hour. Because each infected host will generate 100 scanning (one scanning per thread) at the same time, so
worm-scanning number was much bigger than the number of unique source addresses that are correspondent to the
number of infected hosts. Since we want to model worm propagation and check infected host population
dynamics, we should mainly deal with infected hosts data set.
Since Dr. Staniford's differential equation is the epidemic equation of infected hosts, he should also use
the infected hosts number as the data set. But strangely, he used the worm-scanning number data set in his
report and mentioned nothing about the infected hosts data set.
Another questionable point is that he only used part of the whole data set that Ken Eichman provided. The
Code Red worm kept spreading until 0:00 GMT July 20th, but Staniford only used data set up to 21:00 GMT July
19th. During the 3 hours from 21:00 to 00:00 GMT, the worm scan number kept decreasing and Staniford's model
can not explain it.
We want to know the dynamic of infected host population, thus we need the total infected host number data,
which was not provided by Ken's data. We can not derive this number from Ken Eichman's unique sources number
per hour since there are a lot overlap sources across each hour and his network address space is only 1/65536
of the whole IP address space.
The data set in report [2] seems more useful to us. The data was collected from 2 locations with
2 types of data format to maximize coverage of the expansion of the worm. The total infected host number was
directly collected and one /8 network was used to cover 1/256 of the whole IP address space.
2. Have most of vulnerable nodes been infected?
The typical epidemic equation for finite population is [6]:
It is clear that Staniford model is too simple and did not predict the worm propagation very well. But from
all reports and references, it seems that people took it for granted and nobody has questioned his model.
David Moore from Caida.org gave a detailed analysis of Code Red worm[2]. He presented some
valuable data collected when code red worm began rampant spreading during July 19th. The observed infected
host number curve is as Fig. 2.
III. New model on Code Red worm propagation
1. Factors affecting real worm propagation on current Internet.
Real worm propagation on Internet is a complicated process. It involves a lot of arbitrary and unpredicted
factors. In order to model it, we need to grasp several most important factors that clearly affect worm
propagation properties. Here are some major factors that we can think of:
2. Classical general epidemic model: Kermack-Mckendrick model
In the mathematical modeling of epidemiology book [6], the Kermack-Mckendrick model
was presented that took consideration of removing infective hosts. It is assumed that the infective
hosts are removed from circulation at a rate that is proportional to the current number of
infective hosts. We use the same same as [6], denoting as the number of
removed hosts at time
,
as the rate of removal of infective hosts from circulation. So
![]() ![]() |
(3) |
3. General Internet worm model
We can extend the Kermack-Mckendrick model by considering the following two specific processes that affect Internet
worm propagation:
Figure 3: Ken Eichman Data Figure 4: Dave Goldsmith Data
From the worm scanning curves in Fig. 3 and Fig. 4, we can see that between 18:00 to 19:00 UTC, both of these /16 network received maximum worm scanning. After that, the worm scanning activities began to drop down slowly until 00:00 UTC and stopped. Comparing it with the infected host curve in Fig. 2, we can see that they are not consistent: After 19:00 UTC, infected host number kept increasing while worm scanning kept dropping. Since code red worm kept the same continuously scanning and spreading pattern in that day, this phenomenon can only be explained by the following three possible reasons:
In order to take into account of the slowing down of worm propagation process, the infection rate
in typical epidemic equation (6) must be a variable
, not a constant
anymore. From the whole Internet point of view, patching, filtering, blocking or disconnecting means that
some hosts are removed from the circulation no matter they are infected or vulnerable ones. So the removal
process consists of two parts: Removal of infected hosts and removal of vulnerable hosts. Let us use the same
as the number of removed infected hosts at time
; Use
as the number of removed vulnerable
hosts at time
.
Consider the changes of vulnerable hosts number from time
to time
:
We temporarily call equation (6) as two-factor worm model.
In order to get analytical answer from equation 6, we have to know the dynamic
properties of ,
and
.
is a factor that is determined by Internet
traffic engineering and the worm code pattern;
and
involves people's attitude and
awareness, patching and filtering difficulties, etc. By simplify their dynamic properties, we may
be able to get some analytical results. We will do that in the future. Now let us first do some
simulation to verify it to the valuable Code Red worm propagation data collected by others
[2].
IV. Simulation of Code Red worm based on two-factor
model
In order to verify our two-factor worm model, we do some Code Red worm simulation based on our
two-factor worm model.
1. two-factor worm simulation model description
In order to simulate Code Red worm propagation as real as possible, we used 490,000 vulnerable hosts in our
simulation. We let each worm random pick one host in the hosts space to infect at the end of each infection
delay time, which is the time of the worm scanning process.
The infection process will be like this: An infected host will randomly pick one host in the hosts space, use
a specific delay time (which is determined by the worm scanning process) to infect it, then continue the same
infection cycle; There will be no reinfection case here.
In order to consider the blocking, patching and filtering factor on worm propagation, we simply immunize the
same amount of hosts as the number of the ever infected ones. We randomly pick non-immunized hosts to do
immunization no matter they are infected or still healthy.
We vary the worm infection delay time to capture the worm propagation slowing down effect. In our
simulation, we use the following equation to generate the delay time
:
1. Primary simulation results
We did simulations on three cases by using the same parameters to their simulation result. The first case is
the same as Staniford's model [1] and some other researches [7,8,9]. It dose not consider the two factors in our model; The second case is that we only consider the
traffic slowing down effect; The third one is our two-factor model that considers both of these factors. The
simulation result is in Fig. 5 1( The infected nodes number is the total number of ever infected
hosts, including those infected hosts that have been patched or blocked later on):
2. Statistic in worm simulation
Because there are a lot of random events happened in our worm simulations, we want to know what's
the statistic properties of worm simulation. Is the simulation sensitive to those random events?
We repeat 100 times of the second case simulation as in Fig. 5. By using the
maximum and minimum values for each time , we derive two curves that will contain all these 100
curves inside them. We found that these two curves are very close to each other that we can not
draw them out together. The maximum relative difference between these two curves are only
! So worm propagation is almost like a deterministic process.
The reason why those random events have so little effects on worm propagation is because the
objective population is so huge and each worm does infection independently. So from the whole
process point of view, these huge number of random events will eventually average out each other.
3. Some discussions
In our two-factor model simulations, those two curves will slow down infection speed when only 20%
to 40% of total vulnerable hosts have been infected. It explained the fact that on July 19th,
there are only about 20% to 40% vulnerable hosts been infected by Code Red worm. Staniford's
model can not explain such fact as explained in section II.3.
When Code Red worm and Nimda worm were spreading rampantly on the Internet, people made some
scaring prediction of how fast these worms will infect almost every computers in the world. the
"Warhol worm" presented by Weaver [9] even claim that it can infect almost every
computers in the world within 15 minutes. But it turned out that those worms had not infected so
many computers on the Internet. Our two-factor model is the first step trying to explain such
facts.
It should be noticed that the 21 seconds timeout in code red scanning is set too long. It can easily be set
as for example 2 seconds, then the code red worm would have spread much faster than the event in July 19th.
Patching, blocking and filtering plays an important role in slowing down and eliminating virus and
worm propagation. In reality, there are a lot of viruses and worms coming out almost everyday in
the past and today. But only a few of them propagated seriously on Internet and eventually all of
them died out. Most viruses and worms are not so contagious as Code Red and Nimda, so after
patching and filtering speed is higher than the infection speed, they just diminished from the
Internet circulation.
In the near future, we will try to simplify some dynamics of those parameters like ,
and
in equation (6) to get some analytical results.