Infected System Notifications are Harder Than You Think

The past few years, we have seen an explosion of “infected system notification systems.”  Some of the more successful ones target the more controlled networks, where determining the actual infected device is relatively easy.  Those successes led to a call to notify every user on every network when an “infected system” was found.  But networks vary in their ability to identify a specific device.  And there is no generally-accepted definition of “infected.”  But hope is not lost, or it shouldn’t be, as the APWG operates a notification system and continually wants to improve its effectiveness and usefulness.  Or maybe we just need better definitions and expectations.  [I started this without a heavy technical bias. That changed as I added explanations to help my cause/rant.  Sorry.]

Do Notification Systems Work?

In my opinion, one of the better notification systems is run by the REN-ISAC, the US educational community—but they get to cheat a little.  Most colleges and universities require their users to “register” their devices for use on the campus networks.  This registration (by various means) allows for very easy identification of the infected device, whether it is a Windows or Macintosh desktop, laptop, tablet, phone, or even a gaming console.  But in their environment, they can do that.  Some corporate environments have the same regime, although it may be implemented differently.  These contrast widely with the residential cable modem deployments and “open” hotspot environments at the local coffee shop or a park, where a user shows up, connects, and quickly leaves the premises.  Notifying the network provider of Starbucks or Union Square in San Francisco, or even the provider to the Wi-Fi in the airplane service, is a waste of time… they have no idea who was there.  And they couldn’t help the user even if they did.  Figuring out which networks to notify and which to ignore is a laborious task (maybe it needs a government grant).  Home cable providers and others that use NAT extensively fall somewhere between the two previous extremes.  My cable provider knows that I have an address; they don’t know which of the 11 NAT’d devices behind their router is the infected device.  And most users aren’t going to be able to figure it out either.  The goal of the notifications is to get the device repaired or reconfigured (or maybe even destroyed J), not to just tell someone—maybe the wrong someone—about it.  Oh, and to be able to do statistics so we can guage how well we’re doing and identify areas that may need more work.

Sending the Notification and Getting a Response

If we detect traffic that matches malicious signatures or recover identification from attack participating systems and decide that the IP Addresses are genuine, we can back-trace an IP Address to a network operator relatively easily through the whois system or from decoding network routing data.  So we’ve figured out who to notify.  What do we tell them?  Depending on your place in the network food chain there may be things that even if you told them, you or they couldn’t fix them.  As a reference, my network food chain looks like this, from bottom to top:

  1. An end-user device which may contain password or money-stealing malware.
  2. An end-user device that may have a protocol or design flaw such as a home router with an open DNS or SNMP responder.
  3. A web server that accepts extraneous traffic that causes a local denial of service (DoS). Think of responding to Internet-sourced SNMP requests.
  4. A data center that accepts extraneous traffic for its customers, like DNS replies to a webserver.
  5. Networks that don’t filter their BGP peers or prevent spoofed traffic. [You probably thought I was going to say “networks that don’t filter bad traffic,” but that is really hard in an aggregated network so I won’t say it].

Separating the users into levels helps me describe the response failures better. If I’m the end-user with the malware (#1), I’m pretty concerned about removing that malware from my system; if I’m the network provider (#5) I’m maybe not so concerned.  If a data center server (#3 or #4) is under a DoS attack, the end-user probably doesn’t care as long as the video keeps playing; the network provider may not care either.  There are lots of plans to do sexy notification systems—some even have government funding—but it’s unclear how to get the right data to the right spot in order to get the device cleaned (the overall goal).  Is it better to have a 20 million customer network with no infected systems but lots of open relays such that the network is always used as a DoS platform (i.e., users are happy, rest of Internet not so much)?  Or is it better to not source DoS attacks and the customers have no money to pay their internet bill (i.e., rest of Internet happy; local users annoyed)?

How to Stimulate Parties to Cleanup

There are many efforts to try and convince the infected to get clean.  Some of these include name calling, name and shame lists, blacklisting networks with too many infections, or even convincing government to pay people to clean up.  Most of these efforts envision the notification effort as a single problem, but it’s not.  For example, notifying a user that their network provider has open DNs proxies will probably not get the proxies disabled.  Likewise, notifying a network operator that a customer’s end device behind a NAT is infected may not actually get to the infected device’s user.  And throwing money to get someone to act never works (you can never stop the money flow).

There has to be enough information in the notification to make infected device identification and subsequent actions really apparent, too.  Notifying a data center operator that an IP Address has an infection becomes humorous when that IP Address is a server with 200 virtual webservers on it.  Any idea which of the 200 has the malicious page?  Neither does the data center operator.

I think there are at least two types of notifications: (1) a network operator one and (2) an end-device one.  The network operator notification is deployed when their network’s IP space was used in an attack and the attack characteristics lead to a network issue, not a user device issue (for example, open resolvers or proxies).  The end device notification is sent when an actual infected device is discovered or the attack characteristics lead to a local device error.  Although network operators should be notified of infected end devices—they may have in-house assistance available to the user—the goal would be for them to pass the notification on to the appropriately identified user.

So, although notifying network operators or users about their infected devices can be worthwhile, more effort needs to be put into the notification paths and the data to include in the notification message so that the receiver of the notifications can figure out what to do to help remediation.

Normal Notification Method

The normal notification flow starts with the identification of an infected IP Address.  The network provider of that IP Address is then notified of the infection.  The network provider will (hopefully) figure out the customer of that IP Address at that time and pass the notification on to them.  That customer may have to do some digging to figure out what actual device to inspect or which sub-customer to notify.

Some providers don’t want to do the added effort required to forward notifications; some providers can’t identify the end user to notify.  Some thought has gone into trying to outmaneuver these providers.

Alternative Notification Methods

If a group trying to plot a notification strategy sat around a table and tossed out odd ideas, they may end up with some new ways to work the problem.  Maybe.  One or two of those ideas are discussed below and could be pilot-worthy activities.

Self-Identification Site

If we can’t figure out who owns the device—particularly behind NATs—maybe we can get the owner to tell us.  Or at least notify the owner directly through some trusted, but impolite, means that may not involve the network operator.  Remember that the goal is to get the device cleaned or reconfigured, not to sell them more stuff—or to make fun of them.

There is some early work being done constructing a website where users could be pointed to check their own devices for infection or to see if they are included on an infected system list.  A PSA campaign akin to our “stop.think.connect.” effort would drive users to the site(s).  Maybe “Spot.Tell.Clean.” to keep with the STC theme.  Two issues crop up, though.  One will be on how to keep false positives very low so the users have confidence in the system.  The other issue is what to do when you confirm the device is infected—do we send the user to a disinfection site, do we refer them to another vendor, or do we do something else?

Notification via Trusted Party

Another idea that has been piloted in a few places but has not caught on is for a party “trusted” by the user—maybe their bank, town, school, or community group—to convey the infected system information to the user when the user logs in or otherwise interacts with the trusted party.  I expect that parties that use pop-up windows or deliver odd advertising may not be “trusted” enough by the user to be used here, but many users have a handful of websites and other Internet properties from whom they will take advice.  A challenge here is to be creative enough to keep malicious parties from impersonating the trusted party to deliver malware to the—already infected or not—user.

Parting Words

I think the notification efforts help in identifying systems that need attention; how to implement the effort in an efficient, useful-to-everyone method is the challenge.  Many of the operational notification systems are very country-specific; that is, infected systems outside of the operation area are ignored.  The APWG’s effort is to not operate a detection network, but to rely on other parties to send us infected system identification and we concentrate on the notification part, routing them to the exact party with authority to act.  Such a system will allow us to collect that ignored data and also provide our members with a large Internet presence with a way to report infected systems they detect.

One complaint from network operators is that the notifications they receive lack important data.  Dividing the notifications into different classes allows for different data to be sent in each type, hopefully making the process more efficient.  To assist in that efficiency, we are expecting to let network operators designate where to send their notification and to identify portions of their network space where notifications are not useful, such as coffee shops or hot spots.

So we’re trying out lots of new ideas; some of them could be considered silly.  First, we’re not going to look for infected system; our trusted friends will tell us which systems need to be notified.  Second, we require a good chunk of information from our friends so we can figure out which type of notification to send.  Third, we’ll let network operators identify parts of their networks where we shouldn’t notify them.  Fourth, the network operators can tell us the language to use in the notifications and where to send them.  Fifth, let’s try some unconventional ways for people to find out their device is infected—maybe a website, maybe a tweet.  And last, we’ll do statistics.  We’ll know who isn’t cleaning up their networks and which notifications don’t work and lots of other statistical things.

I’ll close as I opened: notifications are hard.  Maybe some of our ideas will fail miserably (wouldn’t be the first time); maybe some will work.  But it’s more fun to try and screw up than to just screw up.