Tumbling Down the Rabbit Hole: Exploring the Idiosyncrasies of Botmaster Systems in a Multi-Tier Botnet Infrastructure Back to Program
In this study, we advance the understanding of botmaster-owned systems in an advanced botnet, Waledac, through
the analysis of file-system and network trace data from the upper-tiers in its architecture. The functionality and existence
of these systems has to-date only been postulated as existing knowledge has generally been limited to behavioral observations from hosts infected by bot binaries. We describe our new findings for this botnet relating to botmaster interaction,
topological nuances, provided services, and malicious output, providing a more complete view of the botnet infrastructure and insight into the motivations and methods of sophisticated botnet deployment. The exposure of these explicit
details of Waledac reveals and clarifies overall trends in the construction of advanced botnets with tiered architectures,
both past, such as the Storm botnet which featured a highly similar architecture, and future. Implications of our findings are discussed, addressing how the botnet's auditing activities, authenticated spam dispersion technique, repacking
method, and tier utilization affect remediation and challenge current notions of botnet configuration and behavior.
Insights from the Inside: A View of Botnet Management from Infiltration Back to Program
Recent work has leveraged botnet infiltration techniques to track
the activities of bots over time, particularly with regard to
spam campaigns. Building on our previous success in reverseengineering
C&C protocols, we have conducted a 4-month infiltration
of the MegaD botnet, beginning in October 2009. Our
infiltration provides us with constant feeds on MegaD's complex
and evolving C&C architecture as well as its spam operations,
and provides an opportunity to analyze the botmasters'
operations. In particular, we collect significant evidence on the
MegaD infrastructure being managed by multiple botmasters.
Further, FireEye's attempt to shutdown MegaD on Nov. 6, 2009,
which occurred during our infiltration, allows us to gain an inside
view on the takedown and how MegaD not only survived it
but bounced back with significantly greater vigor.
In addition, we present new techniques for mining information
about botnet C&C architecture: "Google hacking" to dig out
MegaD C&C servers and "milking" C&C servers to extract not
only the spectrum of commands sent to bots but the C&C's overall
structure. The resulting overall picture then gives us insight
into MegaD's management structure, its complex and evolving
C&C architecture, and its ability to withstand takedown.
The Nocebo Effect on the Web: An Analysis of Fake Anti-Virus Distribution Back to Program
We present a study of Fake Anti-Virus attacks on the
web. Fake AV software masquerades as a legitimate security
product with the goal of deceiving victims into
paying registration fees to seemingly remove malware
from their computers. Our analysis of 240 million web
pages collected by Google's malware detection infrastructure
over a 13 month period discovered over 11,000
domains involved in Fake AV distribution. We show that
the Fake AV threat is rising in prevalence, both absolutely,
and relative to other forms of web-based malware.
Fake AV currently accounts for 15% of all malware
we detect on the web. Our investigation reveals
several characteristics that distinguish Fake AVs from
other forms of web-based malware and shows how these
characteristics have changed over time. For instance,
Fake AV attacks occur frequently via web sites likely to
reach more users including spam web sites and on-line
Ads. These attacks account for 60% of the malware discovered
on domains that include trending keywords. As
of this writing, Fake AV is responsible for 50% of all
malware delivered via Ads, which represents a five-fold
increase from just a year ago.
Spying the World from Your Laptop: Identifying and Profiling Content Providers and Big Downloaders in BitTorrent Back to Program
This paper presents a set of exploits an adversary can
use to continuously spy on most BitTorrent users of the
Internet from a single machine and for a long period of
time. Using these exploits for a period of 103 days, we
collected 148 million IPs downloading 2 billion copies
of contents.
We identify the IP address of the content providers for
70% of the BitTorrent contents we spied on. We show
that a few content providers inject most contents into
BitTorrent and that those content providers are located
in foreign data centers. We also show that an adversary
can compromise the privacy of any peer in BitTorrent
and identify the big downloaders that we define as the
peers who subscribe to a large number of contents. This
infringement on users' privacy poses a significant impediment
to the legal adoption of BitTorrent.
WebCop: Locating Neighborhoods of Malware on the Web Back to Program
In this paper, we propose WebCop to identify malicious
web pages and neighborhoods of malware on the internet.
Using a bottom-up approach, telemetry data from
commercial Anti-Malware (AM) clients running on millions
of computers first identify malware distribution
sites hosting malicious executables on the web. Next,
traversing hyperlinks in a web graph constructed from
a commercial search engine crawler in the reverse direction
quickly discovers malware landing pages linking
to the malware distribution sites. In addition, the
malicious distribution sites and web graph are used to
identify neighborhoods of malware, locate additional executables
distributed on the internet which may be unknown
malware and identify false positives in AM signatures.
We compare the malicious URLs generated by
the proposed method with those found by a commercial,
drive-by download approach and show that lists are independent;
both methods can be used to identify malware
on the internet and help protect end users.
On the Potential of Proactive Domain Blacklisting Back to Program
In this paper we explore the potential of leveraging properties
inherent to domain registrations and their appearance in
DNS zone files to predict the malicious use of domains proactively,
using only minimal observation of known-bad domains
to drive our inference. Our analysis demonstrates that our inference
procedure derives on average 3.5 to 15 new domains
from a given known-bad domain. 93% of these inferred domains
subsequently appear suspect (based on third-party assessments),
and nearly 73% eventually appear on blacklists
themselves. For these latter, proactively blocking based on our
predictions provides a median headstart of about 2 days versus
using a reactive blacklist, though this gain varies widely for
different domains.
Detection of Spam Hosts and Spam Bots Using Network Flow Traffic Modeling Back to Program
In this paper, we present an approach for detecting
e-mail spam originating hosts, spam bots and their respective
controllers based on network flow data and DNS metadata.
Our approach consists of first establishing SMTP traffic models
of legitimate vs. spammer SMTP clients and then classifying
unknown SMTP clients with respect to their current SMTP
traffic distance from these models. An entropy-based traffic
component extraction algorithm is then applied to traffic flows
of hosts identified as e-mail spammers to determine whether
their traffic profiles indicate that they are engaged in other
exploits. Spam hosts that are determined to be compromised
are processed further to determine their command-and-control
using a two-stage approach that involves the calculation of
several flow-based metrics, such as distance to common control
traffic models, periodicity, and recurrent behavior. DNS passive
replication metadata are analyzed to provide additional evidence
of abnormal use of DNS to access suspected controllers. We
illustrate our approach with examples of detected controllers
in large HTTP(S) botnets such as Cutwail, Ozdok and Zeus,
using flow data collected from our backbone network.
Extending Black Domain Name List by Using Co-occurrence Relation between DNS Queries Back to Program
The Botnet threats, such as server attacks or sending
of spam email, have been increasing. A method of us-
ing a blacklist of domain names has been proposed to
find infected hosts. However, not all infected hosts may
be found by this method because a blacklist does not
cover all black domain names. In this paper, we present
a method for finding unknown black domain names and
extend the blacklist by using DNS traffic data and the
original blacklist of known black domain names. We use
co-occurrence relation of two different domain names to
find unknown black domain names and extend a black-
list. If a domain name co-occurs with a known black
name frequently, we assume that the domain name is
also black. We evaluate the proposed method by cross
validation, about 91 % of domain names that are in the
validation list can be found as top 1 %.
Are Text-Only Data Formats Safe? Or, Use This LaTeX Class File to Pwn Your Computer Back to Program
We show that malicious TEX, BIBTEX, and METAPOST
files can lead to arbitrary code execution, viral infection,
denial of service, and data exfiltration, through the
file I/O capabilities exposed by TEX's Turing-complete
macro language. This calls into doubt the conventional
wisdom view that text-only data formats that do not access
the network are likely safe. We build a TEX virus
that spreads between documents on the MiKTEX distribution
onWindows XP; we demonstrate data exfiltration
attacks on web-based LATEX previewer services.
DNS Prefetching and Its Privacy Implications: When Good Things Go Bad Back to Program
A recent trend in optimizing Internet browsing speed is to
optimistically pre-resolve (or prefetch) DNS resolutions.
While the practical benefits of doing so are still being debated,
this paper attempts to raise awareness that current
practices could lead to privacy threats that are ripe for
abuse. More specifically, although the adoption of several
browser optimizations have already raised security
concerns, we examine how prefetching amplifies disclosure
attacks to a degree where it is possible to infer the
likely search terms issued by clients using a given DNS
resolver. The success of these inference attacks relies on
the fact that prefetching inserts a significant amount of
context into a resolver's cache, allowing an adversary to
glean far more detailed insights than when this feature is
turned off.
Honeybot, Your Man in the Middle for Automated Social Engineering Back to Program
Automated Social Engineering poses a serious information
security threat to human communications on the Internet
since the attacks can easily scale to a large number of
victims. We present a new attack that instruments human
conversations for social engineering, or spamming. The
detection rate is low, which becomes manifest in link click
rates of up to 76.1%. This new attack poses a challenge
for detection mechanisms, and user education.
|