We finally evaluate the extent to which malware is replicated across
the different distribution sites. To do so, we use the same metric in
Equation 1 to calculate the normalized pairwise
intersection of the set of malware hashes served by each pair of
distribution sites. Our results show that in
of the malware
distribution sites, at least one binary is shared between a pair of
sites. While malware hashes exhibit frequent changes as a result of
obfuscation, our results suggest that there is still a level of
content replication across the different
sites. Figure 13 shows the normalized pair-wise
intersection of the malware sets across these distribution
networks. As the graph shows, binaries are less frequently shared
between distribution sites compared to landing sites, but taken as a
whole, there is still a non-trivial degree of similarity among these
networks.