Figure 7 provides a graphical representation of one example cluster, with nodes indicating distinct botnets and edges indicating relationships between different botnets. The label on each edge reflects the pairwise similarity score. It is evident from this graph that botnet relationships can evolve to form rather complex clusters that significantly complicate the task of estimating botnet membership.
Figure 8 plots the CDF of the number of botnets affiliated with botnet cluster we discovered. The graph indicates that botnet clusters can span relatively large collections of botnets. Finally, we note that while code reuse [2] could explain the commonalities across some of the features we chose (e.g., IRC server version), other common features, such as channel names and botmaster IDs, are more likely to indicate intentional botnet relationships. Further research into feature selection and assigning proper weights for each feature is a subject of our ongoing work.