Check out the new USENIX Web site. next up previous
Next: Estimating byte counts Up: Estimators based on flow Previous: Estimators based on flow


Estimating packet counts

The packet counter $ c_s$ in an entry is initialized to $ 1$ when the first packet of the flow gets sampled, and it is incremented for all subsequent packets belonging to the flow. Let $ s$ be the number of packets in the flow at the input of the flow slicing algorithm. gives the formula for our estimator $ \widehat{s}$ for the number of packets in the flow.

$\displaystyle \widehat{s}=1/p-1+c_s$ (1)

Lemma 1   $ \widehat{s}$ as defined in is an unbiased estimator of $ s$.

Proof: By induction on the number of packets $ s$.

Base case: If $ s=1$, the only packet of the flow is sampled with probability $ p$ and in that case it is counted as $ 1/p-1+1=1/p$ packets. With probability $ 1-p$ it is not sampled (and it counts as 0). Thus $ E[\widehat{s}]=p\cdot1/p+0=1=s$.

Inductive step: By induction hypothesis, we know that for a flow with $ s'=s-1$, $ E[\widehat{s'}]=s'=s-1$. Also since the flow slice length $ t$ and the inactivity timeout $ t_{inactive}$ are larger than the bin size, we know that once the flow gets an entry, all its packets within the bin will get counted by $ c_s$. There are two possible cases: the first packet of the flow gets sampled, and we get $ c_s=s$, or it doesn't and then the value of $ c_s$ and $ \widehat{s}$ will be the same as those for a flow with $ s'=s-1$ packets for which the sampling decisions are the same as for the rest of the packets of our flow.

$\displaystyle E[\widehat{s}]$ $\displaystyle =$ $\displaystyle p\cdot(1/p-1+s)+(1-p)E[\widehat{s'}]$  
  $\displaystyle =$ $\displaystyle 1-p+ps+(1-p)(s-1)=s$  

$ \blacksquare$

If we sample packets randomly with probability $ q$ before applying the flow slicing algorithm, we will want to estimate the number of packets $ S$ at the input of the packet sampling stage. Since $ E[s]=qS$, it is easy to show that $ \widehat{S}=1/q\widehat{s}$ is an unbiased estimator for $ S$.


next up previous
Next: Estimating byte counts Up: Estimators based on flow Previous: Estimators based on flow
Ramana Rao Kompella 2005-08-12