We remind the reader that large is not sufficient for strong
security. For example, even if all features are distinguishing (
) for all users, but all users' feature descriptors are identical
(and the attacker knows this), then an attacker who captures a user's
device can trivially determine the key. Therefore, it is equally
important that users' feature descriptors vary widely--or more
precisely, are drawn from a distribution with high entropy. An
entropy evaluation of user's utterances from phone recordings of users
saying the same passphrase is described in [16,17], and
these studies suggest that the entropy available in user utterances it
substantial even when users say the same passphrase. As already
noted, however, since that study involves only recordings of users
taken over phone lines, and since that study is limited to
features, it is insufficient in several ways. Unfortunately, the data
sets with which we are presently working (see Sections 5.1
and 5.2) include too few users to enable meaningful
measurements of the entropy of users' feature descriptors, and so here
we report results for distinguishing features only.
In order to calculate the average number of distinguishing features
per user, it is of course necessary to define when a feature is
distinguishing. Let and
denote the mean and
standard deviation of feature
over the recent history of
successful logins.9 Then we say that the
-th feature is distinguishing if
for
some parameter
. Note that if feature
is distinguishing,
then either
and so usually
for
the user (see (1)), or
and so usually
for the user. Intuitively, the parameter
tunes the ``sensitivity'' of the scheme, in that a small
implies more distinguishing features, and a large
implies fewer.
Obviously
must be tuned to balance achieving a high number of
distinguishing features with enabling the user to successfully
regenerate his key reliably, since a higher number of distinguishing
features is advantageous for security but also requires increasingly
similar utterances to regenerate the key. The parameter
will play
a central role in our evaluation.
The features that we use in the balance of this paper are
described in [16, Section 3.2]. Each is defined by
comparing the position of a vector characterizing a segment of the
utterance to a fixed plane. This plane is a parameter of our scheme,
and though we will rarely mention it below, it is important for the
reader to be aware that the data we present is based on a plane
selected, based on our data, to optimize our measures in certain ways.
On the one hand, this means that our data presents what could be
achieved with a good selection of this plane, and is thus optimistic
in this regard. On the other hand, since this plane is selected by
searching through a small set of candidate planes, (infinitely) many
planes are omitted from this search. Consequently, it is likely that
planes yielding better measures exist. The experimentation we have
conducted thus far does not permit us to conclude how to select this
plane in general, and this continues to be an area of our ongoing
work.