In this section we describe how we approximately compute
for any
, i.e.,
the probability that the scheme yields the password
.
This probability is taken with respect to both random choices by the
password selection algorithm and user choices.
We compute this probability inductively as follows. Suppose
. Then
A necessary condition for the denominator of (2) to be
nonzero for every possible is that the dataset contain
samples for scheme
where
denotes the number
of image categories for
. (
in Face, and
in Story.)
is over 1700 in the Face scheme, for example. And, of
course, to use (2) directly to perform a meaningful
approximation, significantly more samples would be required. Thus, we
introduce a simplifying, Markov assumption: a user's next decision is
influenced only by her immediately prior decision(s) (e.g.,
see [16]). In other words, rather than condition on all of the
previous choices made in a password (
), only the last
few choices are taken into account. Let
denote the selection of an
-tuple,
, for
which the most recent
selections are
.
![]() | |||
![]() |
![]() |
(3) |
``To start, I chose a face that stood out from the group, and then I picked the closest face that seemed to match.''While this user's intention may have been to choose a selection similar to the first image she selected, we conjecture that the most recent image she selected, being most freshly on her mind, influenced her next choice at least as much as the first one did. Assumption 4.1 also seems reasonable for the Story scheme on the whole, since users who selected passwords by choosing a story were presumably trying to continue a story based on what they previously selected.
Assumption 4.1 permits us to replace
(2) by
We further augment the above approach with smoothing in order to
compensate for gaps in the data (c.f., [16]). Specifically, we
replace (4) with