Check out the new USENIX Web site. next up previous
Next: Performance Up: Encoding Scheme Previous: Initialization


Key reconstruction

As described in Section 2, the input from the user, in this case an utterance, is used to generate an $m$-bit feature descriptor $b$. The key regeneration program constructs a vector $\underline{t}_b \in \mathbb{Z}_p^m$ by selecting the corresponding elements of the table, i.e., $(\underline{t}_b)^T = (t_{2i+b(i)})_{0 \le i < m}$. It similarly constructs a $m \times (m-1)$ matrix $\underline{U}_b =
(u_{2i+b(i),j})_{0 \le i < m, 0 \le j < m-1}$. Note that solving
\begin{displaymath}
\underline{U}_b \underline{s}' \equiv_p \underline{t}_b
\end{displaymath} (4)

for $\underline{s}'$ would yield $\underline{s}= \underline{s}'$ if $b$ contained no user errors, and the fact that $\underline{s}' = \underline{s}$ could be confirmed by testing whether $y = h'(\underline{s}')$. However, since instantiating and solving (4) for all the different feature descriptors $b'$ within Hamming distance $c_{\rm max}$ from $b$ would require too much time on the target device, our error correction strategy takes a different approach. This faster approach derives from the observation that the equation $\underline{U}_{b'} \underline{s}' \equiv_p \underline{t}_{b'}$ for a feature descriptor $b'$ contains $m$ equations in $m-1$ unknowns, and thus is over-defined. This is intentional, and allows $b'$ to be rejected very quickly if this equation has no solution. Specifically, let $\tilde{\underline{U}}_{b'}$ be the $m \times m$ matrix whose first $m-1$ columns are the same as $\underline{U}_{b'}$ and whose last column is $\underline{t}_{b'}$. Then, a solution exists only if $\det(\tilde{\underline{U}}_{b'}) \equiv_p 0$, and so if $\det(\tilde{\underline{U}}_{b'}) \not\equiv_p 0$ then $b'$ can be discarded from further consideration. Using a recursive algorithm, it is possible to check $\det(\tilde{\underline{U}}_{b'}) \equiv_p 0$ for feature descriptors $b'$ within $c_{\rm max}$ errors of $b$ using only one additional Gaussian elimination step per new $b'$. Due to this feature, our implementation can test over $6 \times 10^6$ feature descriptors $b'$ per second when $m=60$.

Two further observations are worth noting here. First, the determinant of even a randomly chosen matrix is $0$ mod $p$ with probability $p^{-1}$, which is not negligible since $p$ is chosen to be small. Therefore, if $\det(\tilde{\underline{U}}_{b'}) \equiv_p 0$ and so we proceed to solve $\underline{U}_{b'} \underline{s}' \equiv_p \underline{t}_{b'}$ for $\underline{s}'$, it remains necessary to confirm $\underline{s}'$ by checking $y = h'(\underline{s}')$. Second, there is a small chance that (4) is not solvable even if $b$ is consistent with all the user's distinguishing features, because $\underline{U}_b$ might not have full rank. Under the assumption that $\underline{U}_b$ is chosen uniformly at random, it follows that $\underline{U}_b$ has rank $m-1$ with probability $\prod_{j=2}^{m}(1-p^{-j}) = 1-p^{-2}+O(p^{-3})$. This probability is much smaller than the probability that the system cannot be solved because the feature descriptor $b$ induced by the user's utterance contains more than $c_{\rm max}$ errors, and thus can be ignored.


next up previous
Next: Performance Up: Encoding Scheme Previous: Initialization
fabian 2002-08-28