Target shooting
Here is a general problem; suppose that we a set and a set
, and that we would like to estimate what proportion of
elements of are in .
Of course, if we know the values of and then the answer is
just
(32.2)
but oftentimes we don’t, or to compute the actual sets would
be impractical. One idea is to sample elements from and check if they
are in , something like this
-
•
randomly sample from (uniformly and
independently)
-
•
calculate
where if and otherwise.
clearly gives us a rough idea of what (which above we
defined as ) is. But just how good?
Let . Let be the output of our algorithm, and we
seek to derive a bound on
(32.3)
that is, the probability that is more than a factor of
away from the true value of .
Note that our algorithm is equivalent to
(32.4)
and as each , we know that
(32.5)
and therefore we can use a Chernoff bound (as we have a
binomially distributed discrete random variable). It follows that
(32.6)
|
|
|
|
(32.7) |
then we can let which means that and
is slightly more pleasant to work with. This means that
|
|
|
|
(32.9) |
|
|
|
|
(32.10) |
|
|
|
|
(32.11) |
|
|
|
|
(32.12) |
|
|
|
|
(32.13) |
and through our choice of we can reduce the probability of failure
arbitrarily. If we let then we can make
(32.14)
through choosing
(we
essentially want to cancel everything out so that we get
and then the cancels and we are left
with ).
Thus we can make the failure probability arbitrarily small.