32.3 Target shooting
Here is a general problem; suppose that we a set and a set , and that we would like to estimate what proportion of elements of are in .
Of course, if we know the values of and then the answer is just
but oftentimes we don’t, or to compute the actual sets would be impractical. One idea is to sample elements from and check if they are in , something like this
randomly sample from (uniformly and independently)
calculate where if and otherwise.
clearly gives us a rough idea of what (which above we defined as ) is. But just how good?
Let . Let be the output of our algorithm, and we seek to derive a bound on
that is, the probability that is more than a factor of away from the true value of .
Note that our algorithm is equivalent to
and as each , we know that
and therefore we can use a Chernoff bound (as we have a binomially distributed discrete random variable). It follows that
then we can let which means that and is slightly more pleasant to work with. This means that
and through our choice of we can reduce the probability of failure arbitrarily. If we let then we can make
through choosing (we essentially want to cancel everything out so that we get and then the cancels and we are left with ).
Thus we can make the failure probability arbitrarily small.