I took another detour into the world of binary neural networks. In the interests of evaluating dot-product-like functions of two binary vectors, I wanted to assess how the distributions of multiplied bits (equivalent to ANDed bits) relate to the distributions of the individual bits. This post captures the results of this side-trip.
Products of independent Bernoullis
The simplest case is in the light of independence, when
and
Let be the product of the two bits, equivalent to . As we can model for some .
The truth table for the four possible outcomes guides us here:
0
0
0
0
1
0
1
0
0
1
1
1
; a bit of arithmetic also shows that . With , we see that
Products of Non-Independent Bernoullis
Suppose that . A second Bernoulli random variable will be dependent on only if its single parameter is a function of , so let where generates the parameter of ’s distribution.
As there are only two possible outcomes from such a function we denote them individually as: and
We again are interested in the distribution of a given , and again the truth table leads us to a straightforward solution:
0
0
0
0
1
0
1
0
0
1
1
1
A bit of arithmetic shows that
With we conclude that
The parameter for the case where drops out; it is redundant as is simply another way for to equal zero.
In other words, the distribution of depends only on the distributions of and of , not on . The constraint that all events must result in either one or the other of two outcomes allows the behavior of the combined system to be fully characterized with a constant number of parameters, and the distribution of the product / Boolean AND even of dependent Bernoullis is simply yet another Bernoulli.