In the previous article the two most important building blocks of ProbPy were introduced. Those were the
Factor classes. The way information gets stored in a factor is extremely important because ProbPy will have to perform operations between the factors.
This part discusses what are the operations made between factors, how those operations are implemented and finally, some remarks about complexity, efficiency and optimization of the algorithms are made.
Operations Between Factors
Factor multiplication is the cornerstone of ProbPy and its one of the main reason for the development of the library. To quickly explain what factor multiplication is, suppose you have the following distributions:
P(X | Y), P(Y).
As it was seen, these two distributions are represented by factors. Just to see some numbers lets define the following:
X = RandVar("X", ["a", "b", "c"]) Y = RandVar("Y", ["T", "F"]) fy = Factor(Y, [0.9, 0.1]) fx_y = Factor([X, Y], [[0.2, 0.3, 0.5], [0.6, 0.2, 0.2]])
P(X | Y) respectively, or in factor notation,
fb(X, Y). Remember that the notion of conditioning variable is not important for factor notation, so there is no “|” symbol used in the factor.
Studying probabilities, it is known that multiplying both distributions yields a new distributions, in this case that new distribution will be the join distributions of
Y. The following operation shows this multiplication. Notice that the multiplication is also called the chain rule in some contexts.
P(X, Y) = P(X | Y) P(Y).
fr(X, Y) = fb(X, Y) fa(Y).
Notice that the resulting factor
fr has the same variables as the operand factors. That is something common on all of these operations. The set of variables of the resulting factor is always the union of the set of variables of the operant factors. Mathematically:
After having that notion, the resulting factor can be constructed using the following rule:
The following table shows the operation applied to the example given before.
X Y | fa(Y) fb(X, Y) | fr(X, Y) ----------------------------- a T | 0.9 0.2 | 0.18 b T | 0.9 0.3 | 0.27 c T | 0.9 0.5 | 0.45 a F | 0.1 0.6 | 0.06 b F | 0.1 0.2 | 0.02 c F | 0.1 0.2 | 0.02
If you have read books like “Artificial Intelligence: A Modern Approach” by Stuart Russell and Peter Norvig, or “Probabilistic Graphical Models: Principles and Techniques” by Daphne Koller and Nir Friedman, you have no doubt seen this very operation being defined and used. The operation is factor multiplication meaning that each element in one factor get multiplied with elements in the other factor.
But what about division, addition or subtraction of factors? The operation is exactly the same except for the operator between the two operand factors. The example shown used multiplication, but the same is valid for any other operator, even the
% operator like it is used in most programming languages. Extending on the idea, one could see how binary operators are not the limit. It is possible to define a factor operation with any function relating both operands. Of course the usability of such operation would depend on the context of such use.
To make things simles. All these operations are called Factor Operation. A factor operation is the relation done between two factors which results in a new one. The operation used to relate the factors can be multiplication, addition, or any other operation defined by the user.
How Factors are Related For Operations
Relating two factors always results in a new factor. It was seen in the first part that the new factor will always have the union between the set of variables of the two operand factors. To illustrate what is going on, suppose the following two operand factors:
fa(X, Y)=[ 2 4 1 3 ] X | 0 | 1 | 0 | 1 | Y | 0 | 1 | fb(Z, Y)=[ 8 6 9 7 ] Z | 0 | 1 | 0 | 1 | Y | 0 | 1 |
The values on the factors are just placeholder values to see the result of the operations.
The first thing to do is to determine the resulting factor’s variables. Using the definition from before, the set of variables of the resulting factor will be
X, Y, Z. Remember that the order of variables in the factor representation is very important and as such, the order of the new factor will also be important. We need a systematic way of performing the union of variables. The way that was chosen is simple. Take the variables of the first factor, and succeed them with the variables of the second factor which are not in the first factor. The pseudo-algorithm for this operation is:
Having the variables in order the resulting factor will be represented as:
fr(X, Y)=[ ] X | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | Y | 0 | 1 | 0 | 1 | Z | 0 | 1 | i | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
The values are still not yet inserted. That is the second thing to do. The last line is just an index
i. The problem now is, given this resulting structure, how will the values be calculated?
The approach used in ProbPy starts by considering the index
i, and for every
i the algorithm will find the corresponding value in the two operand factors and use them in an operation. Because the first variables in the resulting factor are the same variables in the first factors, and because their order is the same, finding the corresponding index for the first factor is simple. The index in the first value is the same as the index in the resulting factor. So
index1 could simple be:
index1 = i
This works for
i between 0 and 3, but fails for greater values. For values greater then 3, the values of the indexing variables
Y are just going to repeat. So
i=4 has the same indexes as
i=5 has the same as
i=1, and so on. It simple to see how
index1 will just be:
index1 = i % fa_len
fa_len is the size of the list of values in factor 1.
The variables of the second factor are in no specific order in relation to the resulting factor. So to get
index2 a more complex method must be used. Recall from the first part of the article that it’s simple to take the indexes of each variable and calculate the
i index. Suppose
f(X=0, Y=1, Z=1), the
i index would be:
i = x + 2*y + 2*2*z = 0 + 2*1 + 2*2*1 = 6
So now a reverse method needs to be applied to get the value of each indexing variable of factor 2. The variables in factor 2 are
Y in that order. For every value of
i, thr value of
Z = int(i / (2*2)) % 2
So we take the value of
i and divide it by 4. Why? Because
Z encapsulates two variables in the resulting factor and each variable has a size of 2. This division scales down the size of the
Z variable. After that, the integer division work similarly to the previous one. The same is done for the
Y = int(i / 2) % 2
Notice that now
i only gets divided by 2 because that variable only encapsulates one other variable. If this was done for the
X variable no division would be needed.
After having both the
Y variable values, calculating the corresponding index of factor 2 is simply done like:
index2 = Z + 2*Y
Which is the method shown before.
This operation is implemented in the
factor.py file. The method is big, but the important part is the following:
for i in range(res_values_size): # Get index 1 index1 = i % len_values # Get index 2 index2 = 0 for j in index2_range: index2 += (int(i / div[j]) % dim[j]) * mult[j] # Calculate value res_values.append(fun(self.values[index1], factor.values[index2])) # Make Factor object and return return Factor(res_rand_vars, res_values)
Line 8 is where the index
i transformed in index two. The
div list contains the values for which
i is going to be divided. In the example in this article the list would be
[2, 4], for
Z respectively. The
dim list contains the sizes of the variables. The
mult list contains the values for which each variable is going to be multiplied. The list will be
With this, we have a method that relates two factors. Whether if the method is a multiplication, addition or anything else depends on the
fun function which is passed by parameter.