The 1.0.0 version of aGrUM is finally here and brings its share of new features. The most important is the parallelization of the inference methods that is the subject of this article. In particular, we will present a benchmark between the parallelized version and the old version.
But first, let's remind the exact algorithms available in aGrUM to perform inferences. There are three of them: VariableElimination, ShaferShenoy and LazyPropagation. For our comparisons, we will restrict ourselves to the latter [Madsen, A. L., & Jensen, F. V. (1999)].
To use it, let us first load the ASIA network which, like all the structures that will be used in the following, is available here.
import pyAgrum as gum asia_bn = gum.loadBN("asia.bif")
Now we can make an inference to determine the probability that a patient has lung cancer knowing that he has dyspnoea. To do this we need to create an inference engine and call the makeInference() method:
ie = gum.LazyPropagation(asia_bn) ie.makeInference()
To obtain the wanted probability, we now need to specify the evidence and call the posterior() method:
ie.addEvidence('dyspnoea', 'yes') p = ie.posterior('lung_cancer')
In this case, printing the variable p gives us:
lung_cancer | yes |no | ---------|---------| 0.1028 | 0.8972 |
For more examples, you can consult the notebooks about inference.
I wanna make a supersonic lemon out of you
The makeInference() method carries out the main calculation to obtain the probability of interest. Therefore, we chose to use its execution time as a metric for our comparisons between single and multi-threaded versions. We can check the number of threads that are used for the inference using getNumberOfThreads and it can be modified using setNumberOfThreads():
For our experiments we used successively n=1 and n=10 threads to make inferences on the different structures that can be found in the previous linked repository. For a given structure, we ran 20 iterations and the following table reports the average time of each algorithm. These experiments were done using an Intel(R) Xeon(R) Silver 4116 CPU @ 2.10GHz (48 cores) and 32GB of memory.
|Structure||Nb. of nodes||Nb. of arcs||Mono-thread (ms)||Multi-thread (ms)|
As we can see, the mono-threaded version does as well as the multi-threaded version on small structures but we really save time on large structures !