![]() |
aGrUM
0.13.0
|
Bayesian Networks are a probabilistic graphical model in which nodes are random variables and the probability distribution is defined by the product:
where \(\pi(X_i)\) is the parent of \(X_i\).
The Bayesian Network module in aGrUM can help you do the following operations:
The Bayesian Networks module list all classes for using Bayesian Networks with aGrUM.
We will use the classic Asia network to illustrate how the gum::BayesNetFactory class works.
The following code illustrates how to create the Asia network using the gum::BayesNet class. To create an instance of a Bayesian Network, you simply need to call the gum::BayesNet class constructor.
Use the gum::BayesNet::add( const gum::DiscreteVariable& ) method to add variables in the Bayesian Network. The following variables are available in aGrUM:
You can also use the gum::BayesNet::idFromName( const std::string& ) method to retrieve variable's id from its name.
Use the gum::BayesNet::addArc( gum::NodeId, gum::NodeId ) to add arcs between node in the Bayesian Network.
Finally, use the gumm:BayesNet::cpt( gum::NodeId ) to access a variable's conditional probability table. See How to use the MultiDim hierarchy to learn how to fill gum::Potential. Here we use the gum::Potential::fillwith( const std::vector& ) method.
Filling conditionnal probability tables can be hard and you should use the commenting trick as above to help you with large tables. It is important to remember that the std::vector is used to fill a multidimensionnal table where each line should sum to 1, i.e. each line stores \(P(X_i | \pi(X_i)\).
The gum::ByesNetFactory class is usefull when writing serializers and deserailizers for the gum::BayesNet class. You can also use it to create gum:BayesNet directly in C++, you may however find that using directly the gum::BayesNet class simpler.
The gum::BayesNetFactory expects a pointer toward a gum::BayesNet. The factory will not release this pointer, so you should be careful to release it yourself.
Most methods follow a start / end pattern . Until the end method is called, there is no guarantee that the element is added or partially added to the gum::BayesNet.
To add a node, you must use the gum::BayesNetFactory::startVariableDeclaration() and gum::BayesNetFactory::endVariableDeclaration() methods. You must provide several informations to correctly add a node to the gum::BayesNet, otherwise a gum::OperationNotAllowed will be raised.
When declaring a variable you must:
Here is a list of legal method calls while declaring a variable:
Here is a code sample where we declare the "Visit To Asia" variable in the Asia Network example:
The gum::BayesNetFactory::endVariableDeclaration() method returns the variable's gum::NodeId in the gum::BayesNet.
To add an arc you must use the gum::BayesNetFactory::startParentsDeclaration( const std::string& ) and gum::BayesNetFactory::endParentsDeclaration() methods.
Here is a list of legal method calls while declaring parents:
Note that you may not add all parents in one shot and that calling both start end methods without adding any parent will not result in an error.
The gum::BayesNetFactory class offers three ways to define conditional probability tables (CPT): raw, factorized and delegated.
From a user perspective, raw definitions are useful to define small CPT, like root nodes. However, they do not scale well if the CPT dimension is too high and you should prefer Factorized CPT definition if you need to define large CPT. On the other hand, raw definitions are very useful when automatically filling CPT from some source (file, database, another CPT, ...).
Two methods can be used to define raw CPT:
Defining the conditional probability table for the root node "Visit To Asia" in the Asia Network example can be achieved as follow:
Defining the conditional probability table for a node with parents:
Factorized definitions are useful when dealing with sparse CPT. It can also be used when writing the raw CPT is error prone. The gum::BayesNetFactory::startFactorizedProbabilityDeclaration(const std::string&) is used to start a definition and gum::BayesNetFactory::endFactorizedProbabilityDeclaration(const std::string&) to end it.
A factorized definition is made of consecutive factorized entries. Each entry set parents modalities and defines a distribution given those modalities. If some parents are left undefined, then the distribution will be assigned to each possible outcome of those parents.
To start declaring a factorized entry call the gum::BayesNetFactory::startFactorizedEntry() and to end it call gum::BayesNetFactory::endFactorizedEntry().
In the following example, we define the CPT for the "Dyspnea" variable in the Asia Network:
While adding values in a factorized definition, two methods are available:
The unchecked version will not check if the vector matches the variable's domain size. The checked version will raise a gum::OperationNotAllowed if such situation.
Delegated definitions let the user define himself the gum::DiscreteVariable and gum::MultiDimAdressable added to the gum::BayesNet. You should only use such method if you familiar with the multidim hierarchy and require specific multidimensional arrays, like gum::MultiDimNoisyORCompound, gum::aggregator::Count, etc.
All inference algorithms implement the gum::BayesNetInference class. The main methods for inference are:
More advance methods can be used for special use case:
Here is a list of exact inference algorithms:
And this is the list of approximate inference algorithms:
Finally, a list of utility algorithms used by some inference algorithms:
There are several file format currently supported for gum::BayesNet serialization and deserialization. The all either implement gum::BNReader for serialization or gum::BNWriter for deserialization.
The main methods for deserializing an instance of gum::BayesNet are:
The main methods for serializing an instance of gum::BayesNet are:
Be aware that the file will be created if it does not exists. If it does exist, its content will be erased.
The BIF format:
The BIF XML format:
The DSL format:
The CNF format (no reader in this format):
The NET format: