![]() |
aGrUM
0.20.3
a C++ library for (probabilistic) graphical models
|
The class used to pack sets of generators. More...
#include <agrum/tools/database/DBRowGeneratorSet.h>
Public Member Functions | |
Constructors / Destructors | |
DBRowGeneratorSet (const allocator_type &alloc=allocator_type()) | |
default constructor More... | |
DBRowGeneratorSet (const DBRowGeneratorSet< ALLOC > &from) | |
copy constructor More... | |
DBRowGeneratorSet (const DBRowGeneratorSet< ALLOC > &from, const allocator_type &alloc) | |
copy constructor with a given allocator More... | |
DBRowGeneratorSet (DBRowGeneratorSet< ALLOC > &&from) | |
move constructor More... | |
DBRowGeneratorSet (DBRowGeneratorSet< ALLOC > &&from, const allocator_type &alloc) | |
move constructor with a given allocator More... | |
virtual DBRowGeneratorSet< ALLOC > * | clone () const |
virtual copy constructor More... | |
virtual DBRowGeneratorSet< ALLOC > * | clone (const allocator_type &alloc) const |
virtual copy constructor with a given allocator More... | |
virtual | ~DBRowGeneratorSet () |
destructor More... | |
Operators | |
DBRowGeneratorSet< ALLOC > & | operator= (const DBRowGeneratorSet< ALLOC > &from) |
copy operator More... | |
DBRowGeneratorSet< ALLOC > & | operator= (DBRowGeneratorSet< ALLOC > &&from) |
move operator More... | |
DBRowGenerator< ALLOC > & | operator[] (const std::size_t i) |
returns the ith generator More... | |
const DBRowGenerator< ALLOC > & | operator[] (const std::size_t i) const |
returns the ith generator More... | |
Accessors / Modifiers | |
template<template< template< typename > class > class Generator> | |
void | insertGenerator (const Generator< ALLOC > &generator) |
inserts a new generator at the end of the set More... | |
template<template< template< typename > class > class Generator> | |
void | insertGenerator (const Generator< ALLOC > &generator, const std::size_t i) |
inserts a new generator at the ith position of the set More... | |
std::size_t | nbGenerators () const noexcept |
returns the number of generators More... | |
std::size_t | size () const noexcept |
returns the number of generators (alias for nbGenerators) More... | |
bool | hasRows () |
returns true if there are still rows that can be output by the set of generators More... | |
bool | setInputRow (const DBRow< DBTranslatedValue, ALLOC > &input_row) |
sets the input row from which the generators will create new rows More... | |
const DBRow< DBTranslatedValue, ALLOC > & | generate () |
generates a new output row from the input row More... | |
template<typename GUM_SCALAR > | |
void | setBayesNet (const BayesNet< GUM_SCALAR > &new_bn) |
assign a new Bayes net to all the generators that depend on a BN More... | |
void | reset () |
resets all the generators More... | |
void | clear () |
removes all the generators More... | |
void | setColumnsOfInterest (const std::vector< std::size_t, ALLOC< std::size_t > > &cols_of_interest) |
sets the columns of interest: the output DBRow needs only contain correct values fot these columns More... | |
void | setColumnsOfInterest (std::vector< std::size_t, ALLOC< std::size_t > > &&cols_of_interest) |
sets the columns of interest: the output DBRow needs only contain correct values fot these columns More... | |
const std::vector< std::size_t, ALLOC< std::size_t > > & | columnsOfInterest () const |
returns the current set of columns of interest More... | |
allocator_type | getAllocator () const |
returns the allocator used More... | |
Public Types | |
using | allocator_type = ALLOC< DBTranslatedValue > |
type for the allocators passed in arguments of methods More... | |
The class used to pack sets of generators.
When learning Bayesian networks, the records of the train dataset are used to construct contingency tables that are either exploited in statistical conditional independence tests or in scores. To achieve this, the values of the DatabaseTable's records need all be observed, i.e., there should be no missing value. When this is not the case, we need to decide what to do with the records (actually the DBRows) that contain missing values. Should we discard them? Should we use an EM algorithm to substitute them by several fully-observed DBRows weighted by their probability of occurrence? Should we use a K-means algorithm to substitute them by only one DBRow of highest probability of occurrence? DBRowGenerator classes are used to perform these substitutions. From one input DBRow, they can produce from 0 to several output DBRows. DBRowGenerator instances can be used in sequences, i.e., a first DBRowGenerator can, e.g., apply an EM algorithm to produce many output DBRows, then these DBRows can feed another DBRowGenerator that only keeps those whose weight is higher than a given threshold. The purpose of Class DBRowGeneratorSet is to contain this sequence of DBRowGenerator instances. The key idea is that it makes the parsing of the output DBRow generated easier. For instance, if we want to use a sequence of 2 generators, outputing 3 times and 4 times the DBRows they get in input respectively, we could use the following code:
For each input DBRow of the DatabaseTable, these while loops output 3 x 4 = 12 identical DBRows. As can be seen, when several DBRowGenerator instances are to be used in sequence, the code is not very easy to write. The DBRowGeneratorSet simplifies the coding as follows:
As can be seen, whatever the number of DBRowGenerator instances packed into the DBRowGeneratorSet, only one while loop is needed to parse all the generated output DBRow instances.
Definition at line 112 of file DBRowGeneratorSet.h.
using gum::learning::DBRowGeneratorSet< ALLOC >::allocator_type = ALLOC< DBTranslatedValue > |
type for the allocators passed in arguments of methods
Definition at line 115 of file DBRowGeneratorSet.h.
gum::learning::DBRowGeneratorSet< ALLOC >::DBRowGeneratorSet | ( | const allocator_type & | alloc = allocator_type() | ) |
default constructor
gum::learning::DBRowGeneratorSet< ALLOC >::DBRowGeneratorSet | ( | const DBRowGeneratorSet< ALLOC > & | from | ) |
copy constructor
gum::learning::DBRowGeneratorSet< ALLOC >::DBRowGeneratorSet | ( | const DBRowGeneratorSet< ALLOC > & | from, |
const allocator_type & | alloc | ||
) |
copy constructor with a given allocator
gum::learning::DBRowGeneratorSet< ALLOC >::DBRowGeneratorSet | ( | DBRowGeneratorSet< ALLOC > && | from | ) |
move constructor
gum::learning::DBRowGeneratorSet< ALLOC >::DBRowGeneratorSet | ( | DBRowGeneratorSet< ALLOC > && | from, |
const allocator_type & | alloc | ||
) |
move constructor with a given allocator
|
virtual |
destructor
void gum::learning::DBRowGeneratorSet< ALLOC >::clear | ( | ) |
removes all the generators
|
virtual |
virtual copy constructor
|
virtual |
virtual copy constructor with a given allocator
const std::vector< std::size_t, ALLOC< std::size_t > >& gum::learning::DBRowGeneratorSet< ALLOC >::columnsOfInterest | ( | ) | const |
returns the current set of columns of interest
const DBRow< DBTranslatedValue, ALLOC >& gum::learning::DBRowGeneratorSet< ALLOC >::generate | ( | ) |
generates a new output row from the input row
allocator_type gum::learning::DBRowGeneratorSet< ALLOC >::getAllocator | ( | ) | const |
returns the allocator used
bool gum::learning::DBRowGeneratorSet< ALLOC >::hasRows | ( | ) |
returns true if there are still rows that can be output by the set of generators
void gum::learning::DBRowGeneratorSet< ALLOC >::insertGenerator | ( | const Generator< ALLOC > & | generator | ) |
inserts a new generator at the end of the set
OperationNotAllowed | is raised if the generator set has already started generating output rows and is currently in a state where the generation is not completed yet (i.e., we still need to call the generate() method to complete it). |
void gum::learning::DBRowGeneratorSet< ALLOC >::insertGenerator | ( | const Generator< ALLOC > & | generator, |
const std::size_t | i | ||
) |
inserts a new generator at the ith position of the set
OperationNotAllowed | is raised if the generator set has already started generating output rows and is currently in a state where the generation is not completed yet (i.e., we still need to call the generate() method to complete it). |
|
noexcept |
returns the number of generators
DBRowGeneratorSet< ALLOC >& gum::learning::DBRowGeneratorSet< ALLOC >::operator= | ( | const DBRowGeneratorSet< ALLOC > & | from | ) |
copy operator
DBRowGeneratorSet< ALLOC >& gum::learning::DBRowGeneratorSet< ALLOC >::operator= | ( | DBRowGeneratorSet< ALLOC > && | from | ) |
move operator
DBRowGenerator< ALLOC >& gum::learning::DBRowGeneratorSet< ALLOC >::operator[] | ( | const std::size_t | i | ) |
returns the ith generator
const DBRowGenerator< ALLOC >& gum::learning::DBRowGeneratorSet< ALLOC >::operator[] | ( | const std::size_t | i | ) | const |
returns the ith generator
void gum::learning::DBRowGeneratorSet< ALLOC >::reset | ( | ) |
resets all the generators
void gum::learning::DBRowGeneratorSet< ALLOC >::setBayesNet | ( | const BayesNet< GUM_SCALAR > & | new_bn | ) |
assign a new Bayes net to all the generators that depend on a BN
Typically, generators based on EM or K-means depend on a model to compute correctly their outputs. Method setBayesNet enables to update their BN model.
void gum::learning::DBRowGeneratorSet< ALLOC >::setColumnsOfInterest | ( | const std::vector< std::size_t, ALLOC< std::size_t > > & | cols_of_interest | ) |
sets the columns of interest: the output DBRow needs only contain correct values fot these columns
This method is useful, e.g., for EM-like algorithms that need to know which unobserved variables/values need be filled. In this case, the DBRowGenerator instances contained in the DBRowGeneratorSet still output DBRows with the same columns as the DatabaseTable, but only the columns of these DBRows corresponding to those passed in argument to Method setColumnsOfInterest are meaningful. For instance, if a DatabaseTable contains 10 columns and Method setColumnsOfInterest() is applied with vector<> { 0, 3, 4 }, then the DBRowGenerator instances contained in the DBRowGeneratorSet will output DBRows with 10 columns, in which only columns 0, 3 and 4 are guaranteed to have correct values (columns are always indexed, starting from 0).
OperationNotAllowed | is raised if the generator set has already started generating output rows and is currently in a state where the generation is not completed yet (i.e., we still need to call the generate() method to complete it). |
void gum::learning::DBRowGeneratorSet< ALLOC >::setColumnsOfInterest | ( | std::vector< std::size_t, ALLOC< std::size_t > > && | cols_of_interest | ) |
sets the columns of interest: the output DBRow needs only contain correct values fot these columns
This method is useful, e.g., for EM-like algorithms that need to know which unobserved variables/values need be filled. In this case, the DBRowGenerator instances contained in the DBRowGeneratorSet still output DBRows with the same columns as the DatabaseTable, but only the columns of these DBRows corresponding to those passed in argument to Method setColumnsOfInterest are meaningful. For instance, if a DatabaseTable contains 10 columns and Method setColumnsOfInterest() is applied with vector<> { 0, 3, 4 }, then the DBRowGenerator instances contained in the DBRowGeneratorSet will output DBRows with 10 columns, in which only columns 0, 3 and 4 are guaranteed to have correct values (columns are always indexed, starting from 0).
OperationNotAllowed | is raised if the generator set has already started generating output rows and is currently in a state where the generation is not completed yet (i.e., we still need to call the generate() method to complete it). |
bool gum::learning::DBRowGeneratorSet< ALLOC >::setInputRow | ( | const DBRow< DBTranslatedValue, ALLOC > & | input_row | ) |
sets the input row from which the generators will create new rows
|
noexcept |
returns the number of generators (alias for nbGenerators)