aGrUM  0.13.2
gum::learning::Counter< IdSetAlloc, CountAlloc > Class Template Reference

The counting class for all the scores used for learning (BIC, BDeu, etc) as well as for all the independence tests. More...

#include <counter.h>

+ Inheritance diagram for gum::learning::Counter< IdSetAlloc, CountAlloc >:
+ Collaboration diagram for gum::learning::Counter< IdSetAlloc, CountAlloc >:

Public Member Functions

Constructors / Destructors
template<typename RowFilter >
 Counter (const RowFilter &filter, const std::vector< Size > &var_modalities, Size min_range=0, Size max_range=std::numeric_limits< Size >::max())
 default constructor More...
 
 Counter (const Counter< IdSetAlloc, CountAlloc > &)
 copy constructor More...
 
 Counter (Counter< IdSetAlloc, CountAlloc > &&)
 move constructor More...
 
virtual ~Counter ()
 destructor More...
 
Modifiers for unconditioned variables
Idx addEmptyNodeSet ()
 adds an empty set of variables to count More...
 
Idx addNodeSet (Idx var)
 add a new single variable to be counted More...
 
Modifiers for conditioned variables
Idx addNodeSet (Idx var1, Idx var2)
 add a new target node conditioned by another node to be counted More...
 
Idx addNodeSet (const std::pair< Idx, Idx > &vars)
 add a new target node conditioned by another node to be counted More...
 
Idx addNodeSet (Idx var, const std::vector< Idx > &conditioning_ids)
 add a new target variable plus some conditioning vars More...
 
Idx addNodeSet (Idx var, std::vector< Idx > &&conditioning_ids)
 add a new target variable plus some conditioning vars More...
 
Idx addNodeSet (Idx var1, Idx var2, const std::vector< Idx > &conditioning_ids)
 add a target conditioned by other variables to be counted More...
 
Idx addNodeSet (Idx var1, Idx var2, std::vector< Idx > &&conditioning_ids)
 add a target conditioned by other variables to be counted More...
 
Idx addNodeSet (const std::pair< Idx, Idx > &vars, const std::vector< Idx > &conditioning_ids)
 add a target conditioned by other variables to be counted More...
 
Idx addNodeSet (const std::pair< Idx, Idx > &vars, std::vector< Idx > &&conditioning_ids)
 add a target conditioned by other variables to be counted More...
 
Accessors / General modifiers
void clear ()
 clears all the data structures from memory More...
 
const std::vector< Size > & modalities () const noexcept
 returns the modalities of the variables More...
 
void setMaxNbThreads (Size nb) noexcept
 sets the maximum number of threads used to perform countings More...
 
void setRange (Size min_range, Size max_range)
 sets the range of records taken into account by the counter More...
 

Protected Attributes

const double _1log2 {M_LOG2E}
 1 / log(2) More...
 
const std::vector< Size > & _modalities
 the modalities of the variables More...
 
bool _counts_computed {false}
 indicates whether we have already computed the countings of the nodesets More...
 
RecordCounter< IdSetAlloc, CountAlloc > _record_counter
 the recordCounter that will parse the database More...
 
std::vector< std::pair< std::vector< Idx, IdSetAlloc >, Idx > * > _target_nodesets
 the target id sets to count and their indices in the record counter More...
 
std::vector< std::pair< std::vector< Idx, IdSetAlloc >, Idx > * > _conditioning_nodesets
 the conditioning id sets to count and their indices in the record counter More...
 

Protected Member Functions

void _count ()
 perform the computation of the countings More...
 
const std::vector< double, CountAlloc > & _getAllCounts (Idx index)
 returns the counting vector for a given (conditioned) target set More...
 
const std::vector< double, CountAlloc > & _getConditioningCounts (Idx index)
 returns the counting vector for a conditioning set More...
 
std::vector< std::vector< double, CountAlloc > > & _getCounts () noexcept
 returns all the countings performed (both targets and conditioned) More...
 
const std::vector< Idx, IdSetAlloc > & _getAllNodes (Idx index) const noexcept
 returns the set of target + conditioning nodes More...
 
const std::vector< std::pair< std::vector< Idx, IdSetAlloc >, Idx > * > & _getAllNodes () const noexcept
 returns all the sets of target + cond nodes, and their counting indices More...
 
const std::vector< Idx, IdSetAlloc > * _getConditioningNodes (Idx index) const noexcept
 returns the conditioning nodes (nullptr if there are no such nodes) More...
 
const std::vector< std::pair< std::vector< Idx, IdSetAlloc >, Idx > * > & _getConditioningNodes () const noexcept
 returns all the sets of conditioning nodes More...
 
Counter< IdSetAlloc, CountAlloc > & operator= (const Counter< IdSetAlloc, CountAlloc > &)=delete
 prevent copy operator More...
 

Detailed Description

template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
class gum::learning::Counter< IdSetAlloc, CountAlloc >

The counting class for all the scores used for learning (BIC, BDeu, etc) as well as for all the independence tests.

This class contains all the methods that enable to add (possibly conditioned) nodesets to be subsequently counted to produce a score or the result of an independence test. The class considers both symmetric and asymmetric scores or tests. Basically, a symmetric test involves two variables X and Y and, possibly a conditioning set of variables Z. The test usually relies on an equation involving quantities #XYZ, #XZ, #YZ and #Z, where "@#" refer to observation counts or frequencies. For instance, the Chi2 independence test uses the following formula: (#XYZ

  • (#XZ * #YZ) / #Z )^2 / ( (#XZ * #YZ) / #Z ). An asymmetric score, like BIC for instance, involves only one variable X and a conditioning set Z. Basically, this score will involve only #XZ and #Z. As such, the current class offers different methods to compute all these quantities. Note that, counting vectors are actually multidimensional arrays and the order of the variables in the dimensions of the arrays is always the same: the conditioning nodes (in the order in which they are specified) always come first and the target variable is always the last one.

The class should be used as follows: first, to speed-up computations, you should consider computing all the scores or tests you need in one pass because this enables parsing the database once in order to fill many counting vectors. To do so, use the appropriate addNodeSet methods. These will compute everything you need. The addNodeSet methods where you do not specify a set of conditioning nodes assume that this set is empty. Once the computations have been performed, use methods _getAllCounts and _getConditioningCounts to get the countings that have been performed. Note that this class is not intended to be used as is, but is rather a basis for classes Score and IndependenceTest.

Definition at line 108 of file counter.h.

Constructor & Destructor Documentation

template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
template<typename RowFilter >
gum::learning::Counter< IdSetAlloc, CountAlloc >::Counter ( const RowFilter &  filter,
const std::vector< Size > &  var_modalities,
Size  min_range = 0,
Size  max_range = std::numeric_limits< Size >::max() 
)

default constructor

Parameters
filterthe row filter that will be used to read the database
var_modalitiesthe domain sizes of the variables in the database
min_rangeThe minimal range
max_rangeThe maximal range
template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
gum::learning::Counter< IdSetAlloc, CountAlloc >::Counter ( const Counter< IdSetAlloc, CountAlloc > &  )

copy constructor

template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
gum::learning::Counter< IdSetAlloc, CountAlloc >::Counter ( Counter< IdSetAlloc, CountAlloc > &&  )

move constructor

template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
virtual gum::learning::Counter< IdSetAlloc, CountAlloc >::~Counter ( )
virtual

destructor

Member Function Documentation

template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
void gum::learning::Counter< IdSetAlloc, CountAlloc >::_count ( )
protected

perform the computation of the countings

template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
const std::vector< double, CountAlloc >& gum::learning::Counter< IdSetAlloc, CountAlloc >::_getAllCounts ( Idx  index)
protected

returns the counting vector for a given (conditioned) target set

This method returns the observtion countings for the set of variables whose index was returned by method addNodeSet or addNodeSet. If the set was conditioned, the countings correspond to the target variables and the conditioning variables. If you wish to get only the countings for the conditioning variables, prefer using method countConditioning.

Warning
the dimensions of the vector are as follows: first come the nodes of the conditioning set (in the order in which they were specified when callind addNodeset, and then the target nodes).
whenever you call this function, if the counts have not been computed yet, they are computed before the function returns.
template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
const std::vector< Idx, IdSetAlloc >& gum::learning::Counter< IdSetAlloc, CountAlloc >::_getAllNodes ( Idx  index) const
protectednoexcept

returns the set of target + conditioning nodes

conditioning nodes are always the first ones in the vector and targets are the last ones

template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
const std::vector< std::pair< std::vector< Idx, IdSetAlloc >, Idx >* >& gum::learning::Counter< IdSetAlloc, CountAlloc >::_getAllNodes ( ) const
protectednoexcept

returns all the sets of target + cond nodes, and their counting indices

conditioning nodes are always the first ones in the vector and targets are the last ones

template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
const std::vector< double, CountAlloc >& gum::learning::Counter< IdSetAlloc, CountAlloc >::_getConditioningCounts ( Idx  index)
protected

returns the counting vector for a conditioning set

Warning
whenever you call this function, if the counts have not been computed yet, they are computed before the function returns.
template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
const std::vector< Idx, IdSetAlloc >* gum::learning::Counter< IdSetAlloc, CountAlloc >::_getConditioningNodes ( Idx  index) const
protectednoexcept

returns the conditioning nodes (nullptr if there are no such nodes)

template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
const std::vector< std::pair< std::vector< Idx, IdSetAlloc >, Idx >* >& gum::learning::Counter< IdSetAlloc, CountAlloc >::_getConditioningNodes ( ) const
protectednoexcept

returns all the sets of conditioning nodes

template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
std::vector< std::vector< double, CountAlloc > >& gum::learning::Counter< IdSetAlloc, CountAlloc >::_getCounts ( )
protectednoexcept

returns all the countings performed (both targets and conditioned)

this method returns the countings of the record counter. It should be used in conjunction with methods _getConditioningNodes () and _getTargetNodes () that indicate, for each nodeset, the index of the corresponding counting in the vector returned by _getCounts ().

template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
Idx gum::learning::Counter< IdSetAlloc, CountAlloc >::addEmptyNodeSet ( )

adds an empty set of variables to count

template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
Idx gum::learning::Counter< IdSetAlloc, CountAlloc >::addNodeSet ( Idx  var)

add a new single variable to be counted

Parameters
varrepresents the index of the variable in the filtered rows produced by the database cell filters whose observations shall be counted
Returns
the index of the produced counting vector: the user should use class Counter to compute in one pass several scores or independence tests. These and their corresponding countings in the database are stored into a vector and the value returned by method addNodeSet is the index of the observed countings of "var" in this vector. The user shall pass this index as argument to methods _getAllCounts to get the corresponding counting vector.
template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
Idx gum::learning::Counter< IdSetAlloc, CountAlloc >::addNodeSet ( Idx  var1,
Idx  var2 
)

add a new target node conditioned by another node to be counted

Parameters
var1represents the index of the target variable in the filtered rows produced by the database cell filters
var2represents the index of the conditioning variable in the filtered rows produced by the database cell filters
Returns
the index of the produced counting vector: the user should use class Counter to compute in one pass several scores or independence tests. These and their corresponding countings in the database are stored into a vector and the value returned by method addNodeSet is the index of the counts in this vector. The user shall pass this index as argument to methods _getAllCounts and _getConditioningCounts to get the observed countings of (var2,var1) [in this order] and var2 respectively.
template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
Idx gum::learning::Counter< IdSetAlloc, CountAlloc >::addNodeSet ( const std::pair< Idx, Idx > &  vars)

add a new target node conditioned by another node to be counted

Parameters
varscontains the index of the target variable (first) in the filtered rows produced by the database cell filters, and the index of the conditioning variable (second).
Returns
the index of the produced counting vector: the user should use class Counter to compute in one pass several scores. These and their corresponding countings in the database are stored into a vector and the value returned by method addNodeSet is the index of the counts in this vector. The user shall pass this index as argument to methods _getAllCounts and _getConditioningCounts to get the observed countings of (vars.second, vars.first) [in this order] and vars.second respectively.
template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
Idx gum::learning::Counter< IdSetAlloc, CountAlloc >::addNodeSet ( Idx  var,
const std::vector< Idx > &  conditioning_ids 
)

add a new target variable plus some conditioning vars

Parameters
varrepresents the index of the target variable in the filtered rows produced by the database cell filters
conditioning_idsthe indices of the variables of the conditioning set in the filtered rows
Returns
the index of the produced counting vector: the user should use class Counter to compute in one pass several scores or independence tests. These and their corresponding countings in the database are stored into a vector and the value returned by method addNodeSet is the index of the countings of (var | conditioning_ids) in this vector. The user shall pass this index as argument to methods _getAllCounts and _getConditioningCounts to get the counting vectors of (conditioning_ids,vars) [in this order] and conditioning_ids respectively.
template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
Idx gum::learning::Counter< IdSetAlloc, CountAlloc >::addNodeSet ( Idx  var,
std::vector< Idx > &&  conditioning_ids 
)

add a new target variable plus some conditioning vars

Parameters
varrepresents the index of the target variable in the filtered rows produced by the database cell filters
conditioning_idsthe indices of the variables of the conditioning set in the filtered rows
Returns
the index of the produced counting vector: the user should use class Counter to compute in one pass several scores or independence tests. These and their corresponding countings in the database are stored into a vector and the value returned by method addNodeSet is the index of the countings of (var | conditioning_ids) in this vector. The user shall pass this index as argument to methods _getAllCounts and _getConditioningCounts to get the counting vectors of (conditioning_ids,vars) [in this order] and conditioning_ids respectively.
template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
Idx gum::learning::Counter< IdSetAlloc, CountAlloc >::addNodeSet ( Idx  var1,
Idx  var2,
const std::vector< Idx > &  conditioning_ids 
)

add a target conditioned by other variables to be counted

Parameters
var1represents the index of the target variable in the filtered rows produced by the database cell filters
var2represents the index of the last conditioning variable in the filtered rows produced by the database cell filters
conditioning_idsthe indices of the variables of the conditioning set in the filtered rows (minus var2, which is subsequently apended to it).
Returns
the index of the produced counting vector: the user should use class Counter to compute in one pass several scores. These and their corresponding countings in the database are stored into a vector and the value returned by method addNodeSet is the index of the counts in this vector. The user shall pass this index as argument to methods _getAllCounts and _getConditioningCounts to get the countings of (conditioning_ids, var2, var1) [in this order] and (conditioning_ids, var2) [in this order] respectively.
template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
Idx gum::learning::Counter< IdSetAlloc, CountAlloc >::addNodeSet ( Idx  var1,
Idx  var2,
std::vector< Idx > &&  conditioning_ids 
)

add a target conditioned by other variables to be counted

Parameters
var1represents the index of the target variable in the filtered rows produced by the database cell filters
var2represents the index of the last conditioning variable in the filtered rows produced by the database cell filters
conditioning_idsthe indices of the variables of the conditioning set in the filtered rows (minus var2, which is subsequently apended to it).
Returns
the index of the produced counting vector: the user should use class Counter to compute in one pass several scores. These and their corresponding countings in the database are stored into a vector and the value returned by method addNodeSet is the index of the counts in this vector. The user shall pass this index as argument to methods _getAllCounts and _getConditioningCounts to get the countings of (conditioning_ids, var2, var1) [in this order] and (conditioning_ids, var2) [in this order] respectively.
template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
Idx gum::learning::Counter< IdSetAlloc, CountAlloc >::addNodeSet ( const std::pair< Idx, Idx > &  vars,
const std::vector< Idx > &  conditioning_ids 
)

add a target conditioned by other variables to be counted

Parameters
varsrepresents the index of the target variable (first) in the filtered rows produced by the database cell filters, and the index of the last conditioning variable (second)
conditioning_idsthe indices of the variables of the conditioning set in the filtered rows (minus vars.second which is appended to it)
Returns
the index of the produced counting vector: the user should use class Counter to compute in one pass several scores. These and their corresponding countings in the database are stored into a vector and the value returned by method addNodeSet is the index of the counts in this vector. The user shall pass this index as argument to methods _getAllCounts and _getConditioningCounts to get the observed countings of (conditioning_ids, vars.second, vars.first) [in this order] and (conditioning_ids, vars.second) [in this order] respectively.
template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
Idx gum::learning::Counter< IdSetAlloc, CountAlloc >::addNodeSet ( const std::pair< Idx, Idx > &  vars,
std::vector< Idx > &&  conditioning_ids 
)

add a target conditioned by other variables to be counted

Parameters
varsrepresents the index of the target variable (first) in the filtered rows produced by the database cell filters, and the index of the last conditioning variable (second)
conditioning_idsthe indices of the variables of the conditioning set in the filtered rows (minus vars.second which is appended to it)
Returns
the index of the produced counting vector: the user should use class Counter to compute in one pass several scores. These and their corresponding countings in the database are stored into a vector and the value returned by method addNodeSet is the index of the counts in this vector. The user shall pass this index as argument to methods _getAllCounts and _getConditioningCounts to get the observed countings of (conditioning_ids, vars.second, vars.first) [in this order] and (conditioning_ids, vars.second) [in this order] respectively.
template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
void gum::learning::Counter< IdSetAlloc, CountAlloc >::clear ( )

clears all the data structures from memory

template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
const std::vector< Size >& gum::learning::Counter< IdSetAlloc, CountAlloc >::modalities ( ) const
noexcept

returns the modalities of the variables

template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
Counter< IdSetAlloc, CountAlloc >& gum::learning::Counter< IdSetAlloc, CountAlloc >::operator= ( const Counter< IdSetAlloc, CountAlloc > &  )
protecteddelete

prevent copy operator

template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
void gum::learning::Counter< IdSetAlloc, CountAlloc >::setMaxNbThreads ( Size  nb)
noexcept

sets the maximum number of threads used to perform countings

template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
void gum::learning::Counter< IdSetAlloc, CountAlloc >::setRange ( Size  min_range,
Size  max_range 
)

sets the range of records taken into account by the counter

Parameters
min_rangehe number of the first record to be taken into account during learning
max_rangethe number of the record after the last one taken into account

Member Data Documentation

template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
const double gum::learning::Counter< IdSetAlloc, CountAlloc >::_1log2 {M_LOG2E}
protected

1 / log(2)

Definition at line 342 of file counter.h.

template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
std::vector< std::pair< std::vector< Idx, IdSetAlloc >, Idx >* > gum::learning::Counter< IdSetAlloc, CountAlloc >::_conditioning_nodesets
protected

the conditioning id sets to count and their indices in the record counter

Definition at line 361 of file counter.h.

template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
bool gum::learning::Counter< IdSetAlloc, CountAlloc >::_counts_computed {false}
protected

indicates whether we have already computed the countings of the nodesets

Definition at line 349 of file counter.h.

template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
const std::vector< Size >& gum::learning::Counter< IdSetAlloc, CountAlloc >::_modalities
protected

the modalities of the variables

Definition at line 345 of file counter.h.

template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
RecordCounter< IdSetAlloc, CountAlloc > gum::learning::Counter< IdSetAlloc, CountAlloc >::_record_counter
protected

the recordCounter that will parse the database

Definition at line 352 of file counter.h.

template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
std::vector< std::pair< std::vector< Idx, IdSetAlloc >, Idx >* > gum::learning::Counter< IdSetAlloc, CountAlloc >::_target_nodesets
protected

the target id sets to count and their indices in the record counter

Definition at line 356 of file counter.h.


The documentation for this class was generated from the following file: