aGrUM  0.13.2
gum::learning::IndependenceTest< IdSetAlloc, CountAlloc > Class Template Referenceabstract

the abstract class for all the independence testsThe class should be used as follows: first, to speed-up computations, you should consider computing all the independence tests you need in one pass. More...

#include <independenceTest.h>

+ Inheritance diagram for gum::learning::IndependenceTest< IdSetAlloc, CountAlloc >:
+ Collaboration diagram for gum::learning::IndependenceTest< IdSetAlloc, CountAlloc >:

Public Member Functions

Constructors / Destructors
template<typename RowFilter >
 IndependenceTest (const RowFilter &filter, const std::vector< Size > &var_modalities)
 default constructor More...
 
virtual ~IndependenceTest ()
 destructor More...
 
Accessors / Modifiers
Idx addNodeSet (Idx var1, Idx var2)
 add a new target node conditioned by another node to be counted More...
 
Idx addNodeSet (const std::pair< Idx, Idx > &vars)
 add a new target node conditioned by another node to be counted More...
 
Idx addNodeSet (Idx var1, Idx var2, const std::vector< Idx > &conditioning_ids)
 add a target conditioned by other variables to be counted More...
 
Idx addNodeSet (Idx var1, Idx var2, std::vector< Idx > &&conditioning_ids)
 add a target conditioned by other variables to be counted More...
 
Idx addNodeSet (const std::pair< Idx, Idx > &vars, const std::vector< Idx > &conditioning_ids)
 add a target conditioned by other variables to be counted More...
 
Idx addNodeSet (const std::pair< Idx, Idx > &vars, std::vector< Idx > &&conditioning_ids)
 add a target conditioned by other variables to be counted More...
 
void clear ()
 clears all the data structures from memory More...
 
void clearCache ()
 clears the current cache (clear nodesets as well) More...
 
void useCache (bool on_off) noexcept
 turn on/off the use of a cache of the previously computed score More...
 
virtual double score (Idx nodeset_index)=0
 returns the score corresponding to a given nodeset More...
 

Protected Attributes

const double _1log2 {M_LOG2E}
 1 / log(2) More...
 

Protected Member Functions

bool _isInCache (Idx nodeset_index) const noexcept
 indicates whether a score belongs to the cache More...
 
void _insertIntoCache (Idx nodeset_index, double score)
 inserts a new score into the cache More...
 
double _cachedScore (Idx nodeset_index) const noexcept
 returns a cached score More...
 
bool _isUsingCache () const noexcept
 indicates whether we use the cache or not More...
 

Detailed Description

template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
class gum::learning::IndependenceTest< IdSetAlloc, CountAlloc >

the abstract class for all the independence tests

The class should be used as follows: first, to speed-up computations, you should consider computing all the independence tests you need in one pass.

To do so, use the appropriate addNodeSet methods. These will compute everything you need. The addNodeSet methods where you do not specify a set of conditioning nodes assume that this set is empty. Once the computations have been performed, use method _getAllCounts and _getConditioningCounts to get the observed countings if you are developping a new independence test class, or use method score to get the computed score of the test if you are an end user.

Definition at line 70 of file independenceTest.h.

Constructor & Destructor Documentation

template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
template<typename RowFilter >
gum::learning::IndependenceTest< IdSetAlloc, CountAlloc >::IndependenceTest ( const RowFilter &  filter,
const std::vector< Size > &  var_modalities 
)

default constructor

Parameters
filterthe row filter that will be used to read the database
var_modalitiesthe domain sizes of the variables in the database
template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
virtual gum::learning::IndependenceTest< IdSetAlloc, CountAlloc >::~IndependenceTest ( )
virtual

destructor

template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
gum::learning::IndependenceTest< IdSetAlloc, CountAlloc >::IndependenceTest ( const IndependenceTest< IdSetAlloc, CountAlloc > &  )
privatedelete

prevent copy constructor

Member Function Documentation

template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
double gum::learning::IndependenceTest< IdSetAlloc, CountAlloc >::_cachedScore ( Idx  nodeset_index) const
protectednoexcept

returns a cached score

template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
void gum::learning::Counter< IdSetAlloc, CountAlloc >::_count ( )
protectedinherited

perform the computation of the countings

template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
const std::vector< double, CountAlloc >& gum::learning::Counter< IdSetAlloc, CountAlloc >::_getAllCounts ( Idx  index)
protectedinherited

returns the counting vector for a given (conditioned) target set

This method returns the observtion countings for the set of variables whose index was returned by method addNodeSet or addNodeSet. If the set was conditioned, the countings correspond to the target variables and the conditioning variables. If you wish to get only the countings for the conditioning variables, prefer using method countConditioning.

Warning
the dimensions of the vector are as follows: first come the nodes of the conditioning set (in the order in which they were specified when callind addNodeset, and then the target nodes).
whenever you call this function, if the counts have not been computed yet, they are computed before the function returns.
template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
const std::vector< Idx, IdSetAlloc >& gum::learning::Counter< IdSetAlloc, CountAlloc >::_getAllNodes ( Idx  index) const
protectednoexceptinherited

returns the set of target + conditioning nodes

conditioning nodes are always the first ones in the vector and targets are the last ones

template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
const std::vector< std::pair< std::vector< Idx, IdSetAlloc >, Idx >* >& gum::learning::Counter< IdSetAlloc, CountAlloc >::_getAllNodes ( ) const
protectednoexceptinherited

returns all the sets of target + cond nodes, and their counting indices

conditioning nodes are always the first ones in the vector and targets are the last ones

template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
const std::vector< double, CountAlloc >& gum::learning::Counter< IdSetAlloc, CountAlloc >::_getConditioningCounts ( Idx  index)
protectedinherited

returns the counting vector for a conditioning set

Warning
whenever you call this function, if the counts have not been computed yet, they are computed before the function returns.
template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
const std::vector< Idx, IdSetAlloc >* gum::learning::Counter< IdSetAlloc, CountAlloc >::_getConditioningNodes ( Idx  index) const
protectednoexceptinherited

returns the conditioning nodes (nullptr if there are no such nodes)

template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
const std::vector< std::pair< std::vector< Idx, IdSetAlloc >, Idx >* >& gum::learning::Counter< IdSetAlloc, CountAlloc >::_getConditioningNodes ( ) const
protectednoexceptinherited

returns all the sets of conditioning nodes

template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
std::vector< std::vector< double, CountAlloc > >& gum::learning::Counter< IdSetAlloc, CountAlloc >::_getCounts ( )
protectednoexceptinherited

returns all the countings performed (both targets and conditioned)

this method returns the countings of the record counter. It should be used in conjunction with methods _getConditioningNodes () and _getTargetNodes () that indicate, for each nodeset, the index of the corresponding counting in the vector returned by _getCounts ().

template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
void gum::learning::IndependenceTest< IdSetAlloc, CountAlloc >::_insertIntoCache ( Idx  nodeset_index,
double  score 
)
protected

inserts a new score into the cache

template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
bool gum::learning::IndependenceTest< IdSetAlloc, CountAlloc >::_isInCache ( Idx  nodeset_index) const
protectednoexcept

indicates whether a score belongs to the cache

template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
bool gum::learning::IndependenceTest< IdSetAlloc, CountAlloc >::_isUsingCache ( ) const
protectednoexcept

indicates whether we use the cache or not

template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
Idx gum::learning::Counter< IdSetAlloc, CountAlloc >::addEmptyNodeSet ( )
inherited

adds an empty set of variables to count

template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
Idx gum::learning::IndependenceTest< IdSetAlloc, CountAlloc >::addNodeSet ( Idx  var1,
Idx  var2 
)

add a new target node conditioned by another node to be counted

Parameters
var1represents the index of the target variable in the filtered rows produced by the database cell filters
var2represents the index of the conditioning variable in the filtered rows produced by the database cell filters
Returns
the index of the produced counting vector: the user should use class IndependenceTest to compute in one pass several independence tests. These and their corresponding countings in the database are stored into a vector and the value returned by method addNodeSet is the index of the counts in this vector. The user shall pass this index as argument to methods _getAllCounts and _getConditioningCounts to get the observed countings of (var2,var1) [in this order] and var2 respectively.
template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
Idx gum::learning::IndependenceTest< IdSetAlloc, CountAlloc >::addNodeSet ( const std::pair< Idx, Idx > &  vars)

add a new target node conditioned by another node to be counted

Parameters
varscontains the index of the target variable (first) in the filtered rows produced by the database cell filters, and the index of the conditioning variable (second).
Returns
the index of the produced counting vector: the user should use class IndependenceTest to compute in one pass several independence tests. These and their corresponding countings in the database are stored into a vector and the value returned by method addNodeSet is the index of the counts in this vector. The user shall pass this index as argument to methods _getAllCounts and _getConditioningCounts to get the observed countings of (vars.second, vars.first) [in this order] and vars.second respectively.
template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
Idx gum::learning::IndependenceTest< IdSetAlloc, CountAlloc >::addNodeSet ( Idx  var1,
Idx  var2,
const std::vector< Idx > &  conditioning_ids 
)

add a target conditioned by other variables to be counted

Parameters
var1represents the index of the target variable in the filtered rows produced by the database cell filters
var2represents the index of the last conditioning variable in the filtered rows produced by the database cell filters
conditioning_idsthe indices of the variables of the conditioning set in the filtered rows (minus var2, which is subsequently apended to it).
Returns
the index of the produced counting vector: the user should use class IndependenceTest to compute in one pass several independence tests. These and their corresponding countings in the database are stored into a vector and the value returned by method addNodeSet is the index of the counts in this vector. The user shall pass this index as argument to methods _getAllCounts and _getConditioningCounts to get the countings of (conditioning_ids, var2, var1) [in this order] and (conditioning_ids, var2) [in this order] respectively.
template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
Idx gum::learning::Counter< IdSetAlloc, CountAlloc >::addNodeSet ( Idx  var)
inherited

add a new single variable to be counted

Parameters
varrepresents the index of the variable in the filtered rows produced by the database cell filters whose observations shall be counted
Returns
the index of the produced counting vector: the user should use class Counter to compute in one pass several scores or independence tests. These and their corresponding countings in the database are stored into a vector and the value returned by method addNodeSet is the index of the observed countings of "var" in this vector. The user shall pass this index as argument to methods _getAllCounts to get the corresponding counting vector.
template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
Idx gum::learning::IndependenceTest< IdSetAlloc, CountAlloc >::addNodeSet ( Idx  var1,
Idx  var2,
std::vector< Idx > &&  conditioning_ids 
)

add a target conditioned by other variables to be counted

Parameters
var1represents the index of the target variable in the filtered rows produced by the database cell filters
var2represents the index of the last conditioning variable in the filtered rows produced by the database cell filters
conditioning_idsthe indices of the variables of the conditioning set in the filtered rows (minus var2, which is subsequently apended to it).
Returns
the index of the produced counting vector: the user should use class IndependenceTest to compute in one pass several independence tests. These and their corresponding countings in the database are stored into a vector and the value returned by method addNodeSet is the index of the counts in this vector. The user shall pass this index as argument to methods _getAllCounts and _getConditioningCounts to get the countings of (conditioning_ids, var2, var1) [in this order] and (conditioning_ids, var2) [in this order] respectively.
template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
Idx gum::learning::IndependenceTest< IdSetAlloc, CountAlloc >::addNodeSet ( const std::pair< Idx, Idx > &  vars,
const std::vector< Idx > &  conditioning_ids 
)

add a target conditioned by other variables to be counted

Parameters
varsrepresents the index of the target variable (first) in the filtered rows produced by the database cell filters, and the index of the last conditioning variable (second)
conditioning_idsthe indices of the variables of the conditioning set in the filtered rows (minus vars.second which is appended to it)
Returns
the index of the produced counting vector: the user should use class IndependenceTest to compute in one pass several independence tests. These and their corresponding countings in the database are stored into a vector and the value returned by method addNodeSet is the index of the counts in this vector. The user shall pass this index as argument to methods _getAllCounts and _getConditioningCounts to get the observed countings of (conditioning_ids, vars.second, vars.first) [in this order] and (conditioning_ids, vars.second) [in this order] respectively.
template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
Idx gum::learning::IndependenceTest< IdSetAlloc, CountAlloc >::addNodeSet ( const std::pair< Idx, Idx > &  vars,
std::vector< Idx > &&  conditioning_ids 
)

add a target conditioned by other variables to be counted

Parameters
varsrepresents the index of the target variable (first) in the filtered rows produced by the database cell filters, and the index of the last conditioning variable (second)
conditioning_idsthe indices of the variables of the conditioning set in the filtered rows (minus vars.second which is appended to it)
Returns
the index of the produced counting vector: the user should use class IndependenceTest to compute in one pass several independence tests. These and their corresponding countings in the database are stored into a vector and the value returned by method addNodeSet is the index of the counts in this vector. The user shall pass this index as argument to methods _getAllCounts and _getConditioningCounts to get the observed countings of (conditioning_ids, vars.second, vars.first) [in this order] and (conditioning_ids, vars.second) [in this order] respectively.
template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
Idx gum::learning::Counter< IdSetAlloc, CountAlloc >::addNodeSet ( Idx  var,
const std::vector< Idx > &  conditioning_ids 
)
inherited

add a new target variable plus some conditioning vars

Parameters
varrepresents the index of the target variable in the filtered rows produced by the database cell filters
conditioning_idsthe indices of the variables of the conditioning set in the filtered rows
Returns
the index of the produced counting vector: the user should use class Counter to compute in one pass several scores or independence tests. These and their corresponding countings in the database are stored into a vector and the value returned by method addNodeSet is the index of the countings of (var | conditioning_ids) in this vector. The user shall pass this index as argument to methods _getAllCounts and _getConditioningCounts to get the counting vectors of (conditioning_ids,vars) [in this order] and conditioning_ids respectively.
template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
Idx gum::learning::Counter< IdSetAlloc, CountAlloc >::addNodeSet ( Idx  var,
std::vector< Idx > &&  conditioning_ids 
)
inherited

add a new target variable plus some conditioning vars

Parameters
varrepresents the index of the target variable in the filtered rows produced by the database cell filters
conditioning_idsthe indices of the variables of the conditioning set in the filtered rows
Returns
the index of the produced counting vector: the user should use class Counter to compute in one pass several scores or independence tests. These and their corresponding countings in the database are stored into a vector and the value returned by method addNodeSet is the index of the countings of (var | conditioning_ids) in this vector. The user shall pass this index as argument to methods _getAllCounts and _getConditioningCounts to get the counting vectors of (conditioning_ids,vars) [in this order] and conditioning_ids respectively.
template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
void gum::learning::IndependenceTest< IdSetAlloc, CountAlloc >::clear ( )

clears all the data structures from memory

template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
void gum::learning::IndependenceTest< IdSetAlloc, CountAlloc >::clearCache ( )

clears the current cache (clear nodesets as well)

template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
const std::vector< Size >& gum::learning::Counter< IdSetAlloc, CountAlloc >::modalities ( ) const
noexceptinherited

returns the modalities of the variables

template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
IndependenceTest& gum::learning::IndependenceTest< IdSetAlloc, CountAlloc >::operator= ( const IndependenceTest< IdSetAlloc, CountAlloc > &  )
privatedelete

prevent copy operator

template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
virtual double gum::learning::IndependenceTest< IdSetAlloc, CountAlloc >::score ( Idx  nodeset_index)
pure virtual

returns the score corresponding to a given nodeset

Scores are computed by counting formulas (for instance, for Chi2, this formula corresponds to: sum_X sum_Y sum_Z ( #XYZ - (#XZ * #YZ) / #Z )^2 / (( #XZ * #YZ) / #Z ), where #XYZ, #XZ, #YZ, #Z correspond to the number of occurences of (X,Y,Z), (X,Z), (Y,Z) and Z respectively in the database); then the critical value alpha of the test is computed. Finally, method score returns ( #sum - alpha ) / alpha, where #sum corresponds to the summations mentioned above. Therefore, any positive result should reflect a dependence whereas negative results should reflect independences.

Implemented in gum::learning::IndepTestChi2< IdSetAlloc, CountAlloc >, gum::learning::IndepTestG2< IdSetAlloc, CountAlloc >, and gum::learning::KNML< IdSetAlloc, CountAlloc >.

template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
void gum::learning::Counter< IdSetAlloc, CountAlloc >::setMaxNbThreads ( Size  nb)
noexceptinherited

sets the maximum number of threads used to perform countings

template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
void gum::learning::Counter< IdSetAlloc, CountAlloc >::setRange ( Size  min_range,
Size  max_range 
)
inherited

sets the range of records taken into account by the counter

Parameters
min_rangehe number of the first record to be taken into account during learning
max_rangethe number of the record after the last one taken into account
template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
void gum::learning::IndependenceTest< IdSetAlloc, CountAlloc >::useCache ( bool  on_off)
noexcept

turn on/off the use of a cache of the previously computed score

Member Data Documentation

template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
const double gum::learning::IndependenceTest< IdSetAlloc, CountAlloc >::_1log2 {M_LOG2E}
protected

1 / log(2)

Definition at line 232 of file independenceTest.h.

template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
Cache4IndepTest gum::learning::IndependenceTest< IdSetAlloc, CountAlloc >::__cache
private

a cache for the previously computed scores

Definition at line 276 of file independenceTest.h.

template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
std::vector< double > gum::learning::IndependenceTest< IdSetAlloc, CountAlloc >::__cached_score
private

the vector of scores for the current nodesets

Definition at line 285 of file independenceTest.h.

template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
const std::vector< Idx > gum::learning::IndependenceTest< IdSetAlloc, CountAlloc >::__empty_conditioning_set
private

an empty conditioning set

Definition at line 288 of file independenceTest.h.

template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
std::vector< bool > gum::learning::IndependenceTest< IdSetAlloc, CountAlloc >::__is_cached_score
private

indicates whether the ith nodeset's score is in the cache or not

Definition at line 282 of file independenceTest.h.

template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
bool gum::learning::IndependenceTest< IdSetAlloc, CountAlloc >::__use_cache {true}
private

a Boolean indicating whether we wish to use the cache

Definition at line 279 of file independenceTest.h.

template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
std::vector< std::pair< std::vector< Idx, IdSetAlloc >, Idx >* > gum::learning::Counter< IdSetAlloc, CountAlloc >::_conditioning_nodesets
protectedinherited

the conditioning id sets to count and their indices in the record counter

Definition at line 361 of file counter.h.

template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
bool gum::learning::Counter< IdSetAlloc, CountAlloc >::_counts_computed {false}
protectedinherited

indicates whether we have already computed the countings of the nodesets

Definition at line 349 of file counter.h.

template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
const std::vector< Size >& gum::learning::Counter< IdSetAlloc, CountAlloc >::_modalities
protectedinherited

the modalities of the variables

Definition at line 345 of file counter.h.

template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
RecordCounter< IdSetAlloc, CountAlloc > gum::learning::Counter< IdSetAlloc, CountAlloc >::_record_counter
protectedinherited

the recordCounter that will parse the database

Definition at line 352 of file counter.h.

template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
std::vector< std::pair< std::vector< Idx, IdSetAlloc >, Idx >* > gum::learning::Counter< IdSetAlloc, CountAlloc >::_target_nodesets
protectedinherited

the target id sets to count and their indices in the record counter

Definition at line 356 of file counter.h.


The documentation for this class was generated from the following file: