aGrUM  0.13.2
gum::learning::IndepTestG2< IdSetAlloc, CountAlloc > Class Template Reference

the class for computing G2 independence test scores

 The class should be used as follows: first, to speed-up computations,
 you should consider computing all the independence tests you need in one
 pass.
More...

#include <agrum/learning/scores_and_tests/indepTestG2.h>

+ Inheritance diagram for gum::learning::IndepTestG2< IdSetAlloc, CountAlloc >:
+ Collaboration diagram for gum::learning::IndepTestG2< IdSetAlloc, CountAlloc >:

Public Member Functions

Constructors / Destructors
template<typename RowFilter >
 IndepTestG2 (const RowFilter &filter, const std::vector< Size > &var_modalities)
 default constructor More...
 
 ~IndepTestG2 ()
 destructor More...
 
Accessors / Modifiers
double score (Idx nodeset_index)
 returns the score corresponding to a given nodeset More...
 
Accessors / Modifiers
Idx addNodeSet (Idx var1, Idx var2)
 add a new target node conditioned by another node to be counted More...
 
Idx addNodeSet (const std::pair< Idx, Idx > &vars)
 add a new target node conditioned by another node to be counted More...
 
Idx addNodeSet (Idx var1, Idx var2, const std::vector< Idx > &conditioning_ids)
 add a target conditioned by other variables to be counted More...
 
Idx addNodeSet (Idx var1, Idx var2, std::vector< Idx > &&conditioning_ids)
 add a target conditioned by other variables to be counted More...
 
Idx addNodeSet (const std::pair< Idx, Idx > &vars, const std::vector< Idx > &conditioning_ids)
 add a target conditioned by other variables to be counted More...
 
Idx addNodeSet (const std::pair< Idx, Idx > &vars, std::vector< Idx > &&conditioning_ids)
 add a target conditioned by other variables to be counted More...
 
void clear ()
 clears all the data structures from memory More...
 
void clearCache ()
 clears the current cache (clear nodesets as well) More...
 
void useCache (bool on_off) noexcept
 turn on/off the use of a cache of the previously computed score More...
 

Protected Attributes

const double _1log2 {M_LOG2E}
 1 / log(2) More...
 

Protected Member Functions

bool _isInCache (Idx nodeset_index) const noexcept
 indicates whether a score belongs to the cache More...
 
void _insertIntoCache (Idx nodeset_index, double score)
 inserts a new score into the cache More...
 
double _cachedScore (Idx nodeset_index) const noexcept
 returns a cached score More...
 
bool _isUsingCache () const noexcept
 indicates whether we use the cache or not More...
 

Detailed Description

template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
class gum::learning::IndepTestG2< IdSetAlloc, CountAlloc >

the class for computing G2 independence test scores

 The class should be used as follows: first, to speed-up computations,
 you should consider computing all the independence tests you need in one
 pass.

To do so, use the appropriate addNodeSet methods. These will compute everything you need. Use method score to retrieve the scores related to the independence test that were computed. See the IndependenceTest class for details.

Definition at line 64 of file indepTestG2.h.

Constructor & Destructor Documentation

template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
template<typename RowFilter >
gum::learning::IndepTestG2< IdSetAlloc, CountAlloc >::IndepTestG2 ( const RowFilter &  filter,
const std::vector< Size > &  var_modalities 
)

default constructor

Parameters
filterthe row filter that will be used to read the database
var_modalitiesthe domain sizes of the variables in the database
template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
gum::learning::IndepTestG2< IdSetAlloc, CountAlloc >::~IndepTestG2 ( )

destructor

Member Function Documentation

template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
double gum::learning::IndependenceTest< IdSetAlloc, CountAlloc >::_cachedScore ( Idx  nodeset_index) const
protectednoexceptinherited

returns a cached score

template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
void gum::learning::IndependenceTest< IdSetAlloc, CountAlloc >::_insertIntoCache ( Idx  nodeset_index,
double  score 
)
protectedinherited

inserts a new score into the cache

template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
bool gum::learning::IndependenceTest< IdSetAlloc, CountAlloc >::_isInCache ( Idx  nodeset_index) const
protectednoexceptinherited

indicates whether a score belongs to the cache

template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
bool gum::learning::IndependenceTest< IdSetAlloc, CountAlloc >::_isUsingCache ( ) const
protectednoexceptinherited

indicates whether we use the cache or not

template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
Idx gum::learning::IndependenceTest< IdSetAlloc, CountAlloc >::addNodeSet ( Idx  var1,
Idx  var2 
)
inherited

add a new target node conditioned by another node to be counted

Parameters
var1represents the index of the target variable in the filtered rows produced by the database cell filters
var2represents the index of the conditioning variable in the filtered rows produced by the database cell filters
Returns
the index of the produced counting vector: the user should use class IndependenceTest to compute in one pass several independence tests. These and their corresponding countings in the database are stored into a vector and the value returned by method addNodeSet is the index of the counts in this vector. The user shall pass this index as argument to methods _getAllCounts and _getConditioningCounts to get the observed countings of (var2,var1) [in this order] and var2 respectively.
template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
Idx gum::learning::IndependenceTest< IdSetAlloc, CountAlloc >::addNodeSet ( const std::pair< Idx, Idx > &  vars)
inherited

add a new target node conditioned by another node to be counted

Parameters
varscontains the index of the target variable (first) in the filtered rows produced by the database cell filters, and the index of the conditioning variable (second).
Returns
the index of the produced counting vector: the user should use class IndependenceTest to compute in one pass several independence tests. These and their corresponding countings in the database are stored into a vector and the value returned by method addNodeSet is the index of the counts in this vector. The user shall pass this index as argument to methods _getAllCounts and _getConditioningCounts to get the observed countings of (vars.second, vars.first) [in this order] and vars.second respectively.
template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
Idx gum::learning::IndependenceTest< IdSetAlloc, CountAlloc >::addNodeSet ( Idx  var1,
Idx  var2,
const std::vector< Idx > &  conditioning_ids 
)
inherited

add a target conditioned by other variables to be counted

Parameters
var1represents the index of the target variable in the filtered rows produced by the database cell filters
var2represents the index of the last conditioning variable in the filtered rows produced by the database cell filters
conditioning_idsthe indices of the variables of the conditioning set in the filtered rows (minus var2, which is subsequently apended to it).
Returns
the index of the produced counting vector: the user should use class IndependenceTest to compute in one pass several independence tests. These and their corresponding countings in the database are stored into a vector and the value returned by method addNodeSet is the index of the counts in this vector. The user shall pass this index as argument to methods _getAllCounts and _getConditioningCounts to get the countings of (conditioning_ids, var2, var1) [in this order] and (conditioning_ids, var2) [in this order] respectively.
template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
Idx gum::learning::IndependenceTest< IdSetAlloc, CountAlloc >::addNodeSet ( Idx  var1,
Idx  var2,
std::vector< Idx > &&  conditioning_ids 
)
inherited

add a target conditioned by other variables to be counted

Parameters
var1represents the index of the target variable in the filtered rows produced by the database cell filters
var2represents the index of the last conditioning variable in the filtered rows produced by the database cell filters
conditioning_idsthe indices of the variables of the conditioning set in the filtered rows (minus var2, which is subsequently apended to it).
Returns
the index of the produced counting vector: the user should use class IndependenceTest to compute in one pass several independence tests. These and their corresponding countings in the database are stored into a vector and the value returned by method addNodeSet is the index of the counts in this vector. The user shall pass this index as argument to methods _getAllCounts and _getConditioningCounts to get the countings of (conditioning_ids, var2, var1) [in this order] and (conditioning_ids, var2) [in this order] respectively.
template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
Idx gum::learning::IndependenceTest< IdSetAlloc, CountAlloc >::addNodeSet ( const std::pair< Idx, Idx > &  vars,
const std::vector< Idx > &  conditioning_ids 
)
inherited

add a target conditioned by other variables to be counted

Parameters
varsrepresents the index of the target variable (first) in the filtered rows produced by the database cell filters, and the index of the last conditioning variable (second)
conditioning_idsthe indices of the variables of the conditioning set in the filtered rows (minus vars.second which is appended to it)
Returns
the index of the produced counting vector: the user should use class IndependenceTest to compute in one pass several independence tests. These and their corresponding countings in the database are stored into a vector and the value returned by method addNodeSet is the index of the counts in this vector. The user shall pass this index as argument to methods _getAllCounts and _getConditioningCounts to get the observed countings of (conditioning_ids, vars.second, vars.first) [in this order] and (conditioning_ids, vars.second) [in this order] respectively.
template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
Idx gum::learning::IndependenceTest< IdSetAlloc, CountAlloc >::addNodeSet ( const std::pair< Idx, Idx > &  vars,
std::vector< Idx > &&  conditioning_ids 
)
inherited

add a target conditioned by other variables to be counted

Parameters
varsrepresents the index of the target variable (first) in the filtered rows produced by the database cell filters, and the index of the last conditioning variable (second)
conditioning_idsthe indices of the variables of the conditioning set in the filtered rows (minus vars.second which is appended to it)
Returns
the index of the produced counting vector: the user should use class IndependenceTest to compute in one pass several independence tests. These and their corresponding countings in the database are stored into a vector and the value returned by method addNodeSet is the index of the counts in this vector. The user shall pass this index as argument to methods _getAllCounts and _getConditioningCounts to get the observed countings of (conditioning_ids, vars.second, vars.first) [in this order] and (conditioning_ids, vars.second) [in this order] respectively.
template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
void gum::learning::IndependenceTest< IdSetAlloc, CountAlloc >::clear ( )
inherited

clears all the data structures from memory

template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
void gum::learning::IndependenceTest< IdSetAlloc, CountAlloc >::clearCache ( )
inherited

clears the current cache (clear nodesets as well)

template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
double gum::learning::IndepTestG2< IdSetAlloc, CountAlloc >::score ( Idx  nodeset_index)
virtual

returns the score corresponding to a given nodeset

This method computes sum_X sum_Y sum_Z #XYZ * log ( ( #XYZ * #Z ) / ( #XZ * #YZ ) ), where #XYZ, #XZ, #YZ, #Z correspond to the number of occurences of (X,Y,Z), (X,Z), (Y,Z) and Z respectively in the database. Then, it computes the critical value alpha for the chi2 test and returns ( #sum - alpha ) / alpha, where #sum corresponds to the summations mentioned above. Therefore, any positive result should reflect a dependence whereas negative results should reflect independences.

Implements gum::learning::IndependenceTest< IdSetAlloc, CountAlloc >.

template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
void gum::learning::IndependenceTest< IdSetAlloc, CountAlloc >::useCache ( bool  on_off)
noexceptinherited

turn on/off the use of a cache of the previously computed score

Member Data Documentation

template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
const double gum::learning::IndependenceTest< IdSetAlloc, CountAlloc >::_1log2 {M_LOG2E}
protectedinherited

1 / log(2)

Definition at line 232 of file independenceTest.h.

template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
Chi2 gum::learning::IndepTestG2< IdSetAlloc, CountAlloc >::__chi2
private

a chi2 distribution for computing critical values

Definition at line 104 of file indepTestG2.h.

template<typename IdSetAlloc = std::allocator< Idx >, typename CountAlloc = std::allocator< double >>
const std::vector< Idx, IdSetAlloc > gum::learning::IndepTestG2< IdSetAlloc, CountAlloc >::__empty_set
private

an empty vector of ids

Definition at line 107 of file indepTestG2.h.


The documentation for this class was generated from the following file: