![]() |
aGrUM
0.20.3
a C++ library for (probabilistic) graphical models
|
The base class for all DBRow generators. More...
#include <agrum/tools/database/DBRowGenerator.h>
Public Member Functions | |
Constructors / Destructors | |
DBRowGenerator (const std::vector< DBTranslatedValueType, ALLOC< DBTranslatedValueType > > column_types, const DBRowGeneratorGoal goal, const allocator_type &alloc=allocator_type()) | |
default constructor More... | |
DBRowGenerator (const DBRowGenerator< ALLOC > &from) | |
copy constructor More... | |
DBRowGenerator (const DBRowGenerator< ALLOC > &from, const allocator_type &alloc) | |
copy constructor with a given allocator More... | |
DBRowGenerator (DBRowGenerator< ALLOC > &&from) | |
move constructor More... | |
DBRowGenerator (DBRowGenerator< ALLOC > &&from, const allocator_type &alloc) | |
move constructor with a given allocator More... | |
virtual DBRowGenerator< ALLOC > * | clone () const =0 |
virtual copy constructor More... | |
virtual DBRowGenerator< ALLOC > * | clone (const allocator_type &alloc) const =0 |
virtual copy constructor with a given allocator More... | |
virtual | ~DBRowGenerator () |
destructor More... | |
Accessors / Modifiers | |
bool | hasRows () |
returns true if there are still rows that can be output by the DBRowGenerator More... | |
bool | setInputRow (const DBRow< DBTranslatedValue, ALLOC > &row) |
sets the input row from which the generator will create its output rows More... | |
virtual const DBRow< DBTranslatedValue, ALLOC > & | generate ()=0 |
generate new rows from the input row More... | |
void | decreaseRemainingRows () |
decrease the number of remaining output rows More... | |
virtual void | reset () |
resets the generator. There are therefore no more ouput row to generate More... | |
virtual void | setColumnsOfInterest (const std::vector< std::size_t, ALLOC< std::size_t > > &cols_of_interest) |
sets the columns of interest: the output DBRow needs only contain correct values fot these columns More... | |
virtual void | setColumnsOfInterest (std::vector< std::size_t, ALLOC< std::size_t > > &&cols_of_interest) |
sets the columns of interest: the output DBRow needs only contain correct values fot these columns More... | |
const std::vector< std::size_t, ALLOC< std::size_t > > & | columnsOfInterest () const |
returns the current set of columns of interest More... | |
allocator_type | getAllocator () const |
returns the allocator used More... | |
DBRowGeneratorGoal | goal () const |
returns the goal of the DBRowGenerator More... | |
Public Types | |
using | allocator_type = ALLOC< DBTranslatedValue > |
type for the allocators passed in arguments of methods More... | |
Protected Attributes | |
std::size_t | nb_remaining_output_rows_ {std::size_t(0)} |
the number of output rows still to retrieve through the generate method More... | |
std::vector< DBTranslatedValueType, ALLOC< DBTranslatedValueType > > | column_types_ |
the types of the columns in the DatabaseTable More... | |
std::vector< std::size_t, ALLOC< std::size_t > > | columns_of_interest_ |
the set of columns of interest More... | |
DBRowGeneratorGoal | goal_ {DBRowGeneratorGoal::OTHER_THINGS_THAN_REMOVE_MISSING_VALUES} |
the goal of the DBRowGenerator (just remove missing values or not) More... | |
Protected Member Functions | |
DBRowGenerator< ALLOC > & | operator= (const DBRowGenerator< ALLOC > &) |
copy constructor More... | |
DBRowGenerator< ALLOC > & | operator= (DBRowGenerator< ALLOC > &&) |
move constructor More... | |
virtual std::size_t | computeRows_ (const DBRow< DBTranslatedValue, ALLOC > &row)=0 |
the method that computes the set of DBRow instances to output after method setInputRow has been called More... | |
The base class for all DBRow generators.
A DBRowGenerator instance takes as input a DBRow containing DBTranslatedValue instances provided directly by a DatabaseTable or resulting from a DBRow generation by another DBRowGenerator. Then, it produces from 0 to several instances of DBRow of DBTranslatedValue. This is essentially useful to deal with missing values: during learning, when a DBRow contains some missing values, what should we do with it? Should we discard it? Should we use an EM algorithm to produce several DBRows weighted by their probability of occurrence? Should we use a K-means algorithm to produce only one DBRow of highest probability of occurrence? Using the appropriate DBRowGenerator, you can apply any of these rules when your learning algorithm parses the DatabaseTable. You just need to indicate which DBRowGenerator to use, no line of code needs be changed in your high-level learning algorithm.
As an example of how a DBRowGenerator works, an "Identity" DBRowGenerator takes as input a DBRow and returns it without any further processing, so it "produces" only one output DBRow. An EM DBRowGenerator takes in input a DBRow in which some cells may be missing. In this case, it produces all the possible combinations of values that these missing values may take and it assigns to these combinations a weight proportional to their probability of occurrence according to a given model. As such, it may most often produce several output DBRows.
The standard usage of a DBRowGenerator is the following:
All DBRowGenerator classes should derive from this class. It takes care of the interaction with the RecordCounter / Score classes. The user who wishes to create a new DBRowGenerator, say for instance, one that outputs k times the input row, just has to define the following class (not all the constructors/destructors are required, but we provide them for self-consistency), the important part of which is located from the "Accessors / Modifiers" section on:
Definition at line 235 of file DBRowGenerator.h.
using gum::learning::DBRowGenerator< ALLOC >::allocator_type = ALLOC< DBTranslatedValue > |
type for the allocators passed in arguments of methods
Definition at line 238 of file DBRowGenerator.h.
gum::learning::DBRowGenerator< ALLOC >::DBRowGenerator | ( | const std::vector< DBTranslatedValueType, ALLOC< DBTranslatedValueType > > | column_types, |
const DBRowGeneratorGoal | goal, | ||
const allocator_type & | alloc = allocator_type() |
||
) |
default constructor
column_types | indicates for each column whether this is a continuous or a discrete one |
alloc | the allocator used by all the methods |
gum::learning::DBRowGenerator< ALLOC >::DBRowGenerator | ( | const DBRowGenerator< ALLOC > & | from | ) |
copy constructor
gum::learning::DBRowGenerator< ALLOC >::DBRowGenerator | ( | const DBRowGenerator< ALLOC > & | from, |
const allocator_type & | alloc | ||
) |
copy constructor with a given allocator
gum::learning::DBRowGenerator< ALLOC >::DBRowGenerator | ( | DBRowGenerator< ALLOC > && | from | ) |
move constructor
gum::learning::DBRowGenerator< ALLOC >::DBRowGenerator | ( | DBRowGenerator< ALLOC > && | from, |
const allocator_type & | alloc | ||
) |
move constructor with a given allocator
|
virtual |
destructor
|
pure virtual |
virtual copy constructor
Implemented in gum::learning::DBRowGeneratorEM< GUM_SCALAR, ALLOC >, gum::learning::DBRowGenerator4CompleteRows< ALLOC >, and gum::learning::DBRowGeneratorIdentity< ALLOC >.
|
pure virtual |
virtual copy constructor with a given allocator
Implemented in gum::learning::DBRowGeneratorEM< GUM_SCALAR, ALLOC >, gum::learning::DBRowGenerator4CompleteRows< ALLOC >, and gum::learning::DBRowGeneratorIdentity< ALLOC >.
const std::vector< std::size_t, ALLOC< std::size_t > >& gum::learning::DBRowGenerator< ALLOC >::columnsOfInterest | ( | ) | const |
returns the current set of columns of interest
|
protectedpure virtual |
the method that computes the set of DBRow instances to output after method setInputRow has been called
Implemented in gum::learning::DBRowGeneratorEM< GUM_SCALAR, ALLOC >, gum::learning::DBRowGenerator4CompleteRows< ALLOC >, and gum::learning::DBRowGeneratorIdentity< ALLOC >.
void gum::learning::DBRowGenerator< ALLOC >::decreaseRemainingRows | ( | ) |
decrease the number of remaining output rows
When method setInputRow is performed, the DBRowGenerator knows how many output rows it will be able to generate. Each time method decreaseRemainingRows is called, we decrement this number. When the number becomes equal to 0, then there remains no new output row to generate.
|
pure virtual |
generate new rows from the input row
Implemented in gum::learning::DBRowGenerator4CompleteRows< ALLOC >, gum::learning::DBRowGeneratorEM< GUM_SCALAR, ALLOC >, and gum::learning::DBRowGeneratorIdentity< ALLOC >.
allocator_type gum::learning::DBRowGenerator< ALLOC >::getAllocator | ( | ) | const |
returns the allocator used
DBRowGeneratorGoal gum::learning::DBRowGenerator< ALLOC >::goal | ( | ) | const |
returns the goal of the DBRowGenerator
bool gum::learning::DBRowGenerator< ALLOC >::hasRows | ( | ) |
returns true if there are still rows that can be output by the DBRowGenerator
|
protected |
copy constructor
|
protected |
move constructor
|
virtual |
resets the generator. There are therefore no more ouput row to generate
|
virtual |
sets the columns of interest: the output DBRow needs only contain correct values fot these columns
This method is useful, e.g., for EM-like algorithms that need to know which unobserved variables/values need be filled. In this case, the DBRowGenerator still outputs DBRows with the same columns as the DatabaseTable, but only the columns of these DBRows corresponding to those passed in argument to Method setColumnsOfInterest are meaningful. For instance, if a DatabaseTable contains 10 columns and Method setColumnsOfInterest() is applied with vector<> { 0, 3, 4 }, then the DBRowGenerator will output DBRows with 10 columns, in which only columns 0, 3 and 4 are guaranteed to have correct values (columns are always indexed, starting from 0).
|
virtual |
sets the columns of interest: the output DBRow needs only contain correct values fot these columns
This method is useful, e.g., for EM-like algorithms that need to know which unobserved variables/values need be filled. In this case, the DBRowGenerator still outputs DBRows with the same columns as the DatabaseTable, but only the columns of these DBRows corresponding to those passed in argument to Method setColumnsOfInterest are meaningful. For instance, if a DatabaseTable contains 10 columns and Method setColumnsOfInterest() is applied with vector<> { 0, 3, 4 }, then the DBRowGenerator will output DBRows with 10 columns, in which only columns 0, 3 and 4 are guaranteed to have correct values (columns are always indexed, starting from 0).
bool gum::learning::DBRowGenerator< ALLOC >::setInputRow | ( | const DBRow< DBTranslatedValue, ALLOC > & | row | ) |
sets the input row from which the generator will create its output rows
|
protected |
the types of the columns in the DatabaseTable
This is useful to determine whether we need to use the .discr_val field or the .cont_val field in DBTranslatedValue instances.
Definition at line 361 of file DBRowGenerator.h.
|
protected |
the set of columns of interest
Definition at line 364 of file DBRowGenerator.h.
|
protected |
the goal of the DBRowGenerator (just remove missing values or not)
Definition at line 367 of file DBRowGenerator.h.
|
protected |
the number of output rows still to retrieve through the generate method
Definition at line 356 of file DBRowGenerator.h.