![]() |
aGrUM
0.20.3
a C++ library for (probabilistic) graphical models
|
The base class for all the tabular database cell translators. More...
#include <agrum/tools/database/DBTranslator.h>
Public Member Functions | |
Constructors / Destructors | |
template<template< typename > class XALLOC> | |
DBTranslator (DBTranslatedValueType val_type, const std::vector< std::string, XALLOC< std::string > > &missing_symbols, const bool editable_dictionary=true, std::size_t max_dico_entries=std::numeric_limits< std::size_t >::max(), const allocator_type &alloc=allocator_type()) | |
default constructor More... | |
DBTranslator (DBTranslatedValueType val_type, const bool editable_dictionary=true, std::size_t max_dico_entries=std::numeric_limits< std::size_t >::max(), const allocator_type &alloc=allocator_type()) | |
default constructor without missing symbols More... | |
DBTranslator (const DBTranslator< ALLOC > &from) | |
copy constructor More... | |
DBTranslator (const DBTranslator< ALLOC > &from, const allocator_type &alloc) | |
copy constructor with a given allocator More... | |
DBTranslator (DBTranslator< ALLOC > &&from) | |
move constructor More... | |
DBTranslator (DBTranslator< ALLOC > &&from, const allocator_type &alloc) | |
move constructor with a given allocator More... | |
virtual DBTranslator< ALLOC > * | clone () const =0 |
virtual copy constructor More... | |
virtual DBTranslator< ALLOC > * | clone (const allocator_type &alloc) const =0 |
virtual copy constructor with a given allocator More... | |
virtual | ~DBTranslator () |
destructor More... | |
Operators | |
DBTranslatedValue | operator<< (const std::string &str) |
alias for method translate More... | |
std::string | operator>> (const DBTranslatedValue translated_val) |
alias for method translateBack More... | |
Accessors / Modifiers | |
virtual DBTranslatedValue | translate (const std::string &str)=0 |
returns the translation of a string More... | |
virtual std::string | translateBack (const DBTranslatedValue translated_val) const =0 |
returns the original value for a given translation More... | |
virtual std::size_t | domainSize () const =0 |
returns the domain size of a variable corresponding to the translations More... | |
virtual bool | hasEditableDictionary () const |
indicates whether the translator has an editable dictionary or not More... | |
virtual void | setEditableDictionaryMode (bool new_mode) |
sets/unset the editable dictionary mode More... | |
virtual bool | needsReordering () const =0 |
indicates whether a reordering is needed to make the translations sorted More... | |
virtual HashTable< std::size_t, std::size_t, ALLOC< std::pair< std::size_t, std::size_t > > > | reorder ()=0 |
performs a reordering of the dictionary and returns a mapping from the old translated values to the new ones. More... | |
const Set< std::string, ALLOC< std::string > > & | missingSymbols () const |
returns the set of missing symbols taken into account by the translator More... | |
bool | isMissingSymbol (const std::string &str) const |
indicates whether a string corresponds to a missing symbol More... | |
virtual const Variable * | variable () const =0 |
returns the variable stored into the translator More... | |
void | setVariableName (const std::string &str) const |
sets the name of the variable stored into the translator More... | |
void | setVariableDescription (const std::string &str) const |
sets the name of the variable stored into the translator More... | |
DBTranslatedValueType | getValType () const |
returns the type of values handled by the translator More... | |
allocator_type | getAllocator () const |
returns the allocator used by the translator More... | |
bool | isMissingValue (const DBTranslatedValue &val) const |
indicates whether a translated value corresponds to a missing value More... | |
virtual DBTranslatedValue | missingValue () const =0 |
returns the translation of a missing value More... | |
Public Types | |
using | allocator_type = ALLOC< DBTranslatedValue > |
type for the allocators passed in arguments of methods More... | |
Protected Attributes | |
bool | is_dictionary_dynamic_ |
indicates whether the dictionary can be updated or not More... | |
std::size_t | max_dico_entries_ |
the maximum number of entries that the dictionary is allowed to contain More... | |
Set< std::string, ALLOC< std::string > > | missing_symbols_ |
the set of missing symbols More... | |
Bijection< std::size_t, std::string, ALLOC< std::pair< float, std::string > > > | back_dico_ |
the bijection relating back translated values and their original strings. More... | |
DBTranslatedValueType | val_type_ |
the type of the values translated by the translator More... | |
Protected Member Functions | |
Protected Operators | |
DBTranslator< ALLOC > & | operator= (const DBTranslator< ALLOC > &from) |
copy operator More... | |
DBTranslator< ALLOC > & | operator= (DBTranslator< ALLOC > &&from) |
move operator More... | |
The base class for all the tabular database cell translators.
Translators are used by DatabaseTable instances to transform datasets' strings into DBTranslatedValue instances. The point is that strings are not adequate for fast learning, they need to be preprocessed into a type that can be analyzed quickly (the so-called DBTranslatedValue type). The DBTranslator class is the abstract base class for all the translators used in aGrUM.
Here is an example of how to use it, illustrated with the DBTranslator4ContinuousVariable class:
Definition at line 116 of file DBTranslator.h.
using gum::learning::DBTranslator< ALLOC >::allocator_type = ALLOC< DBTranslatedValue > |
type for the allocators passed in arguments of methods
Definition at line 119 of file DBTranslator.h.
gum::learning::DBTranslator< ALLOC >::DBTranslator | ( | DBTranslatedValueType | val_type, |
const std::vector< std::string, XALLOC< std::string > > & | missing_symbols, | ||
const bool | editable_dictionary = true , |
||
std::size_t | max_dico_entries = std::numeric_limits< std::size_t >::max() , |
||
const allocator_type & | alloc = allocator_type() |
||
) |
default constructor
val_type | indicates whether the DBTranslator deals with discrete or continuous variables |
editable_dictionary | indicates whether the dictionary used for translations can be updated dynamically when observing new string or whether it should remain constant. To see how this parameter is handled, see the child classes inheriting from DBTranslator |
missing_symbols | the set of symbols in the database representing missing values |
max_dico_entries | the max number of entries that the dictionary can contain. If we try to add new entries in the dictionary, this will be considered as an error and a SizeError exception will be raised |
alloc | The allocator used to allocate memory for all the fields of the DBTranslator |
gum::learning::DBTranslator< ALLOC >::DBTranslator | ( | DBTranslatedValueType | val_type, |
const bool | editable_dictionary = true , |
||
std::size_t | max_dico_entries = std::numeric_limits< std::size_t >::max() , |
||
const allocator_type & | alloc = allocator_type() |
||
) |
default constructor without missing symbols
val_type | indicates whether the DBTranslator deals with discrete or continuous variables |
editable_dictionary | indicates whether the dictionary used for translations can be updated dynamically when observing new string or whether it should remain constant. To see how this parameter is handled, see the child classes inheriting from DBTranslator |
max_dico_entries | the max number of entries that the dictionary can contain. If we try to add new entries in the dictionary, this will be considered as an error and a SizeError exception will be raised |
alloc | The allocator used to allocate memory for all the fields of the DBTranslator |
gum::learning::DBTranslator< ALLOC >::DBTranslator | ( | const DBTranslator< ALLOC > & | from | ) |
copy constructor
gum::learning::DBTranslator< ALLOC >::DBTranslator | ( | const DBTranslator< ALLOC > & | from, |
const allocator_type & | alloc | ||
) |
copy constructor with a given allocator
gum::learning::DBTranslator< ALLOC >::DBTranslator | ( | DBTranslator< ALLOC > && | from | ) |
move constructor
gum::learning::DBTranslator< ALLOC >::DBTranslator | ( | DBTranslator< ALLOC > && | from, |
const allocator_type & | alloc | ||
) |
move constructor with a given allocator
|
virtual |
destructor
|
pure virtual |
|
pure virtual |
virtual copy constructor with a given allocator
Implemented in gum::learning::DBTranslator4ContinuousVariable< ALLOC >, gum::learning::DBTranslator4RangeVariable< ALLOC >, gum::learning::DBTranslator4LabelizedVariable< ALLOC >, and gum::learning::DBTranslator4DiscretizedVariable< ALLOC >.
|
pure virtual |
returns the domain size of a variable corresponding to the translations
Assume that the translator has been fed with the observed values of a random variable. Then it has produced a set of translated values. The latter define the domain of the variable. When the variable is discrete, values are assumed to span from 0 to a number n-1. In this case, the domain size of the variable is n. When the function is continuous, the domain size should be infinite and we return a std::numeric_limits<std::size_t>::max() to represent it. Note that missing values are encoded as std::numeric_limits<>::max () and are not taken into account in the domain sizes.
Implemented in gum::learning::DBTranslator4ContinuousVariable< ALLOC >, gum::learning::DBTranslator4RangeVariable< ALLOC >, gum::learning::DBTranslator4LabelizedVariable< ALLOC >, and gum::learning::DBTranslator4DiscretizedVariable< ALLOC >.
allocator_type gum::learning::DBTranslator< ALLOC >::getAllocator | ( | ) | const |
returns the allocator used by the translator
DBTranslatedValueType gum::learning::DBTranslator< ALLOC >::getValType | ( | ) | const |
returns the type of values handled by the translator
|
virtual |
indicates whether the translator has an editable dictionary or not
Reimplemented in gum::learning::DBTranslator4DiscretizedVariable< ALLOC >.
bool gum::learning::DBTranslator< ALLOC >::isMissingSymbol | ( | const std::string & | str | ) | const |
indicates whether a string corresponds to a missing symbol
bool gum::learning::DBTranslator< ALLOC >::isMissingValue | ( | const DBTranslatedValue & | val | ) | const |
indicates whether a translated value corresponds to a missing value
const Set< std::string, ALLOC< std::string > >& gum::learning::DBTranslator< ALLOC >::missingSymbols | ( | ) | const |
returns the set of missing symbols taken into account by the translator
|
pure virtual |
returns the translation of a missing value
Implemented in gum::learning::DBTranslator4LabelizedVariable< ALLOC >, gum::learning::DBTranslator4RangeVariable< ALLOC >, gum::learning::DBTranslator4ContinuousVariable< ALLOC >, and gum::learning::DBTranslator4DiscretizedVariable< ALLOC >.
|
pure virtual |
indicates whether a reordering is needed to make the translations sorted
If the strings represented by the translations are only numbers, translations are considered to be sorted if and only if they are sorted by increasing number. If the strings do not only represent numbers, then translations are considered to be sorted if and only if they are sorted lexicographically.
When constructing dynamically its dictionary, the translator may assign wrong DBTranslatedValue values to strings. For instance, a translator reading sequentially integer strings 4, 1, 3, may map 4 into DBTranslatedValue{std::size_t(0)}, 1 into DBTranslatedValue{std::size_t(1)} and 3 into DBTranslatedValue{std::size_t(2)}, resulting in random variables having domain {4,1,3}. The user may prefer having domain {1,3,4}, i.e., a domain specified with increasing values. This requires a reordering. Method needsReodering() returns a Boolean indicating whether such a reordering should be performed or whether the current order is OK.
Implemented in gum::learning::DBTranslator4ContinuousVariable< ALLOC >, gum::learning::DBTranslator4LabelizedVariable< ALLOC >, gum::learning::DBTranslator4RangeVariable< ALLOC >, and gum::learning::DBTranslator4DiscretizedVariable< ALLOC >.
DBTranslatedValue gum::learning::DBTranslator< ALLOC >::operator<< | ( | const std::string & | str | ) |
alias for method translate
|
protected |
copy operator
|
protected |
move operator
std::string gum::learning::DBTranslator< ALLOC >::operator>> | ( | const DBTranslatedValue | translated_val | ) |
alias for method translateBack
|
pure virtual |
performs a reordering of the dictionary and returns a mapping from the old translated values to the new ones.
When a reordering is needed, i.e., string values must be translated differently, Method reorder() computes how the translations should be changed. It updates accordingly the dictionary and returns the mapping that enables changing the old dictionary values into the new ones. Note that the hash table returned is expressed in terms of std::size_t because only the translations for discrete random variables need be reordered, those for continuous random variables are identity mappings.
Implemented in gum::learning::DBTranslator4LabelizedVariable< ALLOC >, gum::learning::DBTranslator4RangeVariable< ALLOC >, gum::learning::DBTranslator4ContinuousVariable< ALLOC >, and gum::learning::DBTranslator4DiscretizedVariable< ALLOC >.
|
virtual |
sets/unset the editable dictionary mode
Reimplemented in gum::learning::DBTranslator4DiscretizedVariable< ALLOC >.
void gum::learning::DBTranslator< ALLOC >::setVariableDescription | ( | const std::string & | str | ) | const |
sets the name of the variable stored into the translator
void gum::learning::DBTranslator< ALLOC >::setVariableName | ( | const std::string & | str | ) | const |
sets the name of the variable stored into the translator
|
pure virtual |
returns the translation of a string
This method tries to translate a given string into the DBTranslatedValue that should be stored into a DatabaseTable. If the translator cannot find the translation in its current dictionary, then two situations can obtain:
str | the string that the DBTranslator will try to translate |
UnknownLabelInDatabase | is raised if the translation cannot be found and the translator is not in an editable dictionary mode. |
SizeError | is raised if the number of entries in the dictionary has already reached its maximum. |
OperationNotAllowed | exception is raised if the translation cannot be found and the insertion of the string into the translator's dictionary fails because it would induce incoherent behavior (e.g., a DBTranslator4ContinuousVariable that contains a variable whose domain is [x,y] as well as a missing value symbol z \(\in\) [x,y]). |
TypeError | is raised if the translation cannot be found and the insertion of the string into the translator's dictionary fails due to str being impossible to be converted into an appropriate type. |
Implemented in gum::learning::DBTranslator4ContinuousVariable< ALLOC >, gum::learning::DBTranslator4RangeVariable< ALLOC >, gum::learning::DBTranslator4LabelizedVariable< ALLOC >, and gum::learning::DBTranslator4DiscretizedVariable< ALLOC >.
|
pure virtual |
returns the original value for a given translation
translated_val | a value that should result from a translation and for which we are looking for the corresponding DBTranslator's variable's label (a string) |
UnknownLabelInDatabase | is raised if this original value cannot be found |
Implemented in gum::learning::DBTranslator4ContinuousVariable< ALLOC >, gum::learning::DBTranslator4RangeVariable< ALLOC >, gum::learning::DBTranslator4LabelizedVariable< ALLOC >, and gum::learning::DBTranslator4DiscretizedVariable< ALLOC >.
|
pure virtual |
returns the variable stored into the translator
Implemented in gum::learning::DBTranslator4LabelizedVariable< ALLOC >, gum::learning::DBTranslator4RangeVariable< ALLOC >, gum::learning::DBTranslator4ContinuousVariable< ALLOC >, and gum::learning::DBTranslator4DiscretizedVariable< ALLOC >.
|
mutableprotected |
the bijection relating back translated values and their original strings.
Note that the translated values considered here are of type std::size_t because only the values for discrete variables need be stored, those for continuous variables are actually identity mappings.
Definition at line 388 of file DBTranslator.h.
|
protected |
indicates whether the dictionary can be updated or not
Definition at line 373 of file DBTranslator.h.
|
protected |
the maximum number of entries that the dictionary is allowed to contain
Definition at line 376 of file DBTranslator.h.
|
protected |
the set of missing symbols
Definition at line 379 of file DBTranslator.h.
|
protected |
the type of the values translated by the translator
Definition at line 391 of file DBTranslator.h.