aGrUM  0.13.2
gum::learning::DBTranslator4RangeVariable< ALLOC > Class Template Reference

The databases' cell translators for range variables. More...

#include <agrum/learning/database/DBTranslator4RangeVariable.h>

+ Inheritance diagram for gum::learning::DBTranslator4RangeVariable< ALLOC >:
+ Collaboration diagram for gum::learning::DBTranslator4RangeVariable< ALLOC >:

Public Member Functions

Constructors / Destructors
template<template< typename > class XALLOC>
 DBTranslator4RangeVariable (const std::vector< std::string, XALLOC< std::string > > &missing_symbols, std::size_t max_dico_entries=std::numeric_limits< std::size_t >::max(), const allocator_type &alloc=allocator_type())
 default constructor without any initial variable More...
 
 DBTranslator4RangeVariable (std::size_t max_dico_entries=std::numeric_limits< std::size_t >::max(), const allocator_type &alloc=allocator_type())
 default constructor without any initial variable nor missing symbols More...
 
template<template< typename > class XALLOC>
 DBTranslator4RangeVariable (const RangeVariable &var, const std::vector< std::string, XALLOC< std::string > > &missing_symbols, const bool editable_dictionary=false, std::size_t max_dico_entries=std::numeric_limits< std::size_t >::max(), const allocator_type &alloc=allocator_type())
 default constructor with a range variable as translator More...
 
 DBTranslator4RangeVariable (const RangeVariable &var, const bool editable_dictionary=false, std::size_t max_dico_entries=std::numeric_limits< std::size_t >::max(), const allocator_type &alloc=allocator_type())
 default constructor with a range variable as translator but without missing symbols More...
 
 DBTranslator4RangeVariable (const DBTranslator4RangeVariable< ALLOC > &from)
 copy constructor More...
 
 DBTranslator4RangeVariable (const DBTranslator4RangeVariable< ALLOC > &from, const allocator_type &alloc)
 copy constructor with a given translator More...
 
 DBTranslator4RangeVariable (DBTranslator4RangeVariable< ALLOC > &&from)
 move constructor More...
 
 DBTranslator4RangeVariable (DBTranslator4RangeVariable< ALLOC > &&from, const allocator_type &alloc)
 move constructor with a given allocator More...
 
virtual DBTranslator4RangeVariable< ALLOC > * clone () const
 virtual copy constructor More...
 
virtual DBTranslator4RangeVariable< ALLOC > * clone (const allocator_type &alloc) const
 virtual copy constructor with a given allocator More...
 
virtual ~DBTranslator4RangeVariable ()
 destructor More...
 
Operators
DBTranslator4RangeVariable< ALLOC > & operator= (const DBTranslator4RangeVariable< ALLOC > &from)
 copy operator More...
 
DBTranslator4RangeVariable< ALLOC > & operator= (DBTranslator4RangeVariable< ALLOC > &&from)
 move operator More...
 
Accessors / Modifiers
virtual DBTranslatedValue translate (const std::string &str) final
 returns the translation of a string More...
 
virtual std::string translateBack (const DBTranslatedValue translated_val) const final
 returns the original value for a given translation More...
 
virtual std::size_t domainSize () const final
 returns the domain size of a variable corresponding to the translations More...
 
virtual bool needsReordering () const final
 indicates whether a reordering is needed to make the translations sorted by increasing numbers More...
 
virtual HashTable< std::size_t, std::size_t, ALLOC< std::pair< std::size_t, std::size_t > > > reorder () final
 performs a reordering of the dictionary and returns a mapping from the old translated values to the new ones. More...
 
virtual const RangeVariablevariable () const final
 returns the variable stored into the translator More...
 
Operators
DBTranslatedValue operator<< (const std::string &str)
 alias for method translate More...
 
std::string operator>> (const DBTranslatedValue translated_val)
 alias for method translateBack More...
 
Accessors / Modifiers
virtual bool hasEditableDictionary () const
 indicates whether the translator has an editable dictionary or not More...
 
virtual void setEditableDictionaryMode (bool new_mode)
 sets/unset the editable dictionary mode More...
 
const Set< std::string, ALLOC< std::string > > & missingSymbols () const
 returns the set of missing symbols taken into account by the translator More...
 
bool isMissingSymbol (const std::string &str) const
 indicates whether a string corresponds to a missing symbol More...
 
void setVariableName (const std::string &str) const
 sets the name of the variable stored into the translator More...
 
void setVariableDescription (const std::string &str) const
 sets the name of the variable stored into the translator More...
 
DBTranslatedValueType getValType () const
 returns the type of values handled by the translator More...
 
allocator_type getAllocator () const
 returns the allocator used by the translator More...
 
bool isMissingValue (const DBTranslatedValue &val) const
 indicates whether a translated value corresponds to a missing value More...
 

Public Types

using allocator_type = typename DBTranslator< ALLOC >::allocator_type
 type for the allocators passed in arguments of methods More...
 

Protected Attributes

bool _is_dictionary_dynamic
 indicates whether the dictionary can be updated or not More...
 
std::size_t _max_dico_entries
 the maximum number of entries that the dictionary is allowed to contain More...
 
Set< std::string, ALLOC< std::string > > _missing_symbols
 the set of missing symbols More...
 
Bijection< std::size_t, std::string, ALLOC< std::pair< float, std::string > > > _back_dico
 the bijection relating back translated values and their original strings. More...
 
DBTranslatedValueType _val_type
 the type of the values translated by the translator More...
 

Detailed Description

template<template< typename > class ALLOC = std::allocator>
class gum::learning::DBTranslator4RangeVariable< ALLOC >

The databases' cell translators for range variables.

Translators are used by DatabaseTable instances to transform datasets' strings into DBTranslatedValue instances. The point is that strings are not adequate for fast learning, they need to be preprocessed into a type that can be analyzed quickly (the so-called DBTranslatedValue type).

A DBTranslator4RangeVariable is a translator that contains and exploits a RangeVariable for translations. Each time a string needs be translated, we ask the RangeVariable whether its domain contains the integer value represented in the string. If this is the case, then the DBTranslatedValue corresponding to the translation of the string contains in its discr_val field this integer value.

Here is an example of how to use this class:
// create the translator, with possible missing symbols: "N/A" and "???"
// i.e., each time the translator reads a "N/A" or a "???" string, it
// won't translate it into a number but into a missing value.
std::vector<std::string> missing { "N/A", "???" };
// gets the DBTranslatedValue corresponding to some strings
auto val1 = translator.translate("5");
auto val2 = translator.translate("4");
// at this point, val1 and val2 are equal to
// gum::learning::DBTranslatedValue { std::size_t(0) } and
// gum::learning::DBTranslatedValue { std::size_t(1) } respectively.
// In addition, the RangeVariable stored into the translator has
// a domain equal to {4,5}.
auto val3 = translator << "7";
// val3 is encoded as gum::learning::DBTranslatedValue { std::size_t(3) }
// because string "6" is implicitly encoded as
// gum::learning::DBTranslatedValue { std::size_t(3) }.
// In addition, the domain of the range variable is expanded to {4,5,6,7}.
// add the numbers assigned to val1, val2, val3
std::size_t sum = val1.discr_val + val2.discr_val + val3.discr_val;
// translate missing values: val4 and val5 will be equal to:
// DBTranslatedValue { std::numeric_limits<float>::max () }
auto val4 = translator << "N/A";
auto val5 = translator.translate ( "???" );
// the following instructions raise TypeError exceptions because the
// strings cannot be translated into integers
auto val6 = translator << "422x";
auto val7 = translator.translate ( "xxx" );
// given a DBTranslatedValue that is supposed to contain an integer in
// the range of the RangeVariable, get the corresponding string.
std::string str;
str = translator.translateBack ( val1 ); // str = "5"
str = translator >> val2; // str = "4"
str = translator >> gum::learning::DBTranslatedValue {std::size_t(2)};
// str = "6"
// translate back missing values: the string will corresponds to one of
// the missing symbols known to the translator
str = translator >> val4; // str = "N/A" or "???"
str = translator >> val5; // str = "N/A" or "???"
// get the variable stored within the translator
const gum::RangeVariable* var =
dynamic_cast<const gum::RangeVariable*> ( translator.variable () );
// it is possible to create a translator for an already known variable.
// In this case, by default, the translator is not in editable mode, but
// this behavior can be changed passing the right arguments to the
// constructor of the translator, or using the setEditableDictionaryMode
// method. Here, we create a range variable whose domain is {-2,...,10}
gum::RangeVariable var ( "X", "", -2, 10 );
gum::learning::DBTranslator4RangeVariable<> translator2 ( var, missing );
auto xval1 = translator2.translate ( "-1" ).discr_val; // xval1 = 1
auto xval2 = translator2.translate ( "7" ).discr_val; // xval2 = 9
auto xval3 = translator2.translate ( "N/A" ).discr_val;
// here xval3 corresponds to a missing value, hence it is equal to
// std::numeric_limits<size_t>::max ()
// trying to translate a string which is outside the domain of var will
// raise Exception NotFound
translator2.translate ( "20" ); // NotFound

Definition at line 130 of file DBTranslator4RangeVariable.h.

Member Typedef Documentation

template<template< typename > class ALLOC = std::allocator>
using gum::learning::DBTranslator4RangeVariable< ALLOC >::allocator_type = typename DBTranslator< ALLOC >::allocator_type

type for the allocators passed in arguments of methods

Definition at line 133 of file DBTranslator4RangeVariable.h.

Constructor & Destructor Documentation

template<template< typename > class ALLOC = std::allocator>
template<template< typename > class XALLOC>
gum::learning::DBTranslator4RangeVariable< ALLOC >::DBTranslator4RangeVariable ( const std::vector< std::string, XALLOC< std::string > > &  missing_symbols,
std::size_t  max_dico_entries = std::numeric_limits< std::size_t >::max(),
const allocator_type alloc = allocator_type() 
)

default constructor without any initial variable

When using this constructor, it is assumed implicitly that the dictionary contained into the translator is editable. So, when reading the database, if we observe a value that has not been encountered before, we update the range of the dictionary of the translator (hence that of the variable contained by the translator).

Parameters
missing_symbolsthe set of symbols in the dataset representing missing values
max_dico_entriesthe max number of entries that the dictionary can contain. If we try to add new entries in the dictionary, this will be considered as an error and a SizeError exception will be raised
allocThe allocator used to allocate memory for all the fields of the DBTranslator4RangeVariable
template<template< typename > class ALLOC = std::allocator>
gum::learning::DBTranslator4RangeVariable< ALLOC >::DBTranslator4RangeVariable ( std::size_t  max_dico_entries = std::numeric_limits< std::size_t >::max(),
const allocator_type alloc = allocator_type() 
)

default constructor without any initial variable nor missing symbols

When using this constructor, it is assumed implicitly that the dictionary contained into the translator is editable. So, when reading the database, if we observe a value that has not been encountered before, we update the range of the dictionary of the translator (hence that of the variable contained by the translator).

Parameters
max_dico_entriesthe max number of entries that the dictionary can contain. If we try to add new entries in the dictionary, this will be considered as an error and a SizeError exception will be raised
allocThe allocator used to allocate memory for all the fields of the DBTranslator4RangeVariable
template<template< typename > class ALLOC = std::allocator>
template<template< typename > class XALLOC>
gum::learning::DBTranslator4RangeVariable< ALLOC >::DBTranslator4RangeVariable ( const RangeVariable var,
const std::vector< std::string, XALLOC< std::string > > &  missing_symbols,
const bool  editable_dictionary = false,
std::size_t  max_dico_entries = std::numeric_limits< std::size_t >::max(),
const allocator_type alloc = allocator_type() 
)

default constructor with a range variable as translator

Parameters
vara range variable which will be used for translations. The translator keeps a copy of this variable
missing_symbolsthe set of symbols in the dataset representing missing values
editable_dictionarythe mode in which the translator will perform translations: when false (the default), the translation of a string that does not correspond to an integer within the range of var will raise a NotFound exception; when true, the translator will try to expand the domain of the RangeVariable so that the number represented in the string belongs to this domain ((and therefore to the dictionary)
max_dico_entriesthe max number of entries that the dictionary can contain. If we try to add new entries in the dictionary, this will be considered as an error and a SizeError exception will be raised
allocThe allocator used to allocate memory for all the fields of the DBTranslator4RangeVariable
Warning
If the variable contained into the translator has a value in the range that is equal to a missing value symbol, the range value will be taken into account in the translations, not the missing value.
template<template< typename > class ALLOC = std::allocator>
gum::learning::DBTranslator4RangeVariable< ALLOC >::DBTranslator4RangeVariable ( const RangeVariable var,
const bool  editable_dictionary = false,
std::size_t  max_dico_entries = std::numeric_limits< std::size_t >::max(),
const allocator_type alloc = allocator_type() 
)

default constructor with a range variable as translator but without missing symbols

Parameters
vara range variable which will be used for translations. The translator keeps a copy of this variable
editable_dictionarythe mode in which the translator will perform translations: when false (the default), the translation of a string that does not correspond to an integer within the range of var will raise a NotFound exception; when true, the translator will try to expand the domain of the RangeVariable so that the number represented in the string belongs to this domain ((and therefore to the dictionary)
max_dico_entriesthe max number of entries that the dictionary can contain. If we try to add new entries in the dictionary, this will be considered as an error and a SizeError exception will be raised
allocThe allocator used to allocate memory for all the fields of the DBTranslator4RangeVariable
Warning
If the variable contained into the translator has a value in the range that is equal to a missing value symbol, the range value will be taken into account in the translations, not the missing value.
template<template< typename > class ALLOC = std::allocator>
gum::learning::DBTranslator4RangeVariable< ALLOC >::DBTranslator4RangeVariable ( const DBTranslator4RangeVariable< ALLOC > &  from)

copy constructor

template<template< typename > class ALLOC = std::allocator>
gum::learning::DBTranslator4RangeVariable< ALLOC >::DBTranslator4RangeVariable ( const DBTranslator4RangeVariable< ALLOC > &  from,
const allocator_type alloc 
)

copy constructor with a given translator

template<template< typename > class ALLOC = std::allocator>
gum::learning::DBTranslator4RangeVariable< ALLOC >::DBTranslator4RangeVariable ( DBTranslator4RangeVariable< ALLOC > &&  from)

move constructor

template<template< typename > class ALLOC = std::allocator>
gum::learning::DBTranslator4RangeVariable< ALLOC >::DBTranslator4RangeVariable ( DBTranslator4RangeVariable< ALLOC > &&  from,
const allocator_type alloc 
)

move constructor with a given allocator

template<template< typename > class ALLOC = std::allocator>
virtual gum::learning::DBTranslator4RangeVariable< ALLOC >::~DBTranslator4RangeVariable ( )
virtual

destructor

Member Function Documentation

template<template< typename > class ALLOC = std::allocator>
virtual DBTranslator4RangeVariable< ALLOC >* gum::learning::DBTranslator4RangeVariable< ALLOC >::clone ( ) const
virtual

virtual copy constructor

Implements gum::learning::DBTranslator< ALLOC >.

template<template< typename > class ALLOC = std::allocator>
virtual DBTranslator4RangeVariable< ALLOC >* gum::learning::DBTranslator4RangeVariable< ALLOC >::clone ( const allocator_type alloc) const
virtual

virtual copy constructor with a given allocator

Implements gum::learning::DBTranslator< ALLOC >.

template<template< typename > class ALLOC = std::allocator>
virtual std::size_t gum::learning::DBTranslator4RangeVariable< ALLOC >::domainSize ( ) const
finalvirtual

returns the domain size of a variable corresponding to the translations

Returns the size of the range of the variable.

Implements gum::learning::DBTranslator< ALLOC >.

template<template< typename > class ALLOC = std::allocator>
allocator_type gum::learning::DBTranslator< ALLOC >::getAllocator ( ) const
inherited

returns the allocator used by the translator

template<template< typename > class ALLOC = std::allocator>
DBTranslatedValueType gum::learning::DBTranslator< ALLOC >::getValType ( ) const
inherited

returns the type of values handled by the translator

Returns
either DBTranslatedValueType::DISCRETE if the translator includes a discrete variable or DBTranslatedValueType::CONTINUOUS if it contains a continuous variable. This is convenient to know how to interpret the DBTranslatedValue instances produced by the DBTranslator: either using their discr_val field or their cont_val field.
template<template< typename > class ALLOC = std::allocator>
virtual bool gum::learning::DBTranslator< ALLOC >::hasEditableDictionary ( ) const
virtualinherited

indicates whether the translator has an editable dictionary or not

Reimplemented in gum::learning::DBTranslator4DiscretizedVariable< ALLOC >.

template<template< typename > class ALLOC = std::allocator>
bool gum::learning::DBTranslator< ALLOC >::isMissingSymbol ( const std::string &  str) const
inherited

indicates whether a string corresponds to a missing symbol

template<template< typename > class ALLOC = std::allocator>
bool gum::learning::DBTranslator< ALLOC >::isMissingValue ( const DBTranslatedValue val) const
inherited

indicates whether a translated value corresponds to a missing value

template<template< typename > class ALLOC = std::allocator>
const Set< std::string, ALLOC< std::string > >& gum::learning::DBTranslator< ALLOC >::missingSymbols ( ) const
inherited

returns the set of missing symbols taken into account by the translator

template<template< typename > class ALLOC = std::allocator>
virtual bool gum::learning::DBTranslator4RangeVariable< ALLOC >::needsReordering ( ) const
finalvirtual

indicates whether a reordering is needed to make the translations sorted by increasing numbers

When constructing dynamically its dictionary, the translator may assign wrong DBTranslatedValue values to strings. For instance, a translator reading sequentially integer strings 2, 1, 3, may map 2 into DBTranslatedValue{std::size_t(0)}, 1 into DBTranslatedValue{std::size_t(1)} and 3 into DBTranslatedValue{std::size_t(2)}, resulting in random variables having domain {2,1,3}. The user may prefer having domain {1,2,3}, i.e., a domain specified with increasing values. This requires a reordering. Method needsReodering() returns a Boolean indicating whether such a reordering should be performed or whether the current order is OK.

Implements gum::learning::DBTranslator< ALLOC >.

template<template< typename > class ALLOC = std::allocator>
DBTranslatedValue gum::learning::DBTranslator< ALLOC >::operator<< ( const std::string &  str)
inherited

alias for method translate

template<template< typename > class ALLOC = std::allocator>
DBTranslator4RangeVariable< ALLOC >& gum::learning::DBTranslator4RangeVariable< ALLOC >::operator= ( const DBTranslator4RangeVariable< ALLOC > &  from)

copy operator

template<template< typename > class ALLOC = std::allocator>
DBTranslator4RangeVariable< ALLOC >& gum::learning::DBTranslator4RangeVariable< ALLOC >::operator= ( DBTranslator4RangeVariable< ALLOC > &&  from)

move operator

template<template< typename > class ALLOC = std::allocator>
std::string gum::learning::DBTranslator< ALLOC >::operator>> ( const DBTranslatedValue  translated_val)
inherited

alias for method translateBack

template<template< typename > class ALLOC = std::allocator>
virtual HashTable< std::size_t, std::size_t, ALLOC< std::pair< std::size_t, std::size_t > > > gum::learning::DBTranslator4RangeVariable< ALLOC >::reorder ( )
finalvirtual

performs a reordering of the dictionary and returns a mapping from the old translated values to the new ones.

When a reordering is needed, i.e., string values must be translated differently, Method reorder() computes how the translations should be changed. It updates accordingly the dictionary and returns the mapping that enables changing the old dictionary values into the new ones.

Implements gum::learning::DBTranslator< ALLOC >.

template<template< typename > class ALLOC = std::allocator>
virtual void gum::learning::DBTranslator< ALLOC >::setEditableDictionaryMode ( bool  new_mode)
virtualinherited

sets/unset the editable dictionary mode

Reimplemented in gum::learning::DBTranslator4DiscretizedVariable< ALLOC >.

template<template< typename > class ALLOC = std::allocator>
void gum::learning::DBTranslator< ALLOC >::setVariableDescription ( const std::string &  str) const
inherited

sets the name of the variable stored into the translator

template<template< typename > class ALLOC = std::allocator>
void gum::learning::DBTranslator< ALLOC >::setVariableName ( const std::string &  str) const
inherited

sets the name of the variable stored into the translator

template<template< typename > class ALLOC = std::allocator>
virtual DBTranslatedValue gum::learning::DBTranslator4RangeVariable< ALLOC >::translate ( const std::string &  str)
finalvirtual

returns the translation of a string

This method tries to translate a given string into the DBTranslatedValue that should be stored into a databaseTable. If the translator cannot find the translation in its current dictionary, then two situations can obtain:

  1. if the translator is not in an editable dictionary mode, then the translator raises a NotFound exception.
  2. if the translator is in an editable dictionary mode, i.e., it is allowed to update its dictionary, then it tries to update the range of its dictionary to include the new value. Upon success, it returns the translated value, otherwise, it raises either:
    • a TypeError exception if the string cannot be converted into a value that can be inserted into the dictionary
    • an OperationNotAllowed exception if the translation would induce incoherent behavior (e.g., a translator that contains a variable whose domain is [x,y] as well as a missing value symbol z \(\in\) [x,y]).
    • a SizeError exception if the number of entries in the dictionary, i.e., the domain size of the RangeVariable, has already reached its maximum.
Warning
Note that missing values (i.e., string encoded as missing symbols) are translated as std::numeric_limits<std::size_t>::max ().
If the variable contained into the translator has a value in its range equal to a missing value symbol, then this value will be taken into account in the translation, not the missing value.
Returns
the translated value of the string to be stored into a DatabaseTable
Exceptions
UnknownLabelInDatabaseis raised if the translation cannot be found and the translator is not in an editable dictionary mode.
SizeErroris raised if the number of entries (the range) in the dictionary has already reached its maximum.
TypeErroris raised if the translation cannot be found and the translator is in an editable dictionary mode and the string does not correspond to an integer.
OperationNotAllowedexception is raised if the translation cannot be found and the insertion of the string into the translator's dictionary fails because it would induce incoherent behavior (e.g., a translator that contains a variable whose domain is {x,y,z,t} as well as a missing value symbol z).

Implements gum::learning::DBTranslator< ALLOC >.

template<template< typename > class ALLOC = std::allocator>
virtual std::string gum::learning::DBTranslator4RangeVariable< ALLOC >::translateBack ( const DBTranslatedValue  translated_val) const
finalvirtual

returns the original value for a given translation

Returns
the string that was translated into a given DBTranslatedValue.
Exceptions
UnknownLabelInDatabaseis raised if this original value cannot be found

Implements gum::learning::DBTranslator< ALLOC >.

template<template< typename > class ALLOC = std::allocator>
virtual const RangeVariable* gum::learning::DBTranslator4RangeVariable< ALLOC >::variable ( ) const
finalvirtual

returns the variable stored into the translator

Implements gum::learning::DBTranslator< ALLOC >.

Member Data Documentation

template<template< typename > class ALLOC = std::allocator>
Bijection< std::size_t, std::string, ALLOC< std::pair< float, std::string > > > gum::learning::DBTranslator< ALLOC >::_back_dico
mutableprotectedinherited

the bijection relating back translated values and their original strings.

Note that the translated values considered here are of type std::size_t because only the values for discrete variables need be stored, those for continuous variables are actually identity mappings.

Warning
only the values of the random variable are stored into this bijection. Missing values are not considered here.

Definition at line 390 of file DBTranslator.h.

template<template< typename > class ALLOC = std::allocator>
bool gum::learning::DBTranslator< ALLOC >::_is_dictionary_dynamic
protectedinherited

indicates whether the dictionary can be updated or not

Definition at line 373 of file DBTranslator.h.

template<template< typename > class ALLOC = std::allocator>
std::size_t gum::learning::DBTranslator< ALLOC >::_max_dico_entries
protectedinherited

the maximum number of entries that the dictionary is allowed to contain

Definition at line 376 of file DBTranslator.h.

template<template< typename > class ALLOC = std::allocator>
Set< std::string, ALLOC< std::string > > gum::learning::DBTranslator< ALLOC >::_missing_symbols
protectedinherited

the set of missing symbols

Definition at line 379 of file DBTranslator.h.

template<template< typename > class ALLOC = std::allocator>
DBTranslatedValueType gum::learning::DBTranslator< ALLOC >::_val_type
protectedinherited

the type of the values translated by the translator

Definition at line 393 of file DBTranslator.h.


The documentation for this class was generated from the following file: