aGrUM  0.16.0
gum::learning::genericBNLearner::Database Class Reference

a helper to easily read databases More...

#include <genericBNLearner.h>

+ Collaboration diagram for gum::learning::genericBNLearner::Database:

Public Member Functions

template<typename GUM_SCALAR >
 Database (const std::string &filename, const BayesNet< GUM_SCALAR > &bn, const std::vector< std::string > &missing_symbols)
 
Constructors / Destructors
 Database (const std::string &file, const std::vector< std::string > &missing_symbols)
 default constructor More...
 
 Database (const DatabaseTable<> &db)
 default constructor More...
 
 Database (const std::string &filename, Database &score_database, const std::vector< std::string > &missing_symbols)
 constructor for the aprioris More...
 
template<typename GUM_SCALAR >
 Database (const std::string &filename, const gum::BayesNet< GUM_SCALAR > &bn, const std::vector< std::string > &missing_symbols)
 constructor with a BN providing the variables of interest More...
 
 Database (const Database &from)
 copy constructor More...
 
 Database (Database &&from)
 move constructor More...
 
 ~Database ()
 destructor More...
 
Operators
Databaseoperator= (const Database &from)
 copy operator More...
 
Databaseoperator= (Database &&from)
 move operator More...
 
Accessors / Modifiers
DBRowGeneratorParserparser ()
 returns the parser for the database More...
 
const std::vector< std::size_t > & domainSizes () const
 returns the domain sizes of the variables More...
 
const std::vector< std::string > & names () const
 returns the names of the variables in the database More...
 
NodeId idFromName (const std::string &var_name) const
 returns the node id corresponding to a variable name More...
 
const std::string & nameFromId (NodeId id) const
 returns the variable name corresponding to a given node id More...
 
const DatabaseTabledatabaseTable () const
 returns the internal database table More...
 
void setDatabaseWeight (const double new_weight)
 assign a weight to all the rows of the database so that the sum of their weights is equal to new_weight More...
 
const Bijection< NodeId, std::size_t > & nodeId2Columns () const
 returns the mapping between node ids and their columns in the database More...
 
const std::vector< std::string > & missingSymbols () const
 returns the set of missing symbols taken into account More...
 
std::size_t nbRows () const
 returns the number of records in the database More...
 
std::size_t size () const
 returns the number of records in the database More...
 
void setWeight (const std::size_t i, const double weight)
 sets the weight of the ith record More...
 
double weight (const std::size_t i) const
 returns the weight of the ith record More...
 
double weight () const
 returns the weight of the whole database More...
 

Protected Attributes

DatabaseTable __database
 the database itself More...
 
DBRowGeneratorParser__parser {nullptr}
 the parser used for reading the database More...
 
std::vector< std::size_t > __domain_sizes
 the domain sizes of the variables (useful to speed-up computations) More...
 
Bijection< NodeId, std::size_t > __nodeId2cols
 a bijection assigning to each variable name its NodeId More...
 
Size __max_threads_number {1}
 the max number of threads authorized More...
 
Size __min_nb_rows_per_thread {100}
 the minimal number of rows to parse (on average) by thread More...
 

Detailed Description

a helper to easily read databases

Definition at line 135 of file genericBNLearner.h.

Constructor & Destructor Documentation

◆ Database() [1/7]

gum::learning::genericBNLearner::Database::Database ( const std::string &  file,
const std::vector< std::string > &  missing_symbols 
)
explicit

default constructor

Parameters
filethe name of the CSV file containing the data
missing_symbolsthe set of symbols in the CSV file that correspond to missing data

Definition at line 68 of file genericBNLearner.cpp.

70  :
71  Database(genericBNLearner::__readFile(filename, missing_symbols)) {}
static DatabaseTable __readFile(const std::string &filename, const std::vector< std::string > &missing_symbols)
reads a file and returns a databaseVectInRam
Database(const std::string &file, const std::vector< std::string > &missing_symbols)
default constructor

◆ Database() [2/7]

gum::learning::genericBNLearner::Database::Database ( const DatabaseTable<> &  db)
explicit

default constructor

Parameters
dban already initialized database table that is used to fill the Database

Definition at line 51 of file genericBNLearner.cpp.

References __database, __domain_sizes, __nodeId2cols, __parser, gum::learning::DatabaseTable< ALLOC >::domainSizes(), gum::learning::IDatabaseTable< T_DATA, ALLOC >::handler(), gum::BijectionImplementation< T1, T2, Alloc, Gen >::insert(), and gum::learning::IDatabaseTable< T_DATA, ALLOC >::variableNames().

51  :
52  __database(db) {
53  // get the variables names
54  const auto& var_names = __database.variableNames();
55  const std::size_t nb_vars = var_names.size();
56  for (auto dom : __database.domainSizes())
57  __domain_sizes.push_back(dom);
58  for (std::size_t i = 0; i < nb_vars; ++i) {
60  }
61 
62  // create the parser
63  __parser =
64  new DBRowGeneratorParser<>(__database.handler(), DBRowGeneratorSet<>());
65  }
void insert(const T1 &first, const T2 &second)
Inserts a new association in the gum::Bijection.
DBVector< std::size_t > domainSizes() const
returns the domain sizes of all the variables in the database table
DatabaseTable __database
the database itself
DBRowGeneratorParser * __parser
the parser used for reading the database
std::vector< std::size_t > __domain_sizes
the domain sizes of the variables (useful to speed-up computations)
Bijection< NodeId, std::size_t > __nodeId2cols
a bijection assigning to each variable name its NodeId
const DBVector< std::string > & variableNames() const noexcept
returns the variable names for all the columns of the database
Size NodeId
Type for node ids.
Definition: graphElements.h:98
iterator handler() const
returns a new unsafe handler pointing to the 1st record of the database
+ Here is the call graph for this function:

◆ Database() [3/7]

gum::learning::genericBNLearner::Database::Database ( const std::string &  filename,
Database score_database,
const std::vector< std::string > &  missing_symbols 
)

constructor for the aprioris

We must ensure that the variables of the Database are identical to those of the score database (else the countings used by the scores might be erroneous). However, we allow the variables to be ordered differently in the two databases: variables with the same name in both databases are supposed to be the same.

Parameters
filethe name of the CSV file containing the data
score_databasethe main database used for the learning
missing_symbolsthe set of symbols in the CSV file that correspond to missing data

Definition at line 74 of file genericBNLearner.cpp.

References gum::learning::genericBNLearner::__checkFileName(), __database, __domain_sizes, __nodeId2cols, __parser, databaseTable(), gum::learning::DatabaseTable< ALLOC >::domainSizes(), gum::learning::IDBInitializer< ALLOC >::fillDatabase(), GUM_ERROR, gum::learning::IDatabaseTable< T_DATA, ALLOC >::handler(), gum::HashTable< Key, Val, Alloc >::insert(), gum::learning::DatabaseTable< ALLOC >::insertTranslator(), gum::learning::IDatabaseTable< T_DATA, ALLOC >::nbVariables(), nodeId2Columns(), gum::learning::DatabaseTable< ALLOC >::variable(), gum::learning::IDBInitializer< ALLOC >::variableNames(), and gum::learning::IDatabaseTable< T_DATA, ALLOC >::variableNames().

77  {
78  // assign to each column name in the CSV file its column
80  DBInitializerFromCSV<> initializer(CSV_filename);
81  const auto& apriori_names = initializer.variableNames();
82  std::size_t apriori_nb_vars = apriori_names.size();
83  HashTable< std::string, std::size_t > apriori_names2col(apriori_nb_vars);
84  for (std::size_t i = std::size_t(0); i < apriori_nb_vars; ++i)
85  apriori_names2col.insert(apriori_names[i], i);
86 
87  // check that there are at least as many variables in the a priori
88  // database as those in the score_database
89  if (apriori_nb_vars < score_database.__database.nbVariables()) {
90  GUM_ERROR(InvalidArgument,
91  "the a apriori database has fewer variables "
92  "than the observed database");
93  }
94 
95  // get the mapping from the columns of score_database to those of
96  // the CSV file
97  const std::vector< std::string >& score_names =
98  score_database.databaseTable().variableNames();
99  const std::size_t score_nb_vars = score_names.size();
100  HashTable< std::size_t, std::size_t > mapping(score_nb_vars);
101  for (std::size_t i = std::size_t(0); i < score_nb_vars; ++i) {
102  try {
103  mapping.insert(i, apriori_names2col[score_names[i]]);
104  } catch (Exception&) {
105  GUM_ERROR(MissingVariableInDatabase,
106  "Variable "
107  << score_names[i]
108  << " of the observed database does not belong to the "
109  << "apriori database");
110  }
111  }
112 
113  // create the translators for CSV database
114  for (std::size_t i = std::size_t(0); i < score_nb_vars; ++i) {
115  const Variable& var = score_database.databaseTable().variable(i);
116  __database.insertTranslator(var, mapping[i], missing_symbols);
117  }
118 
119  // fill the database
120  initializer.fillDatabase(__database);
121 
122  // get the domain sizes of the variables
123  for (auto dom : __database.domainSizes())
124  __domain_sizes.push_back(dom);
125 
126  // compute the mapping from node ids to column indices
127  __nodeId2cols = score_database.nodeId2Columns();
128 
129  // create the parser
130  __parser =
131  new DBRowGeneratorParser<>(__database.handler(), DBRowGeneratorSet<>());
132  }
static void __checkFileName(const std::string &filename)
checks whether the extension of a CSV filename is correct
DBVector< std::size_t > domainSizes() const
returns the domain sizes of all the variables in the database table
DatabaseTable __database
the database itself
std::size_t insertTranslator(const DBTranslator< ALLOC > &translator, const std::size_t input_column, const bool unique_column=true)
insert a new translator into the database table
DBRowGeneratorParser * __parser
the parser used for reading the database
std::vector< std::size_t > __domain_sizes
the domain sizes of the variables (useful to speed-up computations)
Bijection< NodeId, std::size_t > __nodeId2cols
a bijection assigning to each variable name its NodeId
iterator handler() const
returns a new unsafe handler pointing to the 1st record of the database
#define GUM_ERROR(type, msg)
Definition: exceptions.h:55
+ Here is the call graph for this function:

◆ Database() [4/7]

template<typename GUM_SCALAR >
gum::learning::genericBNLearner::Database::Database ( const std::string &  filename,
const gum::BayesNet< GUM_SCALAR > &  bn,
const std::vector< std::string > &  missing_symbols 
)

constructor with a BN providing the variables of interest

Parameters
filethe name of the CSV file containing the data
bna Bayesian network indicating which variables of the CSV file are used for learning
missing_symbolsthe set of symbols in the CSV file that correspond to missing data

◆ Database() [5/7]

gum::learning::genericBNLearner::Database::Database ( const Database from)

copy constructor

Definition at line 135 of file genericBNLearner.cpp.

References __database, __parser, and gum::learning::IDatabaseTable< T_DATA, ALLOC >::handler().

135  :
136  __database(from.__database), __domain_sizes(from.__domain_sizes),
137  __nodeId2cols(from.__nodeId2cols) {
138  // create the parser
139  __parser =
140  new DBRowGeneratorParser<>(__database.handler(), DBRowGeneratorSet<>());
141  }
DatabaseTable __database
the database itself
DBRowGeneratorParser * __parser
the parser used for reading the database
std::vector< std::size_t > __domain_sizes
the domain sizes of the variables (useful to speed-up computations)
Bijection< NodeId, std::size_t > __nodeId2cols
a bijection assigning to each variable name its NodeId
iterator handler() const
returns a new unsafe handler pointing to the 1st record of the database
+ Here is the call graph for this function:

◆ Database() [6/7]

gum::learning::genericBNLearner::Database::Database ( Database &&  from)

move constructor

Definition at line 144 of file genericBNLearner.cpp.

References __database, __parser, and gum::learning::IDatabaseTable< T_DATA, ALLOC >::handler().

144  :
145  __database(std::move(from.__database)),
146  __domain_sizes(std::move(from.__domain_sizes)),
147  __nodeId2cols(std::move(from.__nodeId2cols)) {
148  // create the parser
149  __parser =
150  new DBRowGeneratorParser<>(__database.handler(), DBRowGeneratorSet<>());
151  }
DatabaseTable __database
the database itself
DBRowGeneratorParser * __parser
the parser used for reading the database
std::vector< std::size_t > __domain_sizes
the domain sizes of the variables (useful to speed-up computations)
Bijection< NodeId, std::size_t > __nodeId2cols
a bijection assigning to each variable name its NodeId
iterator handler() const
returns a new unsafe handler pointing to the 1st record of the database
+ Here is the call graph for this function:

◆ ~Database()

gum::learning::genericBNLearner::Database::~Database ( )

destructor

Definition at line 154 of file genericBNLearner.cpp.

References __parser, and operator=().

154 { delete __parser; }
DBRowGeneratorParser * __parser
the parser used for reading the database
+ Here is the call graph for this function:

◆ Database() [7/7]

template<typename GUM_SCALAR >
gum::learning::genericBNLearner::Database::Database ( const std::string &  filename,
const BayesNet< GUM_SCALAR > &  bn,
const std::vector< std::string > &  missing_symbols 
)

Definition at line 31 of file genericBNLearner_tpl.h.

References gum::learning::genericBNLearner::__checkFileName(), __database, __domain_sizes, __nodeId2cols, __parser, gum::DAGmodel::dag(), gum::learning::DatabaseTable< ALLOC >::domainSizes(), gum::learning::IDBInitializer< ALLOC >::fillDatabase(), GUM_ERROR, gum::learning::IDatabaseTable< T_DATA, ALLOC >::handler(), gum::BijectionImplementation< T1, T2, Alloc, Gen >::insert(), gum::HashTable< Key, Val, Alloc >::insert(), gum::learning::DatabaseTable< ALLOC >::insertTranslator(), gum::Variable::name(), gum::BayesNet< GUM_SCALAR >::variable(), and gum::learning::IDBInitializer< ALLOC >::variableNames().

34  {
35  // assign to each column name in the database its position
37  DBInitializerFromCSV<> initializer(filename);
38  const auto& xvar_names = initializer.variableNames();
39  std::size_t nb_vars = xvar_names.size();
40  HashTable< std::string, std::size_t > var_names(nb_vars);
41  for (std::size_t i = std::size_t(0); i < nb_vars; ++i)
42  var_names.insert(xvar_names[i], i);
43 
44  // we use the bn to insert the translators into the database table
45  std::vector< NodeId > nodes;
46  nodes.reserve(bn.dag().sizeNodes());
47  for (const auto node : bn.dag())
48  nodes.push_back(node);
49  std::sort(nodes.begin(), nodes.end());
50  std::size_t i = std::size_t(0);
51  for (auto node : nodes) {
52  const Variable& var = bn.variable(node);
53  try {
54  __database.insertTranslator(var, var_names[var.name()], missing_symbols);
55  } catch (NotFound&) {
56  GUM_ERROR(MissingVariableInDatabase,
57  "Variable '" << var.name() << "' is missing");
58  }
59  __nodeId2cols.insert(NodeId(node), i++);
60  }
61 
62  // fill the database
63  initializer.fillDatabase(__database);
64 
65  // get the domain sizes of the variables
66  for (auto dom : __database.domainSizes())
67  __domain_sizes.push_back(dom);
68 
69  // create the parser
70  __parser =
71  new DBRowGeneratorParser<>(__database.handler(), DBRowGeneratorSet<>());
72  }
void insert(const T1 &first, const T2 &second)
Inserts a new association in the gum::Bijection.
static void __checkFileName(const std::string &filename)
checks whether the extension of a CSV filename is correct
DBVector< std::size_t > domainSizes() const
returns the domain sizes of all the variables in the database table
DatabaseTable __database
the database itself
std::size_t insertTranslator(const DBTranslator< ALLOC > &translator, const std::size_t input_column, const bool unique_column=true)
insert a new translator into the database table
DBRowGeneratorParser * __parser
the parser used for reading the database
std::vector< std::size_t > __domain_sizes
the domain sizes of the variables (useful to speed-up computations)
Bijection< NodeId, std::size_t > __nodeId2cols
a bijection assigning to each variable name its NodeId
Size NodeId
Type for node ids.
Definition: graphElements.h:98
iterator handler() const
returns a new unsafe handler pointing to the 1st record of the database
#define GUM_ERROR(type, msg)
Definition: exceptions.h:55
+ Here is the call graph for this function:

Member Function Documentation

◆ __BNVars()

template<typename GUM_SCALAR >
BayesNet< GUM_SCALAR > gum::learning::genericBNLearner::Database::__BNVars ( ) const
private

Definition at line 76 of file genericBNLearner_tpl.h.

References __database, gum::BayesNet< GUM_SCALAR >::add(), gum::learning::IDatabaseTable< T_DATA, ALLOC >::nbVariables(), and gum::learning::DatabaseTable< ALLOC >::variable().

76  {
77  BayesNet< GUM_SCALAR > bn;
78  const std::size_t nb_vars = __database.nbVariables();
79  for (std::size_t i = 0; i < nb_vars; ++i) {
80  const DiscreteVariable& var =
81  dynamic_cast< const DiscreteVariable& >(__database.variable(i));
82  bn.add(var);
83  }
84  return bn;
85  }
DatabaseTable __database
the database itself
std::size_t nbVariables() const noexcept
returns the number of variables (columns) of the database
const Variable & variable(const std::size_t k, const bool k_is_input_col=false) const
returns either the kth variable of the database table or the first one corresponding to the kth colum...
+ Here is the call graph for this function:

◆ databaseTable()

INLINE const DatabaseTable & gum::learning::genericBNLearner::Database::databaseTable ( ) const

◆ domainSizes()

INLINE const std::vector< std::size_t > & gum::learning::genericBNLearner::Database::domainSizes ( ) const

returns the domain sizes of the variables

Definition at line 47 of file genericBNLearner_inl.h.

References __domain_sizes.

Referenced by gum::learning::genericBNLearner::domainSizes(), and gum::learning::genericBNLearner::nbCols().

47  {
48  return __domain_sizes;
49  }
std::vector< std::size_t > __domain_sizes
the domain sizes of the variables (useful to speed-up computations)
+ Here is the caller graph for this function:

◆ idFromName()

INLINE NodeId gum::learning::genericBNLearner::Database::idFromName ( const std::string &  var_name) const

returns the node id corresponding to a variable name

Definition at line 67 of file genericBNLearner_inl.h.

References __database, __nodeId2cols, gum::learning::IDatabaseTable< T_DATA, ALLOC >::columnsFromVariableName(), gum::BijectionImplementation< T1, T2, Alloc, Gen >::first(), and GUM_ERROR.

Referenced by gum::learning::genericBNLearner::addForbiddenArc(), gum::learning::genericBNLearner::addMandatoryArc(), gum::learning::genericBNLearner::addPossibleEdge(), gum::learning::genericBNLearner::eraseForbiddenArc(), gum::learning::genericBNLearner::eraseMandatoryArc(), gum::learning::genericBNLearner::erasePossibleEdge(), gum::learning::genericBNLearner::idFromName(), and gum::learning::genericBNLearner::setSliceOrder().

67  {
68  try {
69  const auto cols = __database.columnsFromVariableName(var_name);
70  return __nodeId2cols.first(cols[0]);
71  } catch (...) {
72  GUM_ERROR(MissingVariableInDatabase,
73  "Variable " << var_name
74  << " could not be found in the database");
75  }
76  }
const T1 & first(const T2 &second) const
Returns the first value of a pair given its second value.
DBVector< std::size_t > columnsFromVariableName(const std::string &name) const
returns the indices of all the columns whose name is passed in argument
DatabaseTable __database
the database itself
Bijection< NodeId, std::size_t > __nodeId2cols
a bijection assigning to each variable name its NodeId
#define GUM_ERROR(type, msg)
Definition: exceptions.h:55
+ Here is the call graph for this function:
+ Here is the caller graph for this function:

◆ missingSymbols()

INLINE const std::vector< std::string > & gum::learning::genericBNLearner::Database::missingSymbols ( ) const

returns the set of missing symbols taken into account

Definition at line 101 of file genericBNLearner_inl.h.

References __database, and gum::learning::IDatabaseTable< T_DATA, ALLOC >::missingSymbols().

Referenced by gum::learning::genericBNLearner::__createApriori().

101  {
102  return __database.missingSymbols();
103  }
const DBVector< std::string > & missingSymbols() const
returns the set of missing symbols
DatabaseTable __database
the database itself
+ Here is the call graph for this function:
+ Here is the caller graph for this function:

◆ nameFromId()

INLINE const std::string & gum::learning::genericBNLearner::Database::nameFromId ( NodeId  id) const

returns the variable name corresponding to a given node id

Definition at line 81 of file genericBNLearner_inl.h.

References __database, __nodeId2cols, GUM_ERROR, gum::BijectionImplementation< T1, T2, Alloc, Gen >::second(), and gum::learning::IDatabaseTable< T_DATA, ALLOC >::variableName().

Referenced by gum::learning::genericBNLearner::nameFromId().

81  {
82  try {
84  } catch (...) {
85  GUM_ERROR(MissingVariableInDatabase,
86  "Variable of Id " << id
87  << " could not be found in the database");
88  }
89  }
const T2 & second(const T1 &first) const
Returns the second value of a pair given its first value.
DatabaseTable __database
the database itself
const std::string & variableName(const std::size_t k) const
returns the name of the kth column of the IDatabaseTable
Bijection< NodeId, std::size_t > __nodeId2cols
a bijection assigning to each variable name its NodeId
#define GUM_ERROR(type, msg)
Definition: exceptions.h:55
+ Here is the call graph for this function:
+ Here is the caller graph for this function:

◆ names()

INLINE const std::vector< std::string > & gum::learning::genericBNLearner::Database::names ( ) const

returns the names of the variables in the database

Definition at line 53 of file genericBNLearner_inl.h.

References __database, and gum::learning::IDatabaseTable< T_DATA, ALLOC >::variableNames().

Referenced by gum::learning::genericBNLearner::names().

53  {
54  return __database.variableNames();
55  }
DatabaseTable __database
the database itself
const DBVector< std::string > & variableNames() const noexcept
returns the variable names for all the columns of the database
+ Here is the call graph for this function:
+ Here is the caller graph for this function:

◆ nbRows()

INLINE std::size_t gum::learning::genericBNLearner::Database::nbRows ( ) const

returns the number of records in the database

Definition at line 114 of file genericBNLearner_inl.h.

References __database, and gum::learning::IDatabaseTable< T_DATA, ALLOC >::nbRows().

114  {
115  return __database.nbRows();
116  }
DatabaseTable __database
the database itself
std::size_t nbRows() const noexcept
returns the number of records (rows) in the database
+ Here is the call graph for this function:

◆ nodeId2Columns()

INLINE const Bijection< NodeId, std::size_t > & gum::learning::genericBNLearner::Database::nodeId2Columns ( ) const

returns the mapping between node ids and their columns in the database

Definition at line 108 of file genericBNLearner_inl.h.

References __nodeId2cols.

Referenced by gum::learning::genericBNLearner::__createApriori(), gum::learning::genericBNLearner::__createCorrectedMutualInformation(), gum::learning::genericBNLearner::__createParamEstimator(), gum::learning::genericBNLearner::__createScore(), and Database().

108  {
109  return __nodeId2cols;
110  }
Bijection< NodeId, std::size_t > __nodeId2cols
a bijection assigning to each variable name its NodeId
+ Here is the caller graph for this function:

◆ operator=() [1/2]

genericBNLearner::Database & gum::learning::genericBNLearner::Database::operator= ( const Database from)

copy operator

Definition at line 157 of file genericBNLearner.cpp.

References __database, __domain_sizes, __nodeId2cols, __parser, and gum::learning::IDatabaseTable< T_DATA, ALLOC >::handler().

Referenced by ~Database().

157  {
158  if (this != &from) {
159  delete __parser;
160  __database = from.__database;
161  __domain_sizes = from.__domain_sizes;
162  __nodeId2cols = from.__nodeId2cols;
163 
164  // create the parser
165  __parser =
166  new DBRowGeneratorParser<>(__database.handler(), DBRowGeneratorSet<>());
167  }
168 
169  return *this;
170  }
DatabaseTable __database
the database itself
DBRowGeneratorParser * __parser
the parser used for reading the database
std::vector< std::size_t > __domain_sizes
the domain sizes of the variables (useful to speed-up computations)
Bijection< NodeId, std::size_t > __nodeId2cols
a bijection assigning to each variable name its NodeId
iterator handler() const
returns a new unsafe handler pointing to the 1st record of the database
+ Here is the call graph for this function:
+ Here is the caller graph for this function:

◆ operator=() [2/2]

genericBNLearner::Database & gum::learning::genericBNLearner::Database::operator= ( Database &&  from)

move operator

Definition at line 173 of file genericBNLearner.cpp.

References __database, __domain_sizes, __nodeId2cols, __parser, and gum::learning::IDatabaseTable< T_DATA, ALLOC >::handler().

173  {
174  if (this != &from) {
175  delete __parser;
176  __database = std::move(from.__database);
177  __domain_sizes = std::move(from.__domain_sizes);
178  __nodeId2cols = std::move(from.__nodeId2cols);
179 
180  // create the parser
181  __parser =
182  new DBRowGeneratorParser<>(__database.handler(), DBRowGeneratorSet<>());
183  }
184 
185  return *this;
186  }
DatabaseTable __database
the database itself
DBRowGeneratorParser * __parser
the parser used for reading the database
std::vector< std::size_t > __domain_sizes
the domain sizes of the variables (useful to speed-up computations)
Bijection< NodeId, std::size_t > __nodeId2cols
a bijection assigning to each variable name its NodeId
iterator handler() const
returns a new unsafe handler pointing to the 1st record of the database
+ Here is the call graph for this function:

◆ parser()

INLINE DBRowGeneratorParser & gum::learning::genericBNLearner::Database::parser ( )

returns the parser for the database

Definition at line 41 of file genericBNLearner_inl.h.

References __parser.

Referenced by gum::learning::genericBNLearner::__createApriori(), gum::learning::genericBNLearner::__createCorrectedMutualInformation(), gum::learning::genericBNLearner::__createScore(), gum::learning::genericBNLearner::chi2(), gum::learning::genericBNLearner::G2(), gum::learning::genericBNLearner::logLikelihood(), and gum::learning::genericBNLearner::useDatabaseRanges().

41  {
42  return *__parser;
43  }
DBRowGeneratorParser * __parser
the parser used for reading the database
+ Here is the caller graph for this function:

◆ setDatabaseWeight()

INLINE void gum::learning::genericBNLearner::Database::setDatabaseWeight ( const double  new_weight)

assign a weight to all the rows of the database so that the sum of their weights is equal to new_weight

assign new weight to the rows of the learning database

Definition at line 59 of file genericBNLearner_inl.h.

References __database, gum::learning::IDatabaseTable< T_DATA, ALLOC >::nbRows(), gum::learning::IDatabaseTable< T_DATA, ALLOC >::setAllRowsWeight(), and weight().

Referenced by gum::learning::genericBNLearner::setDatabaseWeight().

59  {
60  if (__database.nbRows() == std::size_t(0)) return;
61  const double weight = new_weight / double(__database.nbRows());
63  }
DatabaseTable __database
the database itself
std::size_t nbRows() const noexcept
returns the number of records (rows) in the database
double weight() const
returns the weight of the whole database
void setAllRowsWeight(const double new_weight)
assign a given weight to all the rows of the database
+ Here is the call graph for this function:
+ Here is the caller graph for this function:

◆ setWeight()

INLINE void gum::learning::genericBNLearner::Database::setWeight ( const std::size_t  i,
const double  weight 
)

sets the weight of the ith record

Exceptions
OutOfBoundsif i is outside the set of indices of the records or if the weight is negative

Definition at line 126 of file genericBNLearner_inl.h.

References __database, and gum::learning::IDatabaseTable< T_DATA, ALLOC >::setWeight().

Referenced by gum::learning::genericBNLearner::setRecordWeight().

127  {
128  __database.setWeight(i, weight);
129  }
DatabaseTable __database
the database itself
void setWeight(const std::size_t i, const double weight)
assigns a given weight to the ith row of the database
+ Here is the call graph for this function:
+ Here is the caller graph for this function:

◆ size()

INLINE std::size_t gum::learning::genericBNLearner::Database::size ( ) const

returns the number of records in the database

Definition at line 120 of file genericBNLearner_inl.h.

References __database, and gum::learning::IDatabaseTable< T_DATA, ALLOC >::size().

120  {
121  return __database.size();
122  }
std::size_t size() const noexcept
returns the number of records (rows) in the database
DatabaseTable __database
the database itself
+ Here is the call graph for this function:

◆ weight() [1/2]

INLINE double gum::learning::genericBNLearner::Database::weight ( const std::size_t  i) const

returns the weight of the ith record

Exceptions
OutOfBoundsif i is outside the set of indices of the records

Definition at line 133 of file genericBNLearner_inl.h.

References __database, and gum::learning::IDatabaseTable< T_DATA, ALLOC >::weight().

Referenced by gum::learning::genericBNLearner::databaseWeight(), and gum::learning::genericBNLearner::recordWeight().

133  {
134  return __database.weight(i);
135  }
DatabaseTable __database
the database itself
double weight(const std::size_t i) const
returns the weight of the ith record
+ Here is the call graph for this function:
+ Here is the caller graph for this function:

◆ weight() [2/2]

INLINE double gum::learning::genericBNLearner::Database::weight ( ) const

returns the weight of the whole database

Definition at line 139 of file genericBNLearner_inl.h.

References __database, and gum::learning::IDatabaseTable< T_DATA, ALLOC >::weight().

Referenced by gum::learning::genericBNLearner::__setAprioriWeight(), and setDatabaseWeight().

139  {
140  return __database.weight();
141  }
DatabaseTable __database
the database itself
double weight(const std::size_t i) const
returns the weight of the ith record
+ Here is the call graph for this function:
+ Here is the caller graph for this function:

Member Data Documentation

◆ __database

DatabaseTable gum::learning::genericBNLearner::Database::__database
protected

◆ __domain_sizes

std::vector< std::size_t > gum::learning::genericBNLearner::Database::__domain_sizes
protected

the domain sizes of the variables (useful to speed-up computations)

Definition at line 269 of file genericBNLearner.h.

Referenced by Database(), domainSizes(), and operator=().

◆ __max_threads_number

Size gum::learning::genericBNLearner::Database::__max_threads_number {1}
protected

the max number of threads authorized

Definition at line 278 of file genericBNLearner.h.

◆ __min_nb_rows_per_thread

Size gum::learning::genericBNLearner::Database::__min_nb_rows_per_thread {100}
protected

the minimal number of rows to parse (on average) by thread

Definition at line 282 of file genericBNLearner.h.

◆ __nodeId2cols

Bijection< NodeId, std::size_t > gum::learning::genericBNLearner::Database::__nodeId2cols
protected

a bijection assigning to each variable name its NodeId

Definition at line 272 of file genericBNLearner.h.

Referenced by Database(), idFromName(), nameFromId(), nodeId2Columns(), and operator=().

◆ __parser

DBRowGeneratorParser* gum::learning::genericBNLearner::Database::__parser {nullptr}
protected

the parser used for reading the database

Definition at line 266 of file genericBNLearner.h.

Referenced by Database(), operator=(), parser(), and ~Database().


The documentation for this class was generated from the following files: