aGrUM  0.20.3
a C++ library for (probabilistic) graphical models
gum::learning::genericBNLearner::Database Class Reference

a helper to easily read databases More...

#include <genericBNLearner.h>

+ Collaboration diagram for gum::learning::genericBNLearner::Database:

Public Member Functions

template<typename GUM_SCALAR >
 Database (const std::string &filename, const BayesNet< GUM_SCALAR > &bn, const std::vector< std::string > &missing_symbols)
 
Constructors / Destructors
 Database (const std::string &file, const std::vector< std::string > &missing_symbols)
 default constructor More...
 
 Database (const DatabaseTable<> &db)
 default constructor More...
 
 Database (const std::string &filename, Database &score_database, const std::vector< std::string > &missing_symbols)
 constructor for the aprioris More...
 
template<typename GUM_SCALAR >
 Database (const std::string &filename, const gum::BayesNet< GUM_SCALAR > &bn, const std::vector< std::string > &missing_symbols)
 constructor with a BN providing the variables of interest More...
 
 Database (const Database &from)
 copy constructor More...
 
 Database (Database &&from)
 move constructor More...
 
 ~Database ()
 destructor More...
 
Operators
Databaseoperator= (const Database &from)
 copy operator More...
 
Databaseoperator= (Database &&from)
 move operator More...
 
Accessors / Modifiers
DBRowGeneratorParserparser ()
 returns the parser for the database More...
 
const std::vector< std::size_t > & domainSizes () const
 returns the domain sizes of the variables More...
 
const std::vector< std::string > & names () const
 returns the names of the variables in the database More...
 
NodeId idFromName (const std::string &var_name) const
 returns the node id corresponding to a variable name More...
 
const std::string & nameFromId (NodeId id) const
 returns the variable name corresponding to a given node id More...
 
const DatabaseTabledatabaseTable () const
 returns the internal database table More...
 
void setDatabaseWeight (const double new_weight)
 assign a weight to all the rows of the database so that the sum of their weights is equal to new_weight More...
 
const Bijection< NodeId, std::size_t > & nodeId2Columns () const
 returns the mapping between node ids and their columns in the database More...
 
const std::vector< std::string > & missingSymbols () const
 returns the set of missing symbols taken into account More...
 
std::size_t nbRows () const
 returns the number of records in the database More...
 
std::size_t size () const
 returns the number of records in the database More...
 
void setWeight (const std::size_t i, const double weight)
 sets the weight of the ith record More...
 
double weight (const std::size_t i) const
 returns the weight of the ith record More...
 
double weight () const
 returns the weight of the whole database More...
 

Protected Attributes

DatabaseTable _database_
 the database itself More...
 
DBRowGeneratorParser_parser_ {nullptr}
 the parser used for reading the database More...
 
std::vector< std::size_t > _domain_sizes_
 the domain sizes of the variables (useful to speed-up computations) More...
 
Bijection< NodeId, std::size_t > _nodeId2cols_
 a bijection assigning to each variable name its NodeId More...
 
Size _max_threads_number_ {1}
 the max number of threads authorized More...
 
Size _min_nb_rows_per_thread_ {100}
 the minimal number of rows to parse (on average) by thread More...
 

Detailed Description

a helper to easily read databases

Definition at line 145 of file genericBNLearner.h.

Constructor & Destructor Documentation

◆ Database() [1/7]

gum::learning::genericBNLearner::Database::Database ( const std::string &  file,
const std::vector< std::string > &  missing_symbols 
)
explicit

default constructor

Parameters
filethe name of the CSV file containing the data
missing_symbolsthe set of symbols in the CSV file that correspond to missing data

Definition at line 67 of file genericBNLearner.cpp.

68  :
69  Database(genericBNLearner::readFile_(filename, missing_symbols)) {}
static DatabaseTable readFile_(const std::string &filename, const std::vector< std::string > &missing_symbols)
reads a file and returns a databaseVectInRam
Database(const std::string &file, const std::vector< std::string > &missing_symbols)
default constructor

◆ Database() [2/7]

gum::learning::genericBNLearner::Database::Database ( const DatabaseTable<> &  db)
explicit

default constructor

Parameters
dban already initialized database table that is used to fill the Database

Definition at line 52 of file genericBNLearner.cpp.

52  : _database_(db) {
53  // get the variables names
54  const auto& var_names = _database_.variableNames();
55  const std::size_t nb_vars = var_names.size();
56  for (auto dom: _database_.domainSizes())
57  _domain_sizes_.push_back(dom);
58  for (std::size_t i = 0; i < nb_vars; ++i) {
60  }
61 
62  // create the parser
63  _parser_ = new DBRowGeneratorParser<>(_database_.handler(), DBRowGeneratorSet<>());
64  }
void insert(const T1 &first, const T2 &second)
Inserts a new association in the gum::Bijection.
std::vector< std::size_t > _domain_sizes_
the domain sizes of the variables (useful to speed-up computations)
DBVector< std::size_t > domainSizes() const
returns the domain sizes of all the variables in the database table
Bijection< NodeId, std::size_t > _nodeId2cols_
a bijection assigning to each variable name its NodeId
DBRowGeneratorParser * _parser_
the parser used for reading the database
DatabaseTable _database_
the database itself
const DBVector< std::string > & variableNames() const noexcept
returns the variable names for all the columns of the database
Size NodeId
Type for node ids.
Definition: graphElements.h:97
iterator handler() const
returns a new unsafe handler pointing to the 1st record of the database

◆ Database() [3/7]

gum::learning::genericBNLearner::Database::Database ( const std::string &  filename,
Database score_database,
const std::vector< std::string > &  missing_symbols 
)

constructor for the aprioris

We must ensure that the variables of the Database are identical to those of the score database (else the countings used by the scores might be erroneous). However, we allow the variables to be ordered differently in the two databases: variables with the same name in both databases are supposed to be the same.

Parameters
filethe name of the CSV file containing the data
score_databasethe main database used for the learning
missing_symbolsthe set of symbols in the CSV file that correspond to missing data

Definition at line 72 of file genericBNLearner.cpp.

74  {
75  // assign to each column name in the CSV file its column
77  DBInitializerFromCSV<> initializer(CSV_filename);
78  const auto& apriori_names = initializer.variableNames();
79  std::size_t apriori_nb_vars = apriori_names.size();
80  HashTable< std::string, std::size_t > apriori_names2col(apriori_nb_vars);
81  for (std::size_t i = std::size_t(0); i < apriori_nb_vars; ++i)
82  apriori_names2col.insert(apriori_names[i], i);
83 
84  // check that there are at least as many variables in the a priori
85  // database as those in the score_database
86  if (apriori_nb_vars < score_database._database_.nbVariables()) {
87  GUM_ERROR(InvalidArgument,
88  "the a apriori database has fewer variables "
89  "than the observed database");
90  }
91 
92  // get the mapping from the columns of score_database to those of
93  // the CSV file
94  const std::vector< std::string >& score_names
95  = score_database.databaseTable().variableNames();
96  const std::size_t score_nb_vars = score_names.size();
97  HashTable< std::size_t, std::size_t > mapping(score_nb_vars);
98  for (std::size_t i = std::size_t(0); i < score_nb_vars; ++i) {
99  try {
100  mapping.insert(i, apriori_names2col[score_names[i]]);
101  } catch (Exception&) {
102  GUM_ERROR(MissingVariableInDatabase,
103  "Variable " << score_names[i]
104  << " of the observed database does not belong to the "
105  << "apriori database");
106  }
107  }
108 
109  // create the translators for CSV database
110  for (std::size_t i = std::size_t(0); i < score_nb_vars; ++i) {
111  const Variable& var = score_database.databaseTable().variable(i);
112  _database_.insertTranslator(var, mapping[i], missing_symbols);
113  }
114 
115  // fill the database
116  initializer.fillDatabase(_database_);
117 
118  // get the domain sizes of the variables
119  for (auto dom: _database_.domainSizes())
120  _domain_sizes_.push_back(dom);
121 
122  // compute the mapping from node ids to column indices
123  _nodeId2cols_ = score_database.nodeId2Columns();
124 
125  // create the parser
126  _parser_ = new DBRowGeneratorParser<>(_database_.handler(), DBRowGeneratorSet<>());
127  }
std::vector< std::size_t > _domain_sizes_
the domain sizes of the variables (useful to speed-up computations)
DBVector< std::size_t > domainSizes() const
returns the domain sizes of all the variables in the database table
Bijection< NodeId, std::size_t > _nodeId2cols_
a bijection assigning to each variable name its NodeId
DBRowGeneratorParser * _parser_
the parser used for reading the database
DatabaseTable _database_
the database itself
static void checkFileName_(const std::string &filename)
checks whether the extension of a CSV filename is correct
std::size_t insertTranslator(const DBTranslator< ALLOC > &translator, const std::size_t input_column, const bool unique_column=true)
insert a new translator into the database table
iterator handler() const
returns a new unsafe handler pointing to the 1st record of the database
#define GUM_ERROR(type, msg)
Definition: exceptions.h:51

◆ Database() [4/7]

template<typename GUM_SCALAR >
gum::learning::genericBNLearner::Database::Database ( const std::string &  filename,
const gum::BayesNet< GUM_SCALAR > &  bn,
const std::vector< std::string > &  missing_symbols 
)

constructor with a BN providing the variables of interest

Parameters
filethe name of the CSV file containing the data
bna Bayesian network indicating which variables of the CSV file are used for learning
missing_symbolsthe set of symbols in the CSV file that correspond to missing data

◆ Database() [5/7]

gum::learning::genericBNLearner::Database::Database ( const Database from)

copy constructor

Definition at line 130 of file genericBNLearner.cpp.

130  :
131  _database_(from._database_), _domain_sizes_(from._domain_sizes_),
132  _nodeId2cols_(from._nodeId2cols_) {
133  // create the parser
134  _parser_ = new DBRowGeneratorParser<>(_database_.handler(), DBRowGeneratorSet<>());
135  }
std::vector< std::size_t > _domain_sizes_
the domain sizes of the variables (useful to speed-up computations)
Bijection< NodeId, std::size_t > _nodeId2cols_
a bijection assigning to each variable name its NodeId
DBRowGeneratorParser * _parser_
the parser used for reading the database
DatabaseTable _database_
the database itself
iterator handler() const
returns a new unsafe handler pointing to the 1st record of the database

◆ Database() [6/7]

gum::learning::genericBNLearner::Database::Database ( Database &&  from)

move constructor

Definition at line 138 of file genericBNLearner.cpp.

138  :
139  _database_(std::move(from._database_)), _domain_sizes_(std::move(from._domain_sizes_)),
140  _nodeId2cols_(std::move(from._nodeId2cols_)) {
141  // create the parser
142  _parser_ = new DBRowGeneratorParser<>(_database_.handler(), DBRowGeneratorSet<>());
143  }
std::vector< std::size_t > _domain_sizes_
the domain sizes of the variables (useful to speed-up computations)
Bijection< NodeId, std::size_t > _nodeId2cols_
a bijection assigning to each variable name its NodeId
DBRowGeneratorParser * _parser_
the parser used for reading the database
DatabaseTable _database_
the database itself
iterator handler() const
returns a new unsafe handler pointing to the 1st record of the database

◆ ~Database()

gum::learning::genericBNLearner::Database::~Database ( )

destructor

Definition at line 146 of file genericBNLearner.cpp.

146 { delete _parser_; }
DBRowGeneratorParser * _parser_
the parser used for reading the database

◆ Database() [7/7]

template<typename GUM_SCALAR >
gum::learning::genericBNLearner::Database::Database ( const std::string &  filename,
const BayesNet< GUM_SCALAR > &  bn,
const std::vector< std::string > &  missing_symbols 
)

Definition at line 31 of file genericBNLearner_tpl.h.

References Database().

Referenced by _BNVars_(), gum::learning::Miic::_existsDirectedPath_(), gum::learning::Miic::_existsNonTrivialDirectedPath_(), gum::learning::Miic::_isNotLatentCouple_(), gum::learning::Miic::_orientingVstructureMiic_(), gum::learning::Miic::_propagatingOrientationMiic_(), gum::learning::genericBNLearner::_setAprioriWeight_(), gum::learning::BNDatabaseGenerator< GUM_SCALAR >::_varOrderFromCSV_(), gum::learning::Miic::addConstraints(), gum::learning::genericBNLearner::addForbiddenArc(), gum::learning::genericBNLearner::addMandatoryArc(), gum::learning::genericBNLearner::addPossibleEdge(), gum::learning::BNLearnerListener::BNLearnerListener(), gum::learning::genericBNLearner::clearDatabaseRanges(), gum::learning::DAG2BNLearner< ALLOC >::clone(), gum::learning::DAG2BNLearner< ALLOC >::createBN(), gum::learning::DAG2BNLearner< ALLOC >::DAG2BNLearner(), Database(), gum::learning::BNDatabaseGenerator< GUM_SCALAR >::database(), gum::learning::genericBNLearner::database(), gum::learning::genericBNLearner::databaseRanges(), databaseTable(), gum::learning::genericBNLearner::databaseWeight(), gum::learning::genericBNLearner::domainSize(), domainSizes(), gum::learning::genericBNLearner::domainSizes(), gum::learning::BNDatabaseGenerator< GUM_SCALAR >::drawSamples(), gum::learning::genericBNLearner::eraseForbiddenArc(), gum::learning::genericBNLearner::eraseMandatoryArc(), gum::learning::genericBNLearner::erasePossibleEdge(), gum::learning::Miic::findBestContributor_(), gum::learning::genericBNLearner::genericBNLearner(), gum::learning::genericBNLearner::getAprioriType_(), gum::learning::GreedyHillClimbing::GreedyHillClimbing(), gum::learning::genericBNLearner::hasMissingValues(), idFromName(), gum::learning::genericBNLearner::idFromName(), gum::learning::Miic::initiation_(), gum::learning::Miic::isForbidenArc_(), gum::learning::Miic::isOrientable_(), gum::learning::Miic::iteration_(), gum::learning::Miic::latentVariables(), gum::learning::genericBNLearner::latentVariables(), gum::learning::K2::learnBN(), gum::learning::GreedyHillClimbing::learnBN(), gum::learning::LocalSearchWithTabuList::learnBN(), gum::learning::Miic::learnBN(), gum::learning::Miic::learnMixedStructure(), gum::learning::K2::learnStructure(), gum::learning::GreedyHillClimbing::learnStructure(), gum::learning::LocalSearchWithTabuList::learnStructure(), gum::learning::Miic::learnStructure(), gum::learning::BNDatabaseGenerator< GUM_SCALAR >::log2likelihood(), gum::learning::Miic::Miic(), missingSymbols(), nameFromId(), gum::learning::genericBNLearner::nameFromId(), names(), gum::learning::genericBNLearner::names(), gum::learning::genericBNLearner::nbCols(), nbRows(), gum::learning::genericBNLearner::nbRows(), nodeId2Columns(), gum::learning::GreaterPairOn2nd::operator()(), gum::learning::GreaterAbsPairOn2nd::operator()(), gum::learning::GreaterTupleOnLast::operator()(), gum::learning::StructuralConstraintMandatoryArcs::operator=(), gum::learning::StructuralConstraintPossibleEdges::operator=(), gum::learning::StructuralConstraintForbiddenArcs::operator=(), gum::learning::BNLearnerListener::operator=(), gum::learning::StructuralConstraintIndegree::operator=(), gum::learning::StructuralConstraintUndiGraph::operator=(), gum::learning::StructuralConstraintDAG::operator=(), gum::learning::StructuralConstraintDiGraph::operator=(), gum::learning::GreedyHillClimbing::operator=(), gum::learning::DAG2BNLearner< ALLOC >::operator=(), gum::learning::StructuralConstraintTabuList::operator=(), gum::learning::StructuralConstraintSliceOrder::operator=(), gum::learning::Miic::operator=(), gum::learning::Miic::orientation3off2_(), gum::learning::Miic::orientationLatents_(), gum::learning::Miic::orientationMiic_(), parser(), gum::learning::Miic::propagatesOrientationInChainOfRemainingEdges_(), gum::learning::Miic::propagatesRemainingOrientableEdges_(), gum::learning::genericBNLearner::recordWeight(), gum::learning::Miic::set3of2Behaviour(), gum::learning::BNDatabaseGenerator< GUM_SCALAR >::setAntiTopologicalVarOrder(), setDatabaseWeight(), gum::learning::genericBNLearner::setDatabaseWeight(), gum::learning::genericBNLearner::setForbiddenArcs(), gum::learning::genericBNLearner::setInitialDAG(), gum::learning::genericBNLearner::setMandatoryArcs(), gum::learning::genericBNLearner::setMaxIndegree(), gum::learning::Miic::setMiicBehaviour(), gum::learning::genericBNLearner::setPossibleEdges(), gum::learning::genericBNLearner::setPossibleSkeleton(), gum::learning::BNDatabaseGenerator< GUM_SCALAR >::setRandomVarOrder(), gum::learning::genericBNLearner::setRecordWeight(), gum::learning::genericBNLearner::setSliceOrder(), gum::learning::BNDatabaseGenerator< GUM_SCALAR >::setTopologicalVarOrder(), gum::learning::BNDatabaseGenerator< GUM_SCALAR >::setVarOrder(), gum::learning::BNDatabaseGenerator< GUM_SCALAR >::setVarOrderFromCSV(), setWeight(), size(), gum::learning::StructuralConstraintDAG::StructuralConstraintDAG(), gum::learning::StructuralConstraintDiGraph::StructuralConstraintDiGraph(), gum::learning::StructuralConstraintForbiddenArcs::StructuralConstraintForbiddenArcs(), gum::learning::StructuralConstraintIndegree::StructuralConstraintIndegree(), gum::learning::StructuralConstraintMandatoryArcs::StructuralConstraintMandatoryArcs(), gum::learning::StructuralConstraintPossibleEdges::StructuralConstraintPossibleEdges(), gum::learning::StructuralConstraintSliceOrder::StructuralConstraintSliceOrder(), gum::learning::StructuralConstraintTabuList::StructuralConstraintTabuList(), gum::learning::StructuralConstraintUndiGraph::StructuralConstraintUndiGraph(), gum::learning::BNDatabaseGenerator< GUM_SCALAR >::toCSV(), gum::learning::BNDatabaseGenerator< GUM_SCALAR >::toDatabaseTable(), gum::learning::GraphChange::toString(), gum::learning::ArcAddition::toString(), gum::learning::ArcDeletion::toString(), gum::learning::ArcReversal::toString(), gum::learning::EdgeAddition::toString(), gum::learning::EdgeDeletion::toString(), gum::learning::Miic::unshieldedTriples_(), gum::learning::Miic::unshieldedTriplesMiic_(), gum::learning::Miic::updateProbaTriples_(), gum::learning::genericBNLearner::use3off2(), gum::learning::genericBNLearner::useAprioriBDeu(), gum::learning::genericBNLearner::useAprioriDirichlet(), gum::learning::genericBNLearner::useAprioriSmoothing(), gum::learning::genericBNLearner::useDatabaseRanges(), gum::learning::genericBNLearner::useEM(), gum::learning::genericBNLearner::useGreedyHillClimbing(), gum::learning::genericBNLearner::useK2(), gum::learning::genericBNLearner::useLocalSearchWithTabuList(), gum::learning::genericBNLearner::useMDLCorrection(), gum::learning::genericBNLearner::useMIIC(), gum::learning::genericBNLearner::useNMLCorrection(), gum::learning::genericBNLearner::useNoApriori(), gum::learning::genericBNLearner::useNoCorrection(), gum::learning::genericBNLearner::useScoreAIC(), gum::learning::genericBNLearner::useScoreBD(), gum::learning::genericBNLearner::useScoreBDeu(), gum::learning::genericBNLearner::useScoreBIC(), gum::learning::genericBNLearner::useScoreK2(), gum::learning::genericBNLearner::useScoreLog2Likelihood(), gum::learning::BNDatabaseGenerator< GUM_SCALAR >::varOrder(), gum::learning::BNDatabaseGenerator< GUM_SCALAR >::varOrderNames(), weight(), gum::learning::BNLearnerListener::whenProgress(), gum::learning::BNLearnerListener::whenStop(), gum::learning::BNDatabaseGenerator< GUM_SCALAR >::~BNDatabaseGenerator(), gum::learning::BNLearnerListener::~BNLearnerListener(), gum::learning::DAG2BNLearner< ALLOC >::~DAG2BNLearner(), gum::learning::GreedyHillClimbing::~GreedyHillClimbing(), gum::learning::Miic::~Miic(), gum::learning::StructuralConstraintDAG::~StructuralConstraintDAG(), gum::learning::StructuralConstraintDiGraph::~StructuralConstraintDiGraph(), gum::learning::StructuralConstraintForbiddenArcs::~StructuralConstraintForbiddenArcs(), gum::learning::StructuralConstraintIndegree::~StructuralConstraintIndegree(), gum::learning::StructuralConstraintMandatoryArcs::~StructuralConstraintMandatoryArcs(), gum::learning::StructuralConstraintPossibleEdges::~StructuralConstraintPossibleEdges(), gum::learning::StructuralConstraintSliceOrder::~StructuralConstraintSliceOrder(), gum::learning::StructuralConstraintTabuList::~StructuralConstraintTabuList(), and gum::learning::StructuralConstraintUndiGraph::~StructuralConstraintUndiGraph().

33  {
34  // assign to each column name in the database its position
36  DBInitializerFromCSV<> initializer(filename);
37  const auto& xvar_names = initializer.variableNames();
38  std::size_t nb_vars = xvar_names.size();
39  HashTable< std::string, std::size_t > var_names(nb_vars);
40  for (std::size_t i = std::size_t(0); i < nb_vars; ++i)
41  var_names.insert(xvar_names[i], i);
42 
43  // we use the bn to insert the translators into the database table
44  std::vector< NodeId > nodes;
45  nodes.reserve(bn.dag().sizeNodes());
46  for (const auto node: bn.dag())
47  nodes.push_back(node);
48  std::sort(nodes.begin(), nodes.end());
49  std::size_t i = std::size_t(0);
50  for (auto node: nodes) {
51  const Variable& var = bn.variable(node);
52  try {
53  _database_.insertTranslator(var, var_names[var.name()], missing_symbols);
54  } catch (NotFound&) {
55  GUM_ERROR(MissingVariableInDatabase, "Variable '" << var.name() << "' is missing")
56  }
57  _nodeId2cols_.insert(NodeId(node), i++);
58  }
59 
60  // fill the database
61  initializer.fillDatabase(_database_);
62 
63  // get the domain sizes of the variables
64  for (auto dom: _database_.domainSizes())
65  _domain_sizes_.push_back(dom);
66 
67  // create the parser
68  _parser_ = new DBRowGeneratorParser<>(_database_.handler(), DBRowGeneratorSet<>());
69  }
void insert(const T1 &first, const T2 &second)
Inserts a new association in the gum::Bijection.
std::vector< std::size_t > _domain_sizes_
the domain sizes of the variables (useful to speed-up computations)
DBVector< std::size_t > domainSizes() const
returns the domain sizes of all the variables in the database table
Bijection< NodeId, std::size_t > _nodeId2cols_
a bijection assigning to each variable name its NodeId
DBRowGeneratorParser * _parser_
the parser used for reading the database
DatabaseTable _database_
the database itself
static void checkFileName_(const std::string &filename)
checks whether the extension of a CSV filename is correct
std::size_t insertTranslator(const DBTranslator< ALLOC > &translator, const std::size_t input_column, const bool unique_column=true)
insert a new translator into the database table
Size NodeId
Type for node ids.
Definition: graphElements.h:97
iterator handler() const
returns a new unsafe handler pointing to the 1st record of the database
#define GUM_ERROR(type, msg)
Definition: exceptions.h:51
+ Here is the call graph for this function:

Member Function Documentation

◆ _BNVars_()

template<typename GUM_SCALAR >
BayesNet< GUM_SCALAR > gum::learning::genericBNLearner::Database::_BNVars_ ( ) const
private

Definition at line 73 of file genericBNLearner_tpl.h.

References Database().

73  {
74  BayesNet< GUM_SCALAR > bn;
75  const std::size_t nb_vars = _database_.nbVariables();
76  for (std::size_t i = 0; i < nb_vars; ++i) {
77  const DiscreteVariable& var
78  = dynamic_cast< const DiscreteVariable& >(_database_.variable(i));
79  bn.add(var);
80  }
81  return bn;
82  }
std::size_t nbVariables() const noexcept
returns the number of variables (columns) of the database
DatabaseTable _database_
the database itself
const Variable & variable(const std::size_t k, const bool k_is_input_col=false) const
returns either the kth variable of the database table or the first one corresponding to the kth colum...
+ Here is the call graph for this function:

◆ databaseTable()

INLINE const DatabaseTable & gum::learning::genericBNLearner::Database::databaseTable ( ) const

returns the internal database table

Definition at line 83 of file genericBNLearner_inl.h.

References Database().

83  {
84  return _database_;
85  }
DatabaseTable _database_
the database itself
+ Here is the call graph for this function:

◆ domainSizes()

INLINE const std::vector< std::size_t > & gum::learning::genericBNLearner::Database::domainSizes ( ) const

returns the domain sizes of the variables

Definition at line 43 of file genericBNLearner_inl.h.

References Database().

43  {
44  return _domain_sizes_;
45  }
std::vector< std::size_t > _domain_sizes_
the domain sizes of the variables (useful to speed-up computations)
+ Here is the call graph for this function:

◆ idFromName()

INLINE NodeId gum::learning::genericBNLearner::Database::idFromName ( const std::string &  var_name) const

returns the node id corresponding to a variable name

Definition at line 60 of file genericBNLearner_inl.h.

References Database().

60  {
61  try {
62  const auto cols = _database_.columnsFromVariableName(var_name);
63  return _nodeId2cols_.first(cols[0]);
64  } catch (...) {
65  GUM_ERROR(MissingVariableInDatabase,
66  "Variable " << var_name << " could not be found in the database");
67  }
68  }
const T1 & first(const T2 &second) const
Returns the first value of a pair given its second value.
DBVector< std::size_t > columnsFromVariableName(const std::string &name) const
returns the indices of all the columns whose name is passed in argument
Bijection< NodeId, std::size_t > _nodeId2cols_
a bijection assigning to each variable name its NodeId
DatabaseTable _database_
the database itself
#define GUM_ERROR(type, msg)
Definition: exceptions.h:51
+ Here is the call graph for this function:

◆ missingSymbols()

INLINE const std::vector< std::string > & gum::learning::genericBNLearner::Database::missingSymbols ( ) const

returns the set of missing symbols taken into account

Definition at line 89 of file genericBNLearner_inl.h.

References Database().

89  {
90  return _database_.missingSymbols();
91  }
const DBVector< std::string > & missingSymbols() const
returns the set of missing symbols
DatabaseTable _database_
the database itself
+ Here is the call graph for this function:

◆ nameFromId()

INLINE const std::string & gum::learning::genericBNLearner::Database::nameFromId ( NodeId  id) const

returns the variable name corresponding to a given node id

Definition at line 72 of file genericBNLearner_inl.h.

References Database().

72  {
73  try {
75  } catch (...) {
76  GUM_ERROR(MissingVariableInDatabase,
77  "Variable of Id " << id << " could not be found in the database");
78  }
79  }
const T2 & second(const T1 &first) const
Returns the second value of a pair given its first value.
Bijection< NodeId, std::size_t > _nodeId2cols_
a bijection assigning to each variable name its NodeId
DatabaseTable _database_
the database itself
const std::string & variableName(const std::size_t k) const
returns the name of the kth column of the IDatabaseTable
#define GUM_ERROR(type, msg)
Definition: exceptions.h:51
+ Here is the call graph for this function:

◆ names()

INLINE const std::vector< std::string > & gum::learning::genericBNLearner::Database::names ( ) const

returns the names of the variables in the database

Definition at line 48 of file genericBNLearner_inl.h.

References Database().

48  {
49  return _database_.variableNames();
50  }
DatabaseTable _database_
the database itself
const DBVector< std::string > & variableNames() const noexcept
returns the variable names for all the columns of the database
+ Here is the call graph for this function:

◆ nbRows()

INLINE std::size_t gum::learning::genericBNLearner::Database::nbRows ( ) const

returns the number of records in the database

Definition at line 102 of file genericBNLearner_inl.h.

References Database().

102 { return _database_.nbRows(); }
DatabaseTable _database_
the database itself
std::size_t nbRows() const noexcept
returns the number of records (rows) in the database
+ Here is the call graph for this function:

◆ nodeId2Columns()

INLINE const Bijection< NodeId, std::size_t > & gum::learning::genericBNLearner::Database::nodeId2Columns ( ) const

returns the mapping between node ids and their columns in the database

Definition at line 96 of file genericBNLearner_inl.h.

References Database().

96  {
97  return _nodeId2cols_;
98  }
Bijection< NodeId, std::size_t > _nodeId2cols_
a bijection assigning to each variable name its NodeId
+ Here is the call graph for this function:

◆ operator=() [1/2]

genericBNLearner::Database & gum::learning::genericBNLearner::Database::operator= ( const Database from)

copy operator

Definition at line 148 of file genericBNLearner.cpp.

148  {
149  if (this != &from) {
150  delete _parser_;
151  _database_ = from._database_;
152  _domain_sizes_ = from._domain_sizes_;
153  _nodeId2cols_ = from._nodeId2cols_;
154 
155  // create the parser
156  _parser_ = new DBRowGeneratorParser<>(_database_.handler(), DBRowGeneratorSet<>());
157  }
158 
159  return *this;
160  }
std::vector< std::size_t > _domain_sizes_
the domain sizes of the variables (useful to speed-up computations)
Bijection< NodeId, std::size_t > _nodeId2cols_
a bijection assigning to each variable name its NodeId
DBRowGeneratorParser * _parser_
the parser used for reading the database
DatabaseTable _database_
the database itself
iterator handler() const
returns a new unsafe handler pointing to the 1st record of the database

◆ operator=() [2/2]

genericBNLearner::Database & gum::learning::genericBNLearner::Database::operator= ( Database &&  from)

move operator

Definition at line 162 of file genericBNLearner.cpp.

162  {
163  if (this != &from) {
164  delete _parser_;
165  _database_ = std::move(from._database_);
166  _domain_sizes_ = std::move(from._domain_sizes_);
167  _nodeId2cols_ = std::move(from._nodeId2cols_);
168 
169  // create the parser
170  _parser_ = new DBRowGeneratorParser<>(_database_.handler(), DBRowGeneratorSet<>());
171  }
172 
173  return *this;
174  }
std::vector< std::size_t > _domain_sizes_
the domain sizes of the variables (useful to speed-up computations)
Bijection< NodeId, std::size_t > _nodeId2cols_
a bijection assigning to each variable name its NodeId
DBRowGeneratorParser * _parser_
the parser used for reading the database
DatabaseTable _database_
the database itself
iterator handler() const
returns a new unsafe handler pointing to the 1st record of the database

◆ parser()

INLINE DBRowGeneratorParser & gum::learning::genericBNLearner::Database::parser ( )

returns the parser for the database

Definition at line 40 of file genericBNLearner_inl.h.

References Database().

40 { return *_parser_; }
DBRowGeneratorParser * _parser_
the parser used for reading the database
+ Here is the call graph for this function:

◆ setDatabaseWeight()

INLINE void gum::learning::genericBNLearner::Database::setDatabaseWeight ( const double  new_weight)

assign a weight to all the rows of the database so that the sum of their weights is equal to new_weight

assign new weight to the rows of the learning database

Definition at line 53 of file genericBNLearner_inl.h.

References Database().

53  {
54  if (_database_.nbRows() == std::size_t(0)) return;
55  const double weight = new_weight / double(_database_.nbRows());
57  }
DatabaseTable _database_
the database itself
std::size_t nbRows() const noexcept
returns the number of records (rows) in the database
double weight() const
returns the weight of the whole database
void setAllRowsWeight(const double new_weight)
assign a given weight to all the rows of the database
+ Here is the call graph for this function:

◆ setWeight()

INLINE void gum::learning::genericBNLearner::Database::setWeight ( const std::size_t  i,
const double  weight 
)

sets the weight of the ith record

Exceptions
OutOfBoundsif i is outside the set of indices of the records or if the weight is negative

Definition at line 110 of file genericBNLearner_inl.h.

References Database().

110  {
111  _database_.setWeight(i, weight);
112  }
DatabaseTable _database_
the database itself
void setWeight(const std::size_t i, const double weight)
assigns a given weight to the ith row of the database
+ Here is the call graph for this function:

◆ size()

INLINE std::size_t gum::learning::genericBNLearner::Database::size ( ) const

returns the number of records in the database

Definition at line 106 of file genericBNLearner_inl.h.

References Database().

106 { return _database_.size(); }
std::size_t size() const noexcept
returns the number of records (rows) in the database
DatabaseTable _database_
the database itself
+ Here is the call graph for this function:

◆ weight() [1/2]

INLINE double gum::learning::genericBNLearner::Database::weight ( const std::size_t  i) const

returns the weight of the ith record

Exceptions
OutOfBoundsif i is outside the set of indices of the records

Definition at line 116 of file genericBNLearner_inl.h.

References Database().

116  {
117  return _database_.weight(i);
118  }
DatabaseTable _database_
the database itself
double weight(const std::size_t i) const
returns the weight of the ith record
+ Here is the call graph for this function:

◆ weight() [2/2]

INLINE double gum::learning::genericBNLearner::Database::weight ( ) const

returns the weight of the whole database

Definition at line 122 of file genericBNLearner_inl.h.

References Database().

122 { return _database_.weight(); }
DatabaseTable _database_
the database itself
double weight(const std::size_t i) const
returns the weight of the ith record
+ Here is the call graph for this function:

Member Data Documentation

◆ _database_

DatabaseTable gum::learning::genericBNLearner::Database::_database_
protected

the database itself

Definition at line 273 of file genericBNLearner.h.

◆ _domain_sizes_

std::vector< std::size_t > gum::learning::genericBNLearner::Database::_domain_sizes_
protected

the domain sizes of the variables (useful to speed-up computations)

Definition at line 279 of file genericBNLearner.h.

◆ _max_threads_number_

Size gum::learning::genericBNLearner::Database::_max_threads_number_ {1}
protected

the max number of threads authorized

Definition at line 288 of file genericBNLearner.h.

◆ _min_nb_rows_per_thread_

Size gum::learning::genericBNLearner::Database::_min_nb_rows_per_thread_ {100}
protected

the minimal number of rows to parse (on average) by thread

Definition at line 292 of file genericBNLearner.h.

◆ _nodeId2cols_

Bijection< NodeId, std::size_t > gum::learning::genericBNLearner::Database::_nodeId2cols_
protected

a bijection assigning to each variable name its NodeId

Definition at line 282 of file genericBNLearner.h.

◆ _parser_

DBRowGeneratorParser* gum::learning::genericBNLearner::Database::_parser_ {nullptr}
protected

the parser used for reading the database

Definition at line 276 of file genericBNLearner.h.


The documentation for this class was generated from the following files: