aGrUM  0.20.2
a C++ library for (probabilistic) graphical models
gum::learning::genericBNLearner::Database Class Reference

a helper to easily read databases More...

#include <genericBNLearner.h>

+ Collaboration diagram for gum::learning::genericBNLearner::Database:

Public Member Functions

template<typename GUM_SCALAR >
 Database (const std::string &filename, const BayesNet< GUM_SCALAR > &bn, const std::vector< std::string > &missing_symbols)
 
Constructors / Destructors
 Database (const std::string &file, const std::vector< std::string > &missing_symbols)
 default constructor More...
 
 Database (const DatabaseTable<> &db)
 default constructor More...
 
 Database (const std::string &filename, Database &score_database, const std::vector< std::string > &missing_symbols)
 constructor for the aprioris More...
 
template<typename GUM_SCALAR >
 Database (const std::string &filename, const gum::BayesNet< GUM_SCALAR > &bn, const std::vector< std::string > &missing_symbols)
 constructor with a BN providing the variables of interest More...
 
 Database (const Database &from)
 copy constructor More...
 
 Database (Database &&from)
 move constructor More...
 
 ~Database ()
 destructor More...
 
Operators
Databaseoperator= (const Database &from)
 copy operator More...
 
Databaseoperator= (Database &&from)
 move operator More...
 
Accessors / Modifiers
DBRowGeneratorParserparser ()
 returns the parser for the database More...
 
const std::vector< std::size_t > & domainSizes () const
 returns the domain sizes of the variables More...
 
const std::vector< std::string > & names () const
 returns the names of the variables in the database More...
 
NodeId idFromName (const std::string &var_name) const
 returns the node id corresponding to a variable name More...
 
const std::string & nameFromId (NodeId id) const
 returns the variable name corresponding to a given node id More...
 
const DatabaseTabledatabaseTable () const
 returns the internal database table More...
 
void setDatabaseWeight (const double new_weight)
 assign a weight to all the rows of the database so that the sum of their weights is equal to new_weight More...
 
const Bijection< NodeId, std::size_t > & nodeId2Columns () const
 returns the mapping between node ids and their columns in the database More...
 
const std::vector< std::string > & missingSymbols () const
 returns the set of missing symbols taken into account More...
 
std::size_t nbRows () const
 returns the number of records in the database More...
 
std::size_t size () const
 returns the number of records in the database More...
 
void setWeight (const std::size_t i, const double weight)
 sets the weight of the ith record More...
 
double weight (const std::size_t i) const
 returns the weight of the ith record More...
 
double weight () const
 returns the weight of the whole database More...
 

Protected Attributes

DatabaseTable database__
 the database itself More...
 
DBRowGeneratorParserparser__ {nullptr}
 the parser used for reading the database More...
 
std::vector< std::size_t > domain_sizes__
 the domain sizes of the variables (useful to speed-up computations) More...
 
Bijection< NodeId, std::size_t > nodeId2cols__
 a bijection assigning to each variable name its NodeId More...
 
Size max_threads_number__ {1}
 the max number of threads authorized More...
 
Size min_nb_rows_per_thread__ {100}
 the minimal number of rows to parse (on average) by thread More...
 

Detailed Description

a helper to easily read databases

Definition at line 145 of file genericBNLearner.h.

Constructor & Destructor Documentation

◆ Database() [1/7]

gum::learning::genericBNLearner::Database::Database ( const std::string &  file,
const std::vector< std::string > &  missing_symbols 
)
explicit

default constructor

Parameters
filethe name of the CSV file containing the data
missing_symbolsthe set of symbols in the CSV file that correspond to missing data

Definition at line 69 of file genericBNLearner.cpp.

71  :
72  Database(genericBNLearner::readFile__(filename, missing_symbols)) {}
static DatabaseTable readFile__(const std::string &filename, const std::vector< std::string > &missing_symbols)
reads a file and returns a databaseVectInRam
Database(const std::string &file, const std::vector< std::string > &missing_symbols)
default constructor

◆ Database() [2/7]

gum::learning::genericBNLearner::Database::Database ( const DatabaseTable<> &  db)
explicit

default constructor

Parameters
dban already initialized database table that is used to fill the Database

Definition at line 52 of file genericBNLearner.cpp.

52  :
53  database__(db) {
54  // get the variables names
55  const auto& var_names = database__.variableNames();
56  const std::size_t nb_vars = var_names.size();
57  for (auto dom: database__.domainSizes())
58  domain_sizes__.push_back(dom);
59  for (std::size_t i = 0; i < nb_vars; ++i) {
61  }
62 
63  // create the parser
64  parser__
65  = new DBRowGeneratorParser<>(database__.handler(), DBRowGeneratorSet<>());
66  }
void insert(const T1 &first, const T2 &second)
Inserts a new association in the gum::Bijection.
DBVector< std::size_t > domainSizes() const
returns the domain sizes of all the variables in the database table
Bijection< NodeId, std::size_t > nodeId2cols__
a bijection assigning to each variable name its NodeId
DBRowGeneratorParser * parser__
the parser used for reading the database
DatabaseTable database__
the database itself
std::vector< std::size_t > domain_sizes__
the domain sizes of the variables (useful to speed-up computations)
const DBVector< std::string > & variableNames() const noexcept
returns the variable names for all the columns of the database
Size NodeId
Type for node ids.
Definition: graphElements.h:97
iterator handler() const
returns a new unsafe handler pointing to the 1st record of the database

◆ Database() [3/7]

gum::learning::genericBNLearner::Database::Database ( const std::string &  filename,
Database score_database,
const std::vector< std::string > &  missing_symbols 
)

constructor for the aprioris

We must ensure that the variables of the Database are identical to those of the score database (else the countings used by the scores might be erroneous). However, we allow the variables to be ordered differently in the two databases: variables with the same name in both databases are supposed to be the same.

Parameters
filethe name of the CSV file containing the data
score_databasethe main database used for the learning
missing_symbolsthe set of symbols in the CSV file that correspond to missing data

Definition at line 75 of file genericBNLearner.cpp.

78  {
79  // assign to each column name in the CSV file its column
81  DBInitializerFromCSV<> initializer(CSV_filename);
82  const auto& apriori_names = initializer.variableNames();
83  std::size_t apriori_nb_vars = apriori_names.size();
84  HashTable< std::string, std::size_t > apriori_names2col(apriori_nb_vars);
85  for (std::size_t i = std::size_t(0); i < apriori_nb_vars; ++i)
86  apriori_names2col.insert(apriori_names[i], i);
87 
88  // check that there are at least as many variables in the a priori
89  // database as those in the score_database
90  if (apriori_nb_vars < score_database.database__.nbVariables()) {
91  GUM_ERROR(InvalidArgument,
92  "the a apriori database has fewer variables "
93  "than the observed database");
94  }
95 
96  // get the mapping from the columns of score_database to those of
97  // the CSV file
98  const std::vector< std::string >& score_names
99  = score_database.databaseTable().variableNames();
100  const std::size_t score_nb_vars = score_names.size();
101  HashTable< std::size_t, std::size_t > mapping(score_nb_vars);
102  for (std::size_t i = std::size_t(0); i < score_nb_vars; ++i) {
103  try {
104  mapping.insert(i, apriori_names2col[score_names[i]]);
105  } catch (Exception&) {
106  GUM_ERROR(MissingVariableInDatabase,
107  "Variable "
108  << score_names[i]
109  << " of the observed database does not belong to the "
110  << "apriori database");
111  }
112  }
113 
114  // create the translators for CSV database
115  for (std::size_t i = std::size_t(0); i < score_nb_vars; ++i) {
116  const Variable& var = score_database.databaseTable().variable(i);
117  database__.insertTranslator(var, mapping[i], missing_symbols);
118  }
119 
120  // fill the database
121  initializer.fillDatabase(database__);
122 
123  // get the domain sizes of the variables
124  for (auto dom: database__.domainSizes())
125  domain_sizes__.push_back(dom);
126 
127  // compute the mapping from node ids to column indices
128  nodeId2cols__ = score_database.nodeId2Columns();
129 
130  // create the parser
131  parser__
132  = new DBRowGeneratorParser<>(database__.handler(), DBRowGeneratorSet<>());
133  }
DBVector< std::size_t > domainSizes() const
returns the domain sizes of all the variables in the database table
static void checkFileName__(const std::string &filename)
checks whether the extension of a CSV filename is correct
Bijection< NodeId, std::size_t > nodeId2cols__
a bijection assigning to each variable name its NodeId
DBRowGeneratorParser * parser__
the parser used for reading the database
DatabaseTable database__
the database itself
std::vector< std::size_t > domain_sizes__
the domain sizes of the variables (useful to speed-up computations)
std::size_t insertTranslator(const DBTranslator< ALLOC > &translator, const std::size_t input_column, const bool unique_column=true)
insert a new translator into the database table
iterator handler() const
returns a new unsafe handler pointing to the 1st record of the database
#define GUM_ERROR(type, msg)
Definition: exceptions.h:54

◆ Database() [4/7]

template<typename GUM_SCALAR >
gum::learning::genericBNLearner::Database::Database ( const std::string &  filename,
const gum::BayesNet< GUM_SCALAR > &  bn,
const std::vector< std::string > &  missing_symbols 
)

constructor with a BN providing the variables of interest

Parameters
filethe name of the CSV file containing the data
bna Bayesian network indicating which variables of the CSV file are used for learning
missing_symbolsthe set of symbols in the CSV file that correspond to missing data

◆ Database() [5/7]

gum::learning::genericBNLearner::Database::Database ( const Database from)

copy constructor

Definition at line 136 of file genericBNLearner.cpp.

136  :
137  database__(from.database__), domain_sizes__(from.domain_sizes__),
138  nodeId2cols__(from.nodeId2cols__) {
139  // create the parser
140  parser__
141  = new DBRowGeneratorParser<>(database__.handler(), DBRowGeneratorSet<>());
142  }
Bijection< NodeId, std::size_t > nodeId2cols__
a bijection assigning to each variable name its NodeId
DBRowGeneratorParser * parser__
the parser used for reading the database
DatabaseTable database__
the database itself
std::vector< std::size_t > domain_sizes__
the domain sizes of the variables (useful to speed-up computations)
iterator handler() const
returns a new unsafe handler pointing to the 1st record of the database

◆ Database() [6/7]

gum::learning::genericBNLearner::Database::Database ( Database &&  from)

move constructor

Definition at line 145 of file genericBNLearner.cpp.

145  :
146  database__(std::move(from.database__)),
147  domain_sizes__(std::move(from.domain_sizes__)),
148  nodeId2cols__(std::move(from.nodeId2cols__)) {
149  // create the parser
150  parser__
151  = new DBRowGeneratorParser<>(database__.handler(), DBRowGeneratorSet<>());
152  }
Bijection< NodeId, std::size_t > nodeId2cols__
a bijection assigning to each variable name its NodeId
DBRowGeneratorParser * parser__
the parser used for reading the database
DatabaseTable database__
the database itself
std::vector< std::size_t > domain_sizes__
the domain sizes of the variables (useful to speed-up computations)
iterator handler() const
returns a new unsafe handler pointing to the 1st record of the database

◆ ~Database()

gum::learning::genericBNLearner::Database::~Database ( )

destructor

Definition at line 155 of file genericBNLearner.cpp.

155 { delete parser__; }
DBRowGeneratorParser * parser__
the parser used for reading the database

◆ Database() [7/7]

template<typename GUM_SCALAR >
gum::learning::genericBNLearner::Database::Database ( const std::string &  filename,
const BayesNet< GUM_SCALAR > &  bn,
const std::vector< std::string > &  missing_symbols 
)

Definition at line 31 of file genericBNLearner_tpl.h.

References Database().

Referenced by gum::learning::Miic::addConstraints(), gum::learning::genericBNLearner::addForbiddenArc(), gum::learning::genericBNLearner::addMandatoryArc(), gum::learning::genericBNLearner::addPossibleEdge(), gum::learning::BNLearnerListener::BNLearnerListener(), BNVars__(), gum::learning::genericBNLearner::clearDatabaseRanges(), gum::learning::DAG2BNLearner< ALLOC >::clone(), gum::learning::DAG2BNLearner< ALLOC >::createBN(), gum::learning::DAG2BNLearner< ALLOC >::DAG2BNLearner(), Database(), gum::learning::BNDatabaseGenerator< GUM_SCALAR >::database(), gum::learning::genericBNLearner::database(), gum::learning::genericBNLearner::databaseRanges(), databaseTable(), gum::learning::genericBNLearner::databaseWeight(), gum::learning::genericBNLearner::domainSize(), domainSizes(), gum::learning::genericBNLearner::domainSizes(), gum::learning::BNDatabaseGenerator< GUM_SCALAR >::drawSamples(), gum::learning::genericBNLearner::eraseForbiddenArc(), gum::learning::genericBNLearner::eraseMandatoryArc(), gum::learning::genericBNLearner::erasePossibleEdge(), gum::learning::Miic::existsDirectedPath__(), gum::learning::Miic::findBestContributor_(), gum::learning::genericBNLearner::genericBNLearner(), gum::learning::genericBNLearner::getAprioriType__(), gum::learning::Miic::getUnshieldedTriples_(), gum::learning::Miic::getUnshieldedTriplesMIIC_(), gum::learning::GreedyHillClimbing::GreedyHillClimbing(), gum::learning::genericBNLearner::hasMissingValues(), idFromName(), gum::learning::genericBNLearner::idFromName(), gum::learning::Miic::initiation_(), gum::learning::Miic::iteration_(), gum::learning::Miic::latentVariables(), gum::learning::genericBNLearner::latentVariables(), gum::learning::K2::learnBN(), gum::learning::GreedyHillClimbing::learnBN(), gum::learning::LocalSearchWithTabuList::learnBN(), gum::learning::Miic::learnBN(), gum::learning::Miic::learnMixedStructure(), gum::learning::K2::learnStructure(), gum::learning::GreedyHillClimbing::learnStructure(), gum::learning::LocalSearchWithTabuList::learnStructure(), gum::learning::Miic::learnStructure(), gum::learning::BNDatabaseGenerator< GUM_SCALAR >::log2likelihood(), gum::learning::Miic::Miic(), missingSymbols(), nameFromId(), gum::learning::genericBNLearner::nameFromId(), names(), gum::learning::genericBNLearner::names(), gum::learning::genericBNLearner::nbCols(), nbRows(), gum::learning::genericBNLearner::nbRows(), nodeId2Columns(), gum::learning::GreaterPairOn2nd::operator()(), gum::learning::GreaterAbsPairOn2nd::operator()(), gum::learning::GreaterTupleOnLast::operator()(), gum::learning::StructuralConstraintMandatoryArcs::operator=(), gum::learning::StructuralConstraintPossibleEdges::operator=(), gum::learning::BNLearnerListener::operator=(), gum::learning::StructuralConstraintForbiddenArcs::operator=(), gum::learning::StructuralConstraintUndiGraph::operator=(), gum::learning::StructuralConstraintDAG::operator=(), gum::learning::StructuralConstraintIndegree::operator=(), gum::learning::StructuralConstraintDiGraph::operator=(), gum::learning::GreedyHillClimbing::operator=(), gum::learning::DAG2BNLearner< ALLOC >::operator=(), gum::learning::StructuralConstraintTabuList::operator=(), gum::learning::StructuralConstraintSliceOrder::operator=(), gum::learning::Miic::operator=(), gum::learning::Miic::orientation_3off2_(), gum::learning::Miic::orientation_latents_(), gum::learning::Miic::orientation_miic_(), parser(), gum::learning::Miic::propagatesHead_(), gum::learning::genericBNLearner::recordWeight(), gum::learning::Miic::set3off2Behaviour(), gum::learning::BNDatabaseGenerator< GUM_SCALAR >::setAntiTopologicalVarOrder(), gum::learning::genericBNLearner::setAprioriWeight__(), setDatabaseWeight(), gum::learning::genericBNLearner::setDatabaseWeight(), gum::learning::genericBNLearner::setForbiddenArcs(), gum::learning::genericBNLearner::setInitialDAG(), gum::learning::genericBNLearner::setMandatoryArcs(), gum::learning::genericBNLearner::setMaxIndegree(), gum::learning::Miic::setMiicBehaviour(), gum::learning::genericBNLearner::setPossibleEdges(), gum::learning::genericBNLearner::setPossibleSkeleton(), gum::learning::BNDatabaseGenerator< GUM_SCALAR >::setRandomVarOrder(), gum::learning::genericBNLearner::setRecordWeight(), gum::learning::genericBNLearner::setSliceOrder(), gum::learning::BNDatabaseGenerator< GUM_SCALAR >::setTopologicalVarOrder(), gum::learning::BNDatabaseGenerator< GUM_SCALAR >::setVarOrder(), gum::learning::BNDatabaseGenerator< GUM_SCALAR >::setVarOrderFromCSV(), setWeight(), size(), gum::learning::StructuralConstraintDAG::StructuralConstraintDAG(), gum::learning::StructuralConstraintDiGraph::StructuralConstraintDiGraph(), gum::learning::StructuralConstraintForbiddenArcs::StructuralConstraintForbiddenArcs(), gum::learning::StructuralConstraintIndegree::StructuralConstraintIndegree(), gum::learning::StructuralConstraintMandatoryArcs::StructuralConstraintMandatoryArcs(), gum::learning::StructuralConstraintPossibleEdges::StructuralConstraintPossibleEdges(), gum::learning::StructuralConstraintSliceOrder::StructuralConstraintSliceOrder(), gum::learning::StructuralConstraintTabuList::StructuralConstraintTabuList(), gum::learning::StructuralConstraintUndiGraph::StructuralConstraintUndiGraph(), gum::learning::BNDatabaseGenerator< GUM_SCALAR >::toCSV(), gum::learning::BNDatabaseGenerator< GUM_SCALAR >::toDatabaseTable(), gum::learning::GraphChange::toString(), gum::learning::ArcAddition::toString(), gum::learning::ArcDeletion::toString(), gum::learning::ArcReversal::toString(), gum::learning::EdgeAddition::toString(), gum::learning::EdgeDeletion::toString(), gum::learning::Miic::updateProbaTriples_(), gum::learning::genericBNLearner::use3off2(), gum::learning::genericBNLearner::useAprioriBDeu(), gum::learning::genericBNLearner::useAprioriDirichlet(), gum::learning::genericBNLearner::useAprioriSmoothing(), gum::learning::genericBNLearner::useDatabaseRanges(), gum::learning::genericBNLearner::useEM(), gum::learning::genericBNLearner::useGreedyHillClimbing(), gum::learning::genericBNLearner::useK2(), gum::learning::genericBNLearner::useLocalSearchWithTabuList(), gum::learning::genericBNLearner::useMDL(), gum::learning::genericBNLearner::useMIIC(), gum::learning::genericBNLearner::useNML(), gum::learning::genericBNLearner::useNoApriori(), gum::learning::genericBNLearner::useNoCorr(), gum::learning::genericBNLearner::useScoreAIC(), gum::learning::genericBNLearner::useScoreBD(), gum::learning::genericBNLearner::useScoreBDeu(), gum::learning::genericBNLearner::useScoreBIC(), gum::learning::genericBNLearner::useScoreK2(), gum::learning::genericBNLearner::useScoreLog2Likelihood(), gum::learning::BNDatabaseGenerator< GUM_SCALAR >::varOrder(), gum::learning::BNDatabaseGenerator< GUM_SCALAR >::varOrderFromCSV__(), gum::learning::BNDatabaseGenerator< GUM_SCALAR >::varOrderNames(), weight(), gum::learning::BNLearnerListener::whenProgress(), gum::learning::BNLearnerListener::whenStop(), gum::learning::BNDatabaseGenerator< GUM_SCALAR >::~BNDatabaseGenerator(), gum::learning::BNLearnerListener::~BNLearnerListener(), gum::learning::DAG2BNLearner< ALLOC >::~DAG2BNLearner(), gum::learning::GreedyHillClimbing::~GreedyHillClimbing(), gum::learning::Miic::~Miic(), gum::learning::StructuralConstraintDAG::~StructuralConstraintDAG(), gum::learning::StructuralConstraintDiGraph::~StructuralConstraintDiGraph(), gum::learning::StructuralConstraintForbiddenArcs::~StructuralConstraintForbiddenArcs(), gum::learning::StructuralConstraintIndegree::~StructuralConstraintIndegree(), gum::learning::StructuralConstraintMandatoryArcs::~StructuralConstraintMandatoryArcs(), gum::learning::StructuralConstraintPossibleEdges::~StructuralConstraintPossibleEdges(), gum::learning::StructuralConstraintSliceOrder::~StructuralConstraintSliceOrder(), gum::learning::StructuralConstraintTabuList::~StructuralConstraintTabuList(), and gum::learning::StructuralConstraintUndiGraph::~StructuralConstraintUndiGraph().

34  {
35  // assign to each column name in the database its position
37  DBInitializerFromCSV<> initializer(filename);
38  const auto& xvar_names = initializer.variableNames();
39  std::size_t nb_vars = xvar_names.size();
40  HashTable< std::string, std::size_t > var_names(nb_vars);
41  for (std::size_t i = std::size_t(0); i < nb_vars; ++i)
42  var_names.insert(xvar_names[i], i);
43 
44  // we use the bn to insert the translators into the database table
45  std::vector< NodeId > nodes;
46  nodes.reserve(bn.dag().sizeNodes());
47  for (const auto node: bn.dag())
48  nodes.push_back(node);
49  std::sort(nodes.begin(), nodes.end());
50  std::size_t i = std::size_t(0);
51  for (auto node: nodes) {
52  const Variable& var = bn.variable(node);
53  try {
54  database__.insertTranslator(var, var_names[var.name()], missing_symbols);
55  } catch (NotFound&) {
56  GUM_ERROR(MissingVariableInDatabase,
57  "Variable '" << var.name() << "' is missing");
58  }
59  nodeId2cols__.insert(NodeId(node), i++);
60  }
61 
62  // fill the database
63  initializer.fillDatabase(database__);
64 
65  // get the domain sizes of the variables
66  for (auto dom: database__.domainSizes())
67  domain_sizes__.push_back(dom);
68 
69  // create the parser
70  parser__
71  = new DBRowGeneratorParser<>(database__.handler(), DBRowGeneratorSet<>());
72  }
void insert(const T1 &first, const T2 &second)
Inserts a new association in the gum::Bijection.
DBVector< std::size_t > domainSizes() const
returns the domain sizes of all the variables in the database table
static void checkFileName__(const std::string &filename)
checks whether the extension of a CSV filename is correct
Bijection< NodeId, std::size_t > nodeId2cols__
a bijection assigning to each variable name its NodeId
DBRowGeneratorParser * parser__
the parser used for reading the database
DatabaseTable database__
the database itself
std::vector< std::size_t > domain_sizes__
the domain sizes of the variables (useful to speed-up computations)
std::size_t insertTranslator(const DBTranslator< ALLOC > &translator, const std::size_t input_column, const bool unique_column=true)
insert a new translator into the database table
Size NodeId
Type for node ids.
Definition: graphElements.h:97
iterator handler() const
returns a new unsafe handler pointing to the 1st record of the database
#define GUM_ERROR(type, msg)
Definition: exceptions.h:54
+ Here is the call graph for this function:

Member Function Documentation

◆ BNVars__()

template<typename GUM_SCALAR >
BayesNet< GUM_SCALAR > gum::learning::genericBNLearner::Database::BNVars__ ( ) const
private

Definition at line 76 of file genericBNLearner_tpl.h.

References Database().

76  {
77  BayesNet< GUM_SCALAR > bn;
78  const std::size_t nb_vars = database__.nbVariables();
79  for (std::size_t i = 0; i < nb_vars; ++i) {
80  const DiscreteVariable& var
81  = dynamic_cast< const DiscreteVariable& >(database__.variable(i));
82  bn.add(var);
83  }
84  return bn;
85  }
std::size_t nbVariables() const noexcept
returns the number of variables (columns) of the database
DatabaseTable database__
the database itself
const Variable & variable(const std::size_t k, const bool k_is_input_col=false) const
returns either the kth variable of the database table or the first one corresponding to the kth colum...
+ Here is the call graph for this function:

◆ databaseTable()

INLINE const DatabaseTable & gum::learning::genericBNLearner::Database::databaseTable ( ) const

returns the internal database table

Definition at line 93 of file genericBNLearner_inl.h.

References Database().

93  {
94  return database__;
95  }
DatabaseTable database__
the database itself
+ Here is the call graph for this function:

◆ domainSizes()

INLINE const std::vector< std::size_t > & gum::learning::genericBNLearner::Database::domainSizes ( ) const

returns the domain sizes of the variables

Definition at line 46 of file genericBNLearner_inl.h.

References Database().

46  {
47  return domain_sizes__;
48  }
std::vector< std::size_t > domain_sizes__
the domain sizes of the variables (useful to speed-up computations)
+ Here is the call graph for this function:

◆ idFromName()

INLINE NodeId gum::learning::genericBNLearner::Database::idFromName ( const std::string &  var_name) const

returns the node id corresponding to a variable name

Definition at line 66 of file genericBNLearner_inl.h.

References Database().

66  {
67  try {
68  const auto cols = database__.columnsFromVariableName(var_name);
69  return nodeId2cols__.first(cols[0]);
70  } catch (...) {
71  GUM_ERROR(MissingVariableInDatabase,
72  "Variable " << var_name
73  << " could not be found in the database");
74  }
75  }
const T1 & first(const T2 &second) const
Returns the first value of a pair given its second value.
DBVector< std::size_t > columnsFromVariableName(const std::string &name) const
returns the indices of all the columns whose name is passed in argument
Bijection< NodeId, std::size_t > nodeId2cols__
a bijection assigning to each variable name its NodeId
DatabaseTable database__
the database itself
#define GUM_ERROR(type, msg)
Definition: exceptions.h:54
+ Here is the call graph for this function:

◆ missingSymbols()

INLINE const std::vector< std::string > & gum::learning::genericBNLearner::Database::missingSymbols ( ) const

returns the set of missing symbols taken into account

Definition at line 100 of file genericBNLearner_inl.h.

References Database().

100  {
101  return database__.missingSymbols();
102  }
const DBVector< std::string > & missingSymbols() const
returns the set of missing symbols
DatabaseTable database__
the database itself
+ Here is the call graph for this function:

◆ nameFromId()

INLINE const std::string & gum::learning::genericBNLearner::Database::nameFromId ( NodeId  id) const

returns the variable name corresponding to a given node id

Definition at line 80 of file genericBNLearner_inl.h.

References Database().

80  {
81  try {
83  } catch (...) {
84  GUM_ERROR(MissingVariableInDatabase,
85  "Variable of Id " << id
86  << " could not be found in the database");
87  }
88  }
const T2 & second(const T1 &first) const
Returns the second value of a pair given its first value.
Bijection< NodeId, std::size_t > nodeId2cols__
a bijection assigning to each variable name its NodeId
DatabaseTable database__
the database itself
const std::string & variableName(const std::size_t k) const
returns the name of the kth column of the IDatabaseTable
#define GUM_ERROR(type, msg)
Definition: exceptions.h:54
+ Here is the call graph for this function:

◆ names()

INLINE const std::vector< std::string > & gum::learning::genericBNLearner::Database::names ( ) const

returns the names of the variables in the database

Definition at line 52 of file genericBNLearner_inl.h.

References Database().

52  {
53  return database__.variableNames();
54  }
DatabaseTable database__
the database itself
const DBVector< std::string > & variableNames() const noexcept
returns the variable names for all the columns of the database
+ Here is the call graph for this function:

◆ nbRows()

INLINE std::size_t gum::learning::genericBNLearner::Database::nbRows ( ) const

returns the number of records in the database

Definition at line 113 of file genericBNLearner_inl.h.

References Database().

113  {
114  return database__.nbRows();
115  }
std::size_t nbRows() const noexcept
returns the number of records (rows) in the database
DatabaseTable database__
the database itself
+ Here is the call graph for this function:

◆ nodeId2Columns()

INLINE const Bijection< NodeId, std::size_t > & gum::learning::genericBNLearner::Database::nodeId2Columns ( ) const

returns the mapping between node ids and their columns in the database

Definition at line 107 of file genericBNLearner_inl.h.

References Database().

107  {
108  return nodeId2cols__;
109  }
Bijection< NodeId, std::size_t > nodeId2cols__
a bijection assigning to each variable name its NodeId
+ Here is the call graph for this function:

◆ operator=() [1/2]

genericBNLearner::Database & gum::learning::genericBNLearner::Database::operator= ( const Database from)

copy operator

Definition at line 158 of file genericBNLearner.cpp.

158  {
159  if (this != &from) {
160  delete parser__;
161  database__ = from.database__;
162  domain_sizes__ = from.domain_sizes__;
163  nodeId2cols__ = from.nodeId2cols__;
164 
165  // create the parser
166  parser__ = new DBRowGeneratorParser<>(database__.handler(),
167  DBRowGeneratorSet<>());
168  }
169 
170  return *this;
171  }
Bijection< NodeId, std::size_t > nodeId2cols__
a bijection assigning to each variable name its NodeId
DBRowGeneratorParser * parser__
the parser used for reading the database
DatabaseTable database__
the database itself
std::vector< std::size_t > domain_sizes__
the domain sizes of the variables (useful to speed-up computations)
iterator handler() const
returns a new unsafe handler pointing to the 1st record of the database

◆ operator=() [2/2]

genericBNLearner::Database & gum::learning::genericBNLearner::Database::operator= ( Database &&  from)

move operator

Definition at line 174 of file genericBNLearner.cpp.

174  {
175  if (this != &from) {
176  delete parser__;
177  database__ = std::move(from.database__);
178  domain_sizes__ = std::move(from.domain_sizes__);
179  nodeId2cols__ = std::move(from.nodeId2cols__);
180 
181  // create the parser
182  parser__ = new DBRowGeneratorParser<>(database__.handler(),
183  DBRowGeneratorSet<>());
184  }
185 
186  return *this;
187  }
Bijection< NodeId, std::size_t > nodeId2cols__
a bijection assigning to each variable name its NodeId
DBRowGeneratorParser * parser__
the parser used for reading the database
DatabaseTable database__
the database itself
std::vector< std::size_t > domain_sizes__
the domain sizes of the variables (useful to speed-up computations)
iterator handler() const
returns a new unsafe handler pointing to the 1st record of the database

◆ parser()

INLINE DBRowGeneratorParser & gum::learning::genericBNLearner::Database::parser ( )

returns the parser for the database

Definition at line 40 of file genericBNLearner_inl.h.

References Database().

40  {
41  return *parser__;
42  }
DBRowGeneratorParser * parser__
the parser used for reading the database
+ Here is the call graph for this function:

◆ setDatabaseWeight()

INLINE void gum::learning::genericBNLearner::Database::setDatabaseWeight ( const double  new_weight)

assign a weight to all the rows of the database so that the sum of their weights is equal to new_weight

assign new weight to the rows of the learning database

Definition at line 58 of file genericBNLearner_inl.h.

References Database().

58  {
59  if (database__.nbRows() == std::size_t(0)) return;
60  const double weight = new_weight / double(database__.nbRows());
62  }
std::size_t nbRows() const noexcept
returns the number of records (rows) in the database
DatabaseTable database__
the database itself
double weight() const
returns the weight of the whole database
void setAllRowsWeight(const double new_weight)
assign a given weight to all the rows of the database
+ Here is the call graph for this function:

◆ setWeight()

INLINE void gum::learning::genericBNLearner::Database::setWeight ( const std::size_t  i,
const double  weight 
)

sets the weight of the ith record

Exceptions
OutOfBoundsif i is outside the set of indices of the records or if the weight is negative

Definition at line 125 of file genericBNLearner_inl.h.

References Database().

126  {
127  database__.setWeight(i, weight);
128  }
DatabaseTable database__
the database itself
void setWeight(const std::size_t i, const double weight)
assigns a given weight to the ith row of the database
+ Here is the call graph for this function:

◆ size()

INLINE std::size_t gum::learning::genericBNLearner::Database::size ( ) const

returns the number of records in the database

Definition at line 119 of file genericBNLearner_inl.h.

References Database().

119  {
120  return database__.size();
121  }
std::size_t size() const noexcept
returns the number of records (rows) in the database
DatabaseTable database__
the database itself
+ Here is the call graph for this function:

◆ weight() [1/2]

INLINE double gum::learning::genericBNLearner::Database::weight ( const std::size_t  i) const

returns the weight of the ith record

Exceptions
OutOfBoundsif i is outside the set of indices of the records

Definition at line 132 of file genericBNLearner_inl.h.

References Database().

132  {
133  return database__.weight(i);
134  }
double weight(const std::size_t i) const
returns the weight of the ith record
DatabaseTable database__
the database itself
+ Here is the call graph for this function:

◆ weight() [2/2]

INLINE double gum::learning::genericBNLearner::Database::weight ( ) const

returns the weight of the whole database

Definition at line 138 of file genericBNLearner_inl.h.

References Database().

138  {
139  return database__.weight();
140  }
double weight(const std::size_t i) const
returns the weight of the ith record
DatabaseTable database__
the database itself
+ Here is the call graph for this function:

Member Data Documentation

◆ database__

DatabaseTable gum::learning::genericBNLearner::Database::database__
protected

the database itself

Definition at line 273 of file genericBNLearner.h.

◆ domain_sizes__

std::vector< std::size_t > gum::learning::genericBNLearner::Database::domain_sizes__
protected

the domain sizes of the variables (useful to speed-up computations)

Definition at line 279 of file genericBNLearner.h.

◆ max_threads_number__

Size gum::learning::genericBNLearner::Database::max_threads_number__ {1}
protected

the max number of threads authorized

Definition at line 288 of file genericBNLearner.h.

◆ min_nb_rows_per_thread__

Size gum::learning::genericBNLearner::Database::min_nb_rows_per_thread__ {100}
protected

the minimal number of rows to parse (on average) by thread

Definition at line 292 of file genericBNLearner.h.

◆ nodeId2cols__

Bijection< NodeId, std::size_t > gum::learning::genericBNLearner::Database::nodeId2cols__
protected

a bijection assigning to each variable name its NodeId

Definition at line 282 of file genericBNLearner.h.

◆ parser__

DBRowGeneratorParser* gum::learning::genericBNLearner::Database::parser__ {nullptr}
protected

the parser used for reading the database

Definition at line 276 of file genericBNLearner.h.


The documentation for this class was generated from the following files: