aGrUM  0.13.2
gum::learning::DBInitializerFromCSV< ALLOC > Class Template Reference

The class for initializing DatabaseTable and RawDatabaseTable instances from CSV files. More...

#include <agrum/learning/database/DBInitializerFromCSV.h>

+ Inheritance diagram for gum::learning::DBInitializerFromCSV< ALLOC >:
+ Collaboration diagram for gum::learning::DBInitializerFromCSV< ALLOC >:

Public Member Functions

Constructors / Destructors
 DBInitializerFromCSV (const std::string filename, bool fileContainsNames=true, const std::string delimiter=",", const char commentmarker= '#', const char quoteMarker= '"', const allocator_type& alloc = allocator_type())
 default constructor More...
 
 DBInitializerFromCSV (const DBInitializerFromCSV< ALLOC > &from)
 copy constructor More...
 
 DBInitializerFromCSV (const DBInitializerFromCSV< ALLOC > &from, const allocator_type &alloc)
 copy constructor with a given allocator More...
 
 DBInitializerFromCSV (DBInitializerFromCSV< ALLOC > &&from)
 move constructor More...
 
 DBInitializerFromCSV (DBInitializerFromCSV< ALLOC > &&from, const allocator_type &alloc)
 move constructor with a given allocator More...
 
virtual DBInitializerFromCSV< ALLOC > * clone () const
 virtual copy constructor More...
 
virtual DBInitializerFromCSV< ALLOC > * clone (const allocator_type &alloc) const
 virtual copy constructor with a given allocator More...
 
virtual ~DBInitializerFromCSV ()
 destructor More...
 
Operators
DBInitializerFromCSV< ALLOC > & operator= (const DBInitializerFromCSV< ALLOC > &from)
 copy operator More...
 
DBInitializerFromCSV< ALLOC > & operator= (DBInitializerFromCSV< ALLOC > &&from)
 move operator More...
 
Accessors / Modifiers
const std::vector< std::string, ALLOC< std::string > > & variableNames ()
 returns the names of the variables in the input dataset More...
 
template<template< template< typename > class > class DATABASE>
void fillDatabase (DATABASE< ALLOC > &database, const bool retry_insertion=false)
 fills the rows of the database table More...
 
std::size_t throwingColumn () const
 This method indicates which column filling raised an exception, if any, during the execution of fillDatabase. More...
 
allocator_type getAllocator () const
 returns the allocator used More...
 

Public Types

using allocator_type = ALLOC< std::string >
 type for the allocators passed in arguments of methods More...
 
enum  InputType : char { InputType::STRING, InputType::DBCELL }
 the enumeration indicating the type of the data the IDBInitializer expects as input data More...
 

Protected Member Functions

virtual std::vector< std::string, ALLOC< std::string > > _variableNames () final
 returns the names of the variables More...
 
virtual const std::vector< std::string, ALLOC< std::string > > & _currentStringRow () final
 returns the content of the current row using strings More...
 
virtual bool _nextRow () final
 indicates whether there is a next row to read (and point on it) More...
 
virtual const DBRow< DBCell, ALLOC > & _currentDBCellRow ()
 asks the child class for the content of the current row using dbcells More...
 

Detailed Description

template<template< typename > class ALLOC = std::allocator>
class gum::learning::DBInitializerFromCSV< ALLOC >

The class for initializing DatabaseTable and RawDatabaseTable instances from CSV files.

In aGrUM, the usual way to create DatabaseTable instances used by learning algorithms is to use the 4-step process below:

  1. Create an IDBInitializer instance (either a DBInitializerFromCSV or a DBInitializerFromSQL). This will enable to get the variables corresponding to the columns of the DatabaseTable.
  2. Knowing these variables, create a DBTranslatorSet for encoding the lines of the CSV file or those of the SQL result into the appropriate values for the learning algorithms.
  3. Create the DatabaseTable, passing it the DBTranslatorSet created in the preceding step. Use the IDBInitializer to provide the variables' names to the DatabaseTable.
  4. Use the IDBInitializer to add the lines of the CSV file or those of the SQL result into the DatabaseTable.
The following codes show the details of this process:
// 1/ use the initializer to parse all the columns/rows of a CSV file
gum::learning::DBInitializerFromCSV<> initializer ( "asia.csv" );
const auto& var_names = initializer.variableNames ();
const std::size_t nb_vars = var_names.size ();
// we create as many translators as there are variables
for ( std::size_t i = 0; i < nb_vars; ++i )
translator_set.insertTranslator ( translator, i );
// create a DatabaseTable with these translators. For the moment, the
// DatabaseTable will be empty, i.e., it will contain no row
gum::learning::DatabaseTable<> database ( translator_set );
database.setVariableNames( initializer.variableNames () );
// use the DBInitializerFromCSV to fill the rows:
initializer.fillDatabase ( database );
// now, the database contains all the content of the CSV file
// 2/ use an IDBInitializer to initialize a DatabaseTable, but ignore
// some columns.
gum::learning::DBInitializerFromCSV<> initializer2 ( "asia.csv" );
gum::learning::DatabaseTable<> database2; // empty database
// indicate which columns of the CSV file should be read
database2.insertTranslator ( translator, 1 );
database2.insertTranslator ( translator, 3 );
database2.insertTranslator ( translator, 4 );
// sets the names of the columns correctly
database2.setVariableNames( initializer2.variableNames () );
// fill the rows:
initializer2.fillDatabase ( database2 );
// now all the rows of the CSV file have been transferred into database2,
// but only columns 1, 3 and 4 of the CSV file have been kept.
// 3/ another possibility to initialize a DatabaseTable, ignoring
// some columns:
gum::learning::DBInitializerFromCSV<> initializer3 ( "asia.csv" );
gum::learning::DatabaseTable<> database3 ( translator_set );
// here, database3 is an empty database but it contains already
// translators for all the columns of the CSV file. We shall now remove
// the columns/translators that are not wanted anymore
database3.ignoreColumn ( 0 );
database3.ignoreColumn ( 2 );
database3.ignoreColumn ( 5 );
database3.ignoreColumn ( 6 );
database3.ignoreColumn ( 7 );
// asia contains 8 columns. The above ignoreColumns keep only columns
// 1, 3 and 4.
// sets the names of the columns correctly
database3.setVariableNames( initializer3.variableNames () );
// fill the rows:
initializer3.fillDatabase ( database3 );
// now all the rows of the CSV file have been transferred into database3,
// but only columns 1, 3 and 4 of the CSV file have been kept.

Definition at line 129 of file DBInitializerFromCSV.h.

Member Typedef Documentation

template<template< typename > class ALLOC = std::allocator>
using gum::learning::DBInitializerFromCSV< ALLOC >::allocator_type = ALLOC< std::string >

type for the allocators passed in arguments of methods

Definition at line 132 of file DBInitializerFromCSV.h.

Member Enumeration Documentation

template<template< typename > class ALLOC>
enum gum::learning::IDBInitializer::InputType : char
stronginherited

the enumeration indicating the type of the data the IDBInitializer expects as input data

Enumerator
STRING 
DBCELL 

Definition at line 118 of file IDBInitializer.h.

118 : char { STRING, DBCELL };

Constructor & Destructor Documentation

template<template< typename > class ALLOC = std::allocator>
gum::learning::DBInitializerFromCSV< ALLOC >::DBInitializerFromCSV ( const std::string  filename,
bool  fileContainsNames = true,
const std::string  delimiter = ",",
const char  commentmarker = '#',
const char  quoteMarker = '"',
const allocator_type alloc = allocator_type() 
)

default constructor

Parameters
filenamethe name of the CSV file
fileContainsNamesa Boolean indicating whether the first line of the CSV file contains the names of the columns
delimiterthe character that acts as the column separator in the CSV file
commentmarkerthe character that marks the beginning of a comment
quoteMarkerthe character that is used to quote the sentences in the CSV file
allocthe allocator used by all the methods
template<template< typename > class ALLOC = std::allocator>
gum::learning::DBInitializerFromCSV< ALLOC >::DBInitializerFromCSV ( const DBInitializerFromCSV< ALLOC > &  from)

copy constructor

the new initializer points to the same file as from, but it reparses it from scratch.

template<template< typename > class ALLOC = std::allocator>
gum::learning::DBInitializerFromCSV< ALLOC >::DBInitializerFromCSV ( const DBInitializerFromCSV< ALLOC > &  from,
const allocator_type alloc 
)

copy constructor with a given allocator

the new initializer points to the same file as from, but it reparses it from scratch.

template<template< typename > class ALLOC = std::allocator>
gum::learning::DBInitializerFromCSV< ALLOC >::DBInitializerFromCSV ( DBInitializerFromCSV< ALLOC > &&  from)

move constructor

template<template< typename > class ALLOC = std::allocator>
gum::learning::DBInitializerFromCSV< ALLOC >::DBInitializerFromCSV ( DBInitializerFromCSV< ALLOC > &&  from,
const allocator_type alloc 
)

move constructor with a given allocator

template<template< typename > class ALLOC = std::allocator>
virtual gum::learning::DBInitializerFromCSV< ALLOC >::~DBInitializerFromCSV ( )
virtual

destructor

Member Function Documentation

template<template< typename > class ALLOC>
virtual const DBRow< DBCell, ALLOC >& gum::learning::IDBInitializer< ALLOC >::_currentDBCellRow ( )
protectedvirtualinherited

asks the child class for the content of the current row using dbcells

If the child class parses DBRows, this method should be overloaded

template<template< typename > class ALLOC = std::allocator>
virtual const std::vector< std::string, ALLOC< std::string > >& gum::learning::DBInitializerFromCSV< ALLOC >::_currentStringRow ( )
finalprotectedvirtual

returns the content of the current row using strings

Reimplemented from gum::learning::IDBInitializer< ALLOC >.

template<template< typename > class ALLOC = std::allocator>
virtual bool gum::learning::DBInitializerFromCSV< ALLOC >::_nextRow ( )
finalprotectedvirtual

indicates whether there is a next row to read (and point on it)

Implements gum::learning::IDBInitializer< ALLOC >.

template<template< typename > class ALLOC = std::allocator>
virtual std::vector< std::string, ALLOC< std::string > > gum::learning::DBInitializerFromCSV< ALLOC >::_variableNames ( )
finalprotectedvirtual

returns the names of the variables

Implements gum::learning::IDBInitializer< ALLOC >.

template<template< typename > class ALLOC = std::allocator>
virtual DBInitializerFromCSV< ALLOC >* gum::learning::DBInitializerFromCSV< ALLOC >::clone ( ) const
virtual

virtual copy constructor

Implements gum::learning::IDBInitializer< ALLOC >.

template<template< typename > class ALLOC = std::allocator>
virtual DBInitializerFromCSV< ALLOC >* gum::learning::DBInitializerFromCSV< ALLOC >::clone ( const allocator_type alloc) const
virtual

virtual copy constructor with a given allocator

Implements gum::learning::IDBInitializer< ALLOC >.

template<template< typename > class ALLOC>
template<template< template< typename > class > class DATABASE>
void gum::learning::IDBInitializer< ALLOC >::fillDatabase ( DATABASE< ALLOC > &  database,
const bool  retry_insertion = false 
)
inherited

fills the rows of the database table

This method may raise exceptions when trying to insert new rows into the database table. See Method insertRow() of the database table.

Referenced by gum::learning::genericBNLearner::__readFile(), gum::learning::genericBNLearner::Database::Database(), and gum::learning::readFile().

+ Here is the caller graph for this function:

template<template< typename > class ALLOC>
allocator_type gum::learning::IDBInitializer< ALLOC >::getAllocator ( ) const
inherited

returns the allocator used

template<template< typename > class ALLOC = std::allocator>
DBInitializerFromCSV< ALLOC >& gum::learning::DBInitializerFromCSV< ALLOC >::operator= ( const DBInitializerFromCSV< ALLOC > &  from)

copy operator

the initializer points to the same file as from, but it reparses it from scratch.

template<template< typename > class ALLOC = std::allocator>
DBInitializerFromCSV< ALLOC >& gum::learning::DBInitializerFromCSV< ALLOC >::operator= ( DBInitializerFromCSV< ALLOC > &&  from)

move operator

the initializer points to the same file as from, but it reparses it from scratch.

template<template< typename > class ALLOC>
std::size_t gum::learning::IDBInitializer< ALLOC >::throwingColumn ( ) const
inherited

This method indicates which column filling raised an exception, if any, during the execution of fillDatabase.

template<template< typename > class ALLOC>
const std::vector< std::string, ALLOC< std::string > >& gum::learning::IDBInitializer< ALLOC >::variableNames ( )
inherited

returns the names of the variables in the input dataset

Referenced by gum::learning::genericBNLearner::__readFile(), gum::learning::genericBNLearner::Database::Database(), and gum::learning::readFile().

+ Here is the caller graph for this function:


The documentation for this class was generated from the following file: