aGrUM  0.20.3
a C++ library for (probabilistic) graphical models
gum::SDYNA Class Reference

The general SDyna architecture abstract class. More...

#include <agrum/FMDP/SDyna/sdyna.h>

+ Collaboration diagram for gum::SDYNA:

Public Member Functions

std::string toString ()
 Returns. More...
 
std::string optimalPolicy2String ()
 
Problem specification methods
void addAction (const Idx actionId, const std::string &actionName)
 Inserts a new action in the SDyna instance. More...
 
void addVariable (const DiscreteVariable *var)
 Inserts a new variable in the SDyna instance. More...
 
Initialization
void initialize ()
 Initializes the Sdyna instance. More...
 
void initialize (const Instantiation &initialState)
 Initializes the Sdyna instance at given state. More...
 
Incremental methods
void setCurrentState (const Instantiation &currentState)
 Sets last state visited to the given state. More...
 
Idx takeAction (const Instantiation &curState)
 
Idx takeAction ()
 
void feedback (const Instantiation &originalState, const Instantiation &reachedState, Idx performedAction, double obtainedReward)
 Performs a feedback on the last transition. More...
 
void feedback (const Instantiation &reachedState, double obtainedReward)
 Performs a feedback on the last transition. More...
 
void makePlanning (Idx nbStep)
 Starts a new planning. More...
 
Size methods

just to get the size of the different data structure for performance evaluation purposes only

Size learnerSize ()
 learnerSize More...
 
Size modelSize ()
 modelSize More...
 
Size valueFunctionSize ()
 valueFunctionSize More...
 
Size optimalPolicySize ()
 optimalPolicySize More...
 

Static Public Member Functions

static SDYNAspitiInstance (double attributeSelectionThreshold=0.99, double discountFactor=0.9, double epsilon=1, Idx observationPhaseLenght=100, Idx nbValueIterationStep=10)
 @ More...
 
static SDYNAspimddiInstance (double attributeSelectionThreshold=0.99, double similarityThreshold=0.3, double discountFactor=0.9, double epsilon=1, Idx observationPhaseLenght=100, Idx nbValueIterationStep=10)
 @ More...
 
static SDYNARMaxMDDInstance (double attributeSelectionThreshold=0.99, double similarityThreshold=0.3, double discountFactor=0.9, double epsilon=1, Idx observationPhaseLenght=100, Idx nbValueIterationStep=10)
 @ More...
 
static SDYNARMaxTreeInstance (double attributeSelectionThreshold=0.99, double discountFactor=0.9, double epsilon=1, Idx observationPhaseLenght=100, Idx nbValueIterationStep=10)
 @ More...
 
static SDYNARandomMDDInstance (double attributeSelectionThreshold=0.99, double similarityThreshold=0.3, double discountFactor=0.9, double epsilon=1, Idx observationPhaseLenght=100, Idx nbValueIterationStep=10)
 @ More...
 
static SDYNARandomTreeInstance (double attributeSelectionThreshold=0.99, double discountFactor=0.9, double epsilon=1, Idx observationPhaseLenght=100, Idx nbValueIterationStep=10)
 @ More...
 

Protected Attributes

FMDP< double > * fmdp_
 The learnt Markovian Decision Process. More...
 
Instantiation lastState_
 The state in which the system is before we perform a new action. More...
 

Constructor & destructor.

 SDYNA (ILearningStrategy *learner, IPlanningStrategy< double > *planer, IDecisionStrategy *decider, Idx observationPhaseLenght, Idx nbValueIterationStep, bool actionReward, bool verbose=true)
 Constructor. More...
 
 ~SDYNA ()
 Destructor. More...
 

Detailed Description

The general SDyna architecture abstract class.

Instance of SDyna architecture should inherit

Definition at line 66 of file sdyna.h.

Constructor & Destructor Documentation

◆ SDYNA()

gum::SDYNA::SDYNA ( ILearningStrategy learner,
IPlanningStrategy< double > *  planer,
IDecisionStrategy decider,
Idx  observationPhaseLenght,
Idx  nbValueIterationStep,
bool  actionReward,
bool  verbose = true 
)
private

Constructor.

Returns
an instance of SDyna architecture

Definition at line 57 of file sdyna.cpp.

References gum::Set< Key, Alloc >::emplace().

63  :
64  _learner_(learner),
65  _planer_(planer), _decider_(decider), _observationPhaseLenght_(observationPhaseLenght),
66  _nbValueIterationStep_(nbValueIterationStep), _actionReward_(actionReward),
67  verbose_(verbose) {
68  GUM_CONSTRUCTOR(SDYNA);
69 
70  fmdp_ = new FMDP< double >();
71 
72  _nbObservation_ = 1;
73  }
SDYNA(ILearningStrategy *learner, IPlanningStrategy< double > *planer, IDecisionStrategy *decider, Idx observationPhaseLenght, Idx nbValueIterationStep, bool actionReward, bool verbose=true)
Constructor.
Definition: sdyna.cpp:57
Idx _observationPhaseLenght_
The number of observation we make before using again the planer.
Definition: sdyna.h:427
FMDP< double > * fmdp_
The learnt Markovian Decision Process.
Definition: sdyna.h:410
IDecisionStrategy * _decider_
The decider.
Definition: sdyna.h:423
Idx _nbObservation_
The total number of observation made so far.
Definition: sdyna.h:430
IPlanningStrategy< double > * _planer_
The planer used to plan an optimal strategy.
Definition: sdyna.h:420
bool verbose_
Definition: sdyna.h:443
Idx _nbValueIterationStep_
The number of Value Iteration step we perform.
Definition: sdyna.h:433
ILearningStrategy * _learner_
The learner used to learn the FMDP.
Definition: sdyna.h:417
bool _actionReward_
Definition: sdyna.h:441
+ Here is the call graph for this function:

◆ ~SDYNA()

gum::SDYNA::~SDYNA ( )

Destructor.

Definition at line 78 of file sdyna.cpp.

References gum::Set< Key, Alloc >::emplace().

78  {
79  delete _decider_;
80 
81  delete _learner_;
82 
83  delete _planer_;
84 
85  for (auto obsIter = _bin_.beginSafe(); obsIter != _bin_.endSafe(); ++obsIter)
86  delete *obsIter;
87 
88  delete fmdp_;
89 
90  GUM_DESTRUCTOR(SDYNA);
91  }
SDYNA(ILearningStrategy *learner, IPlanningStrategy< double > *planer, IDecisionStrategy *decider, Idx observationPhaseLenght, Idx nbValueIterationStep, bool actionReward, bool verbose=true)
Constructor.
Definition: sdyna.cpp:57
FMDP< double > * fmdp_
The learnt Markovian Decision Process.
Definition: sdyna.h:410
IDecisionStrategy * _decider_
The decider.
Definition: sdyna.h:423
Set< Observation *> _bin_
Since SDYNA made these observation, it has to delete them on quitting.
Definition: sdyna.h:439
IPlanningStrategy< double > * _planer_
The planer used to plan an optimal strategy.
Definition: sdyna.h:420
ILearningStrategy * _learner_
The learner used to learn the FMDP.
Definition: sdyna.h:417
+ Here is the call graph for this function:

Member Function Documentation

◆ addAction()

void gum::SDYNA::addAction ( const Idx  actionId,
const std::string &  actionName 
)
inline

Inserts a new action in the SDyna instance.

Warning
Without effect until method initialize is called
Parameters
actionId: an id to identify the action
actionName: its human name

Definition at line 238 of file sdyna.h.

238  {
239  fmdp_->addAction(actionId, actionName);
240  }
FMDP< double > * fmdp_
The learnt Markovian Decision Process.
Definition: sdyna.h:410
void addAction(Idx actionId, const std::string &action)
Adds an action to FMDP description.
Definition: fmdp_tpl.h:144

◆ addVariable()

void gum::SDYNA::addVariable ( const DiscreteVariable var)
inline

Inserts a new variable in the SDyna instance.

Warning
Without effect until method initialize is called
Parameters
var: the var to be added. Note that variable may or may not have all its modalities given. If not they will be discovered by the SDyna architecture during the process

Definition at line 252 of file sdyna.h.

252 { fmdp_->addVariable(var); }
FMDP< double > * fmdp_
The learnt Markovian Decision Process.
Definition: sdyna.h:410
void addVariable(const DiscreteVariable *var)
Adds a variable to FMDP description.
Definition: fmdp_tpl.h:116

◆ feedback() [1/2]

void gum::SDYNA::feedback ( const Instantiation originalState,
const Instantiation reachedState,
Idx  performedAction,
double  obtainedReward 
)

Performs a feedback on the last transition.

Incremental methods.

In extenso, learn from the transition.

Parameters
originalState: the state we were in before the transition
reachedState: the state we reached after
performedAction: the action we performed
obtainedReward: the reward we obtained

Definition at line 129 of file sdyna.cpp.

References gum::Set< Key, Alloc >::emplace().

132  {
133  _lastAction_ = lastAction;
134  lastState_ = prevState;
135  feedback(curState, reward);
136  }
void feedback(const Instantiation &originalState, const Instantiation &reachedState, Idx performedAction, double obtainedReward)
Performs a feedback on the last transition.
Definition: sdyna.cpp:129
Instantiation lastState_
The state in which the system is before we perform a new action.
Definition: sdyna.h:413
Idx _lastAction_
The last performed action.
Definition: sdyna.h:436
+ Here is the call graph for this function:

◆ feedback() [2/2]

void gum::SDYNA::feedback ( const Instantiation reachedState,
double  obtainedReward 
)

Performs a feedback on the last transition.

In extenso, learn from the transition.

Parameters
reachedState: the state reached after the transition
obtainedReward: the reward obtained during the transition
Warning
Uses the originalState and performedAction stored in cache If you want to specify the original state and the performed action, see below

Definition at line 149 of file sdyna.cpp.

References gum::Set< Key, Alloc >::emplace().

149  {
150  Observation* obs = new Observation();
151 
152  for (auto varIter = lastState_.variablesSequence().beginSafe();
153  varIter != lastState_.variablesSequence().endSafe();
154  ++varIter)
155  obs->setModality(*varIter, lastState_.val(**varIter));
156 
157  for (auto varIter = newState.variablesSequence().beginSafe();
158  varIter != newState.variablesSequence().endSafe();
159  ++varIter) {
160  obs->setModality(fmdp_->main2prime(*varIter), newState.val(**varIter));
161 
162  if (this->_actionReward_)
163  obs->setRModality(*varIter, lastState_.val(**varIter));
164  else
165  obs->setRModality(*varIter, newState.val(**varIter));
166  }
167 
168  obs->setReward(reward);
169 
171  _bin_.insert(obs);
172 
173  setCurrentState(newState);
175 
177 
178  _nbObservation_++;
179  }
void setCurrentState(const Instantiation &currentState)
Sets last state visited to the given state.
Definition: sdyna.h:294
Idx _observationPhaseLenght_
The number of observation we make before using again the planer.
Definition: sdyna.h:427
virtual void checkState(const Instantiation &newState, Idx actionId)=0
const Sequence< const DiscreteVariable *> & variablesSequence() const final
Returns the sequence of DiscreteVariable of this instantiation.
Instantiation lastState_
The state in which the system is before we perform a new action.
Definition: sdyna.h:413
Idx val(Idx i) const
Returns the current value of the variable at position i.
FMDP< double > * fmdp_
The learnt Markovian Decision Process.
Definition: sdyna.h:410
IDecisionStrategy * _decider_
The decider.
Definition: sdyna.h:423
Idx _nbObservation_
The total number of observation made so far.
Definition: sdyna.h:430
virtual bool addObservation(Idx actionId, const Observation *obs)=0
Gives to the learner a new transition.
Set< Observation *> _bin_
Since SDYNA made these observation, it has to delete them on quitting.
Definition: sdyna.h:439
const DiscreteVariable * main2prime(const DiscreteVariable *mainVar) const
Returns the primed variable associate to the given main variable.
Definition: fmdp.h:108
Idx _nbValueIterationStep_
The number of Value Iteration step we perform.
Definition: sdyna.h:433
void makePlanning(Idx nbStep)
Starts a new planning.
Definition: sdyna.cpp:188
ILearningStrategy * _learner_
The learner used to learn the FMDP.
Definition: sdyna.h:417
Idx _lastAction_
The last performed action.
Definition: sdyna.h:436
bool _actionReward_
Definition: sdyna.h:441
+ Here is the call graph for this function:

◆ initialize() [1/2]

void gum::SDYNA::initialize ( )

Initializes the Sdyna instance.

Definition at line 97 of file sdyna.cpp.

References gum::Set< Key, Alloc >::emplace().

97  {
101  }
virtual void initialize(const FMDP< GUM_SCALAR > *fmdp)=0
Initializes the learner.
virtual void initialize(const FMDP< double > *fmdp)
Initializes the learner.
virtual void initialize(FMDP< double > *fmdp)=0
Initializes the learner.
FMDP< double > * fmdp_
The learnt Markovian Decision Process.
Definition: sdyna.h:410
IDecisionStrategy * _decider_
The decider.
Definition: sdyna.h:423
IPlanningStrategy< double > * _planer_
The planer used to plan an optimal strategy.
Definition: sdyna.h:420
ILearningStrategy * _learner_
The learner used to learn the FMDP.
Definition: sdyna.h:417
+ Here is the call graph for this function:

◆ initialize() [2/2]

void gum::SDYNA::initialize ( const Instantiation initialState)

Initializes the Sdyna instance at given state.

Parameters
initialState: the state of the studied system from which we will begin the explore, learn and exploit process

Definition at line 110 of file sdyna.cpp.

References gum::Set< Key, Alloc >::emplace().

110  {
111  initialize();
112  setCurrentState(initialState);
113  }
void setCurrentState(const Instantiation &currentState)
Sets last state visited to the given state.
Definition: sdyna.h:294
void initialize()
Initializes the Sdyna instance.
Definition: sdyna.cpp:97
+ Here is the call graph for this function:

◆ learnerSize()

Size gum::SDYNA::learnerSize ( )
inline

learnerSize

Returns

Definition at line 379 of file sdyna.h.

379 { return _learner_->size(); }
ILearningStrategy * _learner_
The learner used to learn the FMDP.
Definition: sdyna.h:417
virtual Size size()=0
learnerSize

◆ makePlanning()

void gum::SDYNA::makePlanning ( Idx  nbStep)

Starts a new planning.

Parameters
nbStep: the maximal number of value iteration performed in this planning

Definition at line 188 of file sdyna.cpp.

References gum::Set< Key, Alloc >::emplace().

188  {
189  if (verbose_) std::cout << "Updating decision trees ..." << std::endl;
191  // std::cout << << "Done" << std::endl;
192 
193  if (verbose_) std::cout << "Planning ..." << std::endl;
194  _planer_->makePlanning(nbValueIterationStep);
195  // std::cout << << "Done" << std::endl;
196 
198  }
virtual void updateFMDP()=0
Starts an update of datastructure in the associated FMDP.
IDecisionStrategy * _decider_
The decider.
Definition: sdyna.h:423
virtual void makePlanning(Idx nbIte)=0
Starts a new planning.
IPlanningStrategy< double > * _planer_
The planer used to plan an optimal strategy.
Definition: sdyna.h:420
bool verbose_
Definition: sdyna.h:443
virtual const MultiDimFunctionGraph< ActionSet, SetTerminalNodePolicy > * optimalPolicy()=0
Returns optimalPolicy computed so far current size.
ILearningStrategy * _learner_
The learner used to learn the FMDP.
Definition: sdyna.h:417
void setOptimalStrategy(const MultiDimFunctionGraph< ActionSet, SetTerminalNodePolicy > *optPol)
+ Here is the call graph for this function:

◆ modelSize()

Size gum::SDYNA::modelSize ( )
inline

modelSize

Returns

Definition at line 387 of file sdyna.h.

387 { return fmdp_->size(); }
FMDP< double > * fmdp_
The learnt Markovian Decision Process.
Definition: sdyna.h:410
Size size() const
Returns the map binding main variables and prime variables.
Definition: fmdp_tpl.h:356

◆ optimalPolicy2String()

std::string gum::SDYNA::optimalPolicy2String ( )
inline

Definition at line 363 of file sdyna.h.

363 { return _planer_->optimalPolicy2String(); }
virtual std::string optimalPolicy2String()=0
Returns a string describing the optimal policy in a dot format.
IPlanningStrategy< double > * _planer_
The planer used to plan an optimal strategy.
Definition: sdyna.h:420

◆ optimalPolicySize()

Size gum::SDYNA::optimalPolicySize ( )
inline

optimalPolicySize

Returns

Definition at line 403 of file sdyna.h.

403 { return _planer_->optimalPolicySize(); }
virtual Size optimalPolicySize()=0
Returns optimalPolicy computed so far current size.
IPlanningStrategy< double > * _planer_
The planer used to plan an optimal strategy.
Definition: sdyna.h:420

◆ RandomMDDInstance()

static SDYNA* gum::SDYNA::RandomMDDInstance ( double  attributeSelectionThreshold = 0.99,
double  similarityThreshold = 0.3,
double  discountFactor = 0.9,
double  epsilon = 1,
Idx  observationPhaseLenght = 100,
Idx  nbValueIterationStep = 10 
)
inlinestatic

@

Definition at line 157 of file sdyna.h.

162  {
163  bool actionReward = true;
164  ILearningStrategy* ls
165  = new FMDPLearner< GTEST, GTEST, IMDDILEARNER >(attributeSelectionThreshold,
166  actionReward,
167  similarityThreshold);
169  = StructuredPlaner< double >::spumddInstance(discountFactor, epsilon);
170  IDecisionStrategy* ds = new RandomDecider();
171  return new SDYNA(ls, ps, ds, observationPhaseLenght, nbValueIterationStep, actionReward);
172  }
SDYNA(ILearningStrategy *learner, IPlanningStrategy< double > *planer, IDecisionStrategy *decider, Idx observationPhaseLenght, Idx nbValueIterationStep, bool actionReward, bool verbose=true)
Constructor.
Definition: sdyna.cpp:57
static StructuredPlaner< GUM_SCALAR > * spumddInstance(GUM_SCALAR discountFactor=0.9, GUM_SCALAR epsilon=0.00001, bool verbose=true)

◆ RandomTreeInstance()

static SDYNA* gum::SDYNA::RandomTreeInstance ( double  attributeSelectionThreshold = 0.99,
double  discountFactor = 0.9,
double  epsilon = 1,
Idx  observationPhaseLenght = 100,
Idx  nbValueIterationStep = 10 
)
inlinestatic

@

Definition at line 177 of file sdyna.h.

181  {
182  bool actionReward = true;
183  ILearningStrategy* ls
184  = new FMDPLearner< CHI2TEST, CHI2TEST, ITILEARNER >(attributeSelectionThreshold,
185  actionReward);
187  = StructuredPlaner< double >::sviInstance(discountFactor, epsilon);
188  IDecisionStrategy* ds = new RandomDecider();
189  return new SDYNA(ls, ps, ds, observationPhaseLenght, nbValueIterationStep, actionReward);
190  }
static StructuredPlaner< GUM_SCALAR > * sviInstance(GUM_SCALAR discountFactor=0.9, GUM_SCALAR epsilon=0.00001, bool verbose=true)
SDYNA(ILearningStrategy *learner, IPlanningStrategy< double > *planer, IDecisionStrategy *decider, Idx observationPhaseLenght, Idx nbValueIterationStep, bool actionReward, bool verbose=true)
Constructor.
Definition: sdyna.cpp:57

◆ RMaxMDDInstance()

static SDYNA* gum::SDYNA::RMaxMDDInstance ( double  attributeSelectionThreshold = 0.99,
double  similarityThreshold = 0.3,
double  discountFactor = 0.9,
double  epsilon = 1,
Idx  observationPhaseLenght = 100,
Idx  nbValueIterationStep = 10 
)
inlinestatic

@

Definition at line 119 of file sdyna.h.

124  {
125  bool actionReward = true;
126  ILearningStrategy* ls
127  = new FMDPLearner< GTEST, GTEST, IMDDILEARNER >(attributeSelectionThreshold,
128  actionReward,
129  similarityThreshold);
130  AdaptiveRMaxPlaner* rm
131  = AdaptiveRMaxPlaner::ReducedAndOrderedInstance(ls, discountFactor, epsilon);
133  IDecisionStrategy* ds = rm;
134  return new SDYNA(ls, ps, ds, observationPhaseLenght, nbValueIterationStep, actionReward);
135  }
SDYNA(ILearningStrategy *learner, IPlanningStrategy< double > *planer, IDecisionStrategy *decider, Idx observationPhaseLenght, Idx nbValueIterationStep, bool actionReward, bool verbose=true)
Constructor.
Definition: sdyna.cpp:57
static AdaptiveRMaxPlaner * ReducedAndOrderedInstance(const ILearningStrategy *learner, double discountFactor=0.9, double epsilon=0.00001, bool verbose=true)

◆ RMaxTreeInstance()

static SDYNA* gum::SDYNA::RMaxTreeInstance ( double  attributeSelectionThreshold = 0.99,
double  discountFactor = 0.9,
double  epsilon = 1,
Idx  observationPhaseLenght = 100,
Idx  nbValueIterationStep = 10 
)
inlinestatic

@

Definition at line 140 of file sdyna.h.

144  {
145  bool actionReward = true;
146  ILearningStrategy* ls
147  = new FMDPLearner< GTEST, GTEST, ITILEARNER >(attributeSelectionThreshold, actionReward);
148  AdaptiveRMaxPlaner* rm = AdaptiveRMaxPlaner::TreeInstance(ls, discountFactor, epsilon);
150  IDecisionStrategy* ds = rm;
151  return new SDYNA(ls, ps, ds, observationPhaseLenght, nbValueIterationStep, actionReward);
152  }
SDYNA(ILearningStrategy *learner, IPlanningStrategy< double > *planer, IDecisionStrategy *decider, Idx observationPhaseLenght, Idx nbValueIterationStep, bool actionReward, bool verbose=true)
Constructor.
Definition: sdyna.cpp:57
static AdaptiveRMaxPlaner * TreeInstance(const ILearningStrategy *learner, double discountFactor=0.9, double epsilon=0.00001, bool verbose=true)

◆ setCurrentState()

void gum::SDYNA::setCurrentState ( const Instantiation currentState)
inline

Sets last state visited to the given state.

During the learning process, we will consider that were in this state before the transition.

Parameters
currentState: the state

Definition at line 294 of file sdyna.h.

294 { lastState_ = currentState; }
Instantiation lastState_
The state in which the system is before we perform a new action.
Definition: sdyna.h:413

◆ spimddiInstance()

static SDYNA* gum::SDYNA::spimddiInstance ( double  attributeSelectionThreshold = 0.99,
double  similarityThreshold = 0.3,
double  discountFactor = 0.9,
double  epsilon = 1,
Idx  observationPhaseLenght = 100,
Idx  nbValueIterationStep = 10 
)
inlinestatic

@

Definition at line 93 of file sdyna.h.

98  {
99  bool actionReward = false;
100  ILearningStrategy* ls
101  = new FMDPLearner< GTEST, GTEST, IMDDILEARNER >(attributeSelectionThreshold,
102  actionReward,
103  similarityThreshold);
105  = StructuredPlaner< double >::spumddInstance(discountFactor, epsilon, false);
106  IDecisionStrategy* ds = new E_GreedyDecider();
107  return new SDYNA(ls,
108  ps,
109  ds,
110  observationPhaseLenght,
111  nbValueIterationStep,
112  actionReward,
113  false);
114  }
SDYNA(ILearningStrategy *learner, IPlanningStrategy< double > *planer, IDecisionStrategy *decider, Idx observationPhaseLenght, Idx nbValueIterationStep, bool actionReward, bool verbose=true)
Constructor.
Definition: sdyna.cpp:57
static StructuredPlaner< GUM_SCALAR > * spumddInstance(GUM_SCALAR discountFactor=0.9, GUM_SCALAR epsilon=0.00001, bool verbose=true)

◆ spitiInstance()

static SDYNA* gum::SDYNA::spitiInstance ( double  attributeSelectionThreshold = 0.99,
double  discountFactor = 0.9,
double  epsilon = 1,
Idx  observationPhaseLenght = 100,
Idx  nbValueIterationStep = 10 
)
inlinestatic

@

Definition at line 75 of file sdyna.h.

79  {
80  bool actionReward = false;
81  ILearningStrategy* ls
82  = new FMDPLearner< CHI2TEST, CHI2TEST, ITILEARNER >(attributeSelectionThreshold,
83  actionReward);
85  = StructuredPlaner< double >::sviInstance(discountFactor, epsilon);
86  IDecisionStrategy* ds = new E_GreedyDecider();
87  return new SDYNA(ls, ps, ds, observationPhaseLenght, nbValueIterationStep, actionReward);
88  }
static StructuredPlaner< GUM_SCALAR > * sviInstance(GUM_SCALAR discountFactor=0.9, GUM_SCALAR epsilon=0.00001, bool verbose=true)
SDYNA(ILearningStrategy *learner, IPlanningStrategy< double > *planer, IDecisionStrategy *decider, Idx observationPhaseLenght, Idx nbValueIterationStep, bool actionReward, bool verbose=true)
Constructor.
Definition: sdyna.cpp:57

◆ takeAction() [1/2]

Idx gum::SDYNA::takeAction ( const Instantiation curState)
Returns
actionId the id of the action the SDyna instance wish to be performed
Parameters
curStatethe state in which we currently are

Definition at line 206 of file sdyna.cpp.

References gum::Set< Key, Alloc >::emplace().

206  {
207  lastState_ = curState;
208  return takeAction();
209  }
Idx takeAction()
Definition: sdyna.cpp:216
Instantiation lastState_
The state in which the system is before we perform a new action.
Definition: sdyna.h:413
+ Here is the call graph for this function:

◆ takeAction() [2/2]

Idx gum::SDYNA::takeAction ( )
Returns
the id of the action the SDyna instance wish to be performed

Definition at line 216 of file sdyna.cpp.

References gum::Set< Key, Alloc >::emplace().

216  {
217  ActionSet actionSet = _decider_->stateOptimalPolicy(lastState_);
218  if (actionSet.size() == 1) {
219  _lastAction_ = actionSet[0];
220  } else {
221  Idx randy = (Idx)((double)std::rand() / (double)RAND_MAX * actionSet.size());
222  _lastAction_ = actionSet[randy == actionSet.size() ? 0 : randy];
223  }
224  return _lastAction_;
225  }
virtual ActionSet stateOptimalPolicy(const Instantiation &curState)
Instantiation lastState_
The state in which the system is before we perform a new action.
Definition: sdyna.h:413
IDecisionStrategy * _decider_
The decider.
Definition: sdyna.h:423
Size Idx
Type for indexes.
Definition: types.h:52
Idx _lastAction_
The last performed action.
Definition: sdyna.h:436
+ Here is the call graph for this function:

◆ toString()

std::string gum::SDYNA::toString ( )

Returns.

Returns
a string describing the learned FMDP, and the associated optimal policy. Both in DOT language.

Definition at line 230 of file sdyna.cpp.

References gum::Set< Key, Alloc >::emplace().

230  {
231  std::stringstream description;
232 
233  description << fmdp_->toString() << std::endl;
234  description << _planer_->optimalPolicy2String() << std::endl;
235 
236  return description.str();
237  }
virtual std::string optimalPolicy2String()=0
Returns a string describing the optimal policy in a dot format.
FMDP< double > * fmdp_
The learnt Markovian Decision Process.
Definition: sdyna.h:410
IPlanningStrategy< double > * _planer_
The planer used to plan an optimal strategy.
Definition: sdyna.h:420
std::string toString() const
Displays the FMDP in a Dot format.
Definition: fmdp_tpl.h:336
+ Here is the call graph for this function:

◆ valueFunctionSize()

Size gum::SDYNA::valueFunctionSize ( )
inline

valueFunctionSize

Returns

Definition at line 395 of file sdyna.h.

395 { return _planer_->vFunctionSize(); }
IPlanningStrategy< double > * _planer_
The planer used to plan an optimal strategy.
Definition: sdyna.h:420
virtual Size vFunctionSize()=0
Returns vFunction computed so far current size.

Member Data Documentation

◆ _actionReward_

bool gum::SDYNA::_actionReward_
private

Definition at line 441 of file sdyna.h.

◆ _bin_

Set< Observation* > gum::SDYNA::_bin_
private

Since SDYNA made these observation, it has to delete them on quitting.

Definition at line 439 of file sdyna.h.

◆ _decider_

IDecisionStrategy* gum::SDYNA::_decider_
private

The decider.

Definition at line 423 of file sdyna.h.

◆ _lastAction_

Idx gum::SDYNA::_lastAction_
private

The last performed action.

Definition at line 436 of file sdyna.h.

◆ _learner_

ILearningStrategy* gum::SDYNA::_learner_
private

The learner used to learn the FMDP.

Definition at line 417 of file sdyna.h.

◆ _nbObservation_

Idx gum::SDYNA::_nbObservation_
private

The total number of observation made so far.

Definition at line 430 of file sdyna.h.

◆ _nbValueIterationStep_

Idx gum::SDYNA::_nbValueIterationStep_
private

The number of Value Iteration step we perform.

Definition at line 433 of file sdyna.h.

◆ _observationPhaseLenght_

Idx gum::SDYNA::_observationPhaseLenght_
private

The number of observation we make before using again the planer.

Definition at line 427 of file sdyna.h.

◆ _planer_

IPlanningStrategy< double >* gum::SDYNA::_planer_
private

The planer used to plan an optimal strategy.

Definition at line 420 of file sdyna.h.

◆ fmdp_

FMDP< double >* gum::SDYNA::fmdp_
protected

The learnt Markovian Decision Process.

Definition at line 410 of file sdyna.h.

◆ lastState_

Instantiation gum::SDYNA::lastState_
protected

The state in which the system is before we perform a new action.

Definition at line 413 of file sdyna.h.

◆ verbose_

bool gum::SDYNA::verbose_
private

Definition at line 443 of file sdyna.h.


The documentation for this class was generated from the following files: