aGrUM  0.14.2
gum::SDYNA Class Reference

The general SDyna architecture abstract class. More...

#include <agrum/FMDP/SDyna/sdyna.h>

+ Collaboration diagram for gum::SDYNA:

Public Member Functions

std::string toString ()
 Returns. More...
 
std::string optimalPolicy2String ()
 
Problem specification methods
void addAction (const Idx actionId, const std::string &actionName)
 Inserts a new action in the SDyna instance. More...
 
void addVariable (const DiscreteVariable *var)
 Inserts a new variable in the SDyna instance. More...
 
Initialization
void initialize ()
 Initializes the Sdyna instance. More...
 
void initialize (const Instantiation &initialState)
 Initializes the Sdyna instance at given state. More...
 
Incremental methods
void setCurrentState (const Instantiation &currentState)
 Sets last state visited to the given state. More...
 
Idx takeAction (const Instantiation &curState)
 
Idx takeAction ()
 
void feedback (const Instantiation &originalState, const Instantiation &reachedState, Idx performedAction, double obtainedReward)
 Performs a feedback on the last transition. More...
 
void feedback (const Instantiation &reachedState, double obtainedReward)
 Performs a feedback on the last transition. More...
 
void makePlanning (Idx nbStep)
 Starts a new planning. More...
 
Size methods

just to get the size of the different data structure for performance evaluation purposes only

Size learnerSize ()
 learnerSize More...
 
Size modelSize ()
 modelSize More...
 
Size valueFunctionSize ()
 valueFunctionSize More...
 
Size optimalPolicySize ()
 optimalPolicySize More...
 

Static Public Member Functions

static SDYNAspitiInstance (double attributeSelectionThreshold=0.99, double discountFactor=0.9, double epsilon=1, Idx observationPhaseLenght=100, Idx nbValueIterationStep=10)
 @ More...
 
static SDYNAspimddiInstance (double attributeSelectionThreshold=0.99, double similarityThreshold=0.3, double discountFactor=0.9, double epsilon=1, Idx observationPhaseLenght=100, Idx nbValueIterationStep=10)
 @ More...
 
static SDYNARMaxMDDInstance (double attributeSelectionThreshold=0.99, double similarityThreshold=0.3, double discountFactor=0.9, double epsilon=1, Idx observationPhaseLenght=100, Idx nbValueIterationStep=10)
 @ More...
 
static SDYNARMaxTreeInstance (double attributeSelectionThreshold=0.99, double discountFactor=0.9, double epsilon=1, Idx observationPhaseLenght=100, Idx nbValueIterationStep=10)
 @ More...
 
static SDYNARandomMDDInstance (double attributeSelectionThreshold=0.99, double similarityThreshold=0.3, double discountFactor=0.9, double epsilon=1, Idx observationPhaseLenght=100, Idx nbValueIterationStep=10)
 @ More...
 
static SDYNARandomTreeInstance (double attributeSelectionThreshold=0.99, double discountFactor=0.9, double epsilon=1, Idx observationPhaseLenght=100, Idx nbValueIterationStep=10)
 @ More...
 

Protected Attributes

FMDP< double > * _fmdp
 The learnt Markovian Decision Process. More...
 
Instantiation _lastState
 The state in which the system is before we perform a new action. More...
 

Constructor & destructor.

 SDYNA (ILearningStrategy *learner, IPlanningStrategy< double > *planer, IDecisionStrategy *decider, Idx observationPhaseLenght, Idx nbValueIterationStep, bool actionReward, bool verbose=true)
 Constructor. More...
 
 ~SDYNA ()
 Destructor. More...
 

Detailed Description

The general SDyna architecture abstract class.

Instance of SDyna architecture should inherit

Definition at line 63 of file sdyna.h.

Constructor & Destructor Documentation

◆ SDYNA()

gum::SDYNA::SDYNA ( ILearningStrategy learner,
IPlanningStrategy< double > *  planer,
IDecisionStrategy decider,
Idx  observationPhaseLenght,
Idx  nbValueIterationStep,
bool  actionReward,
bool  verbose = true 
)
private

Constructor.

Returns
an instance of SDyna architecture

Definition at line 54 of file sdyna.cpp.

References __nbObservation, and _fmdp.

Referenced by RandomMDDInstance(), RandomTreeInstance(), RMaxMDDInstance(), RMaxTreeInstance(), spimddiInstance(), and spitiInstance().

60  :
61  __learner(learner),
62  __planer(planer), __decider(decider),
63  __observationPhaseLenght(observationPhaseLenght),
64  __nbValueIterationStep(nbValueIterationStep), __actionReward(actionReward),
65  _verbose(verbose) {
66  GUM_CONSTRUCTOR(SDYNA);
67 
68  _fmdp = new FMDP< double >();
69 
70  __nbObservation = 1;
71  }
SDYNA(ILearningStrategy *learner, IPlanningStrategy< double > *planer, IDecisionStrategy *decider, Idx observationPhaseLenght, Idx nbValueIterationStep, bool actionReward, bool verbose=true)
Constructor.
Definition: sdyna.cpp:54
bool _verbose
Definition: sdyna.h:440
IPlanningStrategy< double > * __planer
The planer used to plan an optimal strategy.
Definition: sdyna.h:417
ILearningStrategy * __learner
The learner used to learn the FMDP.
Definition: sdyna.h:414
Idx __observationPhaseLenght
The number of observation we make before using again the planer.
Definition: sdyna.h:424
Idx __nbObservation
The total number of observation made so far.
Definition: sdyna.h:427
FMDP< double > * _fmdp
The learnt Markovian Decision Process.
Definition: sdyna.h:407
IDecisionStrategy * __decider
The decider.
Definition: sdyna.h:420
bool __actionReward
Definition: sdyna.h:438
Idx __nbValueIterationStep
The number of Value Iteration step we perform.
Definition: sdyna.h:430
+ Here is the caller graph for this function:

◆ ~SDYNA()

gum::SDYNA::~SDYNA ( )

Destructor.

Definition at line 76 of file sdyna.cpp.

References __bin, __decider, __learner, __planer, and _fmdp.

Referenced by RandomTreeInstance().

76  {
77  delete __decider;
78 
79  delete __learner;
80 
81  delete __planer;
82 
83  for (auto obsIter = __bin.beginSafe(); obsIter != __bin.endSafe(); ++obsIter)
84  delete *obsIter;
85 
86  delete _fmdp;
87 
88  GUM_DESTRUCTOR(SDYNA);
89  }
SDYNA(ILearningStrategy *learner, IPlanningStrategy< double > *planer, IDecisionStrategy *decider, Idx observationPhaseLenght, Idx nbValueIterationStep, bool actionReward, bool verbose=true)
Constructor.
Definition: sdyna.cpp:54
IPlanningStrategy< double > * __planer
The planer used to plan an optimal strategy.
Definition: sdyna.h:417
ILearningStrategy * __learner
The learner used to learn the FMDP.
Definition: sdyna.h:414
FMDP< double > * _fmdp
The learnt Markovian Decision Process.
Definition: sdyna.h:407
IDecisionStrategy * __decider
The decider.
Definition: sdyna.h:420
Set< Observation *> __bin
Since SDYNA made these observation, it has to delete them on quitting.
Definition: sdyna.h:436
+ Here is the caller graph for this function:

Member Function Documentation

◆ addAction()

void gum::SDYNA::addAction ( const Idx  actionId,
const std::string &  actionName 
)
inline

Inserts a new action in the SDyna instance.

Warning
Without effect until method initialize is called
Parameters
actionId: an id to identify the action
actionName: its human name

Definition at line 233 of file sdyna.h.

References _fmdp, and gum::FMDP< GUM_SCALAR >::addAction().

233  {
234  _fmdp->addAction(actionId, actionName);
235  }
FMDP< double > * _fmdp
The learnt Markovian Decision Process.
Definition: sdyna.h:407
void addAction(Idx actionId, const std::string &action)
Adds an action to FMDP description.
Definition: fmdp_tpl.h:150
+ Here is the call graph for this function:

◆ addVariable()

void gum::SDYNA::addVariable ( const DiscreteVariable var)
inline

Inserts a new variable in the SDyna instance.

Warning
Without effect until method initialize is called
Parameters
var: the var to be added. Note that variable may or may not have all its modalities given. If not they will be discovered by the SDyna architecture during the process

Definition at line 247 of file sdyna.h.

References _fmdp, gum::FMDP< GUM_SCALAR >::addVariable(), and initialize().

247 { _fmdp->addVariable(var); }
void addVariable(const DiscreteVariable *var)
Adds a variable to FMDP description.
Definition: fmdp_tpl.h:121
FMDP< double > * _fmdp
The learnt Markovian Decision Process.
Definition: sdyna.h:407
+ Here is the call graph for this function:

◆ feedback() [1/2]

void gum::SDYNA::feedback ( const Instantiation originalState,
const Instantiation reachedState,
Idx  performedAction,
double  obtainedReward 
)

Performs a feedback on the last transition.

Incremental methods.

In extenso, learn from the transition.

Parameters
originalState: the state we were in before the transition
reachedState: the state we reached after
performedAction: the action we performed
obtainedReward: the reward we obtained

Definition at line 127 of file sdyna.cpp.

References __lastAction, and _lastState.

Referenced by setCurrentState().

130  {
131  __lastAction = lastAction;
132  _lastState = prevState;
133  feedback(curState, reward);
134  }
Instantiation _lastState
The state in which the system is before we perform a new action.
Definition: sdyna.h:410
Idx __lastAction
The last performed action.
Definition: sdyna.h:433
void feedback(const Instantiation &originalState, const Instantiation &reachedState, Idx performedAction, double obtainedReward)
Performs a feedback on the last transition.
Definition: sdyna.cpp:127
+ Here is the caller graph for this function:

◆ feedback() [2/2]

void gum::SDYNA::feedback ( const Instantiation reachedState,
double  obtainedReward 
)

Performs a feedback on the last transition.

In extenso, learn from the transition.

Parameters
reachedState: the state reached after the transition
obtainedReward: the reward obtained during the transition
Warning
Uses the __originalState and __performedAction stored in cache If you want to specify the original state and the performed action, see below

Definition at line 147 of file sdyna.cpp.

References __actionReward, __bin, __decider, __lastAction, __learner, __nbObservation, __nbValueIterationStep, __observationPhaseLenght, _fmdp, _lastState, gum::ILearningStrategy::addObservation(), gum::IDecisionStrategy::checkState(), gum::FMDP< GUM_SCALAR >::main2prime(), makePlanning(), setCurrentState(), gum::Observation::setModality(), gum::Observation::setReward(), gum::Observation::setRModality(), gum::Instantiation::val(), and gum::Instantiation::variablesSequence().

147  {
148  Observation* obs = new Observation();
149 
150  for (auto varIter = _lastState.variablesSequence().beginSafe();
151  varIter != _lastState.variablesSequence().endSafe();
152  ++varIter)
153  obs->setModality(*varIter, _lastState.val(**varIter));
154 
155  for (auto varIter = newState.variablesSequence().beginSafe();
156  varIter != newState.variablesSequence().endSafe();
157  ++varIter) {
158  obs->setModality(_fmdp->main2prime(*varIter), newState.val(**varIter));
159 
160  if (this->__actionReward)
161  obs->setRModality(*varIter, _lastState.val(**varIter));
162  else
163  obs->setRModality(*varIter, newState.val(**varIter));
164  }
165 
166  obs->setReward(reward);
167 
169  __bin.insert(obs);
170 
171  setCurrentState(newState);
173 
176 
177  __nbObservation++;
178  }
Instantiation _lastState
The state in which the system is before we perform a new action.
Definition: sdyna.h:410
void setCurrentState(const Instantiation &currentState)
Sets last state visited to the given state.
Definition: sdyna.h:289
Idx __lastAction
The last performed action.
Definition: sdyna.h:433
virtual void checkState(const Instantiation &newState, Idx actionId)=0
const Sequence< const DiscreteVariable *> & variablesSequence() const final
Returns the sequence of DiscreteVariable of this instantiation.
ILearningStrategy * __learner
The learner used to learn the FMDP.
Definition: sdyna.h:414
Idx val(Idx i) const
Returns the current value of the variable at position i.
virtual bool addObservation(Idx actionId, const Observation *obs)=0
Gives to the learner a new transition.
Idx __observationPhaseLenght
The number of observation we make before using again the planer.
Definition: sdyna.h:424
Idx __nbObservation
The total number of observation made so far.
Definition: sdyna.h:427
const DiscreteVariable * main2prime(const DiscreteVariable *mainVar) const
Returns the primed variable associate to the given main variable.
Definition: fmdp.h:106
void makePlanning(Idx nbStep)
Starts a new planning.
Definition: sdyna.cpp:187
FMDP< double > * _fmdp
The learnt Markovian Decision Process.
Definition: sdyna.h:407
IDecisionStrategy * __decider
The decider.
Definition: sdyna.h:420
bool __actionReward
Definition: sdyna.h:438
Idx __nbValueIterationStep
The number of Value Iteration step we perform.
Definition: sdyna.h:430
Set< Observation *> __bin
Since SDYNA made these observation, it has to delete them on quitting.
Definition: sdyna.h:436
+ Here is the call graph for this function:

◆ initialize() [1/2]

void gum::SDYNA::initialize ( )

Initializes the Sdyna instance.

Definition at line 95 of file sdyna.cpp.

References __decider, __learner, __planer, _fmdp, gum::IDecisionStrategy::initialize(), gum::IPlanningStrategy< GUM_SCALAR >::initialize(), and gum::ILearningStrategy::initialize().

Referenced by addVariable(), and initialize().

95  {
99  }
virtual void initialize(const FMDP< GUM_SCALAR > *fmdp)=0
Initializes the learner.
virtual void initialize(const FMDP< double > *fmdp)
Initializes the learner.
IPlanningStrategy< double > * __planer
The planer used to plan an optimal strategy.
Definition: sdyna.h:417
virtual void initialize(FMDP< double > *fmdp)=0
Initializes the learner.
ILearningStrategy * __learner
The learner used to learn the FMDP.
Definition: sdyna.h:414
FMDP< double > * _fmdp
The learnt Markovian Decision Process.
Definition: sdyna.h:407
IDecisionStrategy * __decider
The decider.
Definition: sdyna.h:420
+ Here is the call graph for this function:
+ Here is the caller graph for this function:

◆ initialize() [2/2]

void gum::SDYNA::initialize ( const Instantiation initialState)

Initializes the Sdyna instance at given state.

Parameters
initialState: the state of the studied system from which we will begin the explore, learn and exploit process

Definition at line 108 of file sdyna.cpp.

References initialize(), and setCurrentState().

108  {
109  initialize();
110  setCurrentState(initialState);
111  }
void setCurrentState(const Instantiation &currentState)
Sets last state visited to the given state.
Definition: sdyna.h:289
void initialize()
Initializes the Sdyna instance.
Definition: sdyna.cpp:95
+ Here is the call graph for this function:

◆ learnerSize()

Size gum::SDYNA::learnerSize ( )
inline

learnerSize

Returns

Definition at line 376 of file sdyna.h.

References __learner, and gum::ILearningStrategy::size().

376 { return __learner->size(); }
ILearningStrategy * __learner
The learner used to learn the FMDP.
Definition: sdyna.h:414
virtual Size size()=0
learnerSize
+ Here is the call graph for this function:

◆ makePlanning()

void gum::SDYNA::makePlanning ( Idx  nbStep)

Starts a new planning.

Parameters
nbStep: the maximal number of value iteration performed in this planning

Definition at line 187 of file sdyna.cpp.

References __decider, __learner, __planer, _verbose, gum::IPlanningStrategy< GUM_SCALAR >::makePlanning(), gum::IPlanningStrategy< GUM_SCALAR >::optimalPolicy(), gum::IDecisionStrategy::setOptimalStrategy(), and gum::ILearningStrategy::updateFMDP().

Referenced by feedback(), and setCurrentState().

187  {
188  if (_verbose) std::cout << "Updating decision trees ..." << std::endl;
190  // std::cout << << "Done" << std::endl;
191 
192  if (_verbose) std::cout << "Planning ..." << std::endl;
193  __planer->makePlanning(nbValueIterationStep);
194  // std::cout << << "Done" << std::endl;
195 
197  }
virtual void updateFMDP()=0
Starts an update of datastructure in the associated FMDP.
bool _verbose
Definition: sdyna.h:440
IPlanningStrategy< double > * __planer
The planer used to plan an optimal strategy.
Definition: sdyna.h:417
ILearningStrategy * __learner
The learner used to learn the FMDP.
Definition: sdyna.h:414
virtual void makePlanning(Idx nbIte)=0
Starts a new planning.
virtual const MultiDimFunctionGraph< ActionSet, SetTerminalNodePolicy > * optimalPolicy()=0
Returns optimalPolicy computed so far current size.
IDecisionStrategy * __decider
The decider.
Definition: sdyna.h:420
void setOptimalStrategy(const MultiDimFunctionGraph< ActionSet, SetTerminalNodePolicy > *optPol)
+ Here is the call graph for this function:
+ Here is the caller graph for this function:

◆ modelSize()

Size gum::SDYNA::modelSize ( )
inline

modelSize

Returns

Definition at line 384 of file sdyna.h.

References _fmdp, and gum::FMDP< GUM_SCALAR >::size().

384 { return _fmdp->size(); }
Size size() const
Returns the map binding main variables and prime variables.
Definition: fmdp_tpl.h:389
FMDP< double > * _fmdp
The learnt Markovian Decision Process.
Definition: sdyna.h:407
+ Here is the call graph for this function:

◆ optimalPolicy2String()

std::string gum::SDYNA::optimalPolicy2String ( )
inline

Definition at line 360 of file sdyna.h.

References __planer, and gum::IPlanningStrategy< GUM_SCALAR >::optimalPolicy2String().

360 { return __planer->optimalPolicy2String(); }
virtual std::string optimalPolicy2String()=0
Returns a string describing the optimal policy in a dot format.
IPlanningStrategy< double > * __planer
The planer used to plan an optimal strategy.
Definition: sdyna.h:417
+ Here is the call graph for this function:

◆ optimalPolicySize()

Size gum::SDYNA::optimalPolicySize ( )
inline

optimalPolicySize

Returns

Definition at line 400 of file sdyna.h.

References __planer, and gum::IPlanningStrategy< GUM_SCALAR >::optimalPolicySize().

400 { return __planer->optimalPolicySize(); }
virtual Size optimalPolicySize()=0
Returns optimalPolicy computed so far current size.
IPlanningStrategy< double > * __planer
The planer used to plan an optimal strategy.
Definition: sdyna.h:417
+ Here is the call graph for this function:

◆ RandomMDDInstance()

static SDYNA* gum::SDYNA::RandomMDDInstance ( double  attributeSelectionThreshold = 0.99,
double  similarityThreshold = 0.3,
double  discountFactor = 0.9,
double  epsilon = 1,
Idx  observationPhaseLenght = 100,
Idx  nbValueIterationStep = 10 
)
inlinestatic

@

Definition at line 153 of file sdyna.h.

References SDYNA(), and gum::StructuredPlaner< GUM_SCALAR >::spumddInstance().

158  {
159  bool actionReward = true;
160  ILearningStrategy* ls = new FMDPLearner< GTEST, GTEST, IMDDILEARNER >(
161  attributeSelectionThreshold, actionReward, similarityThreshold);
163  StructuredPlaner< double >::spumddInstance(discountFactor, epsilon);
164  IDecisionStrategy* ds = new RandomDecider();
165  return new SDYNA(
166  ls, ps, ds, observationPhaseLenght, nbValueIterationStep, actionReward);
167  }
SDYNA(ILearningStrategy *learner, IPlanningStrategy< double > *planer, IDecisionStrategy *decider, Idx observationPhaseLenght, Idx nbValueIterationStep, bool actionReward, bool verbose=true)
Constructor.
Definition: sdyna.cpp:54
static StructuredPlaner< GUM_SCALAR > * spumddInstance(GUM_SCALAR discountFactor=0.9, GUM_SCALAR epsilon=0.00001, bool verbose=true)
+ Here is the call graph for this function:

◆ RandomTreeInstance()

static SDYNA* gum::SDYNA::RandomTreeInstance ( double  attributeSelectionThreshold = 0.99,
double  discountFactor = 0.9,
double  epsilon = 1,
Idx  observationPhaseLenght = 100,
Idx  nbValueIterationStep = 10 
)
inlinestatic

@

Definition at line 172 of file sdyna.h.

References SDYNA(), gum::StructuredPlaner< GUM_SCALAR >::sviInstance(), and ~SDYNA().

176  {
177  bool actionReward = true;
178  ILearningStrategy* ls = new FMDPLearner< CHI2TEST, CHI2TEST, ITILEARNER >(
179  attributeSelectionThreshold, actionReward);
181  StructuredPlaner< double >::sviInstance(discountFactor, epsilon);
182  IDecisionStrategy* ds = new RandomDecider();
183  return new SDYNA(
184  ls, ps, ds, observationPhaseLenght, nbValueIterationStep, actionReward);
185  }
static StructuredPlaner< GUM_SCALAR > * sviInstance(GUM_SCALAR discountFactor=0.9, GUM_SCALAR epsilon=0.00001, bool verbose=true)
SDYNA(ILearningStrategy *learner, IPlanningStrategy< double > *planer, IDecisionStrategy *decider, Idx observationPhaseLenght, Idx nbValueIterationStep, bool actionReward, bool verbose=true)
Constructor.
Definition: sdyna.cpp:54
+ Here is the call graph for this function:

◆ RMaxMDDInstance()

static SDYNA* gum::SDYNA::RMaxMDDInstance ( double  attributeSelectionThreshold = 0.99,
double  similarityThreshold = 0.3,
double  discountFactor = 0.9,
double  epsilon = 1,
Idx  observationPhaseLenght = 100,
Idx  nbValueIterationStep = 10 
)
inlinestatic

@

Definition at line 114 of file sdyna.h.

References gum::AdaptiveRMaxPlaner::ReducedAndOrderedInstance(), and SDYNA().

119  {
120  bool actionReward = true;
121  ILearningStrategy* ls = new FMDPLearner< GTEST, GTEST, IMDDILEARNER >(
122  attributeSelectionThreshold, actionReward, similarityThreshold);
123  AdaptiveRMaxPlaner* rm = AdaptiveRMaxPlaner::ReducedAndOrderedInstance(
124  ls, discountFactor, epsilon);
126  IDecisionStrategy* ds = rm;
127  return new SDYNA(
128  ls, ps, ds, observationPhaseLenght, nbValueIterationStep, actionReward);
129  }
SDYNA(ILearningStrategy *learner, IPlanningStrategy< double > *planer, IDecisionStrategy *decider, Idx observationPhaseLenght, Idx nbValueIterationStep, bool actionReward, bool verbose=true)
Constructor.
Definition: sdyna.cpp:54
static AdaptiveRMaxPlaner * ReducedAndOrderedInstance(const ILearningStrategy *learner, double discountFactor=0.9, double epsilon=0.00001, bool verbose=true)
+ Here is the call graph for this function:

◆ RMaxTreeInstance()

static SDYNA* gum::SDYNA::RMaxTreeInstance ( double  attributeSelectionThreshold = 0.99,
double  discountFactor = 0.9,
double  epsilon = 1,
Idx  observationPhaseLenght = 100,
Idx  nbValueIterationStep = 10 
)
inlinestatic

@

Definition at line 134 of file sdyna.h.

References SDYNA(), and gum::AdaptiveRMaxPlaner::TreeInstance().

138  {
139  bool actionReward = true;
140  ILearningStrategy* ls = new FMDPLearner< GTEST, GTEST, ITILEARNER >(
141  attributeSelectionThreshold, actionReward);
142  AdaptiveRMaxPlaner* rm =
143  AdaptiveRMaxPlaner::TreeInstance(ls, discountFactor, epsilon);
145  IDecisionStrategy* ds = rm;
146  return new SDYNA(
147  ls, ps, ds, observationPhaseLenght, nbValueIterationStep, actionReward);
148  }
SDYNA(ILearningStrategy *learner, IPlanningStrategy< double > *planer, IDecisionStrategy *decider, Idx observationPhaseLenght, Idx nbValueIterationStep, bool actionReward, bool verbose=true)
Constructor.
Definition: sdyna.cpp:54
static AdaptiveRMaxPlaner * TreeInstance(const ILearningStrategy *learner, double discountFactor=0.9, double epsilon=0.00001, bool verbose=true)
+ Here is the call graph for this function:

◆ setCurrentState()

void gum::SDYNA::setCurrentState ( const Instantiation currentState)
inline

Sets last state visited to the given state.

During the learning process, we will consider that were in this state before the transition.

Parameters
currentState: the state

Definition at line 289 of file sdyna.h.

References _lastState, feedback(), makePlanning(), takeAction(), and toString().

Referenced by feedback(), and initialize().

289  {
290  _lastState = currentState;
291  }
Instantiation _lastState
The state in which the system is before we perform a new action.
Definition: sdyna.h:410
+ Here is the call graph for this function:
+ Here is the caller graph for this function:

◆ spimddiInstance()

static SDYNA* gum::SDYNA::spimddiInstance ( double  attributeSelectionThreshold = 0.99,
double  similarityThreshold = 0.3,
double  discountFactor = 0.9,
double  epsilon = 1,
Idx  observationPhaseLenght = 100,
Idx  nbValueIterationStep = 10 
)
inlinestatic

@

Definition at line 90 of file sdyna.h.

References SDYNA(), and gum::StructuredPlaner< GUM_SCALAR >::spumddInstance().

95  {
96  bool actionReward = false;
97  ILearningStrategy* ls = new FMDPLearner< GTEST, GTEST, IMDDILEARNER >(
98  attributeSelectionThreshold, actionReward, similarityThreshold);
100  discountFactor, epsilon, false);
101  IDecisionStrategy* ds = new E_GreedyDecider();
102  return new SDYNA(ls,
103  ps,
104  ds,
105  observationPhaseLenght,
106  nbValueIterationStep,
107  actionReward,
108  false);
109  }
SDYNA(ILearningStrategy *learner, IPlanningStrategy< double > *planer, IDecisionStrategy *decider, Idx observationPhaseLenght, Idx nbValueIterationStep, bool actionReward, bool verbose=true)
Constructor.
Definition: sdyna.cpp:54
static StructuredPlaner< GUM_SCALAR > * spumddInstance(GUM_SCALAR discountFactor=0.9, GUM_SCALAR epsilon=0.00001, bool verbose=true)
+ Here is the call graph for this function:

◆ spitiInstance()

static SDYNA* gum::SDYNA::spitiInstance ( double  attributeSelectionThreshold = 0.99,
double  discountFactor = 0.9,
double  epsilon = 1,
Idx  observationPhaseLenght = 100,
Idx  nbValueIterationStep = 10 
)
inlinestatic

@

Definition at line 72 of file sdyna.h.

References SDYNA(), and gum::StructuredPlaner< GUM_SCALAR >::sviInstance().

76  {
77  bool actionReward = false;
78  ILearningStrategy* ls = new FMDPLearner< CHI2TEST, CHI2TEST, ITILEARNER >(
79  attributeSelectionThreshold, actionReward);
81  StructuredPlaner< double >::sviInstance(discountFactor, epsilon);
82  IDecisionStrategy* ds = new E_GreedyDecider();
83  return new SDYNA(
84  ls, ps, ds, observationPhaseLenght, nbValueIterationStep, actionReward);
85  }
static StructuredPlaner< GUM_SCALAR > * sviInstance(GUM_SCALAR discountFactor=0.9, GUM_SCALAR epsilon=0.00001, bool verbose=true)
SDYNA(ILearningStrategy *learner, IPlanningStrategy< double > *planer, IDecisionStrategy *decider, Idx observationPhaseLenght, Idx nbValueIterationStep, bool actionReward, bool verbose=true)
Constructor.
Definition: sdyna.cpp:54
+ Here is the call graph for this function:

◆ takeAction() [1/2]

Idx gum::SDYNA::takeAction ( const Instantiation curState)
Returns
actionId the id of the action the SDyna instance wish to be performed
Parameters
curStatethe state in which we currently are

Definition at line 205 of file sdyna.cpp.

References _lastState, and takeAction().

205  {
206  _lastState = curState;
207  return takeAction();
208  }
Idx takeAction()
Definition: sdyna.cpp:215
Instantiation _lastState
The state in which the system is before we perform a new action.
Definition: sdyna.h:410
+ Here is the call graph for this function:

◆ takeAction() [2/2]

Idx gum::SDYNA::takeAction ( )
Returns
the id of the action the SDyna instance wish to be performed

Definition at line 215 of file sdyna.cpp.

References __decider, __lastAction, _lastState, gum::ActionSet::size(), and gum::IDecisionStrategy::stateOptimalPolicy().

Referenced by setCurrentState(), and takeAction().

215  {
216  ActionSet actionSet = __decider->stateOptimalPolicy(_lastState);
217  if (actionSet.size() == 1) {
218  __lastAction = actionSet[0];
219  } else {
220  Idx randy = (Idx)((double)std::rand() / (double)RAND_MAX * actionSet.size());
221  __lastAction = actionSet[randy == actionSet.size() ? 0 : randy];
222  }
223  return __lastAction;
224  }
virtual ActionSet stateOptimalPolicy(const Instantiation &curState)
Instantiation _lastState
The state in which the system is before we perform a new action.
Definition: sdyna.h:410
Idx __lastAction
The last performed action.
Definition: sdyna.h:433
Size Idx
Type for indexes.
Definition: types.h:50
IDecisionStrategy * __decider
The decider.
Definition: sdyna.h:420
+ Here is the call graph for this function:
+ Here is the caller graph for this function:

◆ toString()

std::string gum::SDYNA::toString ( )

Returns.

Returns
a string describing the learned FMDP, and the associated optimal policy. Both in DOT language.

Definition at line 229 of file sdyna.cpp.

References __planer, _fmdp, gum::IPlanningStrategy< GUM_SCALAR >::optimalPolicy2String(), and gum::FMDP< GUM_SCALAR >::toString().

Referenced by setCurrentState().

229  {
230  std::stringstream description;
231 
232  description << _fmdp->toString() << std::endl;
233  description << __planer->optimalPolicy2String() << std::endl;
234 
235  return description.str();
236  }
virtual std::string optimalPolicy2String()=0
Returns a string describing the optimal policy in a dot format.
IPlanningStrategy< double > * __planer
The planer used to plan an optimal strategy.
Definition: sdyna.h:417
FMDP< double > * _fmdp
The learnt Markovian Decision Process.
Definition: sdyna.h:407
std::string toString() const
Displays the FMDP in a Dot format.
Definition: fmdp_tpl.h:367
+ Here is the call graph for this function:
+ Here is the caller graph for this function:

◆ valueFunctionSize()

Size gum::SDYNA::valueFunctionSize ( )
inline

valueFunctionSize

Returns

Definition at line 392 of file sdyna.h.

References __planer, and gum::IPlanningStrategy< GUM_SCALAR >::vFunctionSize().

392 { return __planer->vFunctionSize(); }
IPlanningStrategy< double > * __planer
The planer used to plan an optimal strategy.
Definition: sdyna.h:417
virtual Size vFunctionSize()=0
Returns vFunction computed so far current size.
+ Here is the call graph for this function:

Member Data Documentation

◆ __actionReward

bool gum::SDYNA::__actionReward
private

Definition at line 438 of file sdyna.h.

Referenced by feedback().

◆ __bin

Set< Observation* > gum::SDYNA::__bin
private

Since SDYNA made these observation, it has to delete them on quitting.

Definition at line 436 of file sdyna.h.

Referenced by feedback(), and ~SDYNA().

◆ __decider

IDecisionStrategy* gum::SDYNA::__decider
private

The decider.

Definition at line 420 of file sdyna.h.

Referenced by feedback(), initialize(), makePlanning(), takeAction(), and ~SDYNA().

◆ __lastAction

Idx gum::SDYNA::__lastAction
private

The last performed action.

Definition at line 433 of file sdyna.h.

Referenced by feedback(), and takeAction().

◆ __learner

ILearningStrategy* gum::SDYNA::__learner
private

The learner used to learn the FMDP.

Definition at line 414 of file sdyna.h.

Referenced by feedback(), initialize(), learnerSize(), makePlanning(), and ~SDYNA().

◆ __nbObservation

Idx gum::SDYNA::__nbObservation
private

The total number of observation made so far.

Definition at line 427 of file sdyna.h.

Referenced by feedback(), and SDYNA().

◆ __nbValueIterationStep

Idx gum::SDYNA::__nbValueIterationStep
private

The number of Value Iteration step we perform.

Definition at line 430 of file sdyna.h.

Referenced by feedback().

◆ __observationPhaseLenght

Idx gum::SDYNA::__observationPhaseLenght
private

The number of observation we make before using again the planer.

Definition at line 424 of file sdyna.h.

Referenced by feedback().

◆ __planer

IPlanningStrategy< double >* gum::SDYNA::__planer
private

The planer used to plan an optimal strategy.

Definition at line 417 of file sdyna.h.

Referenced by initialize(), makePlanning(), optimalPolicy2String(), optimalPolicySize(), toString(), valueFunctionSize(), and ~SDYNA().

◆ _fmdp

FMDP< double >* gum::SDYNA::_fmdp
protected

The learnt Markovian Decision Process.

Definition at line 407 of file sdyna.h.

Referenced by addAction(), addVariable(), feedback(), initialize(), modelSize(), SDYNA(), toString(), and ~SDYNA().

◆ _lastState

Instantiation gum::SDYNA::_lastState
protected

The state in which the system is before we perform a new action.

Definition at line 410 of file sdyna.h.

Referenced by feedback(), setCurrentState(), and takeAction().

◆ _verbose

bool gum::SDYNA::_verbose
private

Definition at line 440 of file sdyna.h.

Referenced by makePlanning().


The documentation for this class was generated from the following files: