aGrUM  0.16.0
gum::SDYNA Class Reference

The general SDyna architecture abstract class. More...

#include <agrum/FMDP/SDyna/sdyna.h>

+ Collaboration diagram for gum::SDYNA:

Public Member Functions

std::string toString ()
 Returns. More...
 
std::string optimalPolicy2String ()
 
Problem specification methods
void addAction (const Idx actionId, const std::string &actionName)
 Inserts a new action in the SDyna instance. More...
 
void addVariable (const DiscreteVariable *var)
 Inserts a new variable in the SDyna instance. More...
 
Initialization
void initialize ()
 Initializes the Sdyna instance. More...
 
void initialize (const Instantiation &initialState)
 Initializes the Sdyna instance at given state. More...
 
Incremental methods
void setCurrentState (const Instantiation &currentState)
 Sets last state visited to the given state. More...
 
Idx takeAction (const Instantiation &curState)
 
Idx takeAction ()
 
void feedback (const Instantiation &originalState, const Instantiation &reachedState, Idx performedAction, double obtainedReward)
 Performs a feedback on the last transition. More...
 
void feedback (const Instantiation &reachedState, double obtainedReward)
 Performs a feedback on the last transition. More...
 
void makePlanning (Idx nbStep)
 Starts a new planning. More...
 
Size methods

just to get the size of the different data structure for performance evaluation purposes only

Size learnerSize ()
 learnerSize More...
 
Size modelSize ()
 modelSize More...
 
Size valueFunctionSize ()
 valueFunctionSize More...
 
Size optimalPolicySize ()
 optimalPolicySize More...
 

Static Public Member Functions

static SDYNAspitiInstance (double attributeSelectionThreshold=0.99, double discountFactor=0.9, double epsilon=1, Idx observationPhaseLenght=100, Idx nbValueIterationStep=10)
 @ More...
 
static SDYNAspimddiInstance (double attributeSelectionThreshold=0.99, double similarityThreshold=0.3, double discountFactor=0.9, double epsilon=1, Idx observationPhaseLenght=100, Idx nbValueIterationStep=10)
 @ More...
 
static SDYNARMaxMDDInstance (double attributeSelectionThreshold=0.99, double similarityThreshold=0.3, double discountFactor=0.9, double epsilon=1, Idx observationPhaseLenght=100, Idx nbValueIterationStep=10)
 @ More...
 
static SDYNARMaxTreeInstance (double attributeSelectionThreshold=0.99, double discountFactor=0.9, double epsilon=1, Idx observationPhaseLenght=100, Idx nbValueIterationStep=10)
 @ More...
 
static SDYNARandomMDDInstance (double attributeSelectionThreshold=0.99, double similarityThreshold=0.3, double discountFactor=0.9, double epsilon=1, Idx observationPhaseLenght=100, Idx nbValueIterationStep=10)
 @ More...
 
static SDYNARandomTreeInstance (double attributeSelectionThreshold=0.99, double discountFactor=0.9, double epsilon=1, Idx observationPhaseLenght=100, Idx nbValueIterationStep=10)
 @ More...
 

Protected Attributes

FMDP< double > * _fmdp
 The learnt Markovian Decision Process. More...
 
Instantiation _lastState
 The state in which the system is before we perform a new action. More...
 

Constructor & destructor.

 SDYNA (ILearningStrategy *learner, IPlanningStrategy< double > *planer, IDecisionStrategy *decider, Idx observationPhaseLenght, Idx nbValueIterationStep, bool actionReward, bool verbose=true)
 Constructor. More...
 
 ~SDYNA ()
 Destructor. More...
 

Detailed Description

The general SDyna architecture abstract class.

Instance of SDyna architecture should inherit

Definition at line 66 of file sdyna.h.

Constructor & Destructor Documentation

◆ SDYNA()

gum::SDYNA::SDYNA ( ILearningStrategy learner,
IPlanningStrategy< double > *  planer,
IDecisionStrategy decider,
Idx  observationPhaseLenght,
Idx  nbValueIterationStep,
bool  actionReward,
bool  verbose = true 
)
private

Constructor.

Returns
an instance of SDyna architecture

Definition at line 57 of file sdyna.cpp.

References __nbObservation, and _fmdp.

Referenced by RandomMDDInstance(), RandomTreeInstance(), RMaxMDDInstance(), RMaxTreeInstance(), spimddiInstance(), and spitiInstance().

63  :
64  __learner(learner),
65  __planer(planer), __decider(decider),
66  __observationPhaseLenght(observationPhaseLenght),
67  __nbValueIterationStep(nbValueIterationStep), __actionReward(actionReward),
68  _verbose(verbose) {
69  GUM_CONSTRUCTOR(SDYNA);
70 
71  _fmdp = new FMDP< double >();
72 
73  __nbObservation = 1;
74  }
SDYNA(ILearningStrategy *learner, IPlanningStrategy< double > *planer, IDecisionStrategy *decider, Idx observationPhaseLenght, Idx nbValueIterationStep, bool actionReward, bool verbose=true)
Constructor.
Definition: sdyna.cpp:57
bool _verbose
Definition: sdyna.h:443
IPlanningStrategy< double > * __planer
The planer used to plan an optimal strategy.
Definition: sdyna.h:420
ILearningStrategy * __learner
The learner used to learn the FMDP.
Definition: sdyna.h:417
Idx __observationPhaseLenght
The number of observation we make before using again the planer.
Definition: sdyna.h:427
Idx __nbObservation
The total number of observation made so far.
Definition: sdyna.h:430
FMDP< double > * _fmdp
The learnt Markovian Decision Process.
Definition: sdyna.h:410
IDecisionStrategy * __decider
The decider.
Definition: sdyna.h:423
bool __actionReward
Definition: sdyna.h:441
Idx __nbValueIterationStep
The number of Value Iteration step we perform.
Definition: sdyna.h:433
+ Here is the caller graph for this function:

◆ ~SDYNA()

gum::SDYNA::~SDYNA ( )

Destructor.

Definition at line 79 of file sdyna.cpp.

References __bin, __decider, __learner, __planer, and _fmdp.

Referenced by RandomTreeInstance().

79  {
80  delete __decider;
81 
82  delete __learner;
83 
84  delete __planer;
85 
86  for (auto obsIter = __bin.beginSafe(); obsIter != __bin.endSafe(); ++obsIter)
87  delete *obsIter;
88 
89  delete _fmdp;
90 
91  GUM_DESTRUCTOR(SDYNA);
92  }
SDYNA(ILearningStrategy *learner, IPlanningStrategy< double > *planer, IDecisionStrategy *decider, Idx observationPhaseLenght, Idx nbValueIterationStep, bool actionReward, bool verbose=true)
Constructor.
Definition: sdyna.cpp:57
IPlanningStrategy< double > * __planer
The planer used to plan an optimal strategy.
Definition: sdyna.h:420
ILearningStrategy * __learner
The learner used to learn the FMDP.
Definition: sdyna.h:417
FMDP< double > * _fmdp
The learnt Markovian Decision Process.
Definition: sdyna.h:410
IDecisionStrategy * __decider
The decider.
Definition: sdyna.h:423
Set< Observation *> __bin
Since SDYNA made these observation, it has to delete them on quitting.
Definition: sdyna.h:439
+ Here is the caller graph for this function:

Member Function Documentation

◆ addAction()

void gum::SDYNA::addAction ( const Idx  actionId,
const std::string &  actionName 
)
inline

Inserts a new action in the SDyna instance.

Warning
Without effect until method initialize is called
Parameters
actionId: an id to identify the action
actionName: its human name

Definition at line 236 of file sdyna.h.

References _fmdp, and gum::FMDP< GUM_SCALAR >::addAction().

236  {
237  _fmdp->addAction(actionId, actionName);
238  }
FMDP< double > * _fmdp
The learnt Markovian Decision Process.
Definition: sdyna.h:410
void addAction(Idx actionId, const std::string &action)
Adds an action to FMDP description.
Definition: fmdp_tpl.h:153
+ Here is the call graph for this function:

◆ addVariable()

void gum::SDYNA::addVariable ( const DiscreteVariable var)
inline

Inserts a new variable in the SDyna instance.

Warning
Without effect until method initialize is called
Parameters
var: the var to be added. Note that variable may or may not have all its modalities given. If not they will be discovered by the SDyna architecture during the process

Definition at line 250 of file sdyna.h.

References _fmdp, gum::FMDP< GUM_SCALAR >::addVariable(), and initialize().

250 { _fmdp->addVariable(var); }
void addVariable(const DiscreteVariable *var)
Adds a variable to FMDP description.
Definition: fmdp_tpl.h:124
FMDP< double > * _fmdp
The learnt Markovian Decision Process.
Definition: sdyna.h:410
+ Here is the call graph for this function:

◆ feedback() [1/2]

void gum::SDYNA::feedback ( const Instantiation originalState,
const Instantiation reachedState,
Idx  performedAction,
double  obtainedReward 
)

Performs a feedback on the last transition.

Incremental methods.

In extenso, learn from the transition.

Parameters
originalState: the state we were in before the transition
reachedState: the state we reached after
performedAction: the action we performed
obtainedReward: the reward we obtained

Definition at line 130 of file sdyna.cpp.

References __lastAction, and _lastState.

Referenced by setCurrentState().

133  {
134  __lastAction = lastAction;
135  _lastState = prevState;
136  feedback(curState, reward);
137  }
Instantiation _lastState
The state in which the system is before we perform a new action.
Definition: sdyna.h:413
Idx __lastAction
The last performed action.
Definition: sdyna.h:436
void feedback(const Instantiation &originalState, const Instantiation &reachedState, Idx performedAction, double obtainedReward)
Performs a feedback on the last transition.
Definition: sdyna.cpp:130
+ Here is the caller graph for this function:

◆ feedback() [2/2]

void gum::SDYNA::feedback ( const Instantiation reachedState,
double  obtainedReward 
)

Performs a feedback on the last transition.

In extenso, learn from the transition.

Parameters
reachedState: the state reached after the transition
obtainedReward: the reward obtained during the transition
Warning
Uses the __originalState and __performedAction stored in cache If you want to specify the original state and the performed action, see below

Definition at line 150 of file sdyna.cpp.

References __actionReward, __bin, __decider, __lastAction, __learner, __nbObservation, __nbValueIterationStep, __observationPhaseLenght, _fmdp, _lastState, gum::ILearningStrategy::addObservation(), gum::IDecisionStrategy::checkState(), gum::FMDP< GUM_SCALAR >::main2prime(), makePlanning(), setCurrentState(), gum::Observation::setModality(), gum::Observation::setReward(), gum::Observation::setRModality(), gum::Instantiation::val(), and gum::Instantiation::variablesSequence().

150  {
151  Observation* obs = new Observation();
152 
153  for (auto varIter = _lastState.variablesSequence().beginSafe();
154  varIter != _lastState.variablesSequence().endSafe();
155  ++varIter)
156  obs->setModality(*varIter, _lastState.val(**varIter));
157 
158  for (auto varIter = newState.variablesSequence().beginSafe();
159  varIter != newState.variablesSequence().endSafe();
160  ++varIter) {
161  obs->setModality(_fmdp->main2prime(*varIter), newState.val(**varIter));
162 
163  if (this->__actionReward)
164  obs->setRModality(*varIter, _lastState.val(**varIter));
165  else
166  obs->setRModality(*varIter, newState.val(**varIter));
167  }
168 
169  obs->setReward(reward);
170 
172  __bin.insert(obs);
173 
174  setCurrentState(newState);
176 
179 
180  __nbObservation++;
181  }
Instantiation _lastState
The state in which the system is before we perform a new action.
Definition: sdyna.h:413
void setCurrentState(const Instantiation &currentState)
Sets last state visited to the given state.
Definition: sdyna.h:292
Idx __lastAction
The last performed action.
Definition: sdyna.h:436
virtual void checkState(const Instantiation &newState, Idx actionId)=0
const Sequence< const DiscreteVariable *> & variablesSequence() const final
Returns the sequence of DiscreteVariable of this instantiation.
ILearningStrategy * __learner
The learner used to learn the FMDP.
Definition: sdyna.h:417
Idx val(Idx i) const
Returns the current value of the variable at position i.
virtual bool addObservation(Idx actionId, const Observation *obs)=0
Gives to the learner a new transition.
Idx __observationPhaseLenght
The number of observation we make before using again the planer.
Definition: sdyna.h:427
Idx __nbObservation
The total number of observation made so far.
Definition: sdyna.h:430
const DiscreteVariable * main2prime(const DiscreteVariable *mainVar) const
Returns the primed variable associate to the given main variable.
Definition: fmdp.h:109
void makePlanning(Idx nbStep)
Starts a new planning.
Definition: sdyna.cpp:190
FMDP< double > * _fmdp
The learnt Markovian Decision Process.
Definition: sdyna.h:410
IDecisionStrategy * __decider
The decider.
Definition: sdyna.h:423
bool __actionReward
Definition: sdyna.h:441
Idx __nbValueIterationStep
The number of Value Iteration step we perform.
Definition: sdyna.h:433
Set< Observation *> __bin
Since SDYNA made these observation, it has to delete them on quitting.
Definition: sdyna.h:439
+ Here is the call graph for this function:

◆ initialize() [1/2]

void gum::SDYNA::initialize ( )

Initializes the Sdyna instance.

Definition at line 98 of file sdyna.cpp.

References __decider, __learner, __planer, _fmdp, gum::IDecisionStrategy::initialize(), gum::IPlanningStrategy< GUM_SCALAR >::initialize(), and gum::ILearningStrategy::initialize().

Referenced by addVariable(), and initialize().

98  {
102  }
virtual void initialize(const FMDP< GUM_SCALAR > *fmdp)=0
Initializes the learner.
virtual void initialize(const FMDP< double > *fmdp)
Initializes the learner.
IPlanningStrategy< double > * __planer
The planer used to plan an optimal strategy.
Definition: sdyna.h:420
virtual void initialize(FMDP< double > *fmdp)=0
Initializes the learner.
ILearningStrategy * __learner
The learner used to learn the FMDP.
Definition: sdyna.h:417
FMDP< double > * _fmdp
The learnt Markovian Decision Process.
Definition: sdyna.h:410
IDecisionStrategy * __decider
The decider.
Definition: sdyna.h:423
+ Here is the call graph for this function:
+ Here is the caller graph for this function:

◆ initialize() [2/2]

void gum::SDYNA::initialize ( const Instantiation initialState)

Initializes the Sdyna instance at given state.

Parameters
initialState: the state of the studied system from which we will begin the explore, learn and exploit process

Definition at line 111 of file sdyna.cpp.

References initialize(), and setCurrentState().

111  {
112  initialize();
113  setCurrentState(initialState);
114  }
void setCurrentState(const Instantiation &currentState)
Sets last state visited to the given state.
Definition: sdyna.h:292
void initialize()
Initializes the Sdyna instance.
Definition: sdyna.cpp:98
+ Here is the call graph for this function:

◆ learnerSize()

Size gum::SDYNA::learnerSize ( )
inline

learnerSize

Returns

Definition at line 379 of file sdyna.h.

References __learner, and gum::ILearningStrategy::size().

379 { return __learner->size(); }
ILearningStrategy * __learner
The learner used to learn the FMDP.
Definition: sdyna.h:417
virtual Size size()=0
learnerSize
+ Here is the call graph for this function:

◆ makePlanning()

void gum::SDYNA::makePlanning ( Idx  nbStep)

Starts a new planning.

Parameters
nbStep: the maximal number of value iteration performed in this planning

Definition at line 190 of file sdyna.cpp.

References __decider, __learner, __planer, _verbose, gum::IPlanningStrategy< GUM_SCALAR >::makePlanning(), gum::IPlanningStrategy< GUM_SCALAR >::optimalPolicy(), gum::IDecisionStrategy::setOptimalStrategy(), and gum::ILearningStrategy::updateFMDP().

Referenced by feedback(), and setCurrentState().

190  {
191  if (_verbose) std::cout << "Updating decision trees ..." << std::endl;
193  // std::cout << << "Done" << std::endl;
194 
195  if (_verbose) std::cout << "Planning ..." << std::endl;
196  __planer->makePlanning(nbValueIterationStep);
197  // std::cout << << "Done" << std::endl;
198 
200  }
virtual void updateFMDP()=0
Starts an update of datastructure in the associated FMDP.
bool _verbose
Definition: sdyna.h:443
IPlanningStrategy< double > * __planer
The planer used to plan an optimal strategy.
Definition: sdyna.h:420
ILearningStrategy * __learner
The learner used to learn the FMDP.
Definition: sdyna.h:417
virtual void makePlanning(Idx nbIte)=0
Starts a new planning.
virtual const MultiDimFunctionGraph< ActionSet, SetTerminalNodePolicy > * optimalPolicy()=0
Returns optimalPolicy computed so far current size.
IDecisionStrategy * __decider
The decider.
Definition: sdyna.h:423
void setOptimalStrategy(const MultiDimFunctionGraph< ActionSet, SetTerminalNodePolicy > *optPol)
+ Here is the call graph for this function:
+ Here is the caller graph for this function:

◆ modelSize()

Size gum::SDYNA::modelSize ( )
inline

modelSize

Returns

Definition at line 387 of file sdyna.h.

References _fmdp, and gum::FMDP< GUM_SCALAR >::size().

387 { return _fmdp->size(); }
Size size() const
Returns the map binding main variables and prime variables.
Definition: fmdp_tpl.h:392
FMDP< double > * _fmdp
The learnt Markovian Decision Process.
Definition: sdyna.h:410
+ Here is the call graph for this function:

◆ optimalPolicy2String()

std::string gum::SDYNA::optimalPolicy2String ( )
inline

Definition at line 363 of file sdyna.h.

References __planer, and gum::IPlanningStrategy< GUM_SCALAR >::optimalPolicy2String().

363 { return __planer->optimalPolicy2String(); }
virtual std::string optimalPolicy2String()=0
Returns a string describing the optimal policy in a dot format.
IPlanningStrategy< double > * __planer
The planer used to plan an optimal strategy.
Definition: sdyna.h:420
+ Here is the call graph for this function:

◆ optimalPolicySize()

Size gum::SDYNA::optimalPolicySize ( )
inline

optimalPolicySize

Returns

Definition at line 403 of file sdyna.h.

References __planer, and gum::IPlanningStrategy< GUM_SCALAR >::optimalPolicySize().

403 { return __planer->optimalPolicySize(); }
virtual Size optimalPolicySize()=0
Returns optimalPolicy computed so far current size.
IPlanningStrategy< double > * __planer
The planer used to plan an optimal strategy.
Definition: sdyna.h:420
+ Here is the call graph for this function:

◆ RandomMDDInstance()

static SDYNA* gum::SDYNA::RandomMDDInstance ( double  attributeSelectionThreshold = 0.99,
double  similarityThreshold = 0.3,
double  discountFactor = 0.9,
double  epsilon = 1,
Idx  observationPhaseLenght = 100,
Idx  nbValueIterationStep = 10 
)
inlinestatic

@

Definition at line 156 of file sdyna.h.

References SDYNA(), and gum::StructuredPlaner< GUM_SCALAR >::spumddInstance().

161  {
162  bool actionReward = true;
163  ILearningStrategy* ls = new FMDPLearner< GTEST, GTEST, IMDDILEARNER >(
164  attributeSelectionThreshold, actionReward, similarityThreshold);
166  StructuredPlaner< double >::spumddInstance(discountFactor, epsilon);
167  IDecisionStrategy* ds = new RandomDecider();
168  return new SDYNA(
169  ls, ps, ds, observationPhaseLenght, nbValueIterationStep, actionReward);
170  }
SDYNA(ILearningStrategy *learner, IPlanningStrategy< double > *planer, IDecisionStrategy *decider, Idx observationPhaseLenght, Idx nbValueIterationStep, bool actionReward, bool verbose=true)
Constructor.
Definition: sdyna.cpp:57
static StructuredPlaner< GUM_SCALAR > * spumddInstance(GUM_SCALAR discountFactor=0.9, GUM_SCALAR epsilon=0.00001, bool verbose=true)
+ Here is the call graph for this function:

◆ RandomTreeInstance()

static SDYNA* gum::SDYNA::RandomTreeInstance ( double  attributeSelectionThreshold = 0.99,
double  discountFactor = 0.9,
double  epsilon = 1,
Idx  observationPhaseLenght = 100,
Idx  nbValueIterationStep = 10 
)
inlinestatic

@

Definition at line 175 of file sdyna.h.

References SDYNA(), gum::StructuredPlaner< GUM_SCALAR >::sviInstance(), and ~SDYNA().

179  {
180  bool actionReward = true;
181  ILearningStrategy* ls = new FMDPLearner< CHI2TEST, CHI2TEST, ITILEARNER >(
182  attributeSelectionThreshold, actionReward);
184  StructuredPlaner< double >::sviInstance(discountFactor, epsilon);
185  IDecisionStrategy* ds = new RandomDecider();
186  return new SDYNA(
187  ls, ps, ds, observationPhaseLenght, nbValueIterationStep, actionReward);
188  }
static StructuredPlaner< GUM_SCALAR > * sviInstance(GUM_SCALAR discountFactor=0.9, GUM_SCALAR epsilon=0.00001, bool verbose=true)
SDYNA(ILearningStrategy *learner, IPlanningStrategy< double > *planer, IDecisionStrategy *decider, Idx observationPhaseLenght, Idx nbValueIterationStep, bool actionReward, bool verbose=true)
Constructor.
Definition: sdyna.cpp:57
+ Here is the call graph for this function:

◆ RMaxMDDInstance()

static SDYNA* gum::SDYNA::RMaxMDDInstance ( double  attributeSelectionThreshold = 0.99,
double  similarityThreshold = 0.3,
double  discountFactor = 0.9,
double  epsilon = 1,
Idx  observationPhaseLenght = 100,
Idx  nbValueIterationStep = 10 
)
inlinestatic

@

Definition at line 117 of file sdyna.h.

References gum::AdaptiveRMaxPlaner::ReducedAndOrderedInstance(), and SDYNA().

122  {
123  bool actionReward = true;
124  ILearningStrategy* ls = new FMDPLearner< GTEST, GTEST, IMDDILEARNER >(
125  attributeSelectionThreshold, actionReward, similarityThreshold);
126  AdaptiveRMaxPlaner* rm = AdaptiveRMaxPlaner::ReducedAndOrderedInstance(
127  ls, discountFactor, epsilon);
129  IDecisionStrategy* ds = rm;
130  return new SDYNA(
131  ls, ps, ds, observationPhaseLenght, nbValueIterationStep, actionReward);
132  }
SDYNA(ILearningStrategy *learner, IPlanningStrategy< double > *planer, IDecisionStrategy *decider, Idx observationPhaseLenght, Idx nbValueIterationStep, bool actionReward, bool verbose=true)
Constructor.
Definition: sdyna.cpp:57
static AdaptiveRMaxPlaner * ReducedAndOrderedInstance(const ILearningStrategy *learner, double discountFactor=0.9, double epsilon=0.00001, bool verbose=true)
+ Here is the call graph for this function:

◆ RMaxTreeInstance()

static SDYNA* gum::SDYNA::RMaxTreeInstance ( double  attributeSelectionThreshold = 0.99,
double  discountFactor = 0.9,
double  epsilon = 1,
Idx  observationPhaseLenght = 100,
Idx  nbValueIterationStep = 10 
)
inlinestatic

@

Definition at line 137 of file sdyna.h.

References SDYNA(), and gum::AdaptiveRMaxPlaner::TreeInstance().

141  {
142  bool actionReward = true;
143  ILearningStrategy* ls = new FMDPLearner< GTEST, GTEST, ITILEARNER >(
144  attributeSelectionThreshold, actionReward);
145  AdaptiveRMaxPlaner* rm =
146  AdaptiveRMaxPlaner::TreeInstance(ls, discountFactor, epsilon);
148  IDecisionStrategy* ds = rm;
149  return new SDYNA(
150  ls, ps, ds, observationPhaseLenght, nbValueIterationStep, actionReward);
151  }
SDYNA(ILearningStrategy *learner, IPlanningStrategy< double > *planer, IDecisionStrategy *decider, Idx observationPhaseLenght, Idx nbValueIterationStep, bool actionReward, bool verbose=true)
Constructor.
Definition: sdyna.cpp:57
static AdaptiveRMaxPlaner * TreeInstance(const ILearningStrategy *learner, double discountFactor=0.9, double epsilon=0.00001, bool verbose=true)
+ Here is the call graph for this function:

◆ setCurrentState()

void gum::SDYNA::setCurrentState ( const Instantiation currentState)
inline

Sets last state visited to the given state.

During the learning process, we will consider that were in this state before the transition.

Parameters
currentState: the state

Definition at line 292 of file sdyna.h.

References _lastState, feedback(), makePlanning(), takeAction(), and toString().

Referenced by feedback(), and initialize().

292  {
293  _lastState = currentState;
294  }
Instantiation _lastState
The state in which the system is before we perform a new action.
Definition: sdyna.h:413
+ Here is the call graph for this function:
+ Here is the caller graph for this function:

◆ spimddiInstance()

static SDYNA* gum::SDYNA::spimddiInstance ( double  attributeSelectionThreshold = 0.99,
double  similarityThreshold = 0.3,
double  discountFactor = 0.9,
double  epsilon = 1,
Idx  observationPhaseLenght = 100,
Idx  nbValueIterationStep = 10 
)
inlinestatic

@

Definition at line 93 of file sdyna.h.

References SDYNA(), and gum::StructuredPlaner< GUM_SCALAR >::spumddInstance().

98  {
99  bool actionReward = false;
100  ILearningStrategy* ls = new FMDPLearner< GTEST, GTEST, IMDDILEARNER >(
101  attributeSelectionThreshold, actionReward, similarityThreshold);
103  discountFactor, epsilon, false);
104  IDecisionStrategy* ds = new E_GreedyDecider();
105  return new SDYNA(ls,
106  ps,
107  ds,
108  observationPhaseLenght,
109  nbValueIterationStep,
110  actionReward,
111  false);
112  }
SDYNA(ILearningStrategy *learner, IPlanningStrategy< double > *planer, IDecisionStrategy *decider, Idx observationPhaseLenght, Idx nbValueIterationStep, bool actionReward, bool verbose=true)
Constructor.
Definition: sdyna.cpp:57
static StructuredPlaner< GUM_SCALAR > * spumddInstance(GUM_SCALAR discountFactor=0.9, GUM_SCALAR epsilon=0.00001, bool verbose=true)
+ Here is the call graph for this function:

◆ spitiInstance()

static SDYNA* gum::SDYNA::spitiInstance ( double  attributeSelectionThreshold = 0.99,
double  discountFactor = 0.9,
double  epsilon = 1,
Idx  observationPhaseLenght = 100,
Idx  nbValueIterationStep = 10 
)
inlinestatic

@

Definition at line 75 of file sdyna.h.

References SDYNA(), and gum::StructuredPlaner< GUM_SCALAR >::sviInstance().

79  {
80  bool actionReward = false;
81  ILearningStrategy* ls = new FMDPLearner< CHI2TEST, CHI2TEST, ITILEARNER >(
82  attributeSelectionThreshold, actionReward);
84  StructuredPlaner< double >::sviInstance(discountFactor, epsilon);
85  IDecisionStrategy* ds = new E_GreedyDecider();
86  return new SDYNA(
87  ls, ps, ds, observationPhaseLenght, nbValueIterationStep, actionReward);
88  }
static StructuredPlaner< GUM_SCALAR > * sviInstance(GUM_SCALAR discountFactor=0.9, GUM_SCALAR epsilon=0.00001, bool verbose=true)
SDYNA(ILearningStrategy *learner, IPlanningStrategy< double > *planer, IDecisionStrategy *decider, Idx observationPhaseLenght, Idx nbValueIterationStep, bool actionReward, bool verbose=true)
Constructor.
Definition: sdyna.cpp:57
+ Here is the call graph for this function:

◆ takeAction() [1/2]

Idx gum::SDYNA::takeAction ( const Instantiation curState)
Returns
actionId the id of the action the SDyna instance wish to be performed
Parameters
curStatethe state in which we currently are

Definition at line 208 of file sdyna.cpp.

References _lastState, and takeAction().

208  {
209  _lastState = curState;
210  return takeAction();
211  }
Idx takeAction()
Definition: sdyna.cpp:218
Instantiation _lastState
The state in which the system is before we perform a new action.
Definition: sdyna.h:413
+ Here is the call graph for this function:

◆ takeAction() [2/2]

Idx gum::SDYNA::takeAction ( )
Returns
the id of the action the SDyna instance wish to be performed

Definition at line 218 of file sdyna.cpp.

References __decider, __lastAction, _lastState, gum::ActionSet::size(), and gum::IDecisionStrategy::stateOptimalPolicy().

Referenced by setCurrentState(), and takeAction().

218  {
219  ActionSet actionSet = __decider->stateOptimalPolicy(_lastState);
220  if (actionSet.size() == 1) {
221  __lastAction = actionSet[0];
222  } else {
223  Idx randy = (Idx)((double)std::rand() / (double)RAND_MAX * actionSet.size());
224  __lastAction = actionSet[randy == actionSet.size() ? 0 : randy];
225  }
226  return __lastAction;
227  }
virtual ActionSet stateOptimalPolicy(const Instantiation &curState)
Instantiation _lastState
The state in which the system is before we perform a new action.
Definition: sdyna.h:413
Idx __lastAction
The last performed action.
Definition: sdyna.h:436
Size Idx
Type for indexes.
Definition: types.h:53
IDecisionStrategy * __decider
The decider.
Definition: sdyna.h:423
+ Here is the call graph for this function:
+ Here is the caller graph for this function:

◆ toString()

std::string gum::SDYNA::toString ( )

Returns.

Returns
a string describing the learned FMDP, and the associated optimal policy. Both in DOT language.

Definition at line 232 of file sdyna.cpp.

References __planer, _fmdp, gum::IPlanningStrategy< GUM_SCALAR >::optimalPolicy2String(), and gum::FMDP< GUM_SCALAR >::toString().

Referenced by setCurrentState().

232  {
233  std::stringstream description;
234 
235  description << _fmdp->toString() << std::endl;
236  description << __planer->optimalPolicy2String() << std::endl;
237 
238  return description.str();
239  }
virtual std::string optimalPolicy2String()=0
Returns a string describing the optimal policy in a dot format.
IPlanningStrategy< double > * __planer
The planer used to plan an optimal strategy.
Definition: sdyna.h:420
FMDP< double > * _fmdp
The learnt Markovian Decision Process.
Definition: sdyna.h:410
std::string toString() const
Displays the FMDP in a Dot format.
Definition: fmdp_tpl.h:370
+ Here is the call graph for this function:
+ Here is the caller graph for this function:

◆ valueFunctionSize()

Size gum::SDYNA::valueFunctionSize ( )
inline

valueFunctionSize

Returns

Definition at line 395 of file sdyna.h.

References __planer, and gum::IPlanningStrategy< GUM_SCALAR >::vFunctionSize().

395 { return __planer->vFunctionSize(); }
IPlanningStrategy< double > * __planer
The planer used to plan an optimal strategy.
Definition: sdyna.h:420
virtual Size vFunctionSize()=0
Returns vFunction computed so far current size.
+ Here is the call graph for this function:

Member Data Documentation

◆ __actionReward

bool gum::SDYNA::__actionReward
private

Definition at line 441 of file sdyna.h.

Referenced by feedback().

◆ __bin

Set< Observation* > gum::SDYNA::__bin
private

Since SDYNA made these observation, it has to delete them on quitting.

Definition at line 439 of file sdyna.h.

Referenced by feedback(), and ~SDYNA().

◆ __decider

IDecisionStrategy* gum::SDYNA::__decider
private

The decider.

Definition at line 423 of file sdyna.h.

Referenced by feedback(), initialize(), makePlanning(), takeAction(), and ~SDYNA().

◆ __lastAction

Idx gum::SDYNA::__lastAction
private

The last performed action.

Definition at line 436 of file sdyna.h.

Referenced by feedback(), and takeAction().

◆ __learner

ILearningStrategy* gum::SDYNA::__learner
private

The learner used to learn the FMDP.

Definition at line 417 of file sdyna.h.

Referenced by feedback(), initialize(), learnerSize(), makePlanning(), and ~SDYNA().

◆ __nbObservation

Idx gum::SDYNA::__nbObservation
private

The total number of observation made so far.

Definition at line 430 of file sdyna.h.

Referenced by feedback(), and SDYNA().

◆ __nbValueIterationStep

Idx gum::SDYNA::__nbValueIterationStep
private

The number of Value Iteration step we perform.

Definition at line 433 of file sdyna.h.

Referenced by feedback().

◆ __observationPhaseLenght

Idx gum::SDYNA::__observationPhaseLenght
private

The number of observation we make before using again the planer.

Definition at line 427 of file sdyna.h.

Referenced by feedback().

◆ __planer

IPlanningStrategy< double >* gum::SDYNA::__planer
private

The planer used to plan an optimal strategy.

Definition at line 420 of file sdyna.h.

Referenced by initialize(), makePlanning(), optimalPolicy2String(), optimalPolicySize(), toString(), valueFunctionSize(), and ~SDYNA().

◆ _fmdp

FMDP< double >* gum::SDYNA::_fmdp
protected

The learnt Markovian Decision Process.

Definition at line 410 of file sdyna.h.

Referenced by addAction(), addVariable(), feedback(), initialize(), modelSize(), SDYNA(), toString(), and ~SDYNA().

◆ _lastState

Instantiation gum::SDYNA::_lastState
protected

The state in which the system is before we perform a new action.

Definition at line 413 of file sdyna.h.

Referenced by feedback(), setCurrentState(), and takeAction().

◆ _verbose

bool gum::SDYNA::_verbose
private

Definition at line 443 of file sdyna.h.

Referenced by makePlanning().


The documentation for this class was generated from the following files: