aGrUM  0.20.2
a C++ library for (probabilistic) graphical models
gum::SDYNA Class Reference

The general SDyna architecture abstract class. More...

#include <agrum/FMDP/SDyna/sdyna.h>

+ Collaboration diagram for gum::SDYNA:

Public Member Functions

std::string toString ()
 Returns. More...
 
std::string optimalPolicy2String ()
 
Problem specification methods
void addAction (const Idx actionId, const std::string &actionName)
 Inserts a new action in the SDyna instance. More...
 
void addVariable (const DiscreteVariable *var)
 Inserts a new variable in the SDyna instance. More...
 
Initialization
void initialize ()
 Initializes the Sdyna instance. More...
 
void initialize (const Instantiation &initialState)
 Initializes the Sdyna instance at given state. More...
 
Incremental methods
void setCurrentState (const Instantiation &currentState)
 Sets last state visited to the given state. More...
 
Idx takeAction (const Instantiation &curState)
 
Idx takeAction ()
 
void feedback (const Instantiation &originalState, const Instantiation &reachedState, Idx performedAction, double obtainedReward)
 Performs a feedback on the last transition. More...
 
void feedback (const Instantiation &reachedState, double obtainedReward)
 Performs a feedback on the last transition. More...
 
void makePlanning (Idx nbStep)
 Starts a new planning. More...
 
Size methods

just to get the size of the different data structure for performance evaluation purposes only

Size learnerSize ()
 learnerSize More...
 
Size modelSize ()
 modelSize More...
 
Size valueFunctionSize ()
 valueFunctionSize More...
 
Size optimalPolicySize ()
 optimalPolicySize More...
 

Static Public Member Functions

static SDYNAspitiInstance (double attributeSelectionThreshold=0.99, double discountFactor=0.9, double epsilon=1, Idx observationPhaseLenght=100, Idx nbValueIterationStep=10)
 @ More...
 
static SDYNAspimddiInstance (double attributeSelectionThreshold=0.99, double similarityThreshold=0.3, double discountFactor=0.9, double epsilon=1, Idx observationPhaseLenght=100, Idx nbValueIterationStep=10)
 @ More...
 
static SDYNARMaxMDDInstance (double attributeSelectionThreshold=0.99, double similarityThreshold=0.3, double discountFactor=0.9, double epsilon=1, Idx observationPhaseLenght=100, Idx nbValueIterationStep=10)
 @ More...
 
static SDYNARMaxTreeInstance (double attributeSelectionThreshold=0.99, double discountFactor=0.9, double epsilon=1, Idx observationPhaseLenght=100, Idx nbValueIterationStep=10)
 @ More...
 
static SDYNARandomMDDInstance (double attributeSelectionThreshold=0.99, double similarityThreshold=0.3, double discountFactor=0.9, double epsilon=1, Idx observationPhaseLenght=100, Idx nbValueIterationStep=10)
 @ More...
 
static SDYNARandomTreeInstance (double attributeSelectionThreshold=0.99, double discountFactor=0.9, double epsilon=1, Idx observationPhaseLenght=100, Idx nbValueIterationStep=10)
 @ More...
 

Protected Attributes

FMDP< double > * fmdp_
 The learnt Markovian Decision Process. More...
 
Instantiation lastState_
 The state in which the system is before we perform a new action. More...
 

Constructor & destructor.

 SDYNA (ILearningStrategy *learner, IPlanningStrategy< double > *planer, IDecisionStrategy *decider, Idx observationPhaseLenght, Idx nbValueIterationStep, bool actionReward, bool verbose=true)
 Constructor. More...
 
 ~SDYNA ()
 Destructor. More...
 

Detailed Description

The general SDyna architecture abstract class.

Instance of SDyna architecture should inherit

Definition at line 66 of file sdyna.h.

Constructor & Destructor Documentation

◆ SDYNA()

gum::SDYNA::SDYNA ( ILearningStrategy learner,
IPlanningStrategy< double > *  planer,
IDecisionStrategy decider,
Idx  observationPhaseLenght,
Idx  nbValueIterationStep,
bool  actionReward,
bool  verbose = true 
)
private

Constructor.

Returns
an instance of SDyna architecture

Definition at line 57 of file sdyna.cpp.

References gum::Set< Key, Alloc >::emplace().

63  :
64  learner__(learner),
65  planer__(planer), decider__(decider),
66  observationPhaseLenght__(observationPhaseLenght),
67  nbValueIterationStep__(nbValueIterationStep), actionReward__(actionReward),
68  verbose_(verbose) {
69  GUM_CONSTRUCTOR(SDYNA);
70 
71  fmdp_ = new FMDP< double >();
72 
73  nbObservation__ = 1;
74  }
Idx nbObservation__
The total number of observation made so far.
Definition: sdyna.h:463
SDYNA(ILearningStrategy *learner, IPlanningStrategy< double > *planer, IDecisionStrategy *decider, Idx observationPhaseLenght, Idx nbValueIterationStep, bool actionReward, bool verbose=true)
Constructor.
Definition: sdyna.cpp:57
IPlanningStrategy< double > * planer__
The planer used to plan an optimal strategy.
Definition: sdyna.h:453
FMDP< double > * fmdp_
The learnt Markovian Decision Process.
Definition: sdyna.h:443
IDecisionStrategy * decider__
The decider.
Definition: sdyna.h:456
Idx observationPhaseLenght__
The number of observation we make before using again the planer.
Definition: sdyna.h:460
bool verbose_
Definition: sdyna.h:476
bool actionReward__
Definition: sdyna.h:474
Idx nbValueIterationStep__
The number of Value Iteration step we perform.
Definition: sdyna.h:466
ILearningStrategy * learner__
The learner used to learn the FMDP.
Definition: sdyna.h:450
+ Here is the call graph for this function:

◆ ~SDYNA()

gum::SDYNA::~SDYNA ( )

Destructor.

Definition at line 79 of file sdyna.cpp.

References gum::Set< Key, Alloc >::emplace().

79  {
80  delete decider__;
81 
82  delete learner__;
83 
84  delete planer__;
85 
86  for (auto obsIter = bin__.beginSafe(); obsIter != bin__.endSafe(); ++obsIter)
87  delete *obsIter;
88 
89  delete fmdp_;
90 
91  GUM_DESTRUCTOR(SDYNA);
92  }
SDYNA(ILearningStrategy *learner, IPlanningStrategy< double > *planer, IDecisionStrategy *decider, Idx observationPhaseLenght, Idx nbValueIterationStep, bool actionReward, bool verbose=true)
Constructor.
Definition: sdyna.cpp:57
IPlanningStrategy< double > * planer__
The planer used to plan an optimal strategy.
Definition: sdyna.h:453
FMDP< double > * fmdp_
The learnt Markovian Decision Process.
Definition: sdyna.h:443
IDecisionStrategy * decider__
The decider.
Definition: sdyna.h:456
Set< Observation *> bin__
Since SDYNA made these observation, it has to delete them on quitting.
Definition: sdyna.h:472
ILearningStrategy * learner__
The learner used to learn the FMDP.
Definition: sdyna.h:450
+ Here is the call graph for this function:

Member Function Documentation

◆ addAction()

void gum::SDYNA::addAction ( const Idx  actionId,
const std::string &  actionName 
)
inline

Inserts a new action in the SDyna instance.

Warning
Without effect until method initialize is called
Parameters
actionId: an id to identify the action
actionName: its human name

Definition at line 269 of file sdyna.h.

269  {
270  fmdp_->addAction(actionId, actionName);
271  }
FMDP< double > * fmdp_
The learnt Markovian Decision Process.
Definition: sdyna.h:443
void addAction(Idx actionId, const std::string &action)
Adds an action to FMDP description.
Definition: fmdp_tpl.h:153

◆ addVariable()

void gum::SDYNA::addVariable ( const DiscreteVariable var)
inline

Inserts a new variable in the SDyna instance.

Warning
Without effect until method initialize is called
Parameters
var: the var to be added. Note that variable may or may not have all its modalities given. If not they will be discovered by the SDyna architecture during the process

Definition at line 283 of file sdyna.h.

283 { fmdp_->addVariable(var); }
FMDP< double > * fmdp_
The learnt Markovian Decision Process.
Definition: sdyna.h:443
void addVariable(const DiscreteVariable *var)
Adds a variable to FMDP description.
Definition: fmdp_tpl.h:124

◆ feedback() [1/2]

void gum::SDYNA::feedback ( const Instantiation originalState,
const Instantiation reachedState,
Idx  performedAction,
double  obtainedReward 
)

Performs a feedback on the last transition.

Incremental methods.

In extenso, learn from the transition.

Parameters
originalState: the state we were in before the transition
reachedState: the state we reached after
performedAction: the action we performed
obtainedReward: the reward we obtained

Definition at line 130 of file sdyna.cpp.

References gum::Set< Key, Alloc >::emplace().

133  {
134  lastAction__ = lastAction;
135  lastState_ = prevState;
136  feedback(curState, reward);
137  }
void feedback(const Instantiation &originalState, const Instantiation &reachedState, Idx performedAction, double obtainedReward)
Performs a feedback on the last transition.
Definition: sdyna.cpp:130
Instantiation lastState_
The state in which the system is before we perform a new action.
Definition: sdyna.h:446
Idx lastAction__
The last performed action.
Definition: sdyna.h:469
+ Here is the call graph for this function:

◆ feedback() [2/2]

void gum::SDYNA::feedback ( const Instantiation reachedState,
double  obtainedReward 
)

Performs a feedback on the last transition.

In extenso, learn from the transition.

Parameters
reachedState: the state reached after the transition
obtainedReward: the reward obtained during the transition
Warning
Uses the originalState__ and performedAction__ stored in cache If you want to specify the original state and the performed action, see below

Definition at line 150 of file sdyna.cpp.

References gum::Set< Key, Alloc >::emplace().

150  {
151  Observation* obs = new Observation();
152 
153  for (auto varIter = lastState_.variablesSequence().beginSafe();
154  varIter != lastState_.variablesSequence().endSafe();
155  ++varIter)
156  obs->setModality(*varIter, lastState_.val(**varIter));
157 
158  for (auto varIter = newState.variablesSequence().beginSafe();
159  varIter != newState.variablesSequence().endSafe();
160  ++varIter) {
161  obs->setModality(fmdp_->main2prime(*varIter), newState.val(**varIter));
162 
163  if (this->actionReward__)
164  obs->setRModality(*varIter, lastState_.val(**varIter));
165  else
166  obs->setRModality(*varIter, newState.val(**varIter));
167  }
168 
169  obs->setReward(reward);
170 
172  bin__.insert(obs);
173 
174  setCurrentState(newState);
176 
179 
180  nbObservation__++;
181  }
Idx nbObservation__
The total number of observation made so far.
Definition: sdyna.h:463
void setCurrentState(const Instantiation &currentState)
Sets last state visited to the given state.
Definition: sdyna.h:325
virtual void checkState(const Instantiation &newState, Idx actionId)=0
const Sequence< const DiscreteVariable *> & variablesSequence() const final
Returns the sequence of DiscreteVariable of this instantiation.
Instantiation lastState_
The state in which the system is before we perform a new action.
Definition: sdyna.h:446
Idx val(Idx i) const
Returns the current value of the variable at position i.
FMDP< double > * fmdp_
The learnt Markovian Decision Process.
Definition: sdyna.h:443
IDecisionStrategy * decider__
The decider.
Definition: sdyna.h:456
virtual bool addObservation(Idx actionId, const Observation *obs)=0
Gives to the learner a new transition.
Idx lastAction__
The last performed action.
Definition: sdyna.h:469
Set< Observation *> bin__
Since SDYNA made these observation, it has to delete them on quitting.
Definition: sdyna.h:472
Idx observationPhaseLenght__
The number of observation we make before using again the planer.
Definition: sdyna.h:460
const DiscreteVariable * main2prime(const DiscreteVariable *mainVar) const
Returns the primed variable associate to the given main variable.
Definition: fmdp.h:109
bool actionReward__
Definition: sdyna.h:474
void makePlanning(Idx nbStep)
Starts a new planning.
Definition: sdyna.cpp:190
Idx nbValueIterationStep__
The number of Value Iteration step we perform.
Definition: sdyna.h:466
ILearningStrategy * learner__
The learner used to learn the FMDP.
Definition: sdyna.h:450
+ Here is the call graph for this function:

◆ initialize() [1/2]

void gum::SDYNA::initialize ( )

Initializes the Sdyna instance.

Definition at line 98 of file sdyna.cpp.

References gum::Set< Key, Alloc >::emplace().

98  {
102  }
virtual void initialize(const FMDP< GUM_SCALAR > *fmdp)=0
Initializes the learner.
virtual void initialize(const FMDP< double > *fmdp)
Initializes the learner.
IPlanningStrategy< double > * planer__
The planer used to plan an optimal strategy.
Definition: sdyna.h:453
virtual void initialize(FMDP< double > *fmdp)=0
Initializes the learner.
FMDP< double > * fmdp_
The learnt Markovian Decision Process.
Definition: sdyna.h:443
IDecisionStrategy * decider__
The decider.
Definition: sdyna.h:456
ILearningStrategy * learner__
The learner used to learn the FMDP.
Definition: sdyna.h:450
+ Here is the call graph for this function:

◆ initialize() [2/2]

void gum::SDYNA::initialize ( const Instantiation initialState)

Initializes the Sdyna instance at given state.

Parameters
initialState: the state of the studied system from which we will begin the explore, learn and exploit process

Definition at line 111 of file sdyna.cpp.

References gum::Set< Key, Alloc >::emplace().

111  {
112  initialize();
113  setCurrentState(initialState);
114  }
void setCurrentState(const Instantiation &currentState)
Sets last state visited to the given state.
Definition: sdyna.h:325
void initialize()
Initializes the Sdyna instance.
Definition: sdyna.cpp:98
+ Here is the call graph for this function:

◆ learnerSize()

Size gum::SDYNA::learnerSize ( )
inline

learnerSize

Returns

Definition at line 412 of file sdyna.h.

412 { return learner__->size(); }
virtual Size size()=0
learnerSize
ILearningStrategy * learner__
The learner used to learn the FMDP.
Definition: sdyna.h:450

◆ makePlanning()

void gum::SDYNA::makePlanning ( Idx  nbStep)

Starts a new planning.

Parameters
nbStep: the maximal number of value iteration performed in this planning

Definition at line 190 of file sdyna.cpp.

References gum::Set< Key, Alloc >::emplace().

190  {
191  if (verbose_) std::cout << "Updating decision trees ..." << std::endl;
193  // std::cout << << "Done" << std::endl;
194 
195  if (verbose_) std::cout << "Planning ..." << std::endl;
196  planer__->makePlanning(nbValueIterationStep);
197  // std::cout << << "Done" << std::endl;
198 
200  }
virtual void updateFMDP()=0
Starts an update of datastructure in the associated FMDP.
IPlanningStrategy< double > * planer__
The planer used to plan an optimal strategy.
Definition: sdyna.h:453
IDecisionStrategy * decider__
The decider.
Definition: sdyna.h:456
virtual void makePlanning(Idx nbIte)=0
Starts a new planning.
bool verbose_
Definition: sdyna.h:476
virtual const MultiDimFunctionGraph< ActionSet, SetTerminalNodePolicy > * optimalPolicy()=0
Returns optimalPolicy computed so far current size.
void setOptimalStrategy(const MultiDimFunctionGraph< ActionSet, SetTerminalNodePolicy > *optPol)
ILearningStrategy * learner__
The learner used to learn the FMDP.
Definition: sdyna.h:450
+ Here is the call graph for this function:

◆ modelSize()

Size gum::SDYNA::modelSize ( )
inline

modelSize

Returns

Definition at line 420 of file sdyna.h.

420 { return fmdp_->size(); }
FMDP< double > * fmdp_
The learnt Markovian Decision Process.
Definition: sdyna.h:443
Size size() const
Returns the map binding main variables and prime variables.
Definition: fmdp_tpl.h:394

◆ optimalPolicy2String()

std::string gum::SDYNA::optimalPolicy2String ( )
inline

Definition at line 396 of file sdyna.h.

396 { return planer__->optimalPolicy2String(); }
virtual std::string optimalPolicy2String()=0
Returns a string describing the optimal policy in a dot format.
IPlanningStrategy< double > * planer__
The planer used to plan an optimal strategy.
Definition: sdyna.h:453

◆ optimalPolicySize()

Size gum::SDYNA::optimalPolicySize ( )
inline

optimalPolicySize

Returns

Definition at line 436 of file sdyna.h.

436 { return planer__->optimalPolicySize(); }
virtual Size optimalPolicySize()=0
Returns optimalPolicy computed so far current size.
IPlanningStrategy< double > * planer__
The planer used to plan an optimal strategy.
Definition: sdyna.h:453

◆ RandomMDDInstance()

static SDYNA* gum::SDYNA::RandomMDDInstance ( double  attributeSelectionThreshold = 0.99,
double  similarityThreshold = 0.3,
double  discountFactor = 0.9,
double  epsilon = 1,
Idx  observationPhaseLenght = 100,
Idx  nbValueIterationStep = 10 
)
inlinestatic

@

Definition at line 178 of file sdyna.h.

183  {
184  bool actionReward = true;
185  ILearningStrategy* ls = new FMDPLearner< GTEST, GTEST, IMDDILEARNER >(
186  attributeSelectionThreshold,
187  actionReward,
188  similarityThreshold);
190  = StructuredPlaner< double >::spumddInstance(discountFactor, epsilon);
191  IDecisionStrategy* ds = new RandomDecider();
192  return new SDYNA(ls,
193  ps,
194  ds,
195  observationPhaseLenght,
196  nbValueIterationStep,
197  actionReward);
198  }
SDYNA(ILearningStrategy *learner, IPlanningStrategy< double > *planer, IDecisionStrategy *decider, Idx observationPhaseLenght, Idx nbValueIterationStep, bool actionReward, bool verbose=true)
Constructor.
Definition: sdyna.cpp:57
static StructuredPlaner< GUM_SCALAR > * spumddInstance(GUM_SCALAR discountFactor=0.9, GUM_SCALAR epsilon=0.00001, bool verbose=true)

◆ RandomTreeInstance()

static SDYNA* gum::SDYNA::RandomTreeInstance ( double  attributeSelectionThreshold = 0.99,
double  discountFactor = 0.9,
double  epsilon = 1,
Idx  observationPhaseLenght = 100,
Idx  nbValueIterationStep = 10 
)
inlinestatic

@

Definition at line 203 of file sdyna.h.

207  {
208  bool actionReward = true;
209  ILearningStrategy* ls = new FMDPLearner< CHI2TEST, CHI2TEST, ITILEARNER >(
210  attributeSelectionThreshold,
211  actionReward);
213  = StructuredPlaner< double >::sviInstance(discountFactor, epsilon);
214  IDecisionStrategy* ds = new RandomDecider();
215  return new SDYNA(ls,
216  ps,
217  ds,
218  observationPhaseLenght,
219  nbValueIterationStep,
220  actionReward);
221  }
static StructuredPlaner< GUM_SCALAR > * sviInstance(GUM_SCALAR discountFactor=0.9, GUM_SCALAR epsilon=0.00001, bool verbose=true)
SDYNA(ILearningStrategy *learner, IPlanningStrategy< double > *planer, IDecisionStrategy *decider, Idx observationPhaseLenght, Idx nbValueIterationStep, bool actionReward, bool verbose=true)
Constructor.
Definition: sdyna.cpp:57

◆ RMaxMDDInstance()

static SDYNA* gum::SDYNA::RMaxMDDInstance ( double  attributeSelectionThreshold = 0.99,
double  similarityThreshold = 0.3,
double  discountFactor = 0.9,
double  epsilon = 1,
Idx  observationPhaseLenght = 100,
Idx  nbValueIterationStep = 10 
)
inlinestatic

@

Definition at line 126 of file sdyna.h.

131  {
132  bool actionReward = true;
133  ILearningStrategy* ls = new FMDPLearner< GTEST, GTEST, IMDDILEARNER >(
134  attributeSelectionThreshold,
135  actionReward,
136  similarityThreshold);
137  AdaptiveRMaxPlaner* rm
139  discountFactor,
140  epsilon);
142  IDecisionStrategy* ds = rm;
143  return new SDYNA(ls,
144  ps,
145  ds,
146  observationPhaseLenght,
147  nbValueIterationStep,
148  actionReward);
149  }
SDYNA(ILearningStrategy *learner, IPlanningStrategy< double > *planer, IDecisionStrategy *decider, Idx observationPhaseLenght, Idx nbValueIterationStep, bool actionReward, bool verbose=true)
Constructor.
Definition: sdyna.cpp:57
static AdaptiveRMaxPlaner * ReducedAndOrderedInstance(const ILearningStrategy *learner, double discountFactor=0.9, double epsilon=0.00001, bool verbose=true)

◆ RMaxTreeInstance()

static SDYNA* gum::SDYNA::RMaxTreeInstance ( double  attributeSelectionThreshold = 0.99,
double  discountFactor = 0.9,
double  epsilon = 1,
Idx  observationPhaseLenght = 100,
Idx  nbValueIterationStep = 10 
)
inlinestatic

@

Definition at line 154 of file sdyna.h.

158  {
159  bool actionReward = true;
160  ILearningStrategy* ls
161  = new FMDPLearner< GTEST, GTEST, ITILEARNER >(attributeSelectionThreshold,
162  actionReward);
163  AdaptiveRMaxPlaner* rm
164  = AdaptiveRMaxPlaner::TreeInstance(ls, discountFactor, epsilon);
166  IDecisionStrategy* ds = rm;
167  return new SDYNA(ls,
168  ps,
169  ds,
170  observationPhaseLenght,
171  nbValueIterationStep,
172  actionReward);
173  }
SDYNA(ILearningStrategy *learner, IPlanningStrategy< double > *planer, IDecisionStrategy *decider, Idx observationPhaseLenght, Idx nbValueIterationStep, bool actionReward, bool verbose=true)
Constructor.
Definition: sdyna.cpp:57
static AdaptiveRMaxPlaner * TreeInstance(const ILearningStrategy *learner, double discountFactor=0.9, double epsilon=0.00001, bool verbose=true)

◆ setCurrentState()

void gum::SDYNA::setCurrentState ( const Instantiation currentState)
inline

Sets last state visited to the given state.

During the learning process, we will consider that were in this state before the transition.

Parameters
currentState: the state

Definition at line 325 of file sdyna.h.

325  {
326  lastState_ = currentState;
327  }
Instantiation lastState_
The state in which the system is before we perform a new action.
Definition: sdyna.h:446

◆ spimddiInstance()

static SDYNA* gum::SDYNA::spimddiInstance ( double  attributeSelectionThreshold = 0.99,
double  similarityThreshold = 0.3,
double  discountFactor = 0.9,
double  epsilon = 1,
Idx  observationPhaseLenght = 100,
Idx  nbValueIterationStep = 10 
)
inlinestatic

@

Definition at line 98 of file sdyna.h.

103  {
104  bool actionReward = false;
105  ILearningStrategy* ls = new FMDPLearner< GTEST, GTEST, IMDDILEARNER >(
106  attributeSelectionThreshold,
107  actionReward,
108  similarityThreshold);
111  epsilon,
112  false);
113  IDecisionStrategy* ds = new E_GreedyDecider();
114  return new SDYNA(ls,
115  ps,
116  ds,
117  observationPhaseLenght,
118  nbValueIterationStep,
119  actionReward,
120  false);
121  }
SDYNA(ILearningStrategy *learner, IPlanningStrategy< double > *planer, IDecisionStrategy *decider, Idx observationPhaseLenght, Idx nbValueIterationStep, bool actionReward, bool verbose=true)
Constructor.
Definition: sdyna.cpp:57
static StructuredPlaner< GUM_SCALAR > * spumddInstance(GUM_SCALAR discountFactor=0.9, GUM_SCALAR epsilon=0.00001, bool verbose=true)

◆ spitiInstance()

static SDYNA* gum::SDYNA::spitiInstance ( double  attributeSelectionThreshold = 0.99,
double  discountFactor = 0.9,
double  epsilon = 1,
Idx  observationPhaseLenght = 100,
Idx  nbValueIterationStep = 10 
)
inlinestatic

@

Definition at line 75 of file sdyna.h.

79  {
80  bool actionReward = false;
81  ILearningStrategy* ls = new FMDPLearner< CHI2TEST, CHI2TEST, ITILEARNER >(
82  attributeSelectionThreshold,
83  actionReward);
85  = StructuredPlaner< double >::sviInstance(discountFactor, epsilon);
86  IDecisionStrategy* ds = new E_GreedyDecider();
87  return new SDYNA(ls,
88  ps,
89  ds,
90  observationPhaseLenght,
91  nbValueIterationStep,
92  actionReward);
93  }
static StructuredPlaner< GUM_SCALAR > * sviInstance(GUM_SCALAR discountFactor=0.9, GUM_SCALAR epsilon=0.00001, bool verbose=true)
SDYNA(ILearningStrategy *learner, IPlanningStrategy< double > *planer, IDecisionStrategy *decider, Idx observationPhaseLenght, Idx nbValueIterationStep, bool actionReward, bool verbose=true)
Constructor.
Definition: sdyna.cpp:57

◆ takeAction() [1/2]

Idx gum::SDYNA::takeAction ( const Instantiation curState)
Returns
actionId the id of the action the SDyna instance wish to be performed
Parameters
curStatethe state in which we currently are

Definition at line 208 of file sdyna.cpp.

References gum::Set< Key, Alloc >::emplace().

208  {
209  lastState_ = curState;
210  return takeAction();
211  }
Idx takeAction()
Definition: sdyna.cpp:218
Instantiation lastState_
The state in which the system is before we perform a new action.
Definition: sdyna.h:446
+ Here is the call graph for this function:

◆ takeAction() [2/2]

Idx gum::SDYNA::takeAction ( )
Returns
the id of the action the SDyna instance wish to be performed

Definition at line 218 of file sdyna.cpp.

References gum::Set< Key, Alloc >::emplace().

218  {
219  ActionSet actionSet = decider__->stateOptimalPolicy(lastState_);
220  if (actionSet.size() == 1) {
221  lastAction__ = actionSet[0];
222  } else {
223  Idx randy = (Idx)((double)std::rand() / (double)RAND_MAX * actionSet.size());
224  lastAction__ = actionSet[randy == actionSet.size() ? 0 : randy];
225  }
226  return lastAction__;
227  }
virtual ActionSet stateOptimalPolicy(const Instantiation &curState)
Instantiation lastState_
The state in which the system is before we perform a new action.
Definition: sdyna.h:446
IDecisionStrategy * decider__
The decider.
Definition: sdyna.h:456
Idx lastAction__
The last performed action.
Definition: sdyna.h:469
Size Idx
Type for indexes.
Definition: types.h:52
+ Here is the call graph for this function:

◆ toString()

std::string gum::SDYNA::toString ( )

Returns.

Returns
a string describing the learned FMDP, and the associated optimal policy. Both in DOT language.

Definition at line 232 of file sdyna.cpp.

References gum::Set< Key, Alloc >::emplace().

232  {
233  std::stringstream description;
234 
235  description << fmdp_->toString() << std::endl;
236  description << planer__->optimalPolicy2String() << std::endl;
237 
238  return description.str();
239  }
virtual std::string optimalPolicy2String()=0
Returns a string describing the optimal policy in a dot format.
IPlanningStrategy< double > * planer__
The planer used to plan an optimal strategy.
Definition: sdyna.h:453
FMDP< double > * fmdp_
The learnt Markovian Decision Process.
Definition: sdyna.h:443
std::string toString() const
Displays the FMDP in a Dot format.
Definition: fmdp_tpl.h:372
+ Here is the call graph for this function:

◆ valueFunctionSize()

Size gum::SDYNA::valueFunctionSize ( )
inline

valueFunctionSize

Returns

Definition at line 428 of file sdyna.h.

428 { return planer__->vFunctionSize(); }
IPlanningStrategy< double > * planer__
The planer used to plan an optimal strategy.
Definition: sdyna.h:453
virtual Size vFunctionSize()=0
Returns vFunction computed so far current size.

Member Data Documentation

◆ actionReward__

bool gum::SDYNA::actionReward__
private

Definition at line 474 of file sdyna.h.

◆ bin__

Set< Observation* > gum::SDYNA::bin__
private

Since SDYNA made these observation, it has to delete them on quitting.

Definition at line 472 of file sdyna.h.

◆ decider__

IDecisionStrategy* gum::SDYNA::decider__
private

The decider.

Definition at line 456 of file sdyna.h.

◆ fmdp_

FMDP< double >* gum::SDYNA::fmdp_
protected

The learnt Markovian Decision Process.

Definition at line 443 of file sdyna.h.

◆ lastAction__

Idx gum::SDYNA::lastAction__
private

The last performed action.

Definition at line 469 of file sdyna.h.

◆ lastState_

Instantiation gum::SDYNA::lastState_
protected

The state in which the system is before we perform a new action.

Definition at line 446 of file sdyna.h.

◆ learner__

ILearningStrategy* gum::SDYNA::learner__
private

The learner used to learn the FMDP.

Definition at line 450 of file sdyna.h.

◆ nbObservation__

Idx gum::SDYNA::nbObservation__
private

The total number of observation made so far.

Definition at line 463 of file sdyna.h.

◆ nbValueIterationStep__

Idx gum::SDYNA::nbValueIterationStep__
private

The number of Value Iteration step we perform.

Definition at line 466 of file sdyna.h.

◆ observationPhaseLenght__

Idx gum::SDYNA::observationPhaseLenght__
private

The number of observation we make before using again the planer.

Definition at line 460 of file sdyna.h.

◆ planer__

IPlanningStrategy< double >* gum::SDYNA::planer__
private

The planer used to plan an optimal strategy.

Definition at line 453 of file sdyna.h.

◆ verbose_

bool gum::SDYNA::verbose_
private

Definition at line 476 of file sdyna.h.


The documentation for this class was generated from the following files: