aGrUM  0.20.3
a C++ library for (probabilistic) graphical models
gum::E_GreedyDecider Class Reference

<agrum/FMDP/decision/E_GreedyDecider.h> More...

#include <E_GreedyDecider.h>

+ Inheritance diagram for gum::E_GreedyDecider:
+ Collaboration diagram for gum::E_GreedyDecider:

Public Member Functions

Constructor & destructor.
 E_GreedyDecider ()
 Constructor. More...
 
 ~E_GreedyDecider ()
 Destructor. More...
 
Initialization
void initialize (const FMDP< double > *fmdp)
 Initializes the learner. More...
 
Incremental methods
void checkState (const Instantiation &newState, Idx actionId)
 
ActionSet stateOptimalPolicy (const Instantiation &curState)
 

Incremental methods

void setOptimalStrategy (const MultiDimFunctionGraph< ActionSet, SetTerminalNodePolicy > *optPol)
 
const MultiDimFunctionGraph< ActionSet, SetTerminalNodePolicy > * optPol_
 
ActionSet allActions_
 

Detailed Description

<agrum/FMDP/decision/E_GreedyDecider.h>

Class to make decision following an epsilon-greedy compromise between exploration and exploitation

Definition at line 56 of file E_GreedyDecider.h.

Constructor & Destructor Documentation

◆ E_GreedyDecider()

gum::E_GreedyDecider::E_GreedyDecider ( )

Constructor.

Definition at line 48 of file E_GreedyDecider.cpp.

References gum::Set< Key, Alloc >::emplace().

48  {
49  GUM_CONSTRUCTOR(E_GreedyDecider);
50 
51  _sss_ = 1.0;
52  }
E_GreedyDecider()
Constructor.
+ Here is the call graph for this function:

◆ ~E_GreedyDecider()

gum::E_GreedyDecider::~E_GreedyDecider ( )

Destructor.

Definition at line 60 of file E_GreedyDecider.cpp.

References gum::Set< Key, Alloc >::emplace().

60  {
61  GUM_DESTRUCTOR(E_GreedyDecider);
62  ;
63  }
E_GreedyDecider()
Constructor.
+ Here is the call graph for this function:

Member Function Documentation

◆ checkState()

void gum::E_GreedyDecider::checkState ( const Instantiation newState,
Idx  actionId 
)
virtual

Implements gum::IDecisionStrategy.

Definition at line 93 of file E_GreedyDecider.cpp.

References gum::Set< Key, Alloc >::emplace().

93  {
94  if (_statecpt_.nbVisitedStates() == 0)
95  _statecpt_.reset(reachedState);
96  else if (!_statecpt_.checkState(reachedState))
97  _statecpt_.addState(reachedState);
98  }
bool checkState(const Instantiation &state)
Definition: statesChecker.h:73
void addState(const Instantiation &)
StatesChecker _statecpt_
void reset(const Instantiation &)
+ Here is the call graph for this function:

◆ initialize()

void gum::E_GreedyDecider::initialize ( const FMDP< double > *  fmdp)
virtual

Initializes the learner.

Reimplemented from gum::IDecisionStrategy.

Definition at line 75 of file E_GreedyDecider.cpp.

References gum::Set< Key, Alloc >::emplace().

75  {
77  for (auto varIter = fmdp->beginVariables(); varIter != fmdp->endVariables(); ++varIter)
78  _sss_ *= (double)(*varIter)->domainSize();
79  }
virtual void initialize(const FMDP< double > *fmdp)
Initializes the learner.
SequenceIteratorSafe< const DiscreteVariable *> beginVariables() const
Returns an iterator reference to he beginning of the list of variables.
Definition: fmdp.h:94
SequenceIteratorSafe< const DiscreteVariable *> endVariables() const
Returns an iterator reference to the end of the list of variables.
Definition: fmdp.h:101
+ Here is the call graph for this function:

◆ setOptimalStrategy()

void gum::IDecisionStrategy::setOptimalStrategy ( const MultiDimFunctionGraph< ActionSet, SetTerminalNodePolicy > *  optPol)
inlineinherited

Definition at line 89 of file IDecisionStrategy.h.

89  {
90  optPol_ = const_cast< MultiDimFunctionGraph< ActionSet, SetTerminalNodePolicy >* >(optPol);
91  }
const MultiDimFunctionGraph< ActionSet, SetTerminalNodePolicy > * optPol_

◆ stateOptimalPolicy()

ActionSet gum::E_GreedyDecider::stateOptimalPolicy ( const Instantiation curState)
virtual

Reimplemented from gum::IDecisionStrategy.

Definition at line 106 of file E_GreedyDecider.cpp.

References gum::Set< Key, Alloc >::emplace().

106  {
107  double explo = (double)std::rand() / (double)RAND_MAX;
108  double temp = std::pow((_sss_ - (double)_statecpt_.nbVisitedStates()) / _sss_, 3.0);
109  double exploThreshold = temp < 0.1 ? 0.1 : temp;
110 
111  // std::cout << exploThreshold << std::endl;
112 
113  ActionSet optimalSet = IDecisionStrategy::stateOptimalPolicy(curState);
114  if (explo > exploThreshold) {
115  // std::cout << "Exploit : " << optimalSet << std::endl;
116  return optimalSet;
117  }
118 
119  if (allActions_.size() > optimalSet.size()) {
120  ActionSet ret(allActions_);
121  ret -= optimalSet;
122  // std::cout << "Explore : " << ret << std::endl;
123  return ret;
124  }
125 
126  // std::cout << "Explore : " << allActions_ << std::endl;
127  return allActions_;
128  }
Size size() const
Gives the size.
Definition: actionSet.h:205
virtual ActionSet stateOptimalPolicy(const Instantiation &curState)
StatesChecker _statecpt_
+ Here is the call graph for this function:

Member Data Documentation

◆ _sss_

double gum::E_GreedyDecider::_sss_
private

Definition at line 100 of file E_GreedyDecider.h.

◆ _statecpt_

StatesChecker gum::E_GreedyDecider::_statecpt_
private

Definition at line 99 of file E_GreedyDecider.h.

◆ allActions_

ActionSet gum::IDecisionStrategy::allActions_
protectedinherited

Definition at line 102 of file IDecisionStrategy.h.

◆ optPol_

const MultiDimFunctionGraph< ActionSet, SetTerminalNodePolicy >* gum::IDecisionStrategy::optPol_
protectedinherited

Definition at line 99 of file IDecisionStrategy.h.


The documentation for this class was generated from the following files: