admin管理员组

文章数量:1666734

Ornstein-Uhlenbeck Action Noise

An OrnsteinUhlenbeckActionNoise object has the following numeric value

properties.

PropertyDescriptionDefault ValueInitialActionInitial value of action for noise model0

MeanNoise model mean0

MeanAttractionConstantConstant specifying how quickly the noise model output is attracted to the mean0.15

StandardDeviationDecayRateDecay rate of the standard deviation0

StandardDeviationNoise model standard deviation0.3

StandardDeviationMinMinimum standard deviation0

At each sample time step k, the noise value v(k) is

updated using the following formula, where Ts is the agent sample

time, and the initial value v(1) is defined by the InitialAction

parameter.

v(k+1) = v(k) + MeanAttractionConstant.*(Mean - v(k)).*Ts

+ StandardDeviation(k).*randn(size(Mean)).*sqrt(Ts)

At each sample time step, the standard deviation decays as shown in the following code.

decayedStandardDeviation = StandardDeviation(k).*(1 - StandardDeviationDecayRate);

StandardDeviation(k+1) = max(decayedStandardDeviation,StandardDeviationMin);

You can calculate how many samples it will take for the standard deviation to be halved using

this simple formula.

halflife = log(0.5)/log(1-StandardDeviationDecayRate);

For continuous action signals, it is important to set the noise standard deviation

appropriately to encourage exploration. It is common to set

StandardDeviation*sqrt(Ts) to a value between 1% and 10% of

your action range.

If your agent converges on local optima too quickly, promote agent exploration by increasing

the amount of noise; that is, by increasing the standard deviation. Also, to increase

exploration, you can reduce the StandardDeviationDecayRate.

本文标签: AgentmatlabDDPGOptions