AdaptiveGradientDescent

Implements the L2^2 and L1 updates from Duchi et al 2010 Adaptive Subgradient Methods for Online Learning and Stochastic Optimization.

Basically, we use "forward regularization" and an adaptive step size based on the previous gradients.

class Object
trait Matchable
class Any

Type members

Classlikes

class L1Regularization[T](val lambda: Double, delta: Double, eta: Double, maxIter: Int)(implicit space: MutableFiniteCoordinateField[T, _, Double], rand: RandBasis) extends StochasticGradientDescent[T]

Implements the L1 regularization update.

Implements the L1 regularization update.

Each step is:

x_{t+1}i = sign(x_{t,i} - eta/s_i * g_ti) * (abs(x_ti - eta/s_ti * g_ti) - lambda * eta /s_ti))_+

where g_ti is the gradient and s_ti = \sqrt(\sum_t'^{t} g_ti^2)

class L2Regularization[T](val regularizationConstant: Double, stepSize: Double, maxIter: Int, tolerance: Double, minImprovementWindow: Int)(implicit vspace: MutableFiniteCoordinateField[T, _, Double], rand: RandBasis) extends StochasticGradientDescent[T]

Implements the L2 regularization update.

Implements the L2 regularization update.

Each step is:

x_{t+1}i = (s_{ti} * x_{ti} - \eta * g_ti) / (eta * regularization + delta + s_ti)

where g_ti is the gradient and s_ti = \sqrt(\sum_t'^{t} g_ti^2)