breeze.optimize

Type members

Classlikes

class AdaDeltaGradientDescent[T](rho: Double, maxIter: Int, tolerance: Double, improvementTolerance: Double, minImprovementWindow: Int)(implicit vspace: MutableFiniteCoordinateField[T, _, Double], rand: RandBasis) extends StochasticGradientDescent[T]

Created by jda on 3/17/15.

Created by jda on 3/17/15.

Implements the L2^2 and L1 updates from Duchi et al 2010 Adaptive Subgradient Methods for Online Learning and Stochastic Optimization.

Implements the L2^2 and L1 updates from Duchi et al 2010 Adaptive Subgradient Methods for Online Learning and Stochastic Optimization.

Basically, we use "forward regularization" and an adaptive step size based on the previous gradients.

class ApproximateGradientFunction[K, T](f: T => Double, epsilon: Double)(implicit zeros: CanCreateZerosLike[T, T], view: T <:< Tensor[K, Double], copy: CanCopy[T]) extends DiffFunction[T]

Approximates a gradient by finite differences.

Approximates a gradient by finite differences.

A line search optimizes a function of one variable without analytic gradient information. It's often used approximately (e.g. in backtracking line search), where there is no intrinsic termination criterion, only extrinsic

A line search optimizes a function of one variable without analytic gradient information. It's often used approximately (e.g. in backtracking line search), where there is no intrinsic termination criterion, only extrinsic

class BacktrackingLineSearch(initfval: Double, maxIterations: Int, shrinkStep: Double, growStep: Double, cArmijo: Double, cWolfe: Double, minAlpha: Double, maxAlpha: Double, enforceWolfeConditions: Boolean, enforceStrongWolfeConditions: Boolean) extends ApproximateLineSearch

Implements the Backtracking Linesearch like that in LBFGS-C (which is (c) 2007-2010 Naoaki Okazaki under BSD)

Implements the Backtracking Linesearch like that in LBFGS-C (which is (c) 2007-2010 Naoaki Okazaki under BSD)

Basic idea is that we need to find an alpha that is sufficiently smaller than f(0), and also possibly requiring that the slope of f decrease by the right amount (wolfe conditions)

trait BatchDiffFunction[T] extends DiffFunction[T] with (T, IndexedSeq[Int]) => Double

A diff function that supports subsets of the data. By default it evaluates on all the data

A diff function that supports subsets of the data. By default it evaluates on all the data

Companion:
object
Companion:
class
case class BatchSize(size: Int) extends OptimizationOption
class CachedBatchDiffFunction[T](obj: BatchDiffFunction[T])(implicit evidence$2: CanCopy[T]) extends BatchDiffFunction[T]
class CachedDiffFunction[T](obj: DiffFunction[T])(implicit evidence$1: CanCopy[T]) extends DiffFunction[T]

Represents a differentiable function whose output is guaranteed to be consistent

Represents a differentiable function whose output is guaranteed to be consistent

Companion:
object
class EmpiricalHessian[T](df: DiffFunction[T], x: T, grad: T, eps: Double)(implicit vs: VectorSpace[T, Double])

The empirical hessian evaluates the derivative for multiplcation.

The empirical hessian evaluates the derivative for multiplcation.

H * d = \lim_e -> 0 (f'(x + e * d) - f'(x))/e

Value parameters:
eps

a small value

grad

the gradient at x

x

the point we compute the hessian for

Companion:
object
Companion:
class
sealed class FirstOrderException(msg: String) extends RuntimeException
abstract class FirstOrderMinimizer[T, DF <: StochasticDiffFunction[T]](val convergenceCheck: ConvergenceCheck[T])(implicit space: NormedModule[T, Double]) extends Minimizer[T, DF] with SerializableLogging
Companion:
object
Companion:
class
class FisherDiffFunction[T](df: BatchDiffFunction[T], gradientsToKeep: Int)(implicit vs: MutableInnerProductVectorSpace[T, Double]) extends SecondOrderFunction[T, FisherMatrix[T]]
class FisherMatrix[T](grads: IndexedSeq[T])(implicit vs: MutableInnerProductVectorSpace[T, Double])

The Fisher matrix approximates the Hessian by E[grad grad']. We further approximate this with a monte carlo approximation to the expectation.

The Fisher matrix approximates the Hessian by E[grad grad']. We further approximate this with a monte carlo approximation to the expectation.

Companion:
object
Companion:
class

Class that compares the computed gradient with an empirical gradient based on finite differences. Essential for debugging dynamic programs.

Class that compares the computed gradient with an empirical gradient based on finite differences. Essential for debugging dynamic programs.

trait IterableOptimizationPackage[Function, Vector, State] extends OptimizationPackage[Function, Vector]
case class L1Regularization(value: Double) extends OptimizationOption
case class L2Regularization(value: Double) extends OptimizationOption
class LBFGS[T](convergenceCheck: ConvergenceCheck[T], m: Int)(implicit space: MutableInnerProductModule[T, Double]) extends FirstOrderMinimizer[T, DiffFunction[T]] with SerializableLogging

Port of LBFGS to Scala.

Port of LBFGS to Scala.

Special note for LBFGS: If you use it in published work, you must cite one of:

  • J. Nocedal. Updating Quasi-Newton Matrices with Limited Storage (1980), Mathematics of Computation 35, pp. 773-782.
  • D.C. Liu and J. Nocedal. On the Limited mem Method for Large Scale Optimization (1989), Mathematical Programming B, 45, 3, pp. 503-528.
Value parameters:
m:

The memory of the search. 3 to 7 is usually sufficient.

Companion:
object
object LBFGS
Companion:
class
class LBFGSB(lowerBounds: DenseVector[Double], upperBounds: DenseVector[Double], maxIter: Int, m: Int, tolerance: Double, maxZoomIter: Int, maxLineSearchIter: Int) extends FirstOrderMinimizer[DenseVector[Double], DiffFunction[DenseVector[Double]]] with SerializableLogging

This algorithm is refered the paper "A LIMITED MEMOR Y ALGORITHM F OR BOUND CONSTRAINED OPTIMIZA TION" written by Richard H.Byrd   Peihuang Lu   Jorge Nocedal  and Ciyou Zhu Created by fanming.chen on 2015/3/7 0007. If StrongWolfeLineSearch(maxZoomIter,maxLineSearchIter) is small, the wolfeRuleSearch.minimize may throw FirstOrderException, it should increase the two variables to appropriate value

This algorithm is refered the paper "A LIMITED MEMOR Y ALGORITHM F OR BOUND CONSTRAINED OPTIMIZA TION" written by Richard H.Byrd   Peihuang Lu   Jorge Nocedal  and Ciyou Zhu Created by fanming.chen on 2015/3/7 0007. If StrongWolfeLineSearch(maxZoomIter,maxLineSearchIter) is small, the wolfeRuleSearch.minimize may throw FirstOrderException, it should increase the two variables to appropriate value

Companion:
object
object LBFGSB
Companion:
class

A line search optimizes a function of one variable without analytic gradient information. Differs only in whether or not it tries to find an exact minimizer

A line search optimizes a function of one variable without analytic gradient information. Differs only in whether or not it tries to find an exact minimizer

Companion:
object
object LineSearch
Companion:
class
class LineSearchFailed(gradNorm: Double, dirNorm: Double) extends FirstOrderException
case class MaxIterations(num: Int) extends OptimizationOption
trait Minimizer[T, -F]

Anything that can minimize a function

Anything that can minimize a function

class OWLQN[K, T](convergenceCheck: ConvergenceCheck[T], m: Int, l1reg: K => Double)(implicit space: MutableEnumeratedCoordinateField[T, K, Double]) extends LBFGS[T] with SerializableLogging

Implements the Orthant-wise Limited Memory QuasiNewton method, which is a variant of LBFGS that handles L1 regularization.

Implements the Orthant-wise Limited Memory QuasiNewton method, which is a variant of LBFGS that handles L1 regularization.

Paper is Andrew and Gao (2007) Scalable Training of L1-Regularized Log-Linear Models

sealed trait OptimizationOption extends OptParams => OptParams
Companion:
object
Companion:
class
trait OptimizationPackage[Function, Vector]
Companion:
object
case object PreferBatch extends OptimizationOption
case object PreferOnline extends OptimizationOption
class ProjectedQuasiNewton(convergenceCheck: ConvergenceCheck[DenseVector[Double]], val innerOptimizer: SpectralProjectedGradient[DenseVector[Double]], val m: Int, val initFeas: Boolean, val testOpt: Boolean, val maxSrchIt: Int, val gamma: Double, val projection: DenseVector[Double] => DenseVector[Double])(implicit space: MutableInnerProductModule[DenseVector[Double], Double]) extends FirstOrderMinimizer[DenseVector[Double], DiffFunction[DenseVector[Double]]] with Projecting[DenseVector[Double]] with SerializableLogging
Companion:
object
trait Projecting[T]

Root finding algorithms

Root finding algorithms

trait SecondOrderFunction[T, H] extends DiffFunction[T]

Represents a function for which we can easily compute the Hessian.

Represents a function for which we can easily compute the Hessian.

For conjugate gradient methods, you can play tricks with the hessian, returning an object that only supports multiplication.

Companion:
object
Companion:
class
class SpectralProjectedGradient[T](val projection: T => T, tolerance: Double, suffDec: Double, fvalMemory: Int, alphaMax: Double, alphaMin: Double, bbMemory: Int, maxIter: Int, val initFeas: Boolean, val curvilinear: Boolean, val bbType: Int, val maxSrcht: Int)(implicit space: MutableVectorField[T, Double]) extends FirstOrderMinimizer[T, DiffFunction[T]] with Projecting[T] with SerializableLogging

SPG is a Spectral Projected Gradient minimizer; it minimizes a differentiable function subject to the optimum being in some set, given by the projection operator projection

SPG is a Spectral Projected Gradient minimizer; it minimizes a differentiable function subject to the optimum being in some set, given by the projection operator projection

Type parameters:
T

vector type

Value parameters:
alphaMax

longest step

alphaMin

shortest step

bbMemory

number of history entries for linesearch

curvilinear

if curvilinear true, do the projection inside line search in place of doing it in chooseDescentDirection

initFeas

is the initial guess feasible, or should it be projected?

maxIter

maximum number of iterations

maxSrcht

maximum number of iterations inside line search

projection

projection operations

suffDec

sufficient decrease parameter

tolerance

termination criterion: tolerance for norm of projected gradient

case class StepSizeScale(alpha: Double) extends OptimizationOption
class StochasticAveragedGradient[T](maxIter: Int, initialStepSize: Double, tuneStepFrequency: Int, l2Regularization: Double)(implicit vs: MutableInnerProductModule[T, Double]) extends FirstOrderMinimizer[T, BatchDiffFunction[T]]

A differentiable function whose output is not guaranteed to be the same across consecutive invocations.

A differentiable function whose output is not guaranteed to be the same across consecutive invocations.

abstract class StochasticGradientDescent[T](val defaultStepSize: Double, val maxIter: Int, tolerance: Double, fvalMemory: Int)(implicit val vspace: NormedModule[T, Double]) extends FirstOrderMinimizer[T, StochasticDiffFunction[T]] with SerializableLogging

Minimizes a function using stochastic gradient descent

Minimizes a function using stochastic gradient descent

Companion:
object
class StrongWolfeLineSearch(maxZoomIter: Int, maxLineSearchIter: Int) extends CubicLineSearch
case class Tolerance(fvalTolerance: Double, gvalTolerance: Double) extends OptimizationOption
class TruncatedNewtonMinimizer[T, H](maxIterations: Int, tolerance: Double, l2Regularization: Double, m: Int)(implicit space: MutableVectorField[T, Double], mult: Impl2[H, T, T]) extends Minimizer[T, SecondOrderFunction[T, H]] with SerializableLogging

Implements a TruncatedNewton Trust region method (like Tron). Also implements "Hessian Free learning". We have a few extra tricks though... :)

Implements a TruncatedNewton Trust region method (like Tron). Also implements "Hessian Free learning". We have a few extra tricks though... :)

Value members

Concrete methods

def iterations[Objective, Vector, State](fn: Objective, init: Vector, options: OptimizationOption*)(implicit optimization: IterableOptimizationPackage[Objective, Vector, State]): Iterator[State]

Returns a sequence of states representing the iterates of a solver, given an breeze.optimize.IterableOptimizationPackage that knows how to minimize The actual state class varies with the kind of function passed in. Typically, they have a .x value of type Vector that is the current point being evaluated, and .value is the current objective value

Returns a sequence of states representing the iterates of a solver, given an breeze.optimize.IterableOptimizationPackage that knows how to minimize The actual state class varies with the kind of function passed in. Typically, they have a .x value of type Vector that is the current point being evaluated, and .value is the current objective value

def minimize[Objective, Vector](fn: Objective, init: Vector, options: OptimizationOption*)(implicit optimization: OptimizationPackage[Objective, Vector]): Vector

Minimizes a function, given an breeze.optimize.OptimizationPackage that knows how to minimize

Minimizes a function, given an breeze.optimize.OptimizationPackage that knows how to minimize