breeze.optimize
Type members
Classlikes
Created by jda on 3/17/15.
Created by jda on 3/17/15.
Implements the L2^2 and L1 updates from Duchi et al 2010 Adaptive Subgradient Methods for Online Learning and Stochastic Optimization.
Implements the L2^2 and L1 updates from Duchi et al 2010 Adaptive Subgradient Methods for Online Learning and Stochastic Optimization.
Basically, we use "forward regularization" and an adaptive step size based on the previous gradients.
Approximates a gradient by finite differences.
Approximates a gradient by finite differences.
A line search optimizes a function of one variable without analytic gradient information. It's often used approximately (e.g. in backtracking line search), where there is no intrinsic termination criterion, only extrinsic
A line search optimizes a function of one variable without analytic gradient information. It's often used approximately (e.g. in backtracking line search), where there is no intrinsic termination criterion, only extrinsic
Implements the Backtracking Linesearch like that in LBFGS-C (which is (c) 2007-2010 Naoaki Okazaki under BSD)
Implements the Backtracking Linesearch like that in LBFGS-C (which is (c) 2007-2010 Naoaki Okazaki under BSD)
Basic idea is that we need to find an alpha that is sufficiently smaller than f(0), and also possibly requiring that the slope of f decrease by the right amount (wolfe conditions)
A diff function that supports subsets of the data. By default it evaluates on all the data
A diff function that supports subsets of the data. By default it evaluates on all the data
- Companion:
- object
Represents a differentiable function whose output is guaranteed to be consistent
Represents a differentiable function whose output is guaranteed to be consistent
- Companion:
- object
The empirical hessian evaluates the derivative for multiplcation.
The empirical hessian evaluates the derivative for multiplcation.
H * d = \lim_e -> 0 (f'(x + e * d) - f'(x))/e
- Value parameters:
- eps
a small value
- grad
the gradient at x
- x
the point we compute the hessian for
- Companion:
- object
- Companion:
- object
The Fisher matrix approximates the Hessian by E[grad grad']. We further approximate this with a monte carlo approximation to the expectation.
The Fisher matrix approximates the Hessian by E[grad grad']. We further approximate this with a monte carlo approximation to the expectation.
- Companion:
- object
Class that compares the computed gradient with an empirical gradient based on finite differences. Essential for debugging dynamic programs.
Class that compares the computed gradient with an empirical gradient based on finite differences. Essential for debugging dynamic programs.
Port of LBFGS to Scala.
Port of LBFGS to Scala.
Special note for LBFGS: If you use it in published work, you must cite one of:
- J. Nocedal. Updating Quasi-Newton Matrices with Limited Storage (1980), Mathematics of Computation 35, pp. 773-782.
- D.C. Liu and J. Nocedal. On the Limited mem Method for Large Scale Optimization (1989), Mathematical Programming B, 45, 3, pp. 503-528.
- Value parameters:
- m:
The memory of the search. 3 to 7 is usually sufficient.
- Companion:
- object
This algorithm is refered the paper "A LIMITED MEMOR Y ALGORITHM F OR BOUND CONSTRAINED OPTIMIZA TION" written by Richard H.Byrd Peihuang Lu Jorge Nocedal and Ciyou Zhu Created by fanming.chen on 2015/3/7 0007. If StrongWolfeLineSearch(maxZoomIter,maxLineSearchIter) is small, the wolfeRuleSearch.minimize may throw FirstOrderException, it should increase the two variables to appropriate value
This algorithm is refered the paper "A LIMITED MEMOR Y ALGORITHM F OR BOUND CONSTRAINED OPTIMIZA TION" written by Richard H.Byrd Peihuang Lu Jorge Nocedal and Ciyou Zhu Created by fanming.chen on 2015/3/7 0007. If StrongWolfeLineSearch(maxZoomIter,maxLineSearchIter) is small, the wolfeRuleSearch.minimize may throw FirstOrderException, it should increase the two variables to appropriate value
- Companion:
- object
A line search optimizes a function of one variable without analytic gradient information. Differs only in whether or not it tries to find an exact minimizer
A line search optimizes a function of one variable without analytic gradient information. Differs only in whether or not it tries to find an exact minimizer
- Companion:
- object
Implements the Orthant-wise Limited Memory QuasiNewton method, which is a variant of LBFGS that handles L1 regularization.
Implements the Orthant-wise Limited Memory QuasiNewton method, which is a variant of LBFGS that handles L1 regularization.
Paper is Andrew and Gao (2007) Scalable Training of L1-Regularized Log-Linear Models
- Companion:
- object
Represents a function for which we can easily compute the Hessian.
Represents a function for which we can easily compute the Hessian.
For conjugate gradient methods, you can play tricks with the hessian, returning an object that only supports multiplication.
- Companion:
- object
SPG is a Spectral Projected Gradient minimizer; it minimizes a differentiable function subject to the optimum being in some set, given by the projection operator projection
SPG is a Spectral Projected Gradient minimizer; it minimizes a differentiable function subject to the optimum being in some set, given by the projection operator projection
- Type parameters:
- T
vector type
- Value parameters:
- alphaMax
longest step
- alphaMin
shortest step
- bbMemory
number of history entries for linesearch
- curvilinear
if curvilinear true, do the projection inside line search in place of doing it in chooseDescentDirection
- initFeas
is the initial guess feasible, or should it be projected?
- maxIter
maximum number of iterations
- maxSrcht
maximum number of iterations inside line search
- projection
projection operations
- suffDec
sufficient decrease parameter
- tolerance
termination criterion: tolerance for norm of projected gradient
A differentiable function whose output is not guaranteed to be the same across consecutive invocations.
A differentiable function whose output is not guaranteed to be the same across consecutive invocations.
Minimizes a function using stochastic gradient descent
Minimizes a function using stochastic gradient descent
- Companion:
- object
Implements a TruncatedNewton Trust region method (like Tron). Also implements "Hessian Free learning". We have a few extra tricks though... :)
Implements a TruncatedNewton Trust region method (like Tron). Also implements "Hessian Free learning". We have a few extra tricks though... :)
Value members
Concrete methods
Returns a sequence of states representing the iterates of a solver, given an breeze.optimize.IterableOptimizationPackage that knows how to minimize The actual state class varies with the kind of function passed in. Typically, they have a .x value of type Vector that is the current point being evaluated, and .value is the current objective value
Returns a sequence of states representing the iterates of a solver, given an breeze.optimize.IterableOptimizationPackage that knows how to minimize The actual state class varies with the kind of function passed in. Typically, they have a .x value of type Vector that is the current point being evaluated, and .value is the current objective value
Minimizes a function, given an breeze.optimize.OptimizationPackage that knows how to minimize
Minimizes a function, given an breeze.optimize.OptimizationPackage that knows how to minimize