bask
.Optimizer¶
- class bask.Optimizer(dimensions, n_points=500, n_initial_points=10, init_strategy='sb', gp_kernel=None, gp_kwargs=None, gp_priors=None, acq_func='pvrs', acq_func_kwargs=None, random_state=None, **kwargs)[source]¶
Execute a stepwise Bayesian optimization.
- Parameters:
- dimensionslist, shape (n_dims,)
List of search space dimensions. Each search dimension can be defined either as
a (lower_bound, upper_bound) tuple (for Real or Integer dimensions),
a (lower_bound, upper_bound, “prior”) tuple (for Real dimensions),
as a list of categories (for Categorical dimensions), or
an instance of a Dimension object (Real, Integer or Categorical).
- n_pointsint, default=500
Number of random points to evaluate the acquisition function on.
- n_initial_pointsint, default=10
Number of initial points to sample before fitting the GP.
- init_strategystring or None, default=”sb”
Type of initialization strategy to use for the initial
n_initial_points
. Should be one of“sb”: The Steinberger low-discrepancy sequence
“r2”: The R2 sequence (works well for up to two parameters)
“random” or None: Uniform random sampling
- gp_kernelkernel object
The kernel specifying the covariance function of the GP. If None is passed, a suitable default kernel is constructed. Note that the kernel’s hyperparameters are estimated using MCMC during fitting.
- gp_kwargsdict, optional
Dict of arguments passed to
BayesGPR
. For example,{'normalize_y': True}
would allow the GP to normalize the output values before fitting.- gp_priorslist of callables, optional
List of prior distributions for the kernel hyperparameters of the GP. Each callable returns the logpdf of the prior distribution. Remember that a WhiteKernel is added to the
gp_kernel
, which is why you need to include a prior distribution for that as well. If None, will try to guess suitable prior distributions.- acq_funcstring or Acquisition object, default=”pvrs”
Acquisition function to use as a criterion to select new points to test. By default we use “pvrs”, which is a very robust criterion with fast convergence. Should be one of
‘pvrs’ Predictive variance reductions search
‘mes’ Max-value entropy search
‘ei’ Expected improvement
‘ttei’ Top-two expected improvement
‘lcb’ Lower confidence bound
‘mean’ Expected value of the GP
‘ts’ Thompson sampling
‘vr’ Global variance reduction
Can also be a custom
Acquisition
object.- acq_func_kwargsdict, optional
Dict of arguments passed to
Acquisition
.- random_stateint or RandomState or None, optional, default=None
Pseudo random number generator state used for random uniform sampling from lists of possible values instead of scipy.stats distributions.
- Attributes:
- Xilist
Points at which objective has been evaluated.
- yiscalar
Values of objective at corresponding points in Xi.
- spaceSpace
An instance of
skopt.space.Space
. Stores parameter search space used to sample points, bounds, and type of parameters.- gpBayesGPR object
The current underlying GP model, which is used to calculate the acquisition function.
- gp_priorslist of callables
List of prior distributions for the kernel hyperparameters of the GP. Each callable returns the logpdf of the prior distribution.
- n_initial_points_int
Number of initial points to sample
- noiseilist of floats
Additional pointwise noise which is added to the diagonal of the kernel matrix
Methods
ask
([n_points])Ask the optimizer for the next point to evaluate.
expected_optimality_gap
([max_tries, ...])Estimate the expected optimality gap by repeatedly sampling functions consistent with the data.
optimum_intervals
([hdi_prob, multimodal, ...])Estimate highest density intervals for the optimum.
probability_of_optimality
(threshold[, ...])Compute the probability that the current expected optimum cannot be improved by more than
threshold
points.run
(func[, n_iter, replace, n_samples, ...])Execute the ask/tell-loop on a given objective function.
tell
(x, y[, noise_vector, fit, replace, ...])Inform the optimizer about the objective function at discrete points.
- __init__(dimensions, n_points=500, n_initial_points=10, init_strategy='sb', gp_kernel=None, gp_kwargs=None, gp_priors=None, acq_func='pvrs', acq_func_kwargs=None, random_state=None, **kwargs)[source]¶
- ask(n_points=1)[source]¶
Ask the optimizer for the next point to evaluate.
If the optimizer is still in its initialization phase, it will return a point as specified by the init_strategy. If the Gaussian process has been fit, a previously computed point as
- Parameters:
- n_pointsint
Number of points to return. This is currently not implemented and will raise a NotImplementedError.
- Returns:
- list
A list with the same dimensionality as the optimization space.
- Raises:
- NotImplementedError
If n_points is != 1, which is not implemented yet.
- expected_optimality_gap(max_tries=3, n_probabilities=50, n_space_samples=500, n_gp_samples=200, n_random_starts=100, tol=0.01, use_mean_gp=True, normalized_scores=True, random_state=None)[source]¶
Estimate the expected optimality gap by repeatedly sampling functions consistent with the data.
- Parameters:
- max_triesint, default=3
Maximum amount of tries to compute the current global optimum. Raises a ValueError, if it fails.
- n_probabilitiesint, default=50
Number of probabilities to calculate in order to estimate the cumulative distribution function for the optimality gap.
- n_space_samplesint, default=500
Number of random samples used to cover the optimization space.
- n_gp_samplesint, default=200
Number of functions to sample from the Gaussian process.
- n_random_startsint, default=100
Number of random positions to start the optimizer from in order to determine the global optimum.
- tolfloat, default=0.01
Tolerance with which to determine the upper bound for the optimality gap.
- use_mean_gpbool, default=True
If True, random functions will be sampled from the consensus GP, which is usually faster, but could underestimate the variability. If False, the posterior distribution over hyperparameters is used to sample different GPs and then sample functions.
- normalized_scoresbool, optional (default: True)
If True, normalize the optimality gaps by the function specific standard deviation. This makes the optimality gaps more comparable, especially if use_mean_gp is False.
- random_stateint, RandomState instance, or None (default)
Set random state to something other than None for reproducible results.
- Returns:
- expected_gapfloat
The expected optimality gap of the current global optimum with respect to randomly sampled, consistent optima.
- optimum_intervals(hdi_prob=0.95, multimodal=True, opt_samples=200, space_samples=500, only_mean=True, random_state=None)[source]¶
Estimate highest density intervals for the optimum.
Employs Thompson sampling to obtain samples from the optimum distribution. For each dimension separately, it will then estimate highest density intervals.
- Parameters:
- hdi_probfloat, default=0.95
The total probability each interval should cover.
- multimodalbool, default=True
If True, more than one interval can be returned for one parameter.
- opt_samplesint, default=200
Number of samples to generate from the optimum distribution.
- space_samplesint, default=500
Number of samples to cover the optimization space with.
- only_meanbool, default=True
If True, it will only sample optima from the mean Gaussian process. This is usually faster, but can underestimate the uncertainty. If False, it will also sample the hyperposterior of the kernel parameters.
- random_stateint, RandomState instance or None, optional (default: None)
The generator used to initialize the centers. If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.
- Returns:
- intervalslist of ndarray
Outputs an array of size (n_modes, 2) for each dimension in the optimization space.
- Raises:
- NotImplementedError
If the user calls the function on an optimizer containing at least one categorical parameter.
- probability_of_optimality(threshold, n_space_samples=500, n_gp_samples=200, n_random_starts=100, use_mean_gp=True, normalized_scores=True, random_state=None)[source]¶
Compute the probability that the current expected optimum cannot be improved by more than
threshold
points.- Parameters:
- thresholdfloat or list-of-floats
Other points have to be better than the current optimum by at least a margin of size
threshold
. If a list is passed, this will return a list of probabilities.- n_space_samplesint, default=500
Number of random samples used to cover the optimization space.
- n_gp_samplesint, default=200
Number of functions to sample from the Gaussian process.
- n_random_startsint, default=100
Number of random positions to start the optimizer from in order to determine the global optimum.
- use_mean_gpbool, default=True
If True, random functions will be sampled from the consensus GP, which is usually faster, but could underestimate the variability. If False, the posterior distribution over hyperparameters is used to sample different GPs and then sample functions.
- normalized_scoresbool, optional (default: True)
If True, normalize the optimality gaps by the function specific standard deviation. This makes the optimality gaps more comparable, especially if use_mean_gp is False.
- random_stateint, RandomState instance, or None (default)
Set random state to something other than None for reproducible results.
- Returns:
- probabilitiesfloat or list-of-floats
Probabilities of the current optimum to be optimal wrt the given thresholds.
- run(func, n_iter=1, replace=False, n_samples=5, gp_samples=100, gp_burnin=10)[source]¶
Execute the ask/tell-loop on a given objective function.
- Parameters:
- funcfunction
The objective function to minimize. Should either return a scalar value, or a tuple (value, noise) where the noise should be a variance.
- n_iterint, optional (default: 1)
Number of iterations to perform.
- replacebool, optional (default: False)
If True, the existing data points will be replaced with the ones collected from now on. The existing model will be used as initialization.
- n_samplesint, optional (default: 5)
Number of hyperposterior samples over which to average the acquisition function.
- gp_samplesint, optional (default: 100)
Number of hyperposterior samples to collect during inference. More samples result in a more accurate representation of the hyperposterior, but increase the running time. Has to be a multiple of 100.
- gp_burninint, optional (default: 10)
Number of inference iterations to discard before beginning collecting hyperposterior samples. Only needs to be increased, if the hyperposterior after burnin has not settled on the typical set. Drastically increases running time.
- Returns:
- scipy.optimize.OptimizeResult object
Contains the points, the values of the objective function, the search space, the random state and the list of models.
- tell(x, y, noise_vector=None, fit=True, replace=False, n_samples=0, gp_samples=100, gp_burnin=10, progress=False)[source]¶
Inform the optimizer about the objective function at discrete points.
Provide values of the objective function at points suggested by ask() or other points. By default a new model will be fit to all observations. The new model is used to suggest the next point at which to evaluate the objective. This point can be retrieved by calling ask(). To add observations without fitting a new model set fit to False. To add multiple observations in a batch pass a list-of-lists for x and a list of scalars for y.
- Parameters:
- xlist or list of lists
Point(s) at which the objective function was evaluated.
- yscalar or list
Value(s) of the objective function at x.
- noise_vectorlist, default=None
Variance(s) of the objective function at x.
- fitbool, optional (default: True)
If True, a model will be fitted to the points, if n_initial_points points have been evaluated.
- replacebool, optional (default: False)
If True, the existing data points will be replaced with the one given in x and y.
- n_samplesint, optional (default: 0)
Number of hyperposterior samples over which to average the acquisition function. More samples make the acquisition function more robust, but increase the running time. Can be set to 0 for pvrs and vr.
- gp_samplesint, optional (default: 100)
Number of hyperposterior samples to collect during inference. More samples result in a more accurate representation of the hyperposterior, but increase the running time. Has to be a multiple of 100.
- gp_burninint, optional (default: 10)
Number of inference iterations to discard before beginning collecting hyperposterior samples. Only needs to be increased, if the hyperposterior after burnin has not settled on the typical set. Drastically increases running time.
- progressbool, optional (default: False)
If True, show a progress bar during the inference phase.
- Returns:
- scipy.optimize.OptimizeResult object
Contains the points, the values of the objective function, the search space, the random state and the list of models.