bask.Optimizer

class bask.Optimizer(dimensions, n_points=500, n_initial_points=10, init_strategy='sb', gp_kernel=None, gp_kwargs=None, gp_priors=None, acq_func='pvrs', acq_func_kwargs=None, random_state=None, **kwargs)[source]

Execute a stepwise Bayesian optimization.

Parameters:
dimensionslist, shape (n_dims,)

List of search space dimensions. Each search dimension can be defined either as

  • a (lower_bound, upper_bound) tuple (for Real or Integer dimensions),

  • a (lower_bound, upper_bound, “prior”) tuple (for Real dimensions),

  • as a list of categories (for Categorical dimensions), or

  • an instance of a Dimension object (Real, Integer or Categorical).

n_pointsint, default=500

Number of random points to evaluate the acquisition function on.

n_initial_pointsint, default=10

Number of initial points to sample before fitting the GP.

init_strategystring or None, default=”sb”

Type of initialization strategy to use for the initial n_initial_points. Should be one of

  • “sb”: The Steinberger low-discrepancy sequence

  • “r2”: The R2 sequence (works well for up to two parameters)

  • “random” or None: Uniform random sampling

gp_kernelkernel object

The kernel specifying the covariance function of the GP. If None is passed, a suitable default kernel is constructed. Note that the kernel’s hyperparameters are estimated using MCMC during fitting.

gp_kwargsdict, optional

Dict of arguments passed to BayesGPR. For example, {'normalize_y': True} would allow the GP to normalize the output values before fitting.

gp_priorslist of callables, optional

List of prior distributions for the kernel hyperparameters of the GP. Each callable returns the logpdf of the prior distribution. Remember that a WhiteKernel is added to the gp_kernel, which is why you need to include a prior distribution for that as well. If None, will try to guess suitable prior distributions.

acq_funcstring or Acquisition object, default=”pvrs”

Acquisition function to use as a criterion to select new points to test. By default we use “pvrs”, which is a very robust criterion with fast convergence. Should be one of

  • ‘pvrs’ Predictive variance reductions search

  • ‘mes’ Max-value entropy search

  • ‘ei’ Expected improvement

  • ‘ttei’ Top-two expected improvement

  • ‘lcb’ Lower confidence bound

  • ‘mean’ Expected value of the GP

  • ‘ts’ Thompson sampling

  • ‘vr’ Global variance reduction

Can also be a custom Acquisition object.

acq_func_kwargsdict, optional

Dict of arguments passed to Acquisition.

random_stateint or RandomState or None, optional, default=None

Pseudo random number generator state used for random uniform sampling from lists of possible values instead of scipy.stats distributions.

Attributes:
Xilist

Points at which objective has been evaluated.

yiscalar

Values of objective at corresponding points in Xi.

spaceSpace

An instance of skopt.space.Space. Stores parameter search space used to sample points, bounds, and type of parameters.

gpBayesGPR object

The current underlying GP model, which is used to calculate the acquisition function.

gp_priorslist of callables

List of prior distributions for the kernel hyperparameters of the GP. Each callable returns the logpdf of the prior distribution.

n_initial_points_int

Number of initial points to sample

noiseilist of floats

Additional pointwise noise which is added to the diagonal of the kernel matrix

Methods

ask([n_points])

Ask the optimizer for the next point to evaluate.

expected_optimality_gap([max_tries, ...])

Estimate the expected optimality gap by repeatedly sampling functions consistent with the data.

optimum_intervals([hdi_prob, multimodal, ...])

Estimate highest density intervals for the optimum.

probability_of_optimality(threshold[, ...])

Compute the probability that the current expected optimum cannot be improved by more than threshold points.

run(func[, n_iter, replace, n_samples, ...])

Execute the ask/tell-loop on a given objective function.

tell(x, y[, noise_vector, fit, replace, ...])

Inform the optimizer about the objective function at discrete points.

__init__(dimensions, n_points=500, n_initial_points=10, init_strategy='sb', gp_kernel=None, gp_kwargs=None, gp_priors=None, acq_func='pvrs', acq_func_kwargs=None, random_state=None, **kwargs)[source]
ask(n_points=1)[source]

Ask the optimizer for the next point to evaluate.

If the optimizer is still in its initialization phase, it will return a point as specified by the init_strategy. If the Gaussian process has been fit, a previously computed point as

Parameters:
n_pointsint

Number of points to return. This is currently not implemented and will raise a NotImplementedError.

Returns:
list

A list with the same dimensionality as the optimization space.

Raises:
NotImplementedError

If n_points is != 1, which is not implemented yet.

expected_optimality_gap(max_tries=3, n_probabilities=50, n_space_samples=500, n_gp_samples=200, n_random_starts=100, tol=0.01, use_mean_gp=True, normalized_scores=True, random_state=None)[source]

Estimate the expected optimality gap by repeatedly sampling functions consistent with the data.

Parameters:
max_triesint, default=3

Maximum amount of tries to compute the current global optimum. Raises a ValueError, if it fails.

n_probabilitiesint, default=50

Number of probabilities to calculate in order to estimate the cumulative distribution function for the optimality gap.

n_space_samplesint, default=500

Number of random samples used to cover the optimization space.

n_gp_samplesint, default=200

Number of functions to sample from the Gaussian process.

n_random_startsint, default=100

Number of random positions to start the optimizer from in order to determine the global optimum.

tolfloat, default=0.01

Tolerance with which to determine the upper bound for the optimality gap.

use_mean_gpbool, default=True

If True, random functions will be sampled from the consensus GP, which is usually faster, but could underestimate the variability. If False, the posterior distribution over hyperparameters is used to sample different GPs and then sample functions.

normalized_scoresbool, optional (default: True)

If True, normalize the optimality gaps by the function specific standard deviation. This makes the optimality gaps more comparable, especially if use_mean_gp is False.

random_stateint, RandomState instance, or None (default)

Set random state to something other than None for reproducible results.

Returns:
expected_gapfloat

The expected optimality gap of the current global optimum with respect to randomly sampled, consistent optima.

optimum_intervals(hdi_prob=0.95, multimodal=True, opt_samples=200, space_samples=500, only_mean=True, random_state=None)[source]

Estimate highest density intervals for the optimum.

Employs Thompson sampling to obtain samples from the optimum distribution. For each dimension separately, it will then estimate highest density intervals.

Parameters:
hdi_probfloat, default=0.95

The total probability each interval should cover.

multimodalbool, default=True

If True, more than one interval can be returned for one parameter.

opt_samplesint, default=200

Number of samples to generate from the optimum distribution.

space_samplesint, default=500

Number of samples to cover the optimization space with.

only_meanbool, default=True

If True, it will only sample optima from the mean Gaussian process. This is usually faster, but can underestimate the uncertainty. If False, it will also sample the hyperposterior of the kernel parameters.

random_stateint, RandomState instance or None, optional (default: None)

The generator used to initialize the centers. If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.

Returns:
intervalslist of ndarray

Outputs an array of size (n_modes, 2) for each dimension in the optimization space.

Raises:
NotImplementedError

If the user calls the function on an optimizer containing at least one categorical parameter.

probability_of_optimality(threshold, n_space_samples=500, n_gp_samples=200, n_random_starts=100, use_mean_gp=True, normalized_scores=True, random_state=None)[source]

Compute the probability that the current expected optimum cannot be improved by more than threshold points.

Parameters:
thresholdfloat or list-of-floats

Other points have to be better than the current optimum by at least a margin of size threshold. If a list is passed, this will return a list of probabilities.

n_space_samplesint, default=500

Number of random samples used to cover the optimization space.

n_gp_samplesint, default=200

Number of functions to sample from the Gaussian process.

n_random_startsint, default=100

Number of random positions to start the optimizer from in order to determine the global optimum.

use_mean_gpbool, default=True

If True, random functions will be sampled from the consensus GP, which is usually faster, but could underestimate the variability. If False, the posterior distribution over hyperparameters is used to sample different GPs and then sample functions.

normalized_scoresbool, optional (default: True)

If True, normalize the optimality gaps by the function specific standard deviation. This makes the optimality gaps more comparable, especially if use_mean_gp is False.

random_stateint, RandomState instance, or None (default)

Set random state to something other than None for reproducible results.

Returns:
probabilitiesfloat or list-of-floats

Probabilities of the current optimum to be optimal wrt the given thresholds.

run(func, n_iter=1, replace=False, n_samples=5, gp_samples=100, gp_burnin=10)[source]

Execute the ask/tell-loop on a given objective function.

Parameters:
funcfunction

The objective function to minimize. Should either return a scalar value, or a tuple (value, noise) where the noise should be a variance.

n_iterint, optional (default: 1)

Number of iterations to perform.

replacebool, optional (default: False)

If True, the existing data points will be replaced with the ones collected from now on. The existing model will be used as initialization.

n_samplesint, optional (default: 5)

Number of hyperposterior samples over which to average the acquisition function.

gp_samplesint, optional (default: 100)

Number of hyperposterior samples to collect during inference. More samples result in a more accurate representation of the hyperposterior, but increase the running time. Has to be a multiple of 100.

gp_burninint, optional (default: 10)

Number of inference iterations to discard before beginning collecting hyperposterior samples. Only needs to be increased, if the hyperposterior after burnin has not settled on the typical set. Drastically increases running time.

Returns:
scipy.optimize.OptimizeResult object

Contains the points, the values of the objective function, the search space, the random state and the list of models.

tell(x, y, noise_vector=None, fit=True, replace=False, n_samples=0, gp_samples=100, gp_burnin=10, progress=False)[source]

Inform the optimizer about the objective function at discrete points.

Provide values of the objective function at points suggested by ask() or other points. By default a new model will be fit to all observations. The new model is used to suggest the next point at which to evaluate the objective. This point can be retrieved by calling ask(). To add observations without fitting a new model set fit to False. To add multiple observations in a batch pass a list-of-lists for x and a list of scalars for y.

Parameters:
xlist or list of lists

Point(s) at which the objective function was evaluated.

yscalar or list

Value(s) of the objective function at x.

noise_vectorlist, default=None

Variance(s) of the objective function at x.

fitbool, optional (default: True)

If True, a model will be fitted to the points, if n_initial_points points have been evaluated.

replacebool, optional (default: False)

If True, the existing data points will be replaced with the one given in x and y.

n_samplesint, optional (default: 0)

Number of hyperposterior samples over which to average the acquisition function. More samples make the acquisition function more robust, but increase the running time. Can be set to 0 for pvrs and vr.

gp_samplesint, optional (default: 100)

Number of hyperposterior samples to collect during inference. More samples result in a more accurate representation of the hyperposterior, but increase the running time. Has to be a multiple of 100.

gp_burninint, optional (default: 10)

Number of inference iterations to discard before beginning collecting hyperposterior samples. Only needs to be increased, if the hyperposterior after burnin has not settled on the typical set. Drastically increases running time.

progressbool, optional (default: False)

If True, show a progress bar during the inference phase.

Returns:
scipy.optimize.OptimizeResult object

Contains the points, the values of the objective function, the search space, the random state and the list of models.