Bayesian optimization, grid search, and random search are three common ways to tune hyperparameters. They all answer the same question: which parameter values produce the best result? The difference is how they choose what to try next.

Short answer: use grid search when the search space is tiny and you need a simple baseline, random search when only a few parameters matter and you want a stronger baseline, and Bayesian optimization when each trial is expensive and you want the search to learn from previous results.

Quick comparison

Method	How it searches	Best for	Main weakness
Grid search	Tests every point in a fixed grid	Tiny spaces, reproducible baselines, simple demos	Wastes trials on unimportant dimensions and grows exponentially
Random search	Samples points randomly from the space	Broad early exploration and cheap-to-medium trials	Does not intentionally focus on promising regions
Bayesian optimization	Builds a model of the objective and chooses informative next trials	Expensive training, backtests, simulations, and black-box optimization	More moving parts than a simple baseline

The visual idea

Imagine you are tuning two parameters: learning_rate and max_depth for a model, or lookback_window and risk_multiplier for a trading strategy. Each dot is one trial.

Grid search

Grid search places trials on a fixed lattice:

max_depth
  ^
  |  x     x     x     x
  |
  |  x     x     x     x
  |
  |  x     x     x     x
  |
  |  x     x     x     x
  +------------------------> learning_rate

This is easy to understand and easy to reproduce. If you define four values for parameter A and four values for parameter B, grid search runs 16 trials.

The problem is scale. Add more parameters and the number of combinations grows quickly:

Parameters	Values per parameter	Grid trials
2	10	100
4	10	10,000
6	10	1,000,000

Grid search also spends the same number of trials on every dimension, even when some parameters barely affect the metric.

Random search

Random search samples from the space without following a fixed grid:

max_depth
  ^
  |      x        x
  |  x
  |            x       x
  |      x
  | x              x
  |          x
  +------------------------> learning_rate

Random search often beats grid search when only a subset of parameters really matters. Instead of testing every combination of every dimension, it keeps drawing new combinations. That gives it more chances to find useful values for the important parameters.

For example, if learning_rate matters much more than max_depth, a grid can waste many trials repeating the same few learning_rate values. Random search can cover more distinct learning_rate values with the same budget.

Bayesian optimization

Bayesian optimization starts with exploration, then concentrates trials where the results look promising:

max_depth
  ^
  |  x       x
  |
  |             x  x
  |           x  *  x
  |             x  x
  |  x
  +------------------------> learning_rate

The * represents a promising region. Bayesian optimization does not know the best point in advance. It builds a probabilistic model from completed trials, estimates where good results may be, and chooses the next trial by balancing:

Exploration: try uncertain regions because they might contain better results.
Exploitation: try near known strong regions because they are likely to perform well.

That learning loop is why Bayesian optimization is useful for expensive black-box optimization: model training runs, trading backtests, simulations, batch jobs, and other workloads where every trial costs time or money.

How each method chooses the next trial

Question	Grid search	Random search	Bayesian optimization
Does it use previous results?	No	No	Yes
Can it stop early with useful information?	Sometimes, but inefficiently	Yes, as a baseline	Yes, often with better direction
Does it handle continuous ranges naturally?	Only after discretizing	Yes	Yes
Is it easy to parallelize?	Yes	Yes	Yes, with batch selection
Is it sample-efficient?	Low	Medium	High

When to use grid search

Use grid search when the search space is small enough that exhaustive testing is practical.

Good fit:

You have two or three parameters with a handful of values each.
You need a deterministic baseline.
You want to validate that your objective metric and trial runner work.
You are tuning categorical choices with few combinations.

Avoid grid search when the number of parameters grows, when ranges are continuous, or when each trial is expensive. In those cases, the grid becomes a trial budget problem instead of an optimization strategy.

When to use random search

Use random search when you want a simple, strong baseline across a large space.

Good fit:

You do not yet know which parameters matter.
Trials are cheap enough to run many samples.
You want broad coverage before using a smarter optimizer.
Your parameter space includes continuous ranges.

Random search is especially useful as a sanity check. If a more complex optimizer cannot beat a well-run random search baseline, the issue may be the objective, the search space, the trial budget, or noise in the measurements.

When to use Bayesian optimization

Use Bayesian optimization when you want better results from a limited trial budget.

Good fit:

Each trial is slow, expensive, or operationally limited.
You are optimizing a black-box function: you can run a trial and observe a metric, but you do not have gradients.
You care about finding strong configurations with fewer wasted runs.
You want the optimizer to adapt as results come in.

This is the typical case for hyperparameter optimization as a service. With HyperOptimizer, your workload runs in a Docker container for each trial. You define the search space and objective metric, the platform chooses parameter sets with optimization algorithms such as Bayesian optimization, collects metrics from stdout, and shows ranked results in the dashboard.

Practical example: tuning a trading strategy

Suppose a trading strategy has these parameters:

Parameter	Range
`lookback_window`	10 to 200
`atr_multiplier`	0.5 to 5.0
`stop_loss_bps`	5 to 50
`max_position_size`	0.01 to 0.10

A grid with 20 values per parameter would require 160,000 backtests. If one backtest takes two minutes, that is more than 222 days of serial compute.

Random search can sample 200 or 1,000 combinations without enumerating everything. Bayesian optimization can go further by learning from early backtests and spending later trials on more promising regions, such as lower drawdown with acceptable return or a better Sharpe ratio under realistic cost assumptions.

Choosing the right method

Use this rule of thumb:

Your situation	Recommended method
Tiny, discrete search space	Grid search
Cheap trials and unknown parameter importance	Random search
Expensive trials and limited budget	Bayesian optimization
Need a baseline before a smarter optimizer	Random search
Need every listed combination tested exactly	Grid search
Need adaptive, sample-efficient search	Bayesian optimization

Common mistakes

Mistake 1: making the grid too fine. A fine grid feels precise, but it can spend thousands of trials on values that do not matter.

Mistake 2: comparing methods with different budgets. If grid search gets 10,000 trials and Bayesian optimization gets 100, the comparison is not measuring the algorithm fairly.

Mistake 3: optimizing the wrong metric. Better search cannot fix an objective that rewards overfitting, ignores costs, or fails to penalize risk.

Mistake 4: skipping holdout validation. The best parameter set on one dataset may not generalize. Always validate strong configurations on data or scenarios not used for selection.

FAQ

Is Bayesian optimization always better than random search?

No. Bayesian optimization is usually more sample-efficient when trials are expensive, but random search is simpler and can be very competitive when trials are cheap, noisy, or highly parallel.

Is grid search obsolete?

No. Grid search is still useful for small search spaces, deterministic baselines, and testing that an optimization workflow is wired correctly.

Why does random search often beat grid search?

Random search can test more distinct values of important parameters. Grid search repeats fixed values across every dimension, which can waste trials when only a few parameters strongly affect the objective.

What is the best method for hyperparameter optimization?

For most expensive black-box workloads, Bayesian optimization is the best starting point after a random-search baseline. For tiny spaces, grid search is often enough.

Can these methods run in parallel?

Yes. Grid search and random search are naturally parallel. Bayesian optimization can also run parallel trials by selecting batches of promising and informative parameter sets.

Next steps

If your workload can accept parameters from the command line and print objective metrics, it can usually be optimized. Start with a small random-search or grid-search baseline, confirm your metric behaves correctly, then move to Bayesian optimization when trial budget matters.

HyperOptimizer is built for that workflow: package your model, backtest, simulation, or data pipeline as a Docker container, define the search space, and let managed optimization run the trials. Read the getting started guide or join the beta when you are ready to optimize without managing the infrastructure yourself.