๐Ÿ Python & library/Etc.

[Optuna] ๋”ฅ๋Ÿฌ๋‹ ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ ์ตœ์ ํ™”ํ•˜๊ธฐ

๋ณต๋งŒ 2023. 12. 10. 14:17

 

 

Optuna๋Š” ํŒŒ์ด์ฌ ๊ธฐ๋ฐ˜์˜ ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ ์ตœ์ ํ™” (hyperparameter optimization) ํ”„๋ ˆ์ž„์›Œํฌ๋กœ, ์‹ฌํ”Œํ•˜๊ณ  ์œ ์—ฐํ•œ API๋ฅผ ์ œ๊ณตํ•œ๋‹ค. ๋ณธ ๊ธ€์—์„œ๋Š” Optuna์˜ ์ฃผ์š” ๊ธฐ๋Šฅ๊ณผ ์‚ฌ์šฉ๋ฐฉ๋ฒ•์„ ๊ฐ„๋‹จํžˆ ์†Œ๊ฐœํ•˜๊ณ ์ž ํ•œ๋‹ค.

 

 

๊ณต์‹ Docs: https://optuna.readthedocs.io/en/stable/index.html

 

Optuna: A hyperparameter optimization framework — Optuna 3.4.0 documentation

© Copyright 2018, Optuna Contributors. Revision 4ea580fc.

optuna.readthedocs.io

 

Basic concepts

Optuna๋Š” study์™€ trial์„ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์ •์˜ํ•œ๋‹ค.

 

  • Study: objective ํ•จ์ˆ˜์— ๊ธฐ๋ฐ˜ํ•˜์—ฌ optimization์„ ์ˆ˜ํ–‰ํ•˜๋Š” ํ•˜๋‚˜์˜ ํ”„๋กœ์ ํŠธ
  • Trial: Study ๋‚ด์˜ optimization ๋‹จ์ผ ์ˆ˜ํ–‰

 

Hyperparameter optimization์„ ์ˆ˜ํ–‰ํ•˜๊ธฐ ์œ„ํ•ด objective์™€ study๋ฅผ ์ •์˜ํ•˜๊ณ , n_trials ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ์กฐ์ •ํ•˜์—ฌ ๋ช‡ ํšŒ์˜ trial์„ ์ˆ˜ํ–‰ํ• ์ง€ ์„ค์ •ํ•  ์ˆ˜ ์žˆ๋‹ค.

 

๋‹ค์Œ๊ณผ ๊ฐ™์ด study๋ฅผ ์ •์˜ํ•  ์ˆ˜ ์žˆ๋‹ค. objective๋Š” ๋งค trial์„ input์œผ๋กœ ๋ฐ›๋Š” ํ•จ์ˆ˜์ด๋‹ค.

import optuna

def objective(trial):
	...
    
    model.fit(train_x, train_y)
    
    error = get_error(model, valid_x, valid_y)
    
    return error
    
study = optuna.create_study()

study.optimize(objective, n_trials=100)

 

 

Search space์™€ Sampling algorithms

์‚ฌ์šฉ์ž๊ฐ€ ํƒ์ƒ‰ํ•  hyperparameter์˜ search space๋ฅผ ์ •์˜ํ•ด์ฃผ๋ฉด, optuna๋Š” ๊ทธ ์•ˆ์—์„œ hyperparmeter์„ samplingํ•˜์—ฌ ์ตœ์ ํ™”๋ฅผ ์ง„ํ–‰ํ•œ๋‹ค.

 

Search space๋Š” objective ์•ˆ์—์„œ ์„ค์ •ํ•  ์ˆ˜ ์žˆ๋‹ค. ๋‹ค์Œ์€ ๋‹ค์–‘ํ•œ search space๋ฅผ ์ •์˜ํ•˜๋Š” ๋ฐฉ๋ฒ•์˜ ์˜ˆ์‹œ์ด๋‹ค.

import optuna


def objective(trial):
    # Categorical parameter
    optimizer = trial.suggest_categorical("optimizer", ["MomentumSGD", "Adam"])

    # Integer parameter
    n_layers = trial.suggest_int("n_layers", 1, 3)

    # Loops
    layers = []
    for i in range(n_layers):
        n_units = trial.suggest_int("n_units_l{}".format(i), 4, 128, log=True)
        layers.append(nn.Linear(in_size, n_units))
        layers.append(nn.ReLU())
        in_size = n_units
    layers.append(nn.Linear(in_size, 10))

    # Integer parameter (discretized)
    num_units = trial.suggest_int("num_units", 10, 100, step=5)

    # Floating point parameter
    dropout_rate = trial.suggest_float("dropout_rate", 0.0, 1.0)

    # Floating point parameter (log)
    learning_rate = trial.suggest_float("learning_rate", 1e-5, 1e-2, log=True)

    # Floating point parameter (discretized)
    drop_path_rate = trial.suggest_float("drop_path_rate", 0.0, 1.0, step=0.1)

 

Categorial, int, float ๋“ฑ ๋‹ค์–‘ํ•œ ํ˜•ํƒœ์˜ hyperparameter์„ ์ง€์ •ํ•ด์ค„ ์ˆ˜ ์žˆ๋‹ค. ๋” ๋งŽ์€ suggest_* ํ•จ์ˆ˜๋Š” ์—ฌ๊ธฐ์—์„œ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค.

 

search space์—์„œ hyperparameter์„ samplingํ•˜๋Š” ์•Œ๊ณ ๋ฆฌ์ฆ˜ ์—ญ์‹œ ์‚ฌ์šฉ์ž๊ฐ€ ์ •์˜ํ•  ์ˆ˜ ์žˆ๋Š”๋ฐ, create_study๋ฅผ ํ•  ๋•Œ sampler ์ธ์ˆ˜์— ๋„˜๊ฒจ์ฃผ๋ฉด ๋œ๋‹ค.

study = optuna.create_study(sampler=optuna.samplers.RandomSampler())
print(f"Sampler is {study.sampler.__class__.__name__}") #print

 

Optuna์—์„œ ์‚ฌ์šฉ๊ฐ€๋Šฅํ•œ sampler์˜ ์ข…๋ฅ˜๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค.

  • GridSampler
  • RandomSampler
  • TPESampler (default)
  • CmaEsSampler
  • PartialFixedSampler
  • NSGAIISampler
  • QMCSampler

 

๊ฐ sampler์— ๋Œ€ํ•œ ์ž์„ธํ•œ ์„ค๋ช…์€ ์—ฌ๊ธฐ์—์„œ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋‹ค.

 

 

์–ด๋–ค sampler์„ ์‚ฌ์šฉํ•˜๋ฉด ์ข‹์„์ง€์— ๋Œ€ํ•œ ํžŒํŠธ๋„ ์ฐพ์•„๋ณผ ์ˆ˜ ์žˆ๋‹ค.

 

 

 

Pruning algorithms

Pruning์€ ํ•™์Šต ์ดˆ๊ธฐ ๋‹จ๊ณ„์—์„œ ๊ฐ€๋Šฅ์„ฑ์ด ๋‚ฎ์•„๋ณด์ด๋Š” trial์„ ์ž๋™์œผ๋กœ ์ค‘๋‹จํ•˜๋Š” ๊ธฐ๋Šฅ์ด๋‹ค. "automated early-stopping"์ด๋ผ๊ณ  ๋ณผ ์ˆ˜ ์žˆ๋‹ค.

 

Pruner์˜ ์ข…๋ฅ˜๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค.

  • MedianPruner
  • NopPruner
  • PatientPruner
  • PercentilePruner
  • SuccessiveHalvingPruner
  • HyperbandPruner
  • ThresholdPruner

 

์ „์ฒด pruner์˜ ์ข…๋ฅ˜์™€ ์ž์„ธํ•œ ์„ค๋ช…์€ ์—ฌ๊ธฐ์—์„œ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค.

 

Pruner์˜ ์‚ฌ์šฉ๋ฒ•์€ ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค.

  • training์˜ each step ์งํ›„์— report() ์™€ should_prune() ํ•จ์ˆ˜๋ฅผ ํ˜ธ์ถœํ•œ๋‹ค.
    • report(): ์ค‘๊ฐ„ objective value๋ฅผ ์ฃผ๊ธฐ์ ์œผ๋กœ ๋ชจ๋‹ˆํ„ฐ๋งํ•œ๋‹ค.
    • should_prune(): ์‚ฌ์ „์— ์ •์˜๋œ ์กฐ๊ฑด์„ ์ถฉ์กฑํ•˜์ง€ ์•Š๋Š” trial์˜ ์กฐ๊ธฐ ์ข…๋ฃŒ๋ฅผ ๊ฒฐ์ •ํ•œ๋‹ค.

 

def objective(trial):
    ...

    for step in range(100):
        model.fit(train_x, train_y, classes=classes)

        # Report intermediate objective value.
        intermediate_error = get_error(valid_x, valid_y)
        trial.report(intermediate_error, step)

        # Handle pruning based on the intermediate value.
        if trial.should_prune():
            raise optuna.TrialPruned()

    return get_error(valid_x, valid_y)

# Add stream handler of stdout to show the messages
optuna.logging.get_logger("optuna").addHandler(logging.StreamHandler(sys.stdout))
study = optuna.create_study(pruner=optuna.pruners.MedianPruner())
study.optimize(objective, n_trials=20)
#out
A new study created in memory with name: no-name-e9380357-f153-4409-b874-c302ee358494
Trial 0 finished with value: 0.2894736842105263 and parameters: {'alpha': 0.07567537350404895}. Best is trial 0 with value: 0.2894736842105263.
Trial 1 finished with value: 0.02631578947368418 and parameters: {'alpha': 1.0132167782206652e-05}. Best is trial 1 with value: 0.02631578947368418.
Trial 2 finished with value: 0.02631578947368418 and parameters: {'alpha': 0.011064776558365616}. Best is trial 1 with value: 0.02631578947368418.
Trial 3 finished with value: 0.3157894736842105 and parameters: {'alpha': 3.096403335234504e-05}. Best is trial 1 with value: 0.02631578947368418.
Trial 4 finished with value: 0.07894736842105265 and parameters: {'alpha': 0.027787238399605656}. Best is trial 1 with value: 0.02631578947368418.
Trial 5 pruned.
Trial 6 pruned.
Trial 7 pruned.
Trial 8 pruned.
Trial 9 pruned.
Trial 10 pruned.
Trial 11 pruned.
Trial 12 pruned.
Trial 13 finished with value: 0.02631578947368418 and parameters: {'alpha': 0.0005226670470560228}. Best is trial 1 with value: 0.02631578947368418.
Trial 14 pruned.
Trial 15 pruned.
Trial 16 pruned.
Trial 17 pruned.
Trial 18 pruned.
Trial 19 pruned.

 

Docs์—์„œ๋Š” ๋‹ค์Œ์˜ sampler-pruner ์กฐํ•ฉ์„ ์ถ”์ฒœํ•˜๊ณ  ์žˆ๋‹ค.

  • RandomSampler ์‚ฌ์šฉ ์‹œ MedianPruner ์‚ฌ์šฉ
  • TPESampler ์‚ฌ์šฉ ์‹œ HyperbandPruner ์‚ฌ์šฉ

 

 

 

Visualization

optuna์˜ ์ตœ์ ํ™” ๊ฒฐ๊ณผ์— ๋Œ€ํ•œ ์‹œ๊ฐํ™”๋ฅผ ๋„์™€์ฃผ๋Š” optuna-dashboard๋ผ๋Š” ํˆด์ด ์žˆ๋‹ค.

 

GitHub - optuna/optuna-dashboard: Real-time Web Dashboard for Optuna.

Real-time Web Dashboard for Optuna. Contribute to optuna/optuna-dashboard development by creating an account on GitHub.

github.com

 

์‚ฌ์šฉ๋ฒ•์€ ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค.

import optuna

if __name__ == "__main__":
    study_name = "quadratic-simple"
    study = optuna.create_study(
        storage=f"sqlite:///{study_name}.db",  # Specify the storage URL here.
        study_name=study_name
    )
    study.optimize(objective, n_trials=100)
    print(f"Best value: {study.best_value} (params: {study.best_params})")
pip install optuna-dashboard
optuna-dashboard sqlite:///quadratic-simple.db

 

 

logging

ํŒŒ์ผ์— trial์˜ ๊ธฐ๋ก์„ ๋‚จ๊ธฐ๋ ค๋ฉด ๋‹ค์Œ๊ณผ ๊ฐ™์ด logging ์˜ต์…˜์„ ์„ค์ •ํ•˜์—ฌ ํ•  ์ˆ˜ ์žˆ๋‹ค.

 

optuna.logging.enable_propagation — Optuna 3.4.0 documentation

© Copyright 2018, Optuna Contributors. Revision 4ea580fc.

optuna.readthedocs.io

import optuna
import logging

logger = logging.getLogger()

logger.setLevel(logging.INFO)  # Setup the root logger.
logger.addHandler(logging.FileHandler("foo.log", mode="w"))

optuna.logging.enable_propagation()  # Propagate logs to the root logger.
optuna.logging.disable_default_handler()  # Stop showing logs in sys.stderr.

study = optuna.create_study()

logger.info("Start optimization.")
study.optimize(objective, n_trials=10)

with open("foo.log") as f:
    assert f.readline().startswith("A new study created")
    assert f.readline() == "Start optimization.\n"

 

 

Examples

Optuna๋Š” ๋‹ค์–‘ํ•œ ๋”ฅ๋Ÿฌ๋‹ ํ”„๋ ˆ์ž„์›Œํฌ๋“ค๊ณผ ์œ ์—ฐํ•˜๊ฒŒ ๊ฒฐํ•ฉํ•˜์—ฌ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š”๋ฐ, ์˜ˆ์ œ๋“ค์€ ๋‹ค์Œ ๋งํฌ์—์„œ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค.

 

 

FAQ

๊ณต์‹ docs์˜ FAQ ์ค‘ ์œ ์šฉํ•œ ๋ช‡๊ฐ€์ง€๋ฅผ ์†Œ๊ฐœํ•œ๋‹ค.

 

FAQ — Optuna 3.4.0 documentation

When you want to suggest \(n\) variables which represent the proportion, that is, \(p[0], p[1], ..., p[n-1]\) which satisfy \(0 \le p[k] \le 1\) for any \(k\) and \(p[0] + p[1] + ... + p[n-1] = 1\), try the below. For example, these variables can be used a

optuna.readthedocs.io

How to define objective functions that have own arguments?

How to avoid OOM when optimizing studies?

 

๋ฐ˜์‘ํ˜•