BigDL-Nano Hyperparameter Tuning (Tensorflow Sequential/Functional API) Quickstart¶

In this notebook we demonstrates how to use Nano HPO to tune the hyperparameters in tensorflow training. The model is built using either tensorflow keras sequential API or functional API.

Step 0: Prepare Environment¶

You can install the latest pre-release version with nano support using below commands.

We recommend to run below commands, especially source bigdl-nano-init before jupyter kernel is started, or some of the optimizations may not take effect.

[ ]:

# Install latest pre-release version of bigdl-nano
!pip install --pre bigdl-nano[tensorflow]
!pip install setuptools==58.0.4
!pip install protobuf==3.20.1
!source bigdl-nano-init

[ ]:

# Install other dependecies for Nano HPO
!pip install ConfigSpace
!pip install optuna

Step 1: Init Nano AutoML¶

We need to enable Nano HPO before we use it for tensorflow training.

[ ]:

import bigdl.nano.automl as automl
automl.hpo_config.enable_hpo_tf()

Step 2: Prepare data¶

We use MNIST dataset for demonstration.

[ ]:

from tensorflow import keras

(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()

CLASSES = 10
img_x, img_y = x_train.shape[1], x_train.shape[2]
input_shape = (img_x, img_y, 1)
x_train = x_train.reshape(-1, img_x, img_y,1).astype("float32") / 255
x_test = x_test.reshape(-1, img_x, img_y,1).astype("float32") / 255

Step 3: Build model and specify search spaces¶

We now create our model.

Change the imports from tensorflow.keras to bigdl.nano as below, and you will be able to specify search spaces as you define the model. For how to specify search space, refer to user doc.

[ ]:

from bigdl.nano.automl.tf.keras import Sequential
from bigdl.nano.tf.keras.layers import Dense, Flatten, Conv2D
from bigdl.nano.tf.keras import Input
from bigdl.nano.automl.tf.keras import Model
import bigdl.nano.automl.hpo.space as space

Below two cells show how to define the model with search spaces using either sequential or functional API respectively. You can choose one of them to run.

[ ]:

model = Sequential()
model.add(Conv2D(
    filters=space.Categorical(32, 64),
    kernel_size=space.Categorical(3, 5),
    strides=space.Categorical(1, 2),
    activation=space.Categorical("relu", "linear"),
    input_shape=input_shape))
model.add(Flatten())
model.add(Dense(CLASSES, activation="softmax"))

[ ]:

inputs = Input(shape=(28,28,1))
x = Conv2D(
    filters=space.Categorical(32, 64),
    kernel_size=space.Categorical(3, 5),
    strides=space.Categorical(1, 2),
    activation=space.Categorical("relu", "linear"),
    input_shape=input_shape)(inputs)
x = Flatten()(x)
outputs = Dense(CLASSES, activation="softmax")(x)
model = Model(inputs=inputs, outputs=outputs, name="mnist_model")

Step 4: Compile model¶

We now compile our model with loss function, optimizer and metrics. If you want to tune learning rate and batch size, refer to user guide.

[ ]:

from tensorflow.keras.optimizers import RMSprop
model.compile(
    loss="sparse_categorical_crossentropy",
    optimizer=RMSprop(learning_rate=0.001),
    metrics=["accuracy"]
)

Step 5: Run hyperparameter search¶

Run hyperparameter search by calling model.search. Set n_trials to the number of trialials you want to run, and set the target_metric and direction so that HPO optimizes the target_metric in the specified direction. Each trial will use a different set of hyperparameters in the search space range. After search completes, you can use search_summary to retrive the search results for analysis. For more details, refer to user doc

[ ]:

%%time
from bigdl.nano.automl.hpo.backend import PrunerType
model.search(
    n_trials=8,
    target_metric='val_accuracy',
    direction="maximize",
    pruner=PrunerType.HyperBand,
    pruner_kwargs={'min_resource':1, 'max_resource':100, 'reduction_factor':3},
    x=x_train,
    y=y_train,
    batch_size=128,
    epochs=5,
    validation_split=0.2,
    verbose=False
)

[ ]:

print(model.search_summary())

Step 6: (Optional) Resume training from memory¶

You can resume the previous search when a search completes by setting resume=True. Refer to user doc for more details.

[ ]:

%%time
model.search(
    n_trials=4,
    target_metric='val_accuracy',
    direction="maximize",
    pruner=PrunerType.HyperBand,
    pruner_kwargs={'min_resource':1, 'max_resource':100, 'reduction_factor':3},
    x=x_train,
    y=y_train,
    batch_size=128,
    epochs=5,
    validation_split=0.2,
    verbose=False,
    resume = True
)

[ ]:

print(model.search_summary())

Step 7: fit with the best hyperparameters¶

After search, model.fit will autotmatically use the best hyperparmeters found in search to fit the model.

[ ]:

history = model.fit(x_train, y_train,
                    batch_size=128, epochs=5, validation_split=0.2)

test_scores = model.evaluate(x_test, y_test, verbose=2)
print("Test loss:", test_scores[0])
print("Test accuracy:", test_scores[1])

Step 8: HPO Result Analysis and Visualization¶

Check out the summary of the model. The model has already been built with the best hyperparameters found by nano hpo.

[ ]:

print(model.summary())
study = model.search_summary()

[ ]:

study.trials_dataframe(attrs=("number", "value", "params", "state"))

[ ]:

from bigdl.nano.automl.hpo.visualization import plot_optimization_history
plot_optimization_history(study)

[ ]:

from bigdl.nano.automl.hpo.visualization import plot_parallel_coordinate
plot_parallel_coordinate(study)

[ ]:

from bigdl.nano.automl.hpo.visualization import plot_intermediate_values
plot_intermediate_values(study)

[ ]:

from bigdl.nano.automl.hpo.visualization import plot_contour
plot_contour(study)

[ ]:

from bigdl.nano.automl.hpo.visualization import plot_param_importances
plot_param_importances(study)

[ ]:

plot_param_importances(study, target=lambda t: t.duration.total_seconds(), target_name="duration")

You can find the running output from here, or run the notebook by yourself in Google Colab.