Open In Colab

image.png

BigDL-Nano Hyperparameter Tuning (Tensorflow Sequential/Functional API) Quickstart

In this notebook we demonstrates how to use Nano HPO to tune the hyperparameters in tensorflow training. The model is built using either tensorflow keras sequential API or functional API.

Step 0: Prepare Environment

You can install the latest pre-release version with nano support using below commands.

We recommend to run below commands, especially source bigdl-nano-init before jupyter kernel is started, or some of the optimizations may not take effect.

[ ]:
# Install latest pre-release version of bigdl-nano
!pip install --pre bigdl-nano[tensorflow]
!pip install setuptools==58.0.4
!pip install protobuf==3.20.1
!source bigdl-nano-init
[ ]:
# Install other dependecies for Nano HPO
!pip install ConfigSpace
!pip install optuna

Step 1: Init Nano AutoML

We need to enable Nano HPO before we use it for tensorflow training.

[ ]:
import bigdl.nano.automl as automl
automl.hpo_config.enable_hpo_tf()

Step 2: Prepare data

We use MNIST dataset for demonstration.

[ ]:
from tensorflow import keras

(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()

CLASSES = 10
img_x, img_y = x_train.shape[1], x_train.shape[2]
input_shape = (img_x, img_y, 1)
x_train = x_train.reshape(-1, img_x, img_y,1).astype("float32") / 255
x_test = x_test.reshape(-1, img_x, img_y,1).astype("float32") / 255

Step 3: Build model and specify search spaces

We now create our model.

Change the imports from tensorflow.keras to bigdl.nano as below, and you will be able to specify search spaces as you define the model. For how to specify search space, refer to user doc.

[ ]:
from bigdl.nano.automl.tf.keras import Sequential
from bigdl.nano.tf.keras.layers import Dense, Flatten, Conv2D
from bigdl.nano.tf.keras import Input
from bigdl.nano.automl.tf.keras import Model
import bigdl.nano.automl.hpo.space as space

Below two cells show how to define the model with search spaces using either sequential or functional API respectively. You can choose one of them to run.

[ ]:
model = Sequential()
model.add(Conv2D(
    filters=space.Categorical(32, 64),
    kernel_size=space.Categorical(3, 5),
    strides=space.Categorical(1, 2),
    activation=space.Categorical("relu", "linear"),
    input_shape=input_shape))
model.add(Flatten())
model.add(Dense(CLASSES, activation="softmax"))
[ ]:
inputs = Input(shape=(28,28,1))
x = Conv2D(
    filters=space.Categorical(32, 64),
    kernel_size=space.Categorical(3, 5),
    strides=space.Categorical(1, 2),
    activation=space.Categorical("relu", "linear"),
    input_shape=input_shape)(inputs)
x = Flatten()(x)
outputs = Dense(CLASSES, activation="softmax")(x)
model = Model(inputs=inputs, outputs=outputs, name="mnist_model")

Step 4: Compile model

We now compile our model with loss function, optimizer and metrics. If you want to tune learning rate and batch size, refer to user guide.

[ ]:
from tensorflow.keras.optimizers import RMSprop
model.compile(
    loss="sparse_categorical_crossentropy",
    optimizer=RMSprop(learning_rate=0.001),
    metrics=["accuracy"]
)

Step 6: (Optional) Resume training from memory

You can resume the previous search when a search completes by setting resume=True. Refer to user doc for more details.

[ ]:
%%time
model.search(
    n_trials=4,
    target_metric='val_accuracy',
    direction="maximize",
    pruner=PrunerType.HyperBand,
    pruner_kwargs={'min_resource':1, 'max_resource':100, 'reduction_factor':3},
    x=x_train,
    y=y_train,
    batch_size=128,
    epochs=5,
    validation_split=0.2,
    verbose=False,
    resume = True
)
[ ]:
print(model.search_summary())

Step 7: fit with the best hyperparameters

After search, model.fit will autotmatically use the best hyperparmeters found in search to fit the model.

[ ]:
history = model.fit(x_train, y_train,
                    batch_size=128, epochs=5, validation_split=0.2)

test_scores = model.evaluate(x_test, y_test, verbose=2)
print("Test loss:", test_scores[0])
print("Test accuracy:", test_scores[1])

Step 8: HPO Result Analysis and Visualization

Check out the summary of the model. The model has already been built with the best hyperparameters found by nano hpo.

[ ]:
print(model.summary())
study = model.search_summary()
[ ]:
study.trials_dataframe(attrs=("number", "value", "params", "state"))
[ ]:
from bigdl.nano.automl.hpo.visualization import plot_optimization_history
plot_optimization_history(study)
[ ]:
from bigdl.nano.automl.hpo.visualization import plot_parallel_coordinate
plot_parallel_coordinate(study)
[ ]:
from bigdl.nano.automl.hpo.visualization import plot_intermediate_values
plot_intermediate_values(study)
[ ]:
from bigdl.nano.automl.hpo.visualization import plot_contour
plot_contour(study)
[ ]:
from bigdl.nano.automl.hpo.visualization import plot_param_importances
plot_param_importances(study)
[ ]:
plot_param_importances(study, target=lambda t: t.duration.total_seconds(), target_name="duration")

You can find the running output from here, or run the notebook by yourself in Google Colab.