Nano PyTorch API¶

bigdl.nano.pytorch.Trainer¶

class bigdl.nano.pytorch.Trainer(*args: Any, **kwargs: Any)[source]¶

Trainer for BigDL-Nano pytorch.

This Trainer extends PyTorch Lightning Trainer by adding various options to accelerate pytorch training.

A pytorch lightning trainer that uses bigdl-nano optimization.

Parameters

num_processes – number of processes in distributed training. default: 4.
use_ipex – whether we use ipex as accelerator for trainer. default: False.
cpu_for_each_process – A list of length num_processes, each containing a list of indices of cpus each process will be using. default: None, and the cpu will be automatically and evenly distributed among processes.

static compile(model: torch.nn.modules.module.Module, loss: Optional[torch.nn.modules.loss._Loss] = None, optimizer: Optional[torch.optim.optimizer.Optimizer] = None, scheduler: Optional[torch.optim.lr_scheduler._LRScheduler] = None, metrics: Optional[List[torchmetrics.metric.Metric]] = None)[source]¶

Construct a pytorch-lightning model.

If model is already a pytorch-lightning model, return model. If model is pytorch model, construct a new pytorch-lightning module with model, loss and optimizer.

Parameters

model – A model instance.
loss – Loss to construct pytorch-lightning model. Should be None if model is instance of pl.LightningModule.
optimizer – Optimizer to construct pytorch-lightning model Should be None. if model is instance of pl.LightningModule.
metrics – A list of torchmetrics to validate/test performance.

Returns

A LightningModule object.

search(model, resume: bool = False, target_metric=None, n_parallels=1, acceleration=False, input_sample=None, **kwargs)[source]¶

Run HPO search. It will be called in Trainer.search().

Parameters

model – The model to be searched. It should be an auto model.
resume – whether to resume the previous or start a new one, defaults to False.
target_metric – the object metric to optimize, defaults to None.
n_parallels – the number of parallel processes for running trials.
acceleration – Whether to automatically consider the model after inference acceleration in the search process. It will only take effect if target_metric contains “latency”. Default value is False.
input_sample – A set of inputs for trace, defaults to None if you have trace before or model is a LightningModule with any dataloader attached.

Returns

the model with study meta info attached.

search_summary()[source]¶

Retrive a summary of trials.

Returns: A summary of all the trials. Currently the entire study is returned to allow more flexibility for further analysis and visualization.

static quantize(model, precision: str = 'int8', accelerator=None, use_ipex=False, calib_dataloader: Optional[torch.utils.data.dataloader.DataLoader] = None, metric: Optional[torchmetrics.metric.Metric] = None, accuracy_criterion: Optional[dict] = None, approach: str = 'static', method: Optional[str] = None, conf: Optional[str] = None, tuning_strategy: Optional[str] = None, timeout: Optional[int] = None, max_trials: Optional[int] = None, input_sample=None, onnxruntime_session_options=None, **export_kwargs)[source]¶

Calibrate a Pytorch-Lightning model for post-training quantization.

Parameters

model – A model to be quantized. Model type should be an instance of nn.Module.
precision – Global precision of quantized model, supported type: ‘int8’, ‘bf16’, ‘fp16’, defaults to ‘int8’.
accelerator – Use accelerator ‘None’, ‘onnxruntime’, ‘openvino’, defaults to None. None means staying in pytorch.
calib_dataloader – A torch.utils.data.dataloader.DataLoader object for calibration. Required for static quantization. It’s also used as validation dataloader.
metric – A torchmetrics.metric.Metric object for evaluation.
accuracy_criterion – Tolerable accuracy drop, defaults to None meaning no accuracy control. accuracy_criterion = {‘relative’: 0.1, ‘higher_is_better’: True} allows relative accuracy loss: 1%. accuracy_criterion = {‘absolute’: 0.99, ‘higher_is_better’:False} means accuracy must be smaller than 0.99.
approach – ‘static’ or ‘dynamic’. ‘static’: post_training_static_quant, ‘dynamic’: post_training_dynamic_quant. Default: ‘static’. OpenVINO supports static mode only.
method – Method to do quantization. When accelerator=None, supported methods: ‘fx’, ‘eager’, ‘ipex’, defaults to ‘fx’. If you don’t use ipex, suggest using ‘fx’ which executes automatic optimizations like fusion. For more information, please refer to https://pytorch.org/docs/stable/quantization.html#eager-mode-quantization. When accelerator=’onnxruntime’, supported methods: ‘qlinear’, ‘integer’, defaults to ‘qlinear’. Suggest ‘qlinear’ for lower accuracy drop if using static quantization. More details in https://onnxruntime.ai/docs/performance/quantization.html. This argument doesn’t take effect for OpenVINO, don’t change it for OpenVINO.
conf – A path to conf yaml file for quantization. Default: None, using default config.
tuning_strategy – ‘bayesian’, ‘basic’, ‘mse’, ‘sigopt’. Default: ‘bayesian’.
timeout – Tuning timeout (seconds). Default: None, which means early stop. Combine with max_trials field to decide when to exit.
max_trials – Max tune times. Default: None, which means no tuning. Combine with timeout field to decide when to exit. “timeout=0, max_trials=1” means it will try quantization only once and return satisfying best model.
input_sample – An input example to convert pytorch model into ONNX/OpenVINO.
onnxruntime_session_options – The session option for onnxruntime, only valid when accelerator=’onnxruntime’, otherwise will be ignored.
**export_kwargs –
will be passed to torch.onnx.export function.

Returns

A accelerated Pytorch-Lightning Model if quantization is sucessful.

static trace(model: torch.nn.modules.module.Module, input_sample=None, accelerator=None, use_ipex=False, onnxruntime_session_options=None, **export_kwargs)[source]¶

Trace a pytorch model and convert it into an accelerated module for inference.

For example, this function returns a PytorchOpenVINOModel when accelerator==’openvino’.

Parameters

model – An torch.nn.Module model, including pl.LightningModule.
input_sample – A set of inputs for trace, defaults to None if you have trace before or model is a LightningModule with any dataloader attached.
accelerator – The accelerator to use, defaults to None meaning staying in Pytorch backend. ‘openvino’, ‘onnxruntime’ and ‘jit’ are supported for now.
use_ipex – whether we use ipex as accelerator for inferencing. default: False.
onnxruntime_session_options – The session option for onnxruntime, only valid when accelerator=’onnxruntime’, otherwise will be ignored.
**kwargs –
other extra advanced settings include 1. those be passed to torch.onnx.export function, only valid when accelerator=’onnxruntime’/’openvino’, otherwise will be ignored. 2. if channels_last is set and use_ipex=True, we will transform the data to be channels last according to the setting. Defaultly, channels_last will be set to True if use_ipex=True.

Returns

Model with different acceleration.

static save(model: pytorch_lightning.LightningModule, path)[source]¶

Save the model to local file.

Parameters

model – Any model of torch.nn.Module, including all models accelareted by Trainer.trace/Trainer.quantize.
path – Path to saved model. Path should be a directory.

static load(path, model: Optional[pytorch_lightning.LightningModule] = None)[source]¶

Load a model from local.

Parameters

path – Path to model to be loaded. Path should be a directory.
model – Required FP32 model to load pytorch model, it is needed if you accelerated the model with accelerator=None by Trainer.trace/Trainer.quantize. model should be set to None if you choose accelerator=”onnxruntime”/”openvino”/”jit”.

Returns

Model with different acceleration(None/OpenVINO/ONNX Runtime/JIT) or precision(FP32/FP16/BF16/INT8).

save_checkpoint(filepath, weights_only: bool = False, storage_options: Optional[Any] = None) → None[source]¶: Save checkpoint after one train epoch.

bigdl.nano.pytorch.TorchNano¶

class bigdl.nano.pytorch.TorchNano(*args: Any, **kwargs: Any)[source]¶

TorchNano for BigDL-Nano pytorch.

It can be used to accelerate custom pytorch training loops with very few code changes.

Create a TorchNano with nano acceleration.

Parameters

num_processes – number of processes in distributed training, defaults to 1
use_ipex – whether use ipex acceleration, defaults to False
enable_bf16 – whether use bf16 acceleration, defaults to False
strategy – use which backend in distributed mode, defaults to “subprocess”, now avaiable strategies are ‘spawn’, ‘subprocess’ and ‘ray’

setup(model: torch.nn.modules.module.Module, optimizer: Union[torch.optim.optimizer.Optimizer, List[torch.optim.optimizer.Optimizer]], *dataloaders: torch.utils.data.dataloader.DataLoader, move_to_device: bool = True)[source]¶

Setup model, optimizers and dataloaders for accelerated training.

Parameters

model – A model to setup
optimizer – The optimizer(s) to setup
*dataloaders –
The dataloader(s) to setup
move_to_device – If set True (default), moves the model to the correct device. Set this to False and alternatively use to_device() manually.

Returns

The tuple of the wrapped model, optimizer, loss_func and dataloaders, in the same order they were passed in.

abstract train(*args: Any, **kwargs: Any) → Any[source]¶

All the code inside this train method gets accelerated by TorchNano.

You can pass arbitrary arguments to this function when overriding it.