What's Changed
- This release includes breaking changes in the Target Platform Capabilities module (TPC). If you use a custom TPC, be sure to review the Breaking changes section.
General changes
-
Quantization enhancements:
-
Improved Hessian information computation runtime: speeds-up GPTQ, HMSE and Mixed Precision with Hessian-based loss.
get_keras_gptq_config
andget_pytorch_gptq_config
functions now allow to gethessian_batch_size
argument to control the size of the batch in Hessian computation for GPTQ.
-
Data Generation Upgrade: Improved Speed, Performance and Coverage.
- Add
SmoothAugmentationImagePipeline
– an image pipeline implementation that includes gaussian smoothing and random cropping and clipping. - Improved performance with float16 support in PyTorch.
- Introduced
ReduceLROnPlateauWithReset
scheduler – a learning rate scheduler which reduce learning rate when a metric has stopped improving and allows resetting the learning rate to the initial value after a specified number of bad epochs.
- Add
-
Shift negative correction for activations:
- Update shift negative for GELU activation operator.
- Enable shift negative correction by default in QuantizationConfig in CoreConfig.
-
-
Introduce new Explainable Quantization (Xquant) tool (experimental):
-
Introduced TPC IMX500.v3 (experimental):
- Support constants quantization. Constants Add, Sub, Mul & Div operators will be quantized to 8 bits Power of Two quantization, per-axis. Axis is chosen per constant according to minimum quantization error.
- IMX500 TPC now supports 16-bit activation quantization for the following operators: Add, Sub, Mul, Concat & Stack.
- Support assigning allowed input precision options to each operator, that is, the precision representation of the input activation tensor of the operator.
- Default TPC remains IMX500.v1.
- For selecting IMX500.v3 in keras:
tpc_v3 = mct.get_target_platform_capabilities("tensorflow", 'imx500', target_platform_version="v3")
mct.ptq.keras_post_training_quantization(model, representative_data_gen, target_platform_capabilities=tpc_v3)
- For selecting IMX500.v3 in pytorch:
tpc_v3 = mct.get_target_platform_capabilities("pytorch", 'imx500', target_platform_version="v3")
mct.ptq. pytorch_post_training_quantization(model, representative_data_gen, target_platform_capabilities=tpc_v3)
-
Introduced BitWidthConfig API:
- Allow manual adjustment of activation bit-widths for specific model layers through a new class under CoreConfig.
- Usage example of manual selection of 16bit activations available at PyTorch object detection YOLOv8n tutorial.
-
Tutorials:
- MCT tutorial notebooks updates:
- Added new tutorials for IMX500:
- instance segmentation YOLOv8n and a pose estimation YOLOv8n quantization in PyTorch, including an optional Gradient-Based PTQ step for optimized performance.
- A torchvision model quantization for IMX500.
- Added new classification models to MCT’s IMX500-Notebooks.
- Added new tutorials for IMX500:
- Added new MCT features tutorials: Xquant tutorial in PyTorch and Keras. In addition, a new tutorial for GPTQ in PyTorch has been added.
- Update PyTorch object detection YOLOv8n tutorial with 16 bits manual configuration.
- MCT tutorial notebooks updates:
Breaking changes
- To configure OpQuantizationConfig in the TPC, an additional arguments has been added:
Signedness
specifies the signedness of the quantization method (signed or unsigned quantization).supported_input_activation_n_bits
sets the number of bits that operator accepts as input.
Bug fixes:
- Fixed a bug in PyTorch model reader of reshape operator #1086.
- Fixed a bug in GPTQ with bias learning for cases that a convolutional layer with None as a bias #1109.
- Fixed an issue with mixed precision where when running only weights/activation compression with mixed precision. If layers with multiple candidates of the other (activation/weights) exist, the search would fail or be incorrect. A new filtering procedure has been added before running mixed precision, to filter out unnecessary candidates #1162.
New Contributors
Welcome @DaniAffCH, @irenaby, @yardeny-sony for their first contribution! PR #1094, PR #1118, PR #1163
Full Changelog: v2.1.0...v2.2.0