TensorFlow Quantization Tour

https://connpass.com/event/136350/

TensorFlow 
Quantization Tour
2019/07/10 @KSuzukiii

Quantization!?
Mobile & IoT( )
float/double( )  
 
Integer( )  
int4,int8,int16…

:http://bit.ly/2FJJtw1
TensorFlow Model Optimization Toolkit — Post-Training Integer Quantization

Quantization Flow
Post-training Quantization
Quantization-aware training

Post-training Q
NN Weight/Bias

Quantizing weights
https://www.tensorflow.org/lite/
performance/post_training_quantization
import tensorflow as tf
converter = tf.lite.TFLiteConverter.from_saved_model(saved_model_dir)
converter.optimizations = [tf.lite.Optimize.OPTIMIZE_FOR_SIZE]
tflite_quant_model = converter.convert()

tf.lite.Optimize
DEFAULT(Improved Size and Latency)
OPTIMIZE_FOR_SIZE
OPTIMIZE_FOR_LATENCY

NN
CONV2D
ReLU
CONV2D
BatchNorm
ReLU
Pooling
CONV2D
ReLU
CONV2D
BatchNorm
ReLU
Pooling
flatten
Dense
SoftMax
CIFAR10
(32x32x3)
Label:10
_________________________________________________________
Layer (type) Output Shape Param #
=========================================================
conv2d (Conv2D) (None, 32, 32, 64) 1792
_________________________________________________________
conv2d_1 (Conv2D) (None, 32, 32, 64) 36928
_________________________________________________________
batch_normalization (BatchNo (None, 32, 32, 64) 256
_________________________________________________________
activation (Activation) (None, 32, 32, 64) 0
_________________________________________________________
max_pooling2d (MaxPooling2D) (None, 16, 16, 64) 0
_________________________________________________________
conv2d_2 (Conv2D) (None, 16, 16, 128) 73856
_________________________________________________________
conv2d_3 (Conv2D) (None, 16, 16, 128) 147584
_________________________________________________________
batch_normalization_1 (Batch (None, 16, 16, 128) 512
_________________________________________________________
activation_1 (Activation) (None, 16, 16, 128) 0
_________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 8, 8, 128) 0
_________________________________________________________
flatten (Flatten) (None, 8192) 0
_________________________________________________________
dense (Dense) (None, 10) 81930
=========================================================
Total params: 342,858
Trainable params: 342,474
Non-trainable params: 384
_________________________________________________________

NORMAL 4.18MiB
DEFAULT 1.37MiB
70%
TensorFlowLite

NORMAL 4.18MiB
DEFAULT 1.37MiB
70%
TensorFlowLite
{'dtype': numpy.float32,
'index': 4,
'name': 'sequential/conv2d/Conv2D/ReadVariableOp',
'quantization': (0.0, 0),
'shape': array([64, 3, 3, 3], dtype=int32)},
'index': 5,
'name': 'sequential/conv2d/Conv2D_bias',
'shape': array([64], dtype=int32)},

NORMAL 4.18MiB
DEFAULT 1.37MiB
FOR_SIZE 346KiB
FOR_LATENCY 346KiB
92%
75%
TensorFlowLite

NORMAL 4.18MiB
DEFAULT 1.37MiB
FOR_SIZE 346KiB
FOR_LATENCY 346KiB
92%
75%
TensorFlowLite
{‘dtype': numpy.int8,
'index': 4,
'quantization': (0.0017586048925295472, 0),
'index': 5,

TFLite 8-bit
quantization spec
real_value = (int8_value − zero_point) × scale

TFLite 8-bit
quantization spec
'index': 4,
'quantization': (0.0017586048925295472, 0),

TFLite 8-bit
quantization spec
'index': 4,
'quantization': (0.0017586048925295472, 0),
:weight -127, —8, 2, 127
-0.2233428213512525, -0.014068839140236378,
0.0035172097850590944, 0.2233428213512525

-0.2233428213512525 0.2233428213512525

input +× ReLU
Weight Bias
quantize
float32 float32
float32 float32 float32 float32
int8
CONV2D

input +× ReLU
Weight Bias
quantize
float32 float32
float32 float32 float32 float32
int8
CONV2D
http://bit.ly/2jF7hck
P5
int8 float32
Energy

Full integer quantization
of weights and activations
https://www.tensorflow.org/lite/
performance/post_training_quantization
import tensorflow as tf
def representative_dataset_gen():
for _ in range(num_calibration_steps):
# Get sample input data as a numpy array in a method of your choosing.
yield [input]
converter = tf.lite.TFLiteConverter.from_saved_model(saved_model_dir)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.representative_dataset = representative_dataset_gen
tflite_quant_model = converter.convert()
100( 1000 ) calibration

NORMAL 4.18MiB
DEFAULT 1.37MiB
FOR_SIZE 346KiB
FOR_LATENCY 346KiB
Full integer
quantization
356KiB
70%
92%
75%
TensorFlowLite

NORMAL 4.18MiB
DEFAULT 1.37MiB
FOR_SIZE 346KiB
FOR_LATENCY 346KiB
Full integer
quantization
356KiB
70%
92%
75%
TensorFlowLite
'index': 4,
{'dtype': numpy.int32,
'index': 5,

input +× ReLU
Weight Bias
int32
int8 int32? int32? int8?
int8
CONV2D
quantize
float32

input +× ReLU
Weight Bias
int32
int8 int32? int32? int8?
int8
CONV2D
quantize
float32
{'dtype': numpy.int8,
'index': 6,
'name': 'sequential/conv2d/Relu',
'quantization': (0.0032105057034641504, -128),
'shape': array([ 1, 32, 32, 64], dtype=int32)},

NORMAL 4.18MiB
DEFAULT 1.37MiB
FOR_SIZE 346KiB
FOR_LATENCY 346KiB
Full integer
quantization
356KiB
Edge TPU 451KiB
70%
92%
75%
TensorFlowLite
Edge TPU

edgetpu_compiler
--show_operations
Number of operations that will run on Edge TPU: 8
Number of operations that will run on CPU: 2
Operator Count Status
MAX_POOL_2D 2 Mapped to Edge TPU
QUANTIZE 1 Operation is otherwise supported, but not mapped due
to some unspecified limitation
CONV_2D 4 Mapped to Edge TPU
DEQUANTIZE 1 Operation is working on an unsupported data type
SOFTMAX 1 Mapped to Edge TPU
FULLY_CONNECTED 1 Mapped to Edge TPU

edgetpu_compiler
--show_operations
Number of operations that will run on Edge TPU: 8
Number of operations that will run on CPU: 2
Operator Count Status
MAX_POOL_2D 2 Mapped to Edge TPU
QUANTIZE 1 Operation is otherwise supported, but not mapped due
to some unspecified limitation
CONV_2D 4 Mapped to Edge TPU
DEQUANTIZE 1 Operation is working on an unsupported data type
SOFTMAX 1 Mapped to Edge TPU
FULLY_CONNECTED 1 Mapped to Edge TPU
CONV2D DenseSoftMaxPooling
Quantize Dequantize
Edge TPU
CPU

input
Weight Bias
int32
int8
int8
Quantize
float32
CONV2D
Weight Bias
int32
int8
int8
CONV2D
Weight Bias
int32
int8
int8
Dense
int8
SoftMax Dequantize
output
float32int8

input
Weight Bias
int32
int8
int8
Quantize
float32
CONV2D
Weight Bias
int32
int8
int8
CONV2D
Weight Bias
int32
int8
int8
Dense
int8
SoftMax Dequantize
output
float32int8
restriction:
(scale, zero_point) =
(1.0 / 256.0, -128)

output int8
input
Weight Bias
int32
int8
int8
Quantize
float32
CONV2D
Weight Bias
int32
int8
int8
CONV2D
Weight Bias
int32
int8
int8
Dense
int8
SoftMax Dequantize
output
float32int8

output int8
input
Weight Bias
int32
int8
int8
Quantize
float32
CONV2D
Weight Bias
int32
int8
int8
CONV2D
Weight Bias
int32
int8
int8
Dense
int8
SoftMax Dequantize
output
float32int8
converter.target_spec.supported_ops =
[tf.lite.OpSet.TFLITE_BUILTINS_INT8]
int8

Quantization
TFLiteConverter
Quantization

Optimization Toolkit -
Pruning API

TensorFlow Lite 8-bit quantization specification 
https://www.tensorflow.org/lite/performance/
quantization_spec
Performance best practices 
https://www.tensorflow.org/lite/performance/
best_practices

TensorFlow Quantization Tour

Recommended

Recommended

More Related Content

What's hot

What's hot (19)

Similar to TensorFlow Quantization Tour

Similar to TensorFlow Quantization Tour (20)

Recently uploaded

Recently uploaded (20)

TensorFlow Quantization Tour