qkeras.estimate
Definition of quantization package.
Functions
|
Analyzes the distribution of weights to specify size of accumulators. |
|
Extracts range of inputs of quantized layers from samples. |
|
Creates an activation cache for the tensors of a model. |
|
Determines types of operations for convolutions. |
|
Checks quantizers around layer and weights to get operation type. |
|
Returns the quantizer mode, number of bits and if it is a signed number. |
|
Prints quantization statistics for the model. |
- qkeras.estimate.analyze_accumulator(in_model, x, verbose=False)[source]
Analyzes the distribution of weights to specify size of accumulators.
Computes the maximum number of bits for the accumulator assuming the inputs have a distribution given by the dictionary x.
- for each output channel i:
max_positive_value[i] = sum(w[i]) + bias[i] for the positive weights max_negative_value[i] = sum(w[i]) + bias[i] for the negative weights
- max_value = max(
max_positive_value[i] * positive(x) + max_negative_value[i] * negative(x),
- (max_negative_value[i] * positive(x) +
max_positive_value[i] * negative(x))
)
accumulator_size = ceil( log2( max_value ) )
x right now is a dictionary of the form:
{ layer_name: (min_value, max_value) }
in the future, we want to provide a sample and compute this automatically
- Parameters:
in_model – keras model object, model to be evaluated
x – dictionary of the form: { layer_name: (min_value, max_value) } input distribution
verbose – boolean, if true, print statistics messages
- Returns:
accumulator_size }
- Return type:
dictionary containing { layer_name
- qkeras.estimate.analyze_accumulator_from_sample(in_model, x_sample, mode='conservative', verbose=False)[source]
Extracts range of inputs of quantized layers from samples.
- qkeras.estimate.create_activation_cache(model)[source]
Creates an activation cache for the tensors of a model.
- qkeras.estimate.extract_model_operations(in_model)[source]
Determines types of operations for convolutions.
- qkeras.estimate.get_operation_type(layer, output_cache)[source]
Checks quantizers around layer and weights to get operation type.
- Determines operator strenght according to the following table.
x
qb(n) +/-,exp t(-1,0,+1) b(-1,+1) b(0,1) float
qb(n) * << >>,- ?,- ?,- ? * +/-,exp << >>,- + ?,- ^ ?,- *
- w t(-1,0,+1) ?,- ?,- ?,^ ?,^ ^ *
b(-1,+1) ?,- ^ ?,^ ^ ^ * b(0,1) ? ?,- ^ ^ ^ * float * * * * * *
- Parameters:
layer – layer in Keras to determine the operation strength.
output_cache – cache of input tensor bit sizes.
- Returns:
One of “mult”, “fmult”, “adder”, “barrel”, “mux”, “xor”. Note: “mult” represents quantized bit multiplier, “fmult” represents
floating point multiplier.