qkeras.estimate

Definition of quantization package.

Functions

`analyze_accumulator`(in_model, x[, verbose])	Analyzes the distribution of weights to specify size of accumulators.
`analyze_accumulator_from_sample`(in_model, ...)	Extracts range of inputs of quantized layers from samples.
`create_activation_cache`(model)	Creates an activation cache for the tensors of a model.
`extract_model_operations`(in_model)	Determines types of operations for convolutions.
`get_operation_type`(layer, output_cache)	Checks quantizers around layer and weights to get operation type.
`get_quant_mode`(quant)	Returns the quantizer mode, number of bits and if it is a signed number.
`print_qstats`(model)	Prints quantization statistics for the model.

qkeras.estimate.analyze_accumulator(in_model, x, verbose=False)[source]

Analyzes the distribution of weights to specify size of accumulators.

Computes the maximum number of bits for the accumulator assuming the inputs have a distribution given by the dictionary x.

for each output channel i:
max_positive_value[i] = sum(w[i]) + bias[i] for the positive weights max_negative_value[i] = sum(w[i]) + bias[i] for the negative weights

max_value = max(

max_positive_value[i] * positive(x) + max_negative_value[i] * negative(x),

(max_negative_value[i] * positive(x) +
max_positive_value[i] * negative(x))

)

accumulator_size = ceil( log2( max_value ) )

x right now is a dictionary of the form:

{ layer_name: (min_value, max_value) }

in the future, we want to provide a sample and compute this automatically

Parameters:

in_model – keras model object, model to be evaluated
x – dictionary of the form: { layer_name: (min_value, max_value) } input distribution
verbose – boolean, if true, print statistics messages

Returns:

accumulator_size }

Return type:

dictionary containing { layer_name

qkeras.estimate.analyze_accumulator_from_sample(in_model, x_sample, mode='conservative', verbose=False)[source]: Extracts range of inputs of quantized layers from samples.

qkeras.estimate.create_activation_cache(model)[source]: Creates an activation cache for the tensors of a model.

qkeras.estimate.extract_model_operations(in_model)[source]: Determines types of operations for convolutions.

qkeras.estimate.get_operation_type(layer, output_cache)[source]

Checks quantizers around layer and weights to get operation type.

Determines operator strenght according to the following table.

x

qb(n) +/-,exp t(-1,0,+1) b(-1,+1) b(0,1) float

qb(n) * << >>,- ?,- ?,- ? * +/-,exp << >>,- + ?,- ^ ?,- *

w t(-1,0,+1) ?,- ?,- ?,^ ?,^ ^ *: b(-1,+1) ?,- ^ ?,^ ^ ^ * b(0,1) ? ?,- ^ ^ ^ * float * * * * * *

Parameters:

layer – layer in Keras to determine the operation strength.
output_cache – cache of input tensor bit sizes.

Returns:

One of “mult”, “fmult”, “adder”, “barrel”, “mux”, “xor”. Note: “mult” represents quantized bit multiplier, “fmult” represents

floating point multiplier.

qkeras.estimate.get_quant_mode(quant)[source]: Returns the quantizer mode, number of bits and if it is a signed number.

qkeras.estimate.print_qstats(model)[source]: Prints quantization statistics for the model.