qkeras.estimate

Definition of quantization package.

Functions

analyze_accumulator(in_model, x[, verbose])

Analyzes the distribution of weights to specify size of accumulators.

analyze_accumulator_from_sample(in_model, ...)

Extracts range of inputs of quantized layers from samples.

create_activation_cache(model)

Creates an activation cache for the tensors of a model.

extract_model_operations(in_model)

Determines types of operations for convolutions.

get_operation_type(layer, output_cache)

Checks quantizers around layer and weights to get operation type.

get_quant_mode(quant)

Returns the quantizer mode, number of bits and if it is a signed number.

print_qstats(model)

Prints quantization statistics for the model.

qkeras.estimate.analyze_accumulator(in_model, x, verbose=False)[source]

Analyzes the distribution of weights to specify size of accumulators.

Computes the maximum number of bits for the accumulator assuming the inputs have a distribution given by the dictionary x.

for each output channel i:

max_positive_value[i] = sum(w[i]) + bias[i] for the positive weights max_negative_value[i] = sum(w[i]) + bias[i] for the negative weights

max_value = max(

max_positive_value[i] * positive(x) + max_negative_value[i] * negative(x),

  • (max_negative_value[i] * positive(x) +

    max_positive_value[i] * negative(x))

)

accumulator_size = ceil( log2( max_value ) )

x right now is a dictionary of the form:

{ layer_name: (min_value, max_value) }

in the future, we want to provide a sample and compute this automatically

Parameters:
  • in_model – keras model object, model to be evaluated

  • x – dictionary of the form: { layer_name: (min_value, max_value) } input distribution

  • verbose – boolean, if true, print statistics messages

Returns:

accumulator_size }

Return type:

dictionary containing { layer_name

qkeras.estimate.analyze_accumulator_from_sample(in_model, x_sample, mode='conservative', verbose=False)[source]

Extracts range of inputs of quantized layers from samples.

qkeras.estimate.create_activation_cache(model)[source]

Creates an activation cache for the tensors of a model.

qkeras.estimate.extract_model_operations(in_model)[source]

Determines types of operations for convolutions.

qkeras.estimate.get_operation_type(layer, output_cache)[source]

Checks quantizers around layer and weights to get operation type.

Determines operator strenght according to the following table.

x

qb(n) +/-,exp t(-1,0,+1) b(-1,+1) b(0,1) float

qb(n) * << >>,- ?,- ?,- ? * +/-,exp << >>,- + ?,- ^ ?,- *

w t(-1,0,+1) ?,- ?,- ?,^ ?,^ ^ *

b(-1,+1) ?,- ^ ?,^ ^ ^ * b(0,1) ? ?,- ^ ^ ^ * float * * * * * *

Parameters:
  • layer – layer in Keras to determine the operation strength.

  • output_cache – cache of input tensor bit sizes.

Returns:

One of “mult”, “fmult”, “adder”, “barrel”, “mux”, “xor”. Note: “mult” represents quantized bit multiplier, “fmult” represents

floating point multiplier.

qkeras.estimate.get_quant_mode(quant)[source]

Returns the quantizer mode, number of bits and if it is a signed number.

qkeras.estimate.print_qstats(model)[source]

Prints quantization statistics for the model.