qkeras.quantizers
Functions
Computes binary_sigmoid. |
|
|
Computes binary_tanh function that outputs -1 and 1. |
|
Gets the initializer and scales it by the range. |
|
Gets the quantizer. |
|
Gets the scales of weights for (stochastic_)binary and ternary quantizers. |
|
Computes hard_sigmoid function that saturates between 0 and 1. |
|
Computes hard_tanh function that saturates between -1 and 1. |
|
Sets _sigmoid to either real, hard or smooth. |
Implements a linear approximation of a sigmoid function. |
|
|
Computes smooth_tanh function that saturates between -1 and 1. |
|
Performs stochastic rounding to the first decimal point. |
Performs stochastic rounding for the power of two. |
Classes
|
Computes a Bernoulli sample with probability sigmoid(x). |
|
Computes the sign(x) returning a value between -alpha and alpha. |
|
Legacy quantizer: Quantizes the number to a number of bits. |
|
Computes a quantized hard swish to a number of bits. |
|
Linear quantization with fixed number of bits. |
|
Quantizes to the closest power of 2. |
|
Computes a quantized relu to a number of bits. |
|
Quantizes x to the closest power of 2 when x > 0 |
|
Computes a quantized sigmoid to a number of bits. |
|
Computes a quantized tanh to a number of bits. |
|
Computes a u-law quantization. |
|
Computes a stochastic activation function returning -alpha or +alpha. |
|
Computes a stochastic activation function returning -alpha, 0 or +alpha. |
|
Computes an activation function returning -alpha, 0 or +alpha. |
- class qkeras.quantizers.bernoulli(alpha=None, temperature=6.0, use_real_sigmoid=True)[source]
Bases:
BaseQuantizerComputes a Bernoulli sample with probability sigmoid(x).
This computation uses ST approximation.
To do that, we compute sigmoid(x) and a random sample z ~ U[0,1]. As p in [0,1] and z in [0,1], p - z in [-1,1]. However, -1 will never appear because to get -1 we would need sigmoid(-inf) - z == 1. As a result, the range will be in practical terms [0,1].
The noise introduced by z can be seen as a regularizer to the weights W of y = Wx as y = Wx + Wz for some noise z with mean mu(z) and var(z). As a result, W**2 var(z) to the variance of y, which has the same effect as a regularizer on L2 with lambda = var(z), as presented in Hinton”s Coursera Lecture 9c.
Remember that E[dL/dy] = E[dL/dx] once we add stochastic sampling.
- alpha
allows one to specify multiplicative factor for number generation of “auto” or “auto_po2”.
- temperature
amplifier factor for sigmoid function, making stochastic less stochastic as it moves away from 0.
- use_real_sigmoid
use real sigmoid for probability.
- Returns:
Computation of round with stochastic sampling with straight through gradient.
- class qkeras.quantizers.binary(use_01=False, alpha=None, training=False, use_stochastic_rounding=False, scale_axis=None, elements_per_scale=None, min_po2_exponent=None, max_po2_exponent=None)[source]
Bases:
BaseQuantizerComputes the sign(x) returning a value between -alpha and alpha.
Although we cannot guarantee E[dL/dy] = E[dL/dx] if we do not use the stochastic sampling, we still use the ST approximation.
Modified from original binary to match QNN implementation.
The binary qunatizer supports multiple-scales per tensor where: - alpha: It can be set to “auto” or “auto_po2” to enable auto-scaling. “auto”
allows arbitrary scale while “auto_po2” allows power-of-two scales only. It can also be set to a fixed value or None (i.e., no scaling).
scale_axis: It determines the axis/axes to calculate the auto-scale at.
elements_per_scale: It enables fine-grained scaling where it determines the number of elements across scale axis/axes that should be grouped into one scale.
Examples:
Input shape = [1, 8, 8, 16] alpha=”auto”, scale_axis=None, elements_per_scale=None –> Number of separate scales = 16
Input shape = [1, 8, 8, 16] alpha=”auto”, scale_axis=1, elements_per_scale=None –> Number of separate scales = 8
Input shape = [1, 8, 8, 16] alpha=”auto”, scale_axis=1, elements_per_scale=2 –> Number of separate scales = 4
Input shape = [1, 8, 8, 16] alpha=”auto”, scale_axis=[2, 3], elements_per_scale=2 –> Number of separate scales = 4*8 = 32
Input shape = [1, 8, 8, 16] alpha=”auto”, scale_axis=[2, 3], elements_per_scale=[2, 4] –> Number of separate scales = 4*4 = 16
- x
tensor to perform sign_through.
- bits
number of bits to perform quantization.
- use_01
if True, return {0,1} instead of {-1,+1}.
- alpha
binary is -alpha or +alpha, or “auto”, “auto_po2” to compute automatically.
- use_stochastic_rounding
if true, we perform stochastic rounding.
- elements_per_scale
if set to an int or List[int], we create multiple scales per axis across scale_axis, where ‘elements_per_scale’ represents the number of elements/values associated with every separate scale value.
- scale_axis
int or List[int] which axis/axes to calculate scale from.
- min_po2_exponent
if set while using “auto_po2”, it represents the minimum allowed power of two exponent.
- max_po2_exponent
if set while using “auto_po2”, it represents the maximum allowed power of two exponent.
- Returns:
Computation of sign operation with straight through gradient.
- qkeras.quantizers.get_quantized_initializer(w_initializer, w_range)[source]
Gets the initializer and scales it by the range.
- qkeras.quantizers.get_quantizer(identifier)[source]
Gets the quantizer.
- Parameters:
identifier – An quantizer, which could be dict, string, or callable function.
- Returns:
- A quantizer class or quantization function from this file. For example,
Quantizer classes: quantized_bits, quantized_po2, quantized_relu_po2, binary, stochastic_binary, ternary, stochastic_ternary, etc.
Quantization functions: binary_sigmoid, hard_sigmoid, soft_sigmoid, etc.
- Raises:
ValueError – An error occurred when quantizer cannot be interpreted.
- qkeras.quantizers.get_weight_scale(quantizer, x=None)[source]
Gets the scales of weights for (stochastic_)binary and ternary quantizers.
- Parameters:
quantizer – A binary or teneray quantizer class.
x – A weight tensor. We keep it here for now for backward compatibility.
- Returns:
Weight scale per channel for binary and ternary quantizers with auto or auto_po2 alpha/threshold.
- qkeras.quantizers.hard_sigmoid(x)[source]
Computes hard_sigmoid function that saturates between 0 and 1.
- qkeras.quantizers.hard_tanh(x)[source]
Computes hard_tanh function that saturates between -1 and 1.
- class qkeras.quantizers.quantized_bits(bits=8, integer=0, symmetric=0, keep_negative=True, alpha=None, use_stochastic_rounding=False, scale_axis=None, qnoise_factor=1.0, var_name=None, use_ste=True, use_variables=False, elements_per_scale=None, min_po2_exponent=None, max_po2_exponent=None, post_training_scale=None)[source]
Bases:
BaseQuantizerLegacy quantizer: Quantizes the number to a number of bits.
In general, we want to use a quantization function like:
a = (pow(2,bits) - 1 - 0) / (max(x) - min(x)) b = -min(x) * a
in the equation:
xq = a x + b
This requires multiplication, which is undesirable. So, we enforce weights to be between -1 and 1 (max(x) = 1 and min(x) = -1), and separating the sign from the rest of the number as we make this function symmetric, thus resulting in the following approximation.
max(x) = +1, min(x) = -1
max(x) = -min(x)
a = pow(2,bits-1) b = 0
Finally, just remember that to represent the number with sign, the largest representation is -pow(2,bits) to pow(2, bits-1)
Symmetric and keep_negative allow us to generate numbers that are symmetric (same number of negative and positive representations), and numbers that are positive.
Note
the behavior of quantized_bits is different than Catapult HLS ac_fixed or Vivado HLS ap_fixed. For ac_fixed<word_length, integer_lenth, signed>, when signed = true, it is equavlent to quantized_bits(word_length, integer_length-1, keep_negative=True)
- bits
number of bits to perform quantization.
- integer
number of bits to the left of the decimal point.
- symmetric
if true, we will have the same number of values for positive and negative numbers.
- alpha
a tensor or None, the scaling factor per channel. If None, the scaling factor is 1 for all channels.
- keep_negative
if true, we do not clip negative numbers.
- use_stochastic_rounding
if true, we perform stochastic rounding.
- scale_axis
int or List[int] which axis/axes to calculate scale from.
- qnoise_factor
float. a scalar from 0 to 1 that represents the level of quantization noise to add. This controls the amount of the quantization noise to add to the outputs by changing the weighted sum of (1 - qnoise_factor)*unquantized_x + qnoise_factor*quantized_x.
- var_name
String or None. A variable name shared between the Variables created in the build function. If None, it is generated automatically.
- use_ste
Bool. Whether to use “straight-through estimator” (STE) method or not.
- use_variables
Bool. Whether to make the quantizer variables to be dynamic Variables or not.
- elements_per_scale
if set to an int or List[int], we create multiple scales per axis across scale_axis, where ‘elements_per_scale’ represents the number of elements/values associated with every separate scale value. It is only supported when using “auto_po2”.
- min_po2_exponent
if set while using “auto_po2”, it represents the minimum allowed power of two exponent.
- max_po2_exponent
if set while using “auto_po2”, it represents the maximum allowed power of two exponent.
- post_training_scale
if set, it represents the scale value to be used for quantization.
- Returns:
Function that computes fixed-point quantization with bits.
- class qkeras.quantizers.quantized_hswish(bits=8, integer=0, symmetric=0, alpha=None, use_stochastic_rounding=False, scale_axis=None, qnoise_factor=1.0, var_name=None, use_variables=False, relu_shift=3, relu_upper_bound=6)[source]
Bases:
quantized_bitsComputes a quantized hard swish to a number of bits.
# TODO(mschoenb97): Update to inherit from quantized_linear.
Equation of h-swisth function in mobilenet v3: hswish(x) = x * ReluY(x + relu_shift) / Y Y is relu_upper_bound
- Parameters:
relu_shift (int)
relu_upper_bound (int)
- bits
number of bits to perform quantization, also known as word length.
- integer
number of integer bits.
- symmetric
if True, the quantization is in symmetric mode, which puts restricted range for the quantizer. Otherwise, it is in asymmetric mode, which uses the full range.
- alpha
a tensor or None, the scaling factor per channel. If None, the scaling factor is 1 for all channels.
- use_stochastic_rounding
if true, we perform stochastic rounding. This parameter is passed on to the underlying quantizer quantized_bits which is used to quantize h_swish.
- scale_axis
which axis to calculate scale from
- qnoise_factor
float. a scalar from 0 to 1 that represents the level of quantization noise to add. This controls the amount of the quantization noise to add to the outputs by changing the weighted sum of (1 - qnoise_factor)*unquantized_x + qnoise_factor*quantized_x.
- var_name
String or None. A variable name shared between the Variables created in the build function. If None, it is generated automatically.
- use_ste
Bool. Whether to use “straight-through estimator” (STE) method or not.
- use_variables
Bool. Whether to make the quantizer variables to be dynamic Variables or not.
- relu_shift
integer type, representing the shift amount of the unquantized relu.
- relu_upper_bound
integer type, representing an upper bound of the unquantized relu. If None, we apply relu without the upper bound when “is_quantized_clip” is set to false (true by default). Note: The quantized relu uses the quantization parameters (bits and
integer) to upper bound. So it is important to set relu_upper_bound appropriately to the quantization parameters. “is_quantized_clip” has precedence over “relu_upper_bound” for backward compatibility.
- class qkeras.quantizers.quantized_linear(bits=8, integer=0, symmetric=1, keep_negative=True, alpha=None, use_stochastic_rounding=False, scale_axis=None, qnoise_factor=1.0, var_name=None, use_variables=False)[source]
Bases:
BaseQuantizerLinear quantization with fixed number of bits.
This quantizer maps inputs to the nearest value of a fixed number of outputs that are evenly spaced, with possible scaling and stochastic rounding. This is an updated version of the legacy quantized_bits.
- The core computation is:
Divide the tensor by a quantization scale
Clip the tensor to a specified range
Round to the nearest integer
Multiply the rounded result by the quantization scale
- This clip range is determined by
The number of bits we have to represent the number
Whether we want to have a symmetric range or not
Whether we want to keep negative numbers or not
The quantization scale is defined by either the quantizer parameters or the data passed to the __call__ method. See documentation for the alpha parameter to find out more.
For backprop purposes, the quantizer uses the straight-through estimator for the rounding step (https://arxiv.org/pdf/1903.05662.pdf). Thus the gradient of the __call__ method is 1 on the interval [quantization_scale * clip_min, quantization_scale * clip_max] and 0 elsewhere.
The quantizer also supports a number of other optional features: - Stochastic rounding (see the stochastic_rounding parameter) - Quantization noise (see the qnoise_factor parameter)
Notes on the various “scales” in quantized_linear:
The quantization scale is the scale used in the core computation (see above). You can access it via the quantization_scale attribute.
The data type scale is the scale is determined by the type of data stored on hardware on a small device running a true quantized model. It is the quantization scale needed to represent bits bits, integer of which are integer bits, and one bit is reserved for the sign if keep_negative is True. It can be calculated as 2 ** (integer - bits + keep_negative). You can access it via the data_type_scale attribute.
The scale attribute stores the quotient of the quantization scale and the data type scale. This is also the scale that can be directly specified by the user, via the alpha parameter.
These three quantities are related by the equation scale = quantization_scale / data_type_scale.
See the diagram below of scale usage in a quantized conv layer.
data_type_scale —————> stored_weights
- (determines decimal point) |
V
- conv op
V
- accumulator
- determines quantization V
- range and precision —————> quantization_scale
- (per channel) |
V
activation
# TODO: The only fundamentally necessary scale is the quantization scale. # We should consider removing the data type scale and scale attributes, # but know that this will require rewriting much of how qtools and HLS4ML # use these scale attributes.
- Note on binary quantization (bits=1):
The core computation is modified here when keep_negative is True to perform a scaled sign function. This is needed because the core computation as defined above requires that 0 be mapped to 0, which does not allow us to keep both positive and negative outputs for binary quantization. Special shifting operations are used to achieve this.
Example usage:
# 8-bit quantization with 3 integer bits >>> q = quantized_linear(8, 3) >>> x = keras.ops.array([0.0, 0.5, 1.0, 1.5, 2.0]) >>> keras.ops.convert_to_numpy(q(x)) array([0., 0., 1., 2., 2.], dtype=float32)
# 2-bit quantization with “auto” and tensor alphas >>> q_auto = quantized_linear(2, alpha=”auto”) >>> x = keras.ops.array([0.0, 0.5, 1.0, 1.5, 2.0]) >>> keras.ops.convert_to_numpy(q_auto(x)) array([0., 0., 0., 2., 2.], dtype=float32) >>> keras.ops.convert_to_numpy(q_auto.scale) array([4.], dtype=float32) >>> keras.ops.convert_to_numpy(q_auto.quantization_scale) array([2.], dtype=float32) >>> q_fixed = quantized_linear(2, alpha=q_auto.scale) >>> q_fixed(x) array([0., 0., 0., 2., 2.], dtype=float32)
- Args:
bits (int): Number of bits to represent the number. Defaults to 8. integer (int): Number of bits to the left of the decimal point, used for
data_type_scale. Defaults to 0.
- symmetric (bool): If true, we will have the same number of values
for positive and negative numbers. Defaults to True.
- alpha (str, Tensor, None): Instructions for determining the quantization
scale. Defaults to None. - If None: the quantization scale is the data type scale, determined
by integer, bits, and keep_negative.
If “auto”, the quantization scale is calculated as the minimum floating point scale per-channel that does not clip the max of x.
If “auto_po2”, the quantization scale is chosen as the power of two per-channel that minimizes squared error between the quantized x and the original x.
If Tensor: The quantization scale is the Tensor passed in multiplied by the data type scale.
- keep_negative (bool): If false, we clip negative numbers. Defaults to
True.
- use_stochastic_rounding (bool): If true, we perform stochastic rounding
- scale_axis (int, None): Which axis to calculate scale from. If None, we
perform per-channel scaling based off of the image data format. Note that each entry of a rank-1 tensor is considered its own channel by default. See _get_scaling_axis for more details. Defaults to None.
- qnoise_factor (float): A scalar from 0 to 1 that represents the level of
quantization noise to add. This controls the amount of the quantization noise to add to the outputs by changing the weighted sum of (1 - qnoise_factor) * unquantized_x + qnoise_factor * quantized_x. Defaults to 1.0, which means that the result is fully quantized.
- use_variables (bool): If true, we use Variables to store certain
parameters. See the BaseQuantizer implementation for more details. Defaults to False. If set to True, be sure to use the special attribute update methods detailed in the BaseQuantizer.
- var_name (str or None): A variable name shared between the Variables
created in on initialization, if use_variables is true. If None, the variable names are generated automatically based on the parameter names along with a uid. Defaults to None.
- Returns:
Function that computes linear quantization.
- Return type:
function
- Raises:
ValueError –
If bits is not positive, or is too small to represent integer. - If integer is negative. - If alpha is a string but not one of (“auto”, “auto_po2”).
- ALPHA_STRING_OPTIONS = ('auto', 'auto_po2')
- property auto_alpha
Returns true if using a data-dependent alpha
- property bits
- property data_type_scale
Quantization scale for the data type
- property default_quantization_scale
Calculate and set quantization_scale default
- property integer
- property keep_negative
- property scale
- property scale_axis
- property use_sign_function
Return true if using sign function for quantization
- property use_stochastic_rounding
- property use_variables
- class qkeras.quantizers.quantized_po2(bits=8, max_value=None, use_stochastic_rounding=False, quadratic_approximation=False, log2_rounding='rnd', qnoise_factor=1.0, var_name=None, use_ste=True, use_variables=False)[source]
Bases:
BaseQuantizerQuantizes to the closest power of 2.
- bits
An integer, the bits allocated for the exponent, its sign and the sign of x.
- max_value
An float or None. If None, no max_value is specified. Otherwise, the maximum value of quantized_po2 <= max_value
- use_stochastic_rounding
A boolean, default is False, if True, it uses stochastic rounding and forces the mean of x to be x statstically.
- quadratic_approximation
A boolean, default is False if True, it forces the exponent to be even number that closted to x.
- log2_rounding
A string, log2 rounding mode. “rnd” and “floor” currently supported, corresponding to keras.ops.round and keras.ops.floor respectively.
- qnoise_factor
float. a scalar from 0 to 1 that represents the level of quantization noise to add. This controls the amount of the quantization noise to add to the outputs by changing the weighted sum of (1 - qnoise_factor)*unquantized_x + qnoise_factor*quantized_x.
- var_name
String or None. A variable name shared between the Variables created in the build function. If None, it is generated automatically.
- use_ste
Bool. Whether to use “straight-through estimator” (STE) method or not.
- use_variables
Bool. Whether to make the quantizer variables to be dynamic Variables or not.
- get_config()[source]
Gets configugration of the quantizer.
- Returns:
- A dict mapping quantization configuration, including
bits: bitwidth for exponents. max_value: the maximum value of this quantized_po2 can represent. use_stochastic_rounding:
if True, stochastic rounding is used.
- quadratic_approximation:
if True, the exponent is enforced to be even number, which is the closest one to x.
- log2_rounding:
A string, Log2 rounding mode
- class qkeras.quantizers.quantized_relu(bits=8, integer=0, use_sigmoid=0, negative_slope=0.0, use_stochastic_rounding=False, relu_upper_bound=None, is_quantized_clip=True, qnoise_factor=1.0, var_name=None, use_ste=True, use_variables=False)[source]
Bases:
BaseQuantizerComputes a quantized relu to a number of bits.
Modified from:
[https://github.com/BertMoons/QuantizedNeuralNetworks-Keras-Tensorflow]
Assume h(x) = +1 with p = sigmoid(x), -1 otherwise, the expected value of h(x) is:
- E[h(x)] = +1 P(p <= sigmoid(x)) - 1 P(p > sigmoid(x))
= +1 P(p <= sigmoid(x)) - 1 ( 1 - P(p <= sigmoid(x)) ) = 2 P(p <= sigmoid(x)) - 1 = 2 sigmoid(x) - 1, if p is sampled from a uniform distribution U[0,1]
If use_sigmoid is 0, we just keep the positive numbers up to 2**integer * (1 - 2**(-bits)) instead of normalizing them, which is easier to implement in hardware.
- bits
number of bits to perform quantization.
- integer
number of bits to the left of the decimal point.
- use_sigmoid
if true, we apply sigmoid to input to normalize it.
- negative_slope
slope when activation < 0, needs to be power of 2.
- use_stochastic_rounding
if true, we perform stochastic rounding.
- relu_upper_bound
A float representing an upper bound of the unquantized relu. If None, we apply relu without the upper bound when “is_quantized_clip” is set to false (true by default). Note: The quantized relu uses the quantization parameters (bits and integer) to upper bound. So it is important to set relu_upper_bound appropriately to the quantization parameters. “is_quantized_clip” has precedence over “relu_upper_bound” for backward compatibility.
- is_quantized_clip
A boolean representing whether the inputs are clipped to the maximum value represented by the quantization parameters. This parameter is deprecated, and the default is set to True for backwards compatibility. Users are encouraged to use “relu_upper_bound” instead.
- qnoise_factor
float. a scalar from 0 to 1 that represents the level of quantization noise to add. This controls the amount of the quantization noise to add to the outputs by changing the weighted sum of (1 - qnoise_factor)*unquantized_x + qnoise_factor*quantized_x.
- var_name
String or None. A variable name shared between the Variables created in the build function. If None, it is generated automatically.
- use_ste
Bool. Whether to use “straight-through estimator” (STE) method or not.
- use_variables
Bool. Whether to make the quantizer variables to be dynamic Variables or not.
- Returns:
Function that performs relu + quantization to bits >= 0.
- class qkeras.quantizers.quantized_relu_po2(bits=8, max_value=None, negative_slope=0, use_stochastic_rounding=False, quadratic_approximation=False, log2_rounding='rnd', qnoise_factor=1.0, var_name=None, use_ste=True, use_variables=False)[source]
Bases:
BaseQuantizerQuantizes x to the closest power of 2 when x > 0
- bits
An integer, the bits allocated for the exponent and its sign.
- max_value
default is None, or a non-negative value to put a constraint for the max value.
- negative_slope
slope when activation < 0, needs to be power of 2.
- use_stochastic_rounding
A boolean, default is False, if True, it uses stochastic rounding and forces the mean of x to be x statstically.
- quadratic_approximation
A boolean, default is False if True, it forces the exponent to be even number that is closest to x.
- log2_rounding
A string, log2 rounding mode. “rnd” and “floor” currently supported, corresponding to keras.ops.round and keras.ops.floor respectively.
- qnoise_factor
float. a scalar from 0 to 1 that represents the level of quantization noise to add. This controls the amount of the quantization noise to add to the outputs by changing the weighted sum of (1 - qnoise_factor)*unquantized_x + qnoise_factor*quantized_x.
- var_name
String or None. A variable name shared between the Variables created in the build function. If None, it is generated automatically.
- use_ste
Bool. Whether to use “straight-through estimator” (STE) method or not.
- use_variables
Bool. Whether to make the quantizer variables to be dynamic Variables or not.
- get_config()[source]
Gets configugration of the quantizer.
- Returns:
- A dict mapping quantization configuration, including
bits: bitwidth for exponents. max_value: the maximum value of this quantized_relu_po2 can represent. use_stochastic_rounding:
if True, stochastic rounding is used.
- quadratic_approximation:
if True, the exponent is enforced to be even number, which is the closest one to x.
- log2_rounding:
A string, Log2 rounding mode
- class qkeras.quantizers.quantized_sigmoid(bits=8, symmetric=False, use_real_sigmoid=False, use_stochastic_rounding=False)[source]
Bases:
BaseQuantizerComputes a quantized sigmoid to a number of bits.
- bits
number of bits to perform quantization.
- symmetric
if true, we will have the same number of values for positive and negative numbers.
- use_real_sigmoid
if true, will use the sigmoid from Keras backend
- use_stochastic_rounding
if true, we perform stochastic rounding.
- Returns:
Function that performs sigmoid + quantization to bits in the range 0.0 to 1.0.
- class qkeras.quantizers.quantized_tanh(bits=8, use_stochastic_rounding=False, symmetric=False, use_real_tanh=False)[source]
Bases:
BaseQuantizerComputes a quantized tanh to a number of bits.
Modified from:
[https://github.com/BertMoons/QuantizedNeuralNetworks-Keras-Tensorflow]
- bits
number of bits to perform quantization.
- use_stochastic_rounding
if true, we perform stochastic rounding.
- symmetric
if true, we will have the same number of values for positive and negative numbers.
- use_real_tanh
if true, use the tanh function from Keras backend, if false, use tanh that is defined as 2 * sigmoid(x) - 1
- Returns:
Function that performs tanh + quantization to bits in the range -1.0 to 1.0.
- class qkeras.quantizers.quantized_ulaw(bits=8, integer=0, symmetric=0, u=255.0)[source]
Bases:
BaseQuantizerComputes a u-law quantization.
- bits
number of bits to perform quantization.
- integer
number of bits to the left of the decimal point.
- symmetric
if true, we will have the same number of values for positive and negative numbers.
- u
parameter of u-law
- Returns:
Function that performs ulaw + quantization to bits in the range -1.0 to 1.0.
- qkeras.quantizers.smooth_sigmoid(x)[source]
Implements a linear approximation of a sigmoid function.
- qkeras.quantizers.smooth_tanh(x)[source]
Computes smooth_tanh function that saturates between -1 and 1.
- class qkeras.quantizers.stochastic_binary(alpha=None, temperature=6.0, use_real_sigmoid=True)[source]
Bases:
binaryComputes a stochastic activation function returning -alpha or +alpha.
Computes straight-through approximation using random sampling to make E[dL/dy] = E[dL/dx], and computing the sign function. See explanation above.
- x
tensor to perform sign opertion with stochastic sampling.
- alpha
binary is -alpha or +alpha, or “auto” or “auto_po2”.
- bits
number of bits to perform quantization.
- temperature
amplifier factor for sigmoid function, making stochastic behavior less stochastic as it moves away from 0.
- use_real_sigmoid
use real sigmoid from tensorflow for probablity.
- Returns:
Computation of sign with stochastic sampling with straight through gradient.
- qkeras.quantizers.stochastic_round(x, precision=0.5)[source]
Performs stochastic rounding to the first decimal point.
- qkeras.quantizers.stochastic_round_po2(x)[source]
Performs stochastic rounding for the power of two.
- class qkeras.quantizers.stochastic_ternary(alpha=None, threshold=None, temperature=8.0, use_real_sigmoid=True, number_of_unrolls=5)[source]
Bases:
ternaryComputes a stochastic activation function returning -alpha, 0 or +alpha.
Computes straight-through approximation using random sampling to make E[dL/dy] = E[dL/dx], and computing the sign function. See explanation above.
- x
tensor to perform sign opertion with stochastic sampling.
- bits
number of bits to perform quantization.
- alpha
ternary is -alpha or +alpha, or “auto” or “auto_po2”.
- threshold
(1-threshold) specifies the spread of the +1 and -1 values.
- temperature
amplifier factor for sigmoid function, making stochastic less stochastic as it moves away from 0.
- use_real_sigmoid
use real sigmoid for probability.
- number_of_unrolls
number of times we iterate between scale and threshold.
- Returns:
Computation of sign with stochastic sampling with straight through gradient.
- class qkeras.quantizers.ternary(alpha=None, threshold=None, use_stochastic_rounding=False, number_of_unrolls=5)[source]
Bases:
BaseQuantizerComputes an activation function returning -alpha, 0 or +alpha.
Right now we assume two type of behavior. For parameters, we should have alpha, threshold and stochastic rounding on. For activations, alpha and threshold should be floating point numbers, and stochastic rounding should be off.
- x
tensor to perform sign opertion with stochastic sampling.
- bits
number of bits to perform quantization.
- alpha
ternary is -alpha or +alpha. Alpha can be “auto” or “auto_po2”.
- threshold
threshold to apply “dropout” or dead band (0 value). If “auto” is specified, we will compute it per output layer.
- use_stochastic_rounding
if true, we perform stochastic rounding.
- Returns:
Computation of sign within the threshold.