qkeras.codebook
Clustering based quantizers
Functions
|
This function applies clustering based non-uniform quantization inspired by https://arxiv.org/pdf/1911.02079.pdf |
|
Create [in, out] table needed to map compressed activations to codebook values. |
|
Creates tables that maps embedding values to their codebook values. |
|
Creates an in, out table that maps weight values to their codebook values. |
- qkeras.codebook.activation_compression(model, compile_config, activation_indexes, bits, X_train, y_train, X_test, y_test, sample_size=1.0)[source]
This function applies clustering based non-uniform quantization inspired by https://arxiv.org/pdf/1911.02079.pdf
model: Keras model compile_config: Dictionary of arguments to be passed to model.compile()
for all submodels
- activation_indexes: Index list of layers to be quantized. This will
used to split the model and create submodels
- bits: Number of bits to compress activations to. This will
results in 2**bits codebook values
X_train, y_train: training data used to fit clustering algorithm X_test, y_test: validation data sample_size:
fraction of training data activations to be used when computing codebook values
- Returns:
[in, out] tables. See create_in_out_table docs models: list of keras submodels km_models: list of KMeans fitted models
- Return type:
cb_tables
- qkeras.codebook.create_in_out_table(km, quantizer)[source]
Create [in, out] table needed to map compressed activations to codebook values. Given v: in_table[out_table[v]] => codebook value of v
- Parameters:
km – KMeans model
quantizer – quantizer function to apply to out_table
- Returns
in_table: conversion of compressed table indexes to n-bit numbers out_table: conversion of n-bit output activations to compressed table
indexes
- qkeras.codebook.two_tier_embedding_compression(embeddings, bits, quantizer=None)[source]
Creates tables that maps embedding values to their codebook values. Based on the idea presented by https://arxiv.org/pdf/1911.02079.pdf
- Parameters:
weights – Numpy array
bits – Number of bits to compress weights to. This will results in 2**bits codebook values
quantizer – quantizer function that will be applied to codebook values
- Returns:
array of indices that maps to codebook values cluster_index_table: array that maps each row to the codebook table
index
codebook_table: array of codebook values quantized_embeddings: Numpy array MxN of quantized weights
- Return type:
index_table
- qkeras.codebook.weight_compression(weights, bits, axis=0, quantizer=None)[source]
Creates an in, out table that maps weight values to their codebook values. Based on the idea presented by https://arxiv.org/pdf/1911.02079.pdf
- Parameters:
weights – Numpy array
bits – Number of bits to compress weights to. This will results in 2**bits codebook values
axis – axis to apply quantization by
quantizer – quantizer function that will be applied to codebook values
- Returns:
array of indices that maps to codebook values for all weights codebook_table: array of codebook values
- Return type:
index_table