qkeras.codebook

Clustering based quantizers

Functions

`activation_compression`(model, ...[, sample_size])	This function applies clustering based non-uniform quantization inspired by https://arxiv.org/pdf/1911.02079.pdf
`create_in_out_table`(km, quantizer)	Create [in, out] table needed to map compressed activations to codebook values.
`two_tier_embedding_compression`(embeddings, bits)	Creates tables that maps embedding values to their codebook values.
`weight_compression`(weights, bits[, axis, ...])	Creates an in, out table that maps weight values to their codebook values.

qkeras.codebook.activation_compression(model, compile_config, activation_indexes, bits, X_train, y_train, X_test, y_test, sample_size=1.0)[source]

This function applies clustering based non-uniform quantization inspired by https://arxiv.org/pdf/1911.02079.pdf

model: Keras model compile_config: Dictionary of arguments to be passed to model.compile()

for all submodels

activation_indexes: Index list of layers to be quantized. This will: used to split the model and create submodels
bits: Number of bits to compress activations to. This will: results in 2**bits codebook values

X_train, y_train: training data used to fit clustering algorithm X_test, y_test: validation data sample_size:

fraction of training data activations to be used when computing codebook values

Returns:: [in, out] tables. See create_in_out_table docs models: list of keras submodels km_models: list of KMeans fitted models
Return type:: cb_tables

qkeras.codebook.create_in_out_table(km, quantizer)[source]

Create [in, out] table needed to map compressed activations to codebook values. Given v: in_table[out_table[v]] => codebook value of v

Parameters:

km – KMeans model
quantizer – quantizer function to apply to out_table

Returns: in_table: conversion of compressed table indexes to n-bit numbers out_table: conversion of n-bit output activations to compressed table

indexes

qkeras.codebook.two_tier_embedding_compression(embeddings, bits, quantizer=None)[source]

Creates tables that maps embedding values to their codebook values. Based on the idea presented by https://arxiv.org/pdf/1911.02079.pdf

Parameters:

weights – Numpy array
bits – Number of bits to compress weights to. This will results in 2**bits codebook values
quantizer – quantizer function that will be applied to codebook values

Returns:

array of indices that maps to codebook values cluster_index_table: array that maps each row to the codebook table

index

codebook_table: array of codebook values quantized_embeddings: Numpy array MxN of quantized weights

Return type:

index_table

qkeras.codebook.weight_compression(weights, bits, axis=0, quantizer=None)[source]

Creates an in, out table that maps weight values to their codebook values. Based on the idea presented by https://arxiv.org/pdf/1911.02079.pdf

Parameters:

weights – Numpy array
bits – Number of bits to compress weights to. This will results in 2**bits codebook values
axis – axis to apply quantization by
quantizer – quantizer function that will be applied to codebook values

Returns:

array of indices that maps to codebook values for all weights codebook_table: array of codebook values

Return type:

index_table