qkeras.codebook

Clustering based quantizers

Functions

activation_compression(model, ...[, sample_size])

This function applies clustering based non-uniform quantization inspired by https://arxiv.org/pdf/1911.02079.pdf

create_in_out_table(km, quantizer)

Create [in, out] table needed to map compressed activations to codebook values.

two_tier_embedding_compression(embeddings, bits)

Creates tables that maps embedding values to their codebook values.

weight_compression(weights, bits[, axis, ...])

Creates an in, out table that maps weight values to their codebook values.

qkeras.codebook.activation_compression(model, compile_config, activation_indexes, bits, X_train, y_train, X_test, y_test, sample_size=1.0)[source]

This function applies clustering based non-uniform quantization inspired by https://arxiv.org/pdf/1911.02079.pdf

model: Keras model compile_config: Dictionary of arguments to be passed to model.compile()

for all submodels

activation_indexes: Index list of layers to be quantized. This will

used to split the model and create submodels

bits: Number of bits to compress activations to. This will

results in 2**bits codebook values

X_train, y_train: training data used to fit clustering algorithm X_test, y_test: validation data sample_size:

fraction of training data activations to be used when computing codebook values

Returns:

[in, out] tables. See create_in_out_table docs models: list of keras submodels km_models: list of KMeans fitted models

Return type:

cb_tables

qkeras.codebook.create_in_out_table(km, quantizer)[source]

Create [in, out] table needed to map compressed activations to codebook values. Given v: in_table[out_table[v]] => codebook value of v

Parameters:
  • km – KMeans model

  • quantizer – quantizer function to apply to out_table

Returns

in_table: conversion of compressed table indexes to n-bit numbers out_table: conversion of n-bit output activations to compressed table

indexes

qkeras.codebook.two_tier_embedding_compression(embeddings, bits, quantizer=None)[source]

Creates tables that maps embedding values to their codebook values. Based on the idea presented by https://arxiv.org/pdf/1911.02079.pdf

Parameters:
  • weights – Numpy array

  • bits – Number of bits to compress weights to. This will results in 2**bits codebook values

  • quantizer – quantizer function that will be applied to codebook values

Returns:

array of indices that maps to codebook values cluster_index_table: array that maps each row to the codebook table

index

codebook_table: array of codebook values quantized_embeddings: Numpy array MxN of quantized weights

Return type:

index_table

qkeras.codebook.weight_compression(weights, bits, axis=0, quantizer=None)[source]

Creates an in, out table that maps weight values to their codebook values. Based on the idea presented by https://arxiv.org/pdf/1911.02079.pdf

Parameters:
  • weights – Numpy array

  • bits – Number of bits to compress weights to. This will results in 2**bits codebook values

  • axis – axis to apply quantization by

  • quantizer – quantizer function that will be applied to codebook values

Returns:

array of indices that maps to codebook values for all weights codebook_table: array of codebook values

Return type:

index_table