banner image
banner image

Intel AI’s DISTILLER

Intel AI’s DISTILLER
Intel AI team has built the Distiller with the following features and tools, keeping both DL researchers and engineers in mind:

  •  A framework for integrating pruning, regularization, and quantization algorithms
  • A set of tools for analyzing and evaluating compression performance
  • Example implementations of state-of-the-art compression algorithms


Pruning and regularization are two methods that can be used to induce sparsity in a DNN’s parameters tensors. Sparsity is a measure of how many elements in a tensor are exact zeros, relative to the tensor size. Sparse tensors can be stored more compactly in memory and can reduce the amount of computing and energy budgets required to carry out DNN operations. Quantization is a method to reduce the precision of the data type used in a DNN, leading again to reduced memory, energy and compute requirements. Distiller provides a growing set of state-of-the-art methods and algorithms for quantization, pruning (structured and fine-grained) and sparsity-inducing regularization–leading the way to faster, smaller, and more energy-efficient models.

To help you concentrate on your research, they have tried to provide the generic functionality, both high and low-level, that we think most people will need for compression research. Some examples:

  • · Certain compression methods dynamically remove filters and channels from Convolutional layers while a DNN is trained. The distiller will perform the changes in the configuration of the targeted layers, and in their parameter tensors as well. In addition, it will analyze the data-dependencies in the model and modify dependent layers as needed
  • ·    Distiller will automatically transform a model for quantization, by replacing specific layer types with their quantized counterparts. This saves you the hassle of manually converting each the floating-point model you are using to its lower-precision form, and allows you to focus on developing the quantization method, and to scale and test your algorithm across many models


Distiller statistics are exported as Pandas DataFrames which are amenable to data-selection (indexing, slicing, etc.) and visualization. Distiller comes with sample applications that employ some methods for compressing image-classification DNNs and language models. They have implemented a few compression research papers that can be used as a template for starting your own work.  These are based on a couple of PyTorch’s example projects and show the the simplicity of adding compression to pre-existing training applications.

The distiller is a research library for the community at large and is part of Intel AI Lab’s effort to help scientists and engineers train and deploy DL solutions, publish research, and reproduce the latest innovative algorithms from the AI community. They are currently working on adding more algorithms, more features, and more application domains.

If you are actively researching or implementing DNN compression, we hope that you will find Distiller useful and fork it to implement your own research. For more information about Distiller, you can refer to the documentation and code at https://github.com/NervanaSystems/distiller.



Intel AI’s DISTILLER Intel AI’s DISTILLER Reviewed by Akhil Kumar on April 25, 2019 Rating: 5

No comments:

Powered by Blogger.