Skip to content

Instantly share code, notes, and snippets.

@wassafr
Created August 28, 2017 10:19
Show Gist options
  • Save wassafr/acb07879ab310b6c5bc8ef446ed62557 to your computer and use it in GitHub Desktop.
Save wassafr/acb07879ab310b6c5bc8ef446ed62557 to your computer and use it in GitHub Desktop.

Why choose MXNet for deeplearning

This article will not be a full comparison of deeplearning framework but more a list of what made us choose Mxnet over other framework. A full comparison between deeplearning frameworks will not be accurate for different reasons. First, we haven't tested all existing frameworks. Second, they evole really fast so the reasons we choose Mxnet one year ago may not be decisive now. Frameworks are in one kind of concurrence and good point form one framework are most of the time copied or adapted by others. If you want to add information or correct some of them, feel free to add comments.

MXNet

On the official website, we can find a really small description of Mxnet caracteristics.

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more

This overview gives some of the main Mxnet technical properties but it is comprehensive. And that maybe not enough for you either. These properties cover aspect from beginner to advenced usage. If you have experience in deeplearning these can be useless as they are not as important for you.

Installation

One of the first obstacle you will have to overcome before using deeplearning framework is the installation. Some are easier to installl than others and some have sparse or missing documentation. A year ago, the installation process was awfull. But since then, but most of the frameworks have simplified their process. For example, frameworks with python API now support installation via pip, the package manager for python. Here is a small list of frameworks that have installation documentation.

  • [MXNet][3]: pip package or compilation from source with make.
  • [Tensorflow][4]: pip package or compilation from source with Bazel. Bazel work fine but it's not a user friendly solution and not very well documented. In our experiments we try to write code to use tensorflow and opencv in C++. Import Opencv in the bazel or import Tensorflow with cmake is horrible and a lost of time. Now there is a not recommanded support for cmake maybe that can makethe integration of tensorflow with other libraries easier.
  • [CNTK][5]: pip package or compilation from source (we don't test from source)
  • [Caffe][9]: The compilation from source installation was really not clear on what to do and on dependency. We got lots of errors before getting a working solution. Now it seems they rewrote the installation page and it seems cleaner.
  • [Chainer][6]: pip package or compilation from source with python setup.
  • [Keras][7]: pip package. As Keras is only a front we also have to install a possible backend to use Tensorflow, CNTK, Theano or Mxnet.

Tutorial

Tutorials are one of the starting point is to test and learn a new deeplearning framework. Currently, the easiest tutorial found on all deepleanring framework is hand written digit recognition with the MNIST Database. This is the minimum required tutorial. This allows to have a glance on notation and API usage. As every frameworks have this tutorial, it can be use to compare usage. Other frameworks have more complete tutorials. Since the release of their new imperative API named Gluon, Mxnet has worked on [tutorials][21] that explain most of the deeplearning concept and go to advanced usage. Moreover with python API, it is possible to write [Python Notebook][13]. That's a really powerfull tool to explain and illustrate code.

Example

Advanced user usually don't need light tutorials. They most of the time search for complete and useable examples. More example are usefull to understand complexe framework functionnality in real application. Thanks to the community it is possible to find implementation of new and breakthrough architectures like ResNet, Faster RCNN, SSD, LSTM. The most used and popular frameworks have their own implementation and other can import network from Caffe or Tensoflow. I work in computer vision. The possibility to have working implementation of Faster RCNN, SSD, MTCNN or even Mobilenet is wonderful.

There is a counterpart: while it is easy to find lots of implementations, some of them are not maintained by their author. And they can be deprecated by the last framework release. To avoid this, Mxnet and some of other frameworks, maintain in their main repository some of the hyped code like SSD or Faster RCNN. Tensorflow uses a separate repository ([Models][18]) that was not well referenced by Google :) This effectively gives a lot of work to keep these codes up to date and compatible with each new framework version.

Zoo

If you want to train a network on new data, it is mainstream to use pre-trained network. This allows to reduce training time, globally reduce over fitting and increase perfomance. These networks are trained on very large databases like ImageNet. But training time and calculation power to obtain this kind of network is not accessible to everyone. To give access to such pre-trained networks to their user, frameworks either trained these models or developped a converter from other framework. These repositories with pre-trained models are called zoo.

Easy to use and understand

This criteria is really personnal. But an easy to use and easy to understand API, will reduce the learning curve and help your team or new member to efficiently work with. There is low level API and high level one. Tensorflow is relatively low level, Keras high level and Mxnet a mix between high level and low level. Mxnet have currently several APIs, the Symbol API for define-and-run network, NDArray (low level) and Gluon (high level) API for define-by-run network. One advantage of having this range of APIs is that if we have question, we don't have to search or ask multiple time on different platforms.

Tensorflow API is not easy to use, that's why the community create lots of higher level ones like [TF learn][15], [TF slim][16], [Sonnet][17], [Keras][7], ... These higher level APIs are easier to use but are not always as well documented and maintained as the core Tensorflow. I hope that will change with the official integration of Keras in Tensorflow.

Keras is a really good API. It's a API on top of backend. And Keras can be used with Tensorflow, Theano, CNTK and even Mxnet. Keras allow a really fast prototyping, it's easy to create complexe neural architecture. However, since it has a separated backend, it can be tricky to do low level operations. For example, multi-GPU management. In this case, we need to use the backend directly (aka TensorFlow/Theano/CNTK/Mxnet).

Activity, Evolution and Community reactivity

Due to deeplearning activity, new architecture are released every month and new breakthrough network every 6 month. Deeplearning frameworks need to be really active. As a recent example, the release of [Mobilenet][19] developped by Google. This network use a not regular layer call "deepwise separable convolution". So to implement this network with full training performance, we need either implement this layer or wait maintainer to implement it. Unforunatly, these state of the art algorithms generate user flow between framework.

Multi GPU/Multi Computer

MXNet is known for it's GPU management. It is easy and performant in comparison to Tensorflow or Keras. As Keras can be used with Mxnet backend, keras in this particular configuration can easily scale on multi-GPU ([Keras with Mxnet backend][20]).

Embeded devices

If you work on smart device, network embeding can be a problem. Not all framework can be used on mobile or low performance platform. MXNet solve this with an amalgamation process. This script concatenate Mxnet functionnality in a small C++ API. This interface can use only prediction but have almost no dependency (only Openblas). This API is wrapped in many languages Java, JS, Scala, ... and can be embed in many platform iOS, Android, Browser.

Speed

There are many benchmark available. But unfortunably as frameworks are really active, benchmarks are not always up to date. An active deep learning [benchmark][14].

Do you want to know more about Wassa?

Wassa is an innovative digital agency expert in Indoor Location and Computer Vision. Wether you are looking to help your customers to find their way in a building, enhance the user experience of your products, collect datas about your customers or analyze the human traffic and behavior in a location, our Innovation Lab brings scientific expertise to design the most adapted solution to your goals.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment