Deep Learning in Australia
This document has moved to its own page: antipode.ai and is no longer being updated on Github gists.
written in December 2018.
The practical development of deep learning and its associated infrastructure has initiated a broad re-examination of the practice of computer programming. In this document we briefly survey how this discussion has evolved over the past few years, and then describe our point of view on the underlying mathematics.
We begin with some appeals to authority, in the form of the following references:
The Melbourne Deep Learning Group (MDLG) is in the first place a research group, but given the broader importance of these technologies and urgency of Australia adopting them, we also take on a responsibility for helping to educate students at the University of Melbourne, and the broader Australian community. There are many free or low-cost introductory courses on deep learning, e.g. deeplearning.ai and fast.ai and there is no point in us reproducing some slight variation on this content. However, while there is plenty of content available, that doesn't mean it is trivial to learn in a vacuum (that's what classes are for!).
We therefore focus our attention on facilitating a thriving local community and runnning short events that help to motivate members of this community to deepen their understanding of these technologies and their applications, and to meet collaborators (e.g. we
It is still unclear what the long-term impacts of this technology will be. Large changes in productivity have occurred in history, and the potential of deep learning is comparable to other general purpose technologies (steam, electricity, chemical manufacturing, etc) responsible for those changes. While there are many real-world applications of today's deep learning in computer vision, natural language, and perhaps soon in robotics, these impacts would have to increase by several orders of magnitude to be reasonably compared with the general purpose technologies which drove previous industrial revolutions. However, as anybody familiar with the history of the industrial revolutions knows, once it is obvious to everybody that things are working you may not have time to catch up.
It is therefore worth noting that rich governments (US, China) and corporations (Google, Facebook, Amazon, Microsoft, Baidu, Alib
In early 2019 I decided to try to understand the University of Melbourne a little better. I have recorded some observations here in case they are useful for other academics. For updates in early 2020 see down the page. The notes are taken from various University of Melbourne (UoM) official documents, primarily
To a first approximation, if you want to understand the University I think you should read the report, ignore the glossy bits, and pay close attention to the statistics on p.13 and the financial data reported beginning on p.124. All references in this section are to the report, unless specified otherwise.
According to the history of logic in the Encyclopaedia Britannica, logic emerged from the study of philosophical arguments, and the realisation that were general patterns by which one could distinguish valid and invalid forms of argumentation. The systematic study of logic was begun by Aristotle, who established a system of formal rules and strategy for reasoning. The use of the word strategy is intentional:
The practice of such techniques in Aristotle’s day was actually competitive, and Aristotle was especially interested in strategies that could be used to “win” such “games.” Naturally, the ability to predict the “answer” that a certain line of questioning would yield represented an important advantage in such competitions. Aristotle noticed that in some cases the answer is completely predictable—viz., when it is (in modern terminology) a logical consequence of earlier answers. Thus, he was led from the study of interrogative techniques to
The optimisation algorithm used in most of DeepMind's deep RL papers is RMSProp (e.g in the Mnih et al Atari paper, in the IMPALA paper, in the RL experiments of the PBT paper, in the Zambaldi et al paper). I have seen speculation online that this is because RMSProp may be well-suited to deep learning on non-stationary distributions. In this note I try to examine the RMSProp algorithm and specifically the significance of the
epsilon hyperparameter. The references are
Often in the literature RMSProp is presented as a variation of AdaGrad (e.g. in the deep learning textbook and in Karpathy's class). However, I think this is misleading, and that the explanation in Hinton's lecture is (not surprisingl
I am making publicly available my hand-written working notes for the paper "Constructing A-infinity categories of matrix factorisations" in the same spirit that I made available the other notes on my webpage The Rising Sea. Obviously you should not expect these notes to be as coherent, or readable, as the final paper, but those marked on the first page as (checked) are indeed checked, to the same level of rigour that I apply to any of my published papers. And they often contain more details than the paper. I hope you find them useful!
The main references, written in the same notation and from the same outlook as the final paper, are given below. You should probably start with (ainfmf28). Some of these PDF files are large, you have been warned.