Weekend Reading Round-Up

by Ian Hellström (13 October 2017)

In this month’s reading round-up I look at businesses in Africa, category theory and Scala, fancy copy-pasting of code, neuromorphic microchips, machine learning, philosophy, supercomputers, topology, and of course data.

An Introduction to Topological Data Analysis: Fundamental and Practical Aspects for Data Scientists

An arXiv survey discusses topological data analysis (TDA), a nascent field of data science, in which topological and geometric tools are used to infer features (as in structures). The basic assumption is that data sets consist of data points generated from an unknown distribution but with some notion of similarity among these points. By connecting these points based on the similarity between them, we can build simplicial complexes — high-dimensional generalizations of graphs (hypergraphs) — that allow us to identify topological features of the data itself. Near the end of the article, the authors briefly show how to use GUDHI, a Python TDA API, in case you don’t care too much about the technical details.

Gödel’s Incompleteness Theorem And Its Implications For Artificial Intelligence

On Deep Ideas there is a nice overview of various philosophical perspectives on the implications of Gödel’s incompleteness theorem for artificial intelligence or even the recreation of human intelligence by a machine. Lucas, Rogers, Norvig, Russell, Benacerraf, Hofstadter, Penrose, McCullough, and Chalmers all come on stage and pick up the mic.

Learning Graphical Models from a Distributed Stream

Researchers propose approximate algorithms for graphical models, in particular Bayesian networks, that are learnt from distributed streams. The challenge is to update the model as new information arrives with minimal overhead and communication among the nodes. In their approach, the authors phrase their solution in terms of a continuous distributed stream monitoring model, one in which there is a coordinator (master) node and several worker (slave) nodes that each receive a chunk of the data stream, and they use maximum likelihood estimation to come up with the CPDs, that is, the conditional probability distributions. A distributed counter is used to, well, count events without triggering messages for each event.

Supercomputer Redesign of Aeroplane Wing Mirrors Bird Anatomy

Nature reports that a supercomputer has now achieved what nature has done over the course of billions of years: design an aerofoil that’s both light and stiff. Unfortunately, no one is capable of manufacturing such a wing at the moment because it would require a truly massive 3D printer.

Teaching Personal Initiative Beats Traditional Training in Boosting Small Business in West Africa

A paper published in Science describes how basic psychological training intended to boost personal initiative provides a better foundation for running a small business than traditional business education. Instead of focusing on a small sample of large companies, the authors cast their net wider and studied 1,500 people in Togo. The sample was divided equally into a control group, a group that received business training, and a group that was taught about personal initiative. The researchers followed the owners’ micro-enterprises for two years and reported a 30% increase in profits for the people from the third group. It seems that entrepreneurship can be taught, but business schools may not provide the best return on investment. At least in West Africa.

Basic Category Theory for Scala Programmers

In the first part of a series of articles on category theory, the author looks at function (i.e. morphism) composition in Scala. If you’re interested in category theory and functional programming but you find that most write-ups on the topic are too mathematical or too abstract, this may turn out to be worth your time.

Estimating the Intrinsic Dimension of Datasets by a Minimal Neighbourhood Information

A paper published in Nature looks at the intrinsic dimension of data sets, i.e. the minimum number of dimensions needed to describe the data set accurately. The estimator proposed (TWO-NN) merely takes the distances to the first two nearest neighbours of each point into account. The only assumption needed is that the data set is locally uniform in density, that is, with respect to the second neighbour.

A Brain Built from Atomic Switches Can Learn

For all the research (and hype) about neural networks and especially deep learning, the brain still beats any computer in terms of efficiency and complexity. That is about to change – although admittedly very, very slowly – according to an article in Quanta Magazine, in which research is presented on a new device that can learn without software but purely based on hardware. The 2x2 mm nanowire mesh boasts around 40 million artificial synapses.

It must be the time of neuromorphic engineering because Intel announced the Loihi chip with 130 million artificial synapses. Research on photonic microchips made the headlines too.

Coding the History of Deep Learning

If you’re interested in code and the history of machine learning you may find the following blog post of interest. It looks at the origins (linear regression, perceptrons, neural networks, and deep neural nets) and shows code samples in Python for each.

People who have enjoyed this article might also like Deep Learning from Scratch: Theory and Implementation.

Deep Reinforcement Learning that Matters

In an arXiv preprint the authors discuss the problem of reproducibility of research in deep reinforcement learning. Perhaps unsurprisingly, hyperparameters are key to replicate results, in addition to implementation details, the experimental setup. Many researchers implement their own versions of baseline algorithms, which need not be (and rarely are) identical in terms of model performance. Moreover, intrinsic sources of non-determinism (e.g. random seeds and environment properties) also degrade the reproducibility. This suggests that significance testing ought to be done properly.

Learning to Optimize with Reinforcement Learning

A nice blog post describes some of the ideas behind two papers on optimization algorithms, one from June 2016 paper entitled ‘Learning to Optimize’ and another one from March 2017 with the title ‘Learning to Optimize Neural Nets’. The authors propose to replace gradients in optimization algorithms with neural nets, so that the optimizers can learn how to optimize. The idea is simple: if you have seen a particular pattern in an objective function before, there is no need to go through the numerical computations just to come to the same conclusion: you might as well jump ahead to the solution.


Wouldn’t it be great if you could just reuse code without having to think about all the details, such as renaming variables? A paper presented to the ACM does just that. The method was used to ‘transplant’ code between six different open-source image-processing applications. Seven out of the eight transplants were successful, meaning that the recipient program executed the donor’s instructions properly. It even removes irrelevant code. CodeCarbonCopy does not yet support data structures such as hash tables, lists, or trees though.

TFX: A TensorFlow-Based Production-Scale Machine Learning Platform

Data scientists love working with notebooks. But notebooks are terrible when it comes to deploying and running machine learning models in production. At KDD2017 Google presented their internal, fully integrated scalable machine learning platform. It includes everything from data ingestion, analysis, transformation, validation, training (with a parameter tuner), model evaluation and validation components, a serving layer (i.e. the ‘scoring engine’), a job orchestration framework, and of course the always important logging component. Among many cool and useful things, the ability to serve models based on criteria such as ‘safe to serve’ and ‘prediction quality’ really make TFX stand out.