Thinking About Languages for Deep Learning

Thinking About Languages for Deep Learning

Right now Python feels like the language of deep learning and data science in general. That’s not really surprising. With the right libraries (probably NumPy, Pandas and MatPlotLib) it can come close to matching Matlab for power and expressibility when it comes to numerical and data computing. It also makes it easy to wrap the fast C and C++ code used under the hood by libraries like TensorFlow and PyTorch, taking it beyond Matlab's capabilities when it comes to deep learning.

The usefulness of Python is difficult to overstate, the size of its ecosystem is monumental, and the learning resources for it are phenomenal. Whatever your question is, you’re almost certain to find the answer on Stack Overflow. It also has quite possibly the most complete and useful standard library I’ve ever encountered. It’s amazing what you can get done without ever even having to type import.

Having buttered it up, it probably feels like I’m about to say something negative about it, so before I go any further: I do really like Python. I use it for the majority of my day job and it’s an incredible language for getting things done. If you were to ask me what one programming language you should learn, I would probably say Python. My number one piece of advice for people interviewing at tech companies is “Use Python in your interview. If you need to learn it, do.”


Python is not perfect. A lot of its popularity (in my opinion) comes from being “good enough” for many, if not most, problems. But it’s rarely the best solution for any of them. It’s fast to write, but slow to run. Which is why you need that high performance C code under the hood. Which comes with a big downside: your code is no longer “pure” Python, which makes it harder to debug and reason about. You write in Python, but what you’ve written is just the glue. The wood is the C hidden underneath.

Python’s approach to typing (or its lack thereof) and its dynamism can also make refactoring painful. It can be hard to know exactly what relies on the code you’re trying to change. Plus: It’s not just slow to run on your computer. It’s slow to run in your brain. Come back to it a month later and unless you had really good coding hygiene or wrote a lot of documentation (and kept it up to date!) it can be hard figure out what the hell it’s supposed to do.

So: What are the alternatives? Two appear to stand out: Swift and Julia.

But first: If Python is “good enough”, why even consider alternatives? Just because Python is the deep learning lingua franca now doesn’t mean that will always be the case. It’s always good to look ahead at what might be coming next, especially in such a fast moving field. There are also things Python can’t really do. It’s lack of a real type system or a compiler means that it can’t make the sort of guarantees which are needed for language level automatic differentiation, for example.

Jeremy Howard seems to agree with me to some extent. So much so that the last two lessons of the part 2 of’s 2019 course use Swift. This despite the fact that Howard freely admits that Swift deep learning is "not ready yet".

I used Swift a little when my job was iOS development and really liked it. In particular: I like its type system and protocol orientated approach. It has a tight binding to the LLVM compiler system, which made it a candidate to become a first class TensorFlow language in the Swift for TensorFlow project. That project is being led by Chris Lattner, who created both LLVM and Swift itself, so the depth of knowledge there is huge. The path to trying it out seems pretty straightforward: the last two lessons of the 2019 course use it (as noted above), so it’s already part of my deep learning road map.

Julia I know less about, but what I do know I find intriguing. It was essentially the runner up language for the project which became Swift for TensorFlow, and was designed from the ground up for numerical and data computing. My plan to try it is pretty simple: after I’m done with 2019 part 2 I’ll recreate some element of the course using one of the Julia deep learning frameworks (probably Flux). I’m inclined to try something related to tabular data, as that will also give me the chance to play with JuliaDB, which looks really interesting.