Anaconda is Slow on Tiger Lake – How to Get MKL-accelerated Python working on Latest Intel CPUs

While doing some benchmarking of my AutoTS package I quickly discovered a troubling trend – my newest, fanciest, most expensive Intel i7 CPUs – an Intel 10700 and 1165G7, were performing vastly slower than their older counterparts. I wasn’t terribly surprised to see the 1165G7 having issues: Tiger Lake – a brand new architecture CPU – avx512 in an ultrabook!? I was expecting things to not yet be optimized fully for that. Yet the 10700 being almost a year old, I definitely was surprised to see it facing issues on the latest Anaconda Python.

Discussion:

Over the years, I have come to rely on Anaconda to provide the fastest and most convenient downloads of data science packages. The biggest reason is the Anaconda ships out packages compiled in an optimized way. In particular, numpy on conda is built using MKL – Intel’s optimized LINPACK.

For those of you who aren’t aware, a LINPACK is a low level software package that handles linear algebra, Fourier transforms, and a few other things in highly optimized (read: really fast) way. It doesn’t matter if you are using MATLAB, R, Python, or some other math-doing software, they likely all use one of these LINPACKs to get the computations done behind the scenes.

Intel MKL is the biggest, fastest, and yet, unsurprisingly, is proprietary. There used to be a hack to get it working on AMD CPUs, but that is now gone from what I have seen, it only works on Intel CPUs. The primary competitor to MKL these days is the open-source OpenBLAS, which in my limited experience works quite well – noticeably slower than MKL, but still much faster than not having any LINPACK.

How much speed are we talking here? On my Intel i7 1165G7, using numpy with no LINPACK has a base performance of 1x, while OpenBLAS offers a 4x speedup and Intel MKL offers a 6x speedup. This can mean the difference between something running half a day and something running most of the week…

Since numpy is the foundation of most Python data science, this is where you usually go to see if you have a LINPACK configured. import numpy as np and then np.show_config(). If you have no LINPACK configured, you will see a result with a long line of “Not Available” repeated. If you install numpy over pip you will download numpy built with OpenBLAS (so use pip or conda-forge if you have an AMD or ARM CPU), and if you download numpy over conda you will get numpy built with MKL. Check out: https://numpy.org/install/ for more the latest details. As long as you see some techno jargon next to the MKL or OpenBLAS sections you should be good to go – there may still be “Not Available” next to other LINPACKs like Atlas – that won’t matter.

What is important is that you routinely check np.show_config() because I have found that installing additional packages into the environment, presumably packages that attempt to change the numpy version, can break numpy’s connection to both MKL and/or OpenBLAS.

The Solution:

Nothing I have written so far deals with the fact that my basic Anaconda version is not properly working with MKL on my newer i7 CPUs. The cause is likely that the version of MKL being packaged with Anaconda is a mid-2020 release and not end-of-year release. Presumably the mid-2021 release of Anaconda will fix these problems for my 10700 and 1165G7, but will then not work for what will then be the latest and greatest, the 11700 Rocket Lake and eventually Alder Lake Intel CPUs.

Luckily there is a simple, immediate solution: using Intel’s own python package channel on Anaconda to use MKL.
Or using OpenBLAS, that works pretty well too.

Using Intel’s Anaconda Channel

# create the environment. Intelpy does not work with the latest python, yet
conda create -n intelpy -c intel python=3.7 intelpython3_full
activate intelpy

# install additional packages as desired
conda install -c intel spyder
conda install -c intel statsmodels

# also checkout daal4py: https://intelpython.github.io/daal4py/sklearn.html

Using OpenBLAS with Pip

# create the new environment with conda or venv
conda create -n openpy python=3.9
activate openpy

# install packages...
pip install numpy==1.19.3
pip install pandas
pip install statsmodels
pip install scikit-learn

# make sure to double check np.show_config() after installing packages
# for example, I founding installing spyder broke the OpenBLAS connections here

Finally, I should note you can manually compile the numpy source code yourself using the LINPACK of your choice. However, that is significantly more complicated than the two options above so I have never bothered to try. Checkout the numpy documentation for more details.

Interestingly, Raspberry Pi’s use piwheels by default to access a custom package index of compiled packages that are surprisingly fast, so if you are on a Raspberry Pi you can be fairly confident that the default pip install is the best possible for your machine.

Source: https://xkcd.com/1987/

1 thought on “Anaconda is Slow on Tiger Lake – How to Get MKL-accelerated Python working on Latest Intel CPUs”

  1. Pingback: Choosing a Data Science CPU (Benchmarking on Intel) – Syllepsis

Leave a Comment

Your email address will not be published. Required fields are marked *