We have lately had to read many large OpenEXR image files in Python at work. Unfortunately, the Python package OpenEXR loads these files quite slowly. A single one of our files can easily take more than a minute to open. Other software, such as the image viewer tev, loads the same files in a few seconds.

I therefore started looking into alternative ways to open these files. There are a few other Python packages out there, but those either failed to open the files, loaded just as slow, or required building and linking a C++ library manually to work. And after doing said building and linking, I of course started hitting segfaults.

I then noticed that the exr crate in Rust loaded the same files really fast. A file that took over a minute to load in OpenEXR takes only a few seconds to load using this crate. The speed seems to be due to exr using multithreading to load the files and otherwise being well-written and performant.

So I decided to look into how you can expose Rust code to Python these days. It turns out that it is incredibly easy thanks to the PyO3 project which provides such bindings. Further, they also develop the numpy Rust crate that makes it possible to pass NumPy arrays to and from Rust.

On top of it all, there is even a great little tool called maturin that allows you to easily develop, build and even publish your Rust-based Python packages to PyPI, which is the place where Pip fetches packages from. Not only that, but it even supports cross-compilation from Linux to Windows, as long as you have added a target such as x86_64-pc-windows-msvc with rustup.

Publishing a cross-compiled package to PyPI is as easy as running something along the lines of:

maturin publish --target x86_64-pc-windows-msvc --interpreter python3.10

In just a few hours, I was able to wrap the functionality we needed from the exr crate and make a Python package based on it that is now on PyPI. Naming is as always the hardest part, but I landed on “pyroexr” which is some kind of play on Python, Rust and OpenEXR.

You can install it using PIP:

python -m pip install pyroexr

And open an EXR file using a script such as this to list the channels:

import pyroexr
import matplotlib.pyplot as plt

image = pyroexr.load("Ocean.exr")
print("Channels", image.channels())
Channels ['B', 'G', 'R']

And view one of the channels in a plot with imshow:

plt.imshow(image.channel("B"))
plt.show()

Ocean output

Note that pyroexr is minimal and only supports the functionality we currently need ourselves. For instance, the package assumes you want to load the entire file into memory and that there is only one layer in the file. I have no current plans to extend it further, but contributions are of course welcome.

Implementation

Below, I have listed the main steps I had to perform to expose the functionality we needed from Rust to Python. You can find the full source code for the package on GitHub.

First of all, we expose the module using the #[pymodule] annotation:

#[pymodule]
fn pyroexr(_py: Python, m: &PyModule) -> PyResult<()> {
    m.add_function(wrap_pyfunction!(load, m)?)?;
    Ok(())
}

The load function mentioned above is annotated using #[pyfunction] and is used to load a file and return an image with everything present:

#[pyfunction]
fn load(filename: &str) -> PyResult<ImageWrapper> {
    let image = match exr::prelude::read::read()
        .no_deep_data()
        .largest_resolution_level()
        .all_channels()
        .all_layers()
        .all_attributes()
        .from_file(filename)
    {
        Ok(img) => img,
        Err(err) => {
            return Err(PyRuntimeError::new_err(format!(
                "Could not load file '{filename}' due to error: '{err}'"
            )));
        }
    };

    Ok(ImageWrapper { image })
}

The ImageWrapper is a simple class that wraps the underlying exr::Image:

#[pyclass]
struct ImageWrapper {
    image: Image<SmallVec<[Layer<AnyChannels<FlatSamples>>; 2]>>,
}

Thanks to the numpy crate, we can then expose a function that reads out the data from a specific channel as a NumPy array with float data:

#[pymethods]
impl ImageWrapper {
    // [...]

    fn channel<'a>(&self, py: Python<'a>, name: &str) -> PyResult<&'a PyArray2<f32>> {
        let layer = match self.image.layer_data.first() {
            Some(l) => l,
            None => {
                return Err(PyRuntimeError::new_err("Image contains no layers".to_string()));
            }
        };
        let channel = match layer
            .channel_data
            .list
            .iter()
            .find(|channel| channel.name.eq(name))
        {
            Some(c) => c,
            None => {
                return Err(PyKeyError::new_err(format!(
                    "Channel '{name}' not found in image"
                )));
            }
        };
        let size = [layer.size.1, layer.size.0];
        let array = PyArray::from_iter(py, channel.sample_data.values_as_f32()).reshape(size);

        array
    }
}

Then the code is compiled using maturin develop and can be tested in Python as shown above.

I am really impressed with how straightforward all of this was and how quickly I could get our files to load faster in Python. Kudos to the PyO3 and exr developers!