Execution providers (EPs) enable ONNX Runtime to execute ONNX graphs with hardware acceleration. If you have specialized hardware like a GPU or NPU, execution providers can provide a massive performance boost to your ort applications. For more information on the intricacies of execution providers, see the ONNX Runtime docs.

ONNX Runtime must be compiled with support for each execution provider. pyke provides precompiled binaries for some of the most common EPs, so you won’t need to compile ONNX Runtime from source. Below is a table showing available EPs, their support in ort, and their binary availability status.

EPSupportedBinariesStatic linking
NVIDIA CUDA🟒🟒❌
NVIDIA TensorRT🟒🟒❌
Microsoft DirectML🟒🟒🟒
Apple CoreML🟒🟒🟒
AMD ROCm🟒🟒❌
Intel OpenVINOπŸŸ’βŒβ“
Intel oneDNNπŸŸ’βŒβ“
XNNPACK🟒🟒🟒
Qualcomm QNNπŸŸ’βŒβ“
Huawei CANNπŸŸ’βŒβ“
Android NNAPIπŸŸ’βŒβ“
Apache TVMπŸŸ’βŒβ“
Arm ACLπŸŸ’βŒβ“
ArmNNπŸŸ’βŒβ“
AMD MIGraphXβŒβŒβ“
AMD Vitis AIβŒβŒβ“
Microsoft AzureβŒβŒβ“
Rockchip RKNPUβŒβŒβ“

Some EPs supported by ONNX Runtime are not supported by ort due to a lack of hardware for testing. If your preferred EP is missing support and you’ve got the hardware, please open an issue!

Registering execution providers

To use an execution provider with ort, you’ll need to enable its respective Cargo feature, e.g. the cuda feature to use CUDA, or the coreml feature to use CoreML.

Cargo.toml
[dependencies]
ort = { version = "2.0", features = [ "cuda" ] }

See Cargo features for the full list of features.

In order to configure sessions to use certain execution providers, you must register them when creating an environment or session. You can do this via the SessionBuilder::with_execution_providers method. For example, to register the CUDA execution provider for a session:

use ort::{CUDAExecutionProvider, Session};

fn main() -> anyhow::Result<()> {
    let session = Session::builder()?
        .with_execution_providers([CUDAExecutionProvider::default().build()])?
        .commit_from_file("model.onnx")?;

    Ok(())
}

You can, of course, specify multiple execution providers. ort will register all EPs specified, in order. If an EP does not support a certain operator in a graph, it will fall back to the next successfully registered EP, or to the CPU if all else fails.

use ort::{CoreMLExecutionProvider, CUDAExecutionProvider, DirectMLExecutionProvider, TensorRTExecutionProvider, Session};

fn main() -> anyhow::Result<()> {
    let session = Session::builder()?
        .with_execution_providers([
            // Prefer TensorRT over CUDA.
            TensorRTExecutionProvider::default().build(),
            CUDAExecutionProvider::default().build(),
            // Use DirectML on Windows if NVIDIA EPs are not available
            DirectMLExecutionProvider::default().build(),
            // Or use ANE on Apple platforms
            CoreMLExecutionProvider::default().build()
        ])?
        .commit_from_file("model.onnx")?;

    Ok(())
}

Configuring EPs

EPs have configuration options to control behavior or increase performance. Each XXXExecutionProvider struct returns a builder with configuration methods. See the API reference for the EP structs for more information on which options are supported and what they do.

use ort::{CoreMLExecutionProvider, Session};

fn main() -> anyhow::Result<()> {
    let session = Session::builder()?
        .with_execution_providers([
            CoreMLExecutionProvider::default()
                // this model uses control flow operators, so enable CoreML on subgraphs too
                .with_subgraphs()
                // only use the ANE as the CoreML CPU implementation is super slow for this model
                .with_ane_only()
                .build()
        ])?
        .commit_from_file("model.onnx")?;

    Ok(())
}

Fallback behavior

ort will silently fail and fall back to executing on the CPU if all execution providers fail to register. In many cases, though, you’ll want to show the user an error message when an EP fails to register, or outright abort the process.

To receive these registration errors, instead use ExecutionProvider::register to register an execution provider:

use ort::{CUDAExecutionProvider, ExecutionProvider, Session};

fn main() -> anyhow::Result<()> {
    let builder = Session::builder()?;

    let cuda = CUDAExecutionProvider::default();
    if cuda.register(&builder).is_err() {
        eprintln!("Failed to register CUDA!");
        std::process::exit(1);
    }

    let session = builder.commit_from_file("model.onnx")?;

    Ok(())
}

You can also check whether ONNX Runtime is even compiled with support for the execution provider with the is_available method.

use ort::{CoreMLExecutionProvider, ExecutionProvider, Session};

fn main() -> anyhow::Result<()> {
    let builder = Session::builder()?;

    let coreml = CoreMLExecutionProvider::default();
    if !coreml.is_available() {
        eprintln!("Please compile ONNX Runtime with CoreML!");
        std::process::exit(1);
    }

    // Note that even though ONNX Runtime was compiled with CoreML, registration could still fail!
    coreml.register(&builder)?;

    let session = builder.commit_from_file("model.onnx")?;

    Ok(())
}

Global defaults

You can configure ort to attempt to register a list of execution providers for all sessions created in an environment.

use ort::{CUDAExecutionProvider, Session};

fn main() -> anyhow::Result<()> {
    ort::init()
        .with_execution_providers([CUDAExecutionProvider::default().build()])
        .commit()?;

    let session = Session::builder()?.commit_from_file("model.onnx")?;
    // The session will attempt to register the CUDA EP
    // since we configured the environment default.

    Ok(())
}

ort::init must come before you create any sessions, otherwise the configuration will not take effect!

Sessions configured with their own execution providers will extend the execution provider defaults, rather than overriding them.

Troubleshooting

If it seems like the execution provider is not registering properly, or you are not getting acceptable performance, see the Troubleshooting: Performance page for more information on how to debug any EP issues.

Notes

CoreML

Statically linking to CoreML (the default behavior when using downloaded binaries + the coreml Cargo feature) requires an additional Rust flag in order to link properly. You’ll need to provide the flag -C link-arg=-fapple-link-rtlib to rustc. You can do this via an entry in .cargo/config.toml, in a build script, or in an environment variable.

See Configuration: Hierarchical structure for more information on where the configuration file can be placed.

.cargo/config.toml
[target.aarch64-apple-darwin]
rustflags = ["-Clink-arg=-fapple-link-rtlib"]

[target.x86_64-apple-darwin]
rustflags = ["-Clink-arg=-fapple-link-rtlib"]