Migrating from v1.x to v2

`Environment` no more

The Environment struct has been removed. Only one Environment was allowed per process, so it didn’t really make sense to have an environment as a struct.

To configure an Environment, you instead use the ort::init function, which returns the same EnvironmentBuilder as v1.x. Use commit() to then commit the environment.

ort::init()
    .with_execution_providers([CUDAExecutionProvider::default().build()])
    .commit()?;

commit() must be called before any sessions are created to take effect. Otherwise, a default environment will be created. The global environment can be updated afterward by calling commit() on another EnvironmentBuilder, however you’ll need to recreate sessions after comitting the new environment in order for them to use it.

Value specialization

The Value struct has been refactored into multiple strongly-typed structs: Tensor<T>, Map<K, V>, and Sequence<T>, and their type-erased variants: DynTensor, DynMap, and DynSequence.

Values returned by session inference are now DynValues, which behave exactly the same as Value in previous versions.

Tensors created from Rust, like via the new Tensor::new function, can be directly and infallibly extracted into its underlying data via extract_tensor (no try_):

let allocator = Allocator::new(&session, MemoryInfo::new(AllocationDevice::CUDA_PINNED, 0, AllocatorType::Device, MemoryType::CPUInput)?)?;
let tensor = Tensor::<f32>::new(&allocator, [1, 128, 128, 3])?;
 
let array = tensor.extract_array();
// no need to specify type or handle errors - Tensor<f32> can only extract into an f32 ArrayView

You can still extract tensors, maps, or sequence values normally from a DynValue using try_extract_*:

let generated_tokens: ArrayViewD<f32> = outputs["output1"].try_extract_tensor()?;

DynValue can be downcast()ed to the more specialized types, like DynMap or Tensor<T>:

let tensor: Tensor<f32> = value.downcast()?;
let map: DynMap = value.downcast()?;

Similarly, a strongly-typed value like Tensor<T> can be upcast back into a DynValue or DynTensor.

let dyn_tensor: DynTensor = tensor.upcast();
let dyn_value: DynValue = tensor.into_dyn();

Tensor extraction directly returns an `ArrayView`

The new extract_tensor and try_extract_tensor functions return an ndarray::ArrayView directly, instead of putting it behind the old ort::value::Tensor<T> type (not to be confused with the new specialized value type). This means you don’t have to .view() on the result:

-let generated_tokens: Tensor<f32> = outputs["output1"].try_extract()?;
-let generated_tokens = generated_tokens.view();
+let generated_tokens: ArrayViewD<f32> = outputs["output1"].try_extract_tensor()?;

Full support for sequence & map values

You can now construct and extract Sequence/Map values.

Value views

You can now obtain a view of any Value via the new view() and view_mut() functions, which operate similar to ndarray’s own view system. These views can also now be passed into session inputs.

Mutable tensor extraction

You can extract a mutable ArrayViewMut or &mut [T] from a mutable reference to a tensor.

let (raw_shape, raw_data) = tensor.extract_raw_tensor_mut();

Device-allocated tensors

You can now create a tensor on device memory with Tensor::new & an allocator:

let allocator = Allocator::new(&session, MemoryInfo::new(AllocationDevice::CUDA_PINNED, 0, AllocatorType::Device, MemoryType::CPUInput)?)?;
let tensor = Tensor::<f32>::new(&allocator, [1, 128, 128, 3])?;

The data will be allocated by the device specified by the allocator. You can then use the new mutable tensor extraction to modify the tensor’s data.

Session creation

SessionBuilder::new(&environment) has been soft-replaced with Session::builder():

-// v1.x
-let session = SessionBuilder::new(&environment)?.with_model_from_file("model.onnx")?;
+// v2
+let session = Session::builder()?.commit_from_file("model.onnx")?;

`SessionBuilder::with_model_` -> `SessionBuilder::commit_`

The final SessionBuilder methods have been renamed for clarity.

SessionBuilder::with_model_from_file -> SessionBuilder::commit_from_file
SessionBuilder::with_model_from_memory -> SessionBuilder::commit_from_memory
SessionBuilder::with_model_from_memory_directly -> SessionBuilder::commit_from_memory_directly
SessionBuilder::with_model_downloaded -> SessionBuilder::commit_from_url

Session inputs

`CowArray`/`IxDyn`/`ndarray` no longer required

One of the biggest usability changes is that the usual pattern of CowArray::from(array.into_dyn()) is no longer required to create tensors. Now, tensors can be created from:

Owned Arrays of any dimensionality
ArrayViews of any dimensionality
Shared references to CowArrays of any dimensionality (i.e. &CowArray<'_, f32, Ix3>)
Mutable references to ArcArrays of any dimensionality (i.e. &mut ArcArray<f32, Ix3>)
A raw shape definition & data array, of type (Vec<i64>, Arc<Box<[T]>>)

-// v1.x
-let mut tokens = CowArray::from(Array1::from_iter(tokens.iter().cloned()).into_dyn());
+// v2
+let mut tokens = Array1::from_iter(tokens.iter().cloned());

It should be noted that there are some cases in which an array is cloned when converting into a tensor which may lead to a surprising performance hit. ONNX Runtime does not expose an API to specify the strides of a tensor, so if an array is reshaped before being converted into a tensor, it must be cloned in order to make the data contiguous. Specifically:

&CowArray, ArrayView will always be cloned (due to the fact that we cannot guarantee the lifetime of the array).
Array, &mut ArcArray will only be cloned if the memory layout is not contiguous, i.e. if it has been reshaped.
Raw data will never be cloned as it is assumed to already have a contiguous memory layout.

`ort::inputs!` macro

v2.0 makes the transition to the new input/output system easier by providing an inputs! macro. This new macro allows you to specify inputs either by position as they appear in the graph (like previous versions), or by name.

The ort::inputs! macro will painlessly convert compatible data types (see above) into the new inputs system.

-// v1.x
-let chunk_embeddings = text_encoder.run(&[CowArray::from(text_input_chunk.into_dyn())])?;
+// v2
+let chunk_embeddings = text_encoder.run(ort::inputs![text_input_chunk]?)?;

Note the ? after the macro call - ort::inputs! returns an ort::Result<SessionInputs>, so you’ll need to handle any errors accordingly.

As mentioned, you can now also specify inputs by name using a map-like syntax. This is especially useful for graphs with optional inputs.

let noise_pred = unet.run(ort::inputs![
    "latents" => latents,
    "timestep" => Array1::from_iter([t]),
    "encoder_hidden_states" => text_embeddings.view()
]?)?;

Tensor creation no longer requires the session’s allocator

In previous versions, Value::from_array took an allocator parameter. The allocator was only used because the string data contained in string tensors had to be cloned into ONNX Runtime-managed memory. However, 99% of users only ever use primitive tensors, so the extra parameter served little purpose. The new Tensor::from_array function now takes only an array, and the logic for converting string arrays has been moved to a new function, DynTensor::from_string_array.

-// v1.x
-let val = Value::from_array(session.allocator(), &array)?;
+// v2
+let val = Tensor::from_array(&array)?;

Separate string tensor creation

As previously mentioned, the logic for creating string tensors has been moved from Value::from_array to DynTensor::from_string_array.

To use string tensors with ort::inputs!, you must create a DynTensor using DynTensor::from_string_array.

let array = ndarray::Array::from_shape_vec((1,), vec![document]).unwrap();
let outputs = session.run(ort::inputs![
    "input" => DynTensor::from_string_array(session.allocator(), array)?
]?)?;

Session outputs

New: Retrieve outputs by name

Just like how inputs can now be specified by name, you can now retrieve session outputs by name.

let l = outputs["latents"].try_extract_tensor::<f32>()?;

Execution providers

Execution provider structs with public fields have been replaced with builder pattern structs. See the API reference and the execution providers reference for more information.

-// v1.x
-builder = builder.with_execution_providers(ExecutionProvider::DirectML(DirectMLExecutionProvider {
-    device_id: 1
-}))?;
+// v2
+builder = builder.with_execution_providers([
+    DirectMLExecutionProvider::default()
+        .with_device_id(1)
+        .build()
+])?;

Updated dependencies & features

`ndarray` is now optional

The dependency on ndarray is now declared optional. If you use ort with default-features = false, you’ll need to add the ndarray feature.

Model Zoo structs have been removed

ONNX pushed a new Model Zoo structure that adds hundreds of different models. This is impractical to maintain, so the built-in structs have been removed.

You can still use Session::commit_from_url, it just now takes a URL string instead of a struct.

Changes to logging

Environment-level logging configuration (i.e. EnvironmentBuilder::with_log_level) has been removed because it could cause unnecessary confusion with our tracing integration.

Renamed types

The following types have been renamed with no other changes.

NdArrayExtensions -> ArrayExtensions
OrtResult, OrtError -> ort::Result, ort::Error
TensorDataToType -> ExtractTensorData
TensorElementDataType, IntoTensorElementDataType -> TensorElementType, IntoTensorElementType

Last updated on November 13, 2024

Performance Version mapping

Environment no more