Environment
no more
The Environment
struct has been removed. Only one Environment
was allowed per process, so it didn’t really make sense to have an environment as a struct.
To configure an Environment
, you instead use the ort::init
function, which returns the same EnvironmentBuilder
as v1.x. Use commit()
to then commit the environment.
ort::init()
.with_execution_providers([CUDAExecutionProvider::default().build()])
.commit()?;
commit()
must be called before any sessions are created to take effect. Otherwise, a default environment will be created. The global environment can be updated afterward by calling commit()
on another EnvironmentBuilder
, however you’ll need to recreate sessions after comitting the new environment in order for them to use it.
Value specialization
The Value
struct has been refactored into multiple strongly-typed structs: Tensor<T>
, Map<K, V>
, and Sequence<T>
, and their type-erased variants: DynTensor
, DynMap
, and DynSequence
.
Values returned by session inference are now DynValue
s, which behave exactly the same as Value
in previous versions.
Tensors created from Rust, like via the new Tensor::new
function, can be directly and infallibly extracted into its underlying data via extract_tensor
(no try_
):
let allocator = Allocator::new(&session, MemoryInfo::new(AllocationDevice::CUDA_PINNED, 0, AllocatorType::Device, MemoryType::CPUInput)?)?;
let tensor = Tensor::<f32>::new(&allocator, [1, 128, 128, 3])?;
let array = tensor.extract_array();
// no need to specify type or handle errors - Tensor<f32> can only extract into an f32 ArrayView
You can still extract tensors, maps, or sequence values normally from a DynValue
using try_extract_*
:
let generated_tokens: ArrayViewD<f32> = outputs["output1"].try_extract_tensor()?;
DynValue
can be downcast()
ed to the more specialized types, like DynMap
or Tensor<T>
:
let tensor: Tensor<f32> = value.downcast()?;
let map: DynMap = value.downcast()?;
Similarly, a strongly-typed value like Tensor<T>
can be upcast back into a DynValue
or DynTensor
.
let dyn_tensor: DynTensor = tensor.upcast();
let dyn_value: DynValue = tensor.into_dyn();
Tensor extraction directly returns an ArrayView
The new extract_tensor
and try_extract_tensor
functions return an ndarray::ArrayView
directly, instead of putting it behind the old ort::value::Tensor<T>
type (not to be confused with the new specialized value type). This means you don’t have to .view()
on the result:
-let generated_tokens: Tensor<f32> = outputs["output1"].try_extract()?;
-let generated_tokens = generated_tokens.view();
+let generated_tokens: ArrayViewD<f32> = outputs["output1"].try_extract_tensor()?;
Full support for sequence & map values
You can now construct and extract Sequence
/Map
values.
Value views
You can now obtain a view of any Value
via the new view()
and view_mut()
functions, which operate similar to ndarray
’s own view system. These views can also now be passed into session inputs.
Mutable tensor extraction
You can extract a mutable ArrayViewMut
or &mut [T]
from a mutable reference to a tensor.
let (raw_shape, raw_data) = tensor.extract_raw_tensor_mut();
Device-allocated tensors
You can now create a tensor on device memory with Tensor::new
& an allocator:
let allocator = Allocator::new(&session, MemoryInfo::new(AllocationDevice::CUDA_PINNED, 0, AllocatorType::Device, MemoryType::CPUInput)?)?;
let tensor = Tensor::<f32>::new(&allocator, [1, 128, 128, 3])?;
The data will be allocated by the device specified by the allocator. You can then use the new mutable tensor extraction to modify the tensor’s data.
Session creation
SessionBuilder::new(&environment)
has been soft-replaced with Session::builder()
:
-// v1.x
-let session = SessionBuilder::new(&environment)?.with_model_from_file("model.onnx")?;
+// v2
+let session = Session::builder()?.commit_from_file("model.onnx")?;
SessionBuilder::with_model_*
-> SessionBuilder::commit_*
The final SessionBuilder
methods have been renamed for clarity.
SessionBuilder::with_model_from_file
->SessionBuilder::commit_from_file
SessionBuilder::with_model_from_memory
->SessionBuilder::commit_from_memory
SessionBuilder::with_model_from_memory_directly
->SessionBuilder::commit_from_memory_directly
SessionBuilder::with_model_downloaded
->SessionBuilder::commit_from_url
Session inputs
CowArray
/IxDyn
/ndarray
no longer required
One of the biggest usability changes is that the usual pattern of CowArray::from(array.into_dyn())
is no longer required to create tensors. Now, tensors can be created from:
- Owned
Array
s of any dimensionality ArrayView
s of any dimensionality- Shared references to
CowArray
s of any dimensionality (i.e.&CowArray<'_, f32, Ix3>
) - Mutable references to
ArcArray
s of any dimensionality (i.e.&mut ArcArray<f32, Ix3>
) - A raw shape definition & data array, of type
(Vec<i64>, Arc<Box<[T]>>)
-// v1.x
-let mut tokens = CowArray::from(Array1::from_iter(tokens.iter().cloned()).into_dyn());
+// v2
+let mut tokens = Array1::from_iter(tokens.iter().cloned());
It should be noted that there are some cases in which an array is cloned when converting into a tensor which may lead to a surprising performance hit. ONNX Runtime does not expose an API to specify the strides of a tensor, so if an array is reshaped before being converted into a tensor, it must be cloned in order to make the data contiguous. Specifically:
&CowArray
,ArrayView
will always be cloned (due to the fact that we cannot guarantee the lifetime of the array).Array
,&mut ArcArray
will only be cloned if the memory layout is not contiguous, i.e. if it has been reshaped.- Raw data will never be cloned as it is assumed to already have a contiguous memory layout.
ort::inputs!
macro
v2.0 makes the transition to the new input/output system easier by providing an inputs!
macro. This new macro allows you to specify inputs either by position as they appear in the graph (like previous versions), or by name.
The ort::inputs!
macro will painlessly convert compatible data types (see above) into the new inputs system.
-// v1.x
-let chunk_embeddings = text_encoder.run(&[CowArray::from(text_input_chunk.into_dyn())])?;
+// v2
+let chunk_embeddings = text_encoder.run(ort::inputs![text_input_chunk]?)?;
Note the ?
after the macro call - ort::inputs!
returns an ort::Result<SessionInputs>
, so you’ll need to handle any errors accordingly.
As mentioned, you can now also specify inputs by name using a map-like syntax. This is especially useful for graphs with optional inputs.
let noise_pred = unet.run(ort::inputs![
"latents" => latents,
"timestep" => Array1::from_iter([t]),
"encoder_hidden_states" => text_embeddings.view()
]?)?;
Tensor creation no longer requires the session’s allocator
In previous versions, Value::from_array
took an allocator parameter. The allocator was only used because the string data contained in string tensors had to be cloned into ONNX Runtime-managed memory. However, 99% of users only ever use primitive tensors, so the extra parameter served little purpose. The new Tensor::from_array
function now takes only an array, and the logic for converting string arrays has been moved to a new function, DynTensor::from_string_array
.
-// v1.x
-let val = Value::from_array(session.allocator(), &array)?;
+// v2
+let val = Tensor::from_array(&array)?;
Separate string tensor creation
As previously mentioned, the logic for creating string tensors has been moved from Value::from_array
to DynTensor::from_string_array
.
To use string tensors with ort::inputs!
, you must create a DynTensor
using DynTensor::from_string_array
.
let array = ndarray::Array::from_shape_vec((1,), vec![document]).unwrap();
let outputs = session.run(ort::inputs![
"input" => DynTensor::from_string_array(session.allocator(), array)?
]?)?;
Session outputs
New: Retrieve outputs by name
Just like how inputs can now be specified by name, you can now retrieve session outputs by name.
let l = outputs["latents"].try_extract_tensor::<f32>()?;
Execution providers
Execution provider structs with public fields have been replaced with builder pattern structs. See the API reference and the execution providers reference for more information.
-// v1.x
-builder = builder.with_execution_providers(ExecutionProvider::DirectML(DirectMLExecutionProvider {
- device_id: 1
-}))?;
+// v2
+builder = builder.with_execution_providers([
+ DirectMLExecutionProvider::default()
+ .with_device_id(1)
+ .build()
+])?;
Updated dependencies & features
ndarray
is now optional
The dependency on ndarray
is now declared optional. If you use ort
with default-features = false
, you’ll need to add the ndarray
feature.
Model Zoo structs have been removed
ONNX pushed a new Model Zoo structure that adds hundreds of different models. This is impractical to maintain, so the built-in structs have been removed.
You can still use Session::commit_from_url
, it just now takes a URL string instead of a struct.
Changes to logging
Environment-level logging configuration (i.e. EnvironmentBuilder::with_log_level
) has been removed because it could cause unnecessary confusion with our tracing
integration.
Renamed types
The following types have been renamed with no other changes.
NdArrayExtensions
->ArrayExtensions
OrtResult
,OrtError
->ort::Result
,ort::Error
TensorDataToType
->ExtractTensorData
TensorElementDataType
,IntoTensorElementDataType
->TensorElementType
,IntoTensorElementType