Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 0 additions & 6 deletions .azure_pipelines/olive-ci.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -92,9 +92,6 @@ jobs:
resnet_ptq_cpu:
exampleFolder: resnet
exampleName: resnet_ptq_cpu
whisper:
exampleFolder: whisper
exampleName: whisper
mobilenet_qnn_ep:
exampleFolder: mobilenet/qnn
exampleName: mobilenet_qnn_ep
Expand All @@ -112,9 +109,6 @@ jobs:
resnet_ptq_cpu:
exampleFolder: resnet
exampleName: resnet_ptq_cpu
whisper:
exampleFolder: whisper
exampleName: whisper
mobilenet_qnn_ep:
exampleFolder: mobilenet/qnn
exampleName: mobilenet_qnn_ep
Expand Down
6 changes: 0 additions & 6 deletions .azure_pipelines/olive-ort-nightly.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -61,9 +61,6 @@ jobs:
resnet_qat:
exampleFolder: resnet
exampleName: resnet_qat
whisper:
exampleFolder: whisper
exampleName: whisper
mobilenet_qnn_ep:
exampleFolder: mobilenet/qnn
exampleName: mobilenet_qnn_ep
Expand All @@ -82,9 +79,6 @@ jobs:
resnet_ptq_cpu:
exampleFolder: resnet
exampleName: resnet_ptq_cpu
whisper:
exampleFolder: whisper
exampleName: whisper
mobilenet_qnn_ep:
exampleFolder: mobilenet/qnn
exampleName: mobilenet_qnn_ep
Expand Down
1 change: 0 additions & 1 deletion docs/source/examples.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,6 @@
||deberta|[Link](https://github.com/microsoft/Olive/tree/main/examples/deberta)|`GPU`: Optimize Azureml Registry Model with ONNX Runtime optimizations and quantization
||gptj|[Link](https://github.com/microsoft/Olive/tree/main/examples/gptj)|`CPU`: with Intel® Neural Compressor static/dynamic quantization for INT8 ONNX model
||bge|[Link](https://github.com/microsoft/Olive/tree/main/examples/bge)|`NPU`: with ONNX Runtime optimizations for QNN EP
|Audio|whisper|[Link](https://github.com/microsoft/Olive/tree/main/examples/whisper)|`CPU`: with ONNX Runtime optimizations for all-in-one ONNX model in FP32<br>`CPU`: with ONNX Runtime optimizations for all-in-one ONNX model in INT8<br>`CPU`: with ONNX Runtime optimizations and Intel® Neural Compressor Dynamic Quantization for all-in-one ONNX model in INT8<br>`GPU`: with ONNX Runtime optimizations for all-in-one ONNX model in FP32<br>`GPU`: with ONNX Runtime optimizations for all-in-one ONNX model in FP16<br>`GPU`: with ONNX Runtime optimizations for all-in-one ONNX model in INT8
||audio spectrogram<br>transformer|[Link](https://github.com/microsoft/Olive/tree/main/examples/ast)|`CPU`: with ONNX Runtime optimizations and quantization for optimized INT8 ONNX model
|Vision|stable diffusion|[Link](https://github.com/microsoft/Olive/tree/main/examples/stable_diffusion)|`GPU`: with ONNX Runtime optimization for DirectML EP<br>`GPU`: with ONNX Runtime optimization for CUDA EP<br>`Intel CPU`: with OpenVINO toolkit<br>`QDQ`: with ONNX Runtime static Quantization for ONNX INT8 model with QDQ format
||stable diffusion XL|[Link](https://github.com/microsoft/Olive/tree/main/examples/directml/stable_diffusion_xl)|`GPU`: with ONNX Runtime optimizations with DirectML EP<br>`GPU`: with ONNX Runtime optimization for CUDA EP
Expand Down
23 changes: 0 additions & 23 deletions docs/source/features/onnx-transformations.md
Original file line number Diff line number Diff line change
Expand Up @@ -1017,16 +1017,6 @@ graph {
}
```

```json
{
"type": "AppendPrePostProcessingOps",
"tool_command": "whisper",
"tool_command_args": {
"use_audio_decoder": true
}
}
```

`AppendPrePostProcessingOps` also supports pre/post processing ops by leveraging the [onnxruntime-extension steps](https://github.com/microsoft/onnxruntime-extensions/tree/main/onnxruntime_extensions/tools/pre_post_processing/steps) and `PrePostProcessor`.
You can refer to [here](https://github.com/microsoft/onnxruntime-extensions/blob/main/onnxruntime_extensions/tools/Example%20usage%20of%20the%20PrePostProcessor.md) to see how to leverage `PrePostProcessor` to customize pre and post processing ops.

Expand Down Expand Up @@ -1130,19 +1120,6 @@ Here are some examples to describe the pre/post processing which is exactly same
}
```

## Insert Beam Search Op

`InsertBeamSearch` chains two model components (for example, encoder and decoder) together by inserting beam search op in between them.

### Example Configuration

```json
{
"type": "InsertBeamSearch",
"no_repeat_ngram_size": 4
}
```

## ORT Performance Tuning

ONNX Runtime provides high performance across a range of hardware options through its Execution Providers interface for different execution
Expand Down
1 change: 0 additions & 1 deletion docs/source/reference/options.md
Original file line number Diff line number Diff line change
Expand Up @@ -405,7 +405,6 @@ Please also find the detailed options from following table for each pass:
| [IncQuantization](../../reference/pass.rst#_inc_quantization) | Quantize ONNX model with Intel® Neural Compressor where we can search for best parameters for static/dynamic quantization at same time. |
| [VitisAIQuantization](../../reference/pass.rst#_vitis_ai_quantization) | AMD-Xilinx Vitis-AI Quantization Pass. |
| [AppendPrePostProcessingOps](../../reference/pass.rst#_append_pre_post_processing) | Add Pre/Post nodes to the input model. |
| [InsertBeamSearch](../../reference/pass.rst#_insert_beam_search) | Insert Beam Search Op. Only used for whisper models. Uses WhisperBeamSearch contrib op if ORT version >= 1.17.1, else uses BeamSearch contrib op. |
| [ExtractAdapters](../../reference/pass.rst#_extract_adapters) | Extract adapters from ONNX model |
| [CaptureSplitInfo](../../reference/pass.rst#_capture_split_info) | Capture the split information of the model layers. Only splits the transformer layers. |
| [SplitModel](../../reference/pass.rst#_split_model) | Split an ONNX model into multiple smaller sub-models based on predefined assignments. |
Expand Down
6 changes: 0 additions & 6 deletions docs/source/reference/pass.rst
Original file line number Diff line number Diff line change
Expand Up @@ -152,12 +152,6 @@ AppendPrePostProcessingOps
--------------------------
.. autoconfigclass:: olive.passes.AppendPrePostProcessingOps

.. _insert_beam_search:

InsertBeamSearch
----------------
.. autoconfigclass:: olive.passes.InsertBeamSearch

.. _extract_adapters:

ExtractAdapters
Expand Down
53 changes: 0 additions & 53 deletions examples/test/local/test_whisper.py

This file was deleted.

2 changes: 0 additions & 2 deletions examples/whisper/.gitignore

This file was deleted.

169 changes: 0 additions & 169 deletions examples/whisper/README.md

This file was deleted.

Loading