Skip to content

[Web] How to free webgpu gpu mem in onnxruntime web #21574

@soyoapp

Description

@soyoapp

Describe the issue

I use onnxruntime web with following code

/**
 *
 * @param model don't pass session but pass model path and create session in infer inner. In this way, after infer finish, it will auto free gpu mem to prevent mem overflow
 * @param inputTensor
 */
export async function infer2(model: string, inputTensor: Tensor) {
  const session = await newSession(model)
  const feeds: any = {};
  const inputNames = session.inputNames;
  feeds[inputNames[0]] = inputTensor;
  const results = await session.run(feeds);
  const tensor = results[session.outputNames[0]]
  // await session.release() // free gpu mem
  await session.release() // free gpu mem
  return tensor;
}

/**
 * Load the ONNX model and perform inference
 * @param model don't pass session but pass model path and create session in infer inner. In this way, after infer finish, it will auto free gpu mem to prevent mem overflow
 * @param {onnxruntime.Tensor} inputTensor - Input tensor
 * @param {number[]} inputShape - Input tensor shape
 * @returns {Promise<Float32Array>} - Output tensor data
 */
export const infer = async (model: string, input: Ndarray) => {
  let inputTensor = ndarrayToTensor(input)
  const outTensor = await infer2(model, inputTensor);
  let na = new Ndarray(Array.from(outTensor.data as Float32Array) as number[], outTensor.dims as number[])
  inputTensor.dispose()
  outTensor.dispose()
  return na
  // const {data: out, dims: outShape} = results[session.outputNames[0]]
  // return {out: out as Float32Array, outShape: outShape as number[]}
};

and following is my test code

  let input = await imgToNdarray(t);
  let out = await infer(model, input)
  let imgDataUrl = outToImgDataUrl(out)
  testReact(<img src={imgDataUrl}/>)

but after infer, nvidia-smi show the gpu mem is still in use, only refresh browser tab or close browser tab can free gpu mem

To reproduce

Just run above code

Urgency

No response

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

1.17.3

Execution Provider

'webgpu' (WebGPU)

Env

Microsoft Edge 127.0.2651.74 (Official build) (64-bit)
Revision dbf5b0aa014c4e70e3d5e2d73248e21264f82957
Chromium version 127.0.6533.73
Operating system Linux
JavaScript V8 12.7.18.6
User agent Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/127.0.0.0 Safari/537.36 Edg/127.0.0.0
Command-line /usr/bin/microsoft-edge --disable-web-security --password-store=basic --user-data-dir=/home/roroco/.config/JetBrains/WebStorm2023.2/edge-user-data --remote-debugging-port=39765 --no-default-browser-check --flag-switches-begin --enable-unsafe-webgpu --enable-features=Vulkan --flag-switches-end about:blank

Metadata

Metadata

Assignees

Labels

api:Javascriptissues related to the Javascript APIep:WebGPUort-web webgpu providerplatform:webissues related to ONNX Runtime web; typically submitted using templatestaleissues that have not been addressed in a while; categorized by a bot

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions