Incorrect calculations in GridAnchorGenerator, gridAnchorLayer for rectangular inputs

## Description

If the input layer of an object detector is not square, libnvinfer_plugin does not produce the correct bounding boxes. An incomplete fix was committed to master with https://github.com/NVIDIA/TensorRT/pull/679 by @rajeevsrao . But parts of it are not in master anymore. The issue was also discussed in https://github.com/NVIDIA/TensorRT/issues/807.

## Environment

- NVIDIA Jetson TX2
   * Jetpack 4.6 [L4T 32.6.1]
   * NV Power Mode: MAXP_CORE_ARM - Type: 3
   * jetson_stats.service: active
 - Libraries:
   * CUDA: 10.2.300
   * cuDNN: 8.2.1.32
   * TensorRT: 8.0.1.6
   * Visionworks: 1.6.0.501
   * OpenCV: 4.1.1 compiled CUDA: NO
   * VPI: ii libnvvpi1 1.1.12 arm64 NVIDIA Vision Programming Interface library
   * Vulkan: 1.2.70

## Relevant Files and Fix

The following changes fix the issue.

```diff
modified   plugin/common/kernels/gridAnchorLayer.cu                                                                                                                                        
@@ -34,8 +34,10 @@ __launch_bounds__(nthdsPerCTA) __global__ void gridAnchorKernel(const GridAnchor                                                                                        
      * the image Every coordinate will go back to the pixel coordinates in the input image if being multiplied by                                                                         
      * image_input_size Here we implicitly assumes the image input and feature map are square                                                                                             
      */                                                                                                                                                                                   
-    float anchorStride = (1.0 / param.H);                                                                                                                                                 
-    float anchorOffset = 0.5 * anchorStride;                                                                                                                                              
+    float anchorStrideH = (1.0 / param.H);                                                                                                                                                
+    float anchorOffsetH = 0.5 * anchorStrideH;                                                                                                                                            
+    float anchorStrideW = (1.0 / param.W);                                                                                                                                                
+    float anchorOffsetW = 0.5 * anchorStrideW;                                                                                                                                            
                                                                                                                                                                                           
     int tid = blockIdx.x * blockDim.x + threadIdx.x;                                                                                                                                      
     if (tid >= dim)                                                                                                                                                                       
@@ -47,8 +49,8 @@ __launch_bounds__(nthdsPerCTA) __global__ void gridAnchorKernel(const GridAnchor                                                                                         
     const int h = currIndex / param.W;                                                                                                                                                    
                                                                                                                                                                                           
     // Center coordinates                                                                                                                                                                 
-    float yC = h * anchorStride + anchorOffset;                                                                                                                                           
-    float xC = w * anchorStride + anchorOffset;                                                                                                                                           
+    float yC = h * anchorStrideH + anchorOffsetH;                                                                                                                                         
+    float xC = w * anchorStrideW + anchorOffsetW;                                                                                                                                         

modified   plugin/gridAnchorPlugin/gridAnchorPlugin.cpp                                                                                                                                    
@@ -109,11 +109,13 @@ GridAnchorGenerator::GridAnchorGenerator(const GridAnchorParameters* paramIn, in                                                                                     
                                                                                                                                                                                           
         std::vector<float> tmpWidths;                                                                                                                                                     
         std::vector<float> tmpHeights;                                                                                                                                                    
+        float featMapAspectRatio = (float) (mParam[0].H) / (float) (mParam[0].W);                                                                                                         
+        // TODO: calculate the ratio with the input layer height and width instead.                                                                                                             
         // Calculate the width and height of the prior boxes                                                                                                                              
         for (int i = 0; i < mNumPriors[id]; i++)                                                                                                                                          
         {                                                                                                                                                                                 
             float sqrt_AR = sqrt(aspect_ratios[i]);                                                                                                                                       
-            tmpWidths.push_back(scales[i] * sqrt_AR);                                                                                                                                     
+            tmpWidths.push_back(scales[i] * sqrt_AR * featMapAspectRatio);                                                                                                                
             tmpHeights.push_back(scales[i] / sqrt_AR);                                                                                                                                    
         }                                                                                                                                                                                 

```

## Steps To Reproduce
- Use a SSD object detection model with a rectangular input layer for example 400x300.
- Convert it to TensorRT like in https://github.com/NVIDIA/TensorRT/tree/main/samples/python/uff_ssd.
- Do inference using the TensorRT model on an image.
- The x coordinates of the output bounding boxes are invalid.

Example of how the plugin is used when doing graph surgeon:
```python
gs.create_plugin_node(
        name="MultipleGridAnchorGenerator",
        op="GridAnchorRect_TRT",
        minSize=0.2,
        maxSize=0.95,
        aspectRatios=[1.0, 2.0, 0.5, 3.0, 0.33],
        variance=[0.1, 0.1, 0.2, 0.2],
        featureMapShapes=[40, 23, 20, 12, 10, 6, 5, 3, 3, 2, 2, 1],
        numLayers=6,
)
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Incorrect calculations in GridAnchorGenerator, gridAnchorLayer for rectangular inputs #1563

Description

Environment

Relevant Files and Fix

Steps To Reproduce

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Incorrect calculations in GridAnchorGenerator, gridAnchorLayer for rectangular inputs #1563

Description

Description

Environment

Relevant Files and Fix

Steps To Reproduce

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions