|
1 | 1 | # TorchSparse
|
2 | 2 |
|
3 |
| -## News |
| 3 | +TorchSparse is a high-performance neural network library for point cloud processing. |
4 | 4 |
|
5 |
| -2020/09/20: We released `torchsparse` v1.1, which is significantly faster than our `torchsparse` v1.0 and is also achieves **1.9x** speedup over [MinkowskiEngine](https://github.com/NVIDIA/MinkowskiEngine) v0.5 alpha when running MinkUNet18C! |
6 |
| - |
7 |
| -2020/08/30: We released `torchsparse` v1.0. |
8 |
| - |
9 |
| -## Overview |
| 5 | +## Installation |
10 | 6 |
|
11 |
| -We release `torchsparse`, a high-performance computing library for efficient 3D sparse convolution. This library aims at accelerating sparse computation in 3D, in particular the Sparse Convolution operation. |
| 7 | +TorchSparse depends on the [Google Sparse Hash](https://github.com/sparsehash/sparsehash) library. |
12 | 8 |
|
13 |
| -<img src="https://hanlab.mit.edu/projects/spvnas/figures/sparseconv_illustration.gif" width="1080"> |
| 9 | +* On Ubuntu, it can be installed by |
14 | 10 |
|
15 |
| -The major advantage of this library is that we support all computation on the GPU, especially the kernel map construction (which is done on the CPU in latest [MinkowskiEngine](https://github.com/NVIDIA/MinkowskiEngine) V0.4.3). |
| 11 | + ```bash |
| 12 | + sudo apt-get install libsparsehash-dev |
| 13 | + ``` |
16 | 14 |
|
17 |
| -## Installation |
| 15 | +* On Mac OS, it can be installed by |
18 | 16 |
|
19 |
| -You may run the following command to install torchsparse. |
| 17 | + ```bash |
| 18 | + brew install google-sparsehash |
| 19 | + ``` |
20 | 20 |
|
21 |
| -```bash |
22 |
| -pip install --upgrade git+https://github.com/mit-han-lab/torchsparse.git |
23 |
| -``` |
| 21 | +* You can also compile the library locally (if you do not have the sudo permission) and add the library path to the environment variable `CPLUS_INCLUDE_PATH`. |
24 | 22 |
|
25 |
| -Note that this library depends on Google's [sparse hash map project](https://github.com/sparsehash/sparsehash). In order to install this library, you may run |
| 23 | +The latest released TorchSparse (v1.4.0) can then be installed by |
26 | 24 |
|
27 | 25 | ```bash
|
28 |
| -sudo apt-get install libsparsehash-dev |
| 26 | +pip install --upgrade git+https://github.com/mit-han-lab/[email protected] |
29 | 27 | ```
|
30 | 28 |
|
31 |
| -on Ubuntu servers. If you are not sudo, please clone Google's codebase, compile it and install locally. Finally, add the path to this library to your `CPLUS_INCLUDE_PATH` environmental variable. |
32 |
| - |
33 |
| -For GPU server users, we currently support PyTorch 1.6.0 + CUDA 10.2 + CUDNN 7.6.2. For CPU users, we support PyTorch 1.6.0 (CPU version), MKLDNN backend is optional. |
34 |
| - |
35 |
| -## Usage |
36 |
| - |
37 |
| -Our [SPVNAS](https://github.com/mit-han-lab/e3d) project (ECCV2020) is built with torchsparse. You may navigate to this project and follow the instructions in that codebase to play around. |
38 |
| - |
39 |
| -Here, we also provide a walk-through on some important concepts in torchsparse. |
40 |
| - |
41 |
| -### Sparse Tensor and Point Tensor |
42 |
| - |
43 |
| -In torchsparse, we have two data structures for point cloud storage, namely `torchsparse.SparseTensor` and `torchsparse.PointTensor`. Both structures has two data fields `C` (coordinates) and `F` (features). In `SparseTensor`, we assume that all coordinates are **integer** and **do not duplicate**. However, in `PointTensor`, all coordinates are **floating-point** and can duplicate. |
44 |
| - |
45 |
| -### Sparse Quantize and Sparse Collate |
46 |
| - |
47 |
| -The way to convert a point cloud to `SparseTensor` so that it can be consumed by networks built with Sparse Convolution or Sparse Point-Voxel Convolution is to use the function `torchsparse.utils.sparse_quantize`. An example is given here: |
48 |
| - |
49 |
| -```python |
50 |
| -inds, labels, inverse_map = sparse_quantize(pc, feat, labels, return_index=True, return_invs=True) |
51 |
| -``` |
| 29 | +If you use TorchSparse in your code, please remember to specify the exact version as your dependencies. |
52 | 30 |
|
53 |
| -where `pc`, `feat`, `labels` corresponds to point cloud (coordinates, should be integer), feature and ground-truth. The `inds` denotes unique indices in the point cloud coordinates, and `inverse_map` denotes the unique index each point is corresponding to. The `inverse map` is used to restore full point cloud prediction from downsampled prediction. |
| 31 | +## Benchmark |
54 | 32 |
|
55 |
| -To combine a list of `SparseTensor`s to a batch, you may want to use the `torchsparse.utils.sparse_collate_fn` function. |
| 33 | +We compare TorchSparse with [MinkowskiEngine](https://github.com/NVIDIA/MinkowskiEngine) (where the latency is measured on NVIDIA GTX 1080Ti): |
56 | 34 |
|
57 |
| -Detailed results are given in [SemanticKITTI dataset preprocessing code](https://github.com/mit-han-lab/e3d/blob/master/spvnas/core/datasets/semantic_kitti.py) in our [SPVNAS](https://github.com/mit-han-lab/e3d) project. |
| 35 | +| | MinkowskiEngine v0.4.3 | TorchSparse v1.0.0 | |
| 36 | +| :----------------------- | :--------------------: | :----------------: | |
| 37 | +| MinkUNet18C (MACs / 10) | 224.7 ms | 124.3 ms | |
| 38 | +| MinkUNet18C (MACs / 4) | 244.3 ms | 160.9 ms | |
| 39 | +| MinkUNet18C (MACs / 2.5) | 269.6 ms | 214.3 ms | |
| 40 | +| MinkUNet18C | 323.5 ms | 294.0 ms | |
58 | 41 |
|
59 |
| -### Computation API |
| 42 | +## Getting Started |
60 | 43 |
|
61 |
| -The computation interface in torchsparse is straightforward and very similar to original PyTorch. An example here defines a basic convolution block: |
| 44 | +### Sparse Tensor |
62 | 45 |
|
63 |
| -```python |
64 |
| -class BasicConvolutionBlock(nn.Module): |
65 |
| - def __init__(self, inc, outc, ks=3, stride=1, dilation=1): |
66 |
| - super().__init__() |
67 |
| - self.net = nn.Sequential( |
68 |
| - spnn.Conv3d(inc, outc, kernel_size=ks, dilation=dilation, stride=stride), |
69 |
| - spnn.BatchNorm(outc), |
70 |
| - spnn.ReLU(True) |
71 |
| - ) |
72 |
| - |
73 |
| - def forward(self, x): |
74 |
| - out = self.net(x) |
75 |
| - return out |
76 |
| -``` |
| 46 | +Sparse tensor (`SparseTensor`) is the main data structure for point cloud, which has two data fields: |
| 47 | +* Coordinates (`coords`): a 2D integer tensor with a shape of N x 4, where the first three dimensions correspond to quantized x, y, z coordinates, and the last dimension denotes the batch index. |
| 48 | +* Features (`feats`): a 2D tensor with a shape of N x C, where C is the number of feature channels. |
77 | 49 |
|
78 |
| -where `spnn`denotes `torchsparse.nn`, and `spnn.Conv3d` means 3D sparse convolution operation, `spnn.BatchNorm` and `spnn.ReLU` denotes 3D sparse tensor batchnorm and activations, respectively. We also support direct convolution kernel call via `torchsparse.nn.functional`, for example: |
| 50 | +Most existing datasets provide raw point cloud data with float coordinates. We can use `sparse_quantize` (provided in `torchsparse.utils.quantize`) to voxelize x, y, z coordinates and remove duplicates: |
79 | 51 |
|
80 | 52 | ```python
|
81 |
| -outputs = torchsparse.nn.functional.conv3d(inputs, kernel, stride=1, dilation=1, transpose=False) |
| 53 | +coords -= np.min(coords, axis=0, keepdims=True) |
| 54 | +coords, indices = sparse_quantize(coords, voxel_size, return_index=True) |
| 55 | +coords = torch.tensor(coords, dtype=torch.int) |
| 56 | +feats = torch.tensor(feats[indices], dtype=torch.float) |
| 57 | +tensor = SparseTensor(coords=coords, feats=feats) |
82 | 58 | ```
|
83 | 59 |
|
84 |
| -where we need to define `inputs`(SparseTensor), `kernel` (of shape k^3 x OC x IC when k > 1, or OC x IC when k = 1, where k denotes the kernel size and IC, OC means input / output channels). The `outputs` is still a SparseTensor. |
| 60 | +We can then use `sparse_collate_fn` (provided in `torchsparse.utils.collate`) to assemble a batch of `SparseTensor`'s (and add the batch dimension to `coords`). Please refer to [this example](https://github.com/mit-han-lab/torchsparse/blob/dev/pre-commit/examples/example.py) for more details. |
85 | 61 |
|
86 |
| -Detailed examples are given in [here](https://github.com/mit-han-lab/e3d/blob/master/spvnas/core/modules/dynamic_sparseop.py), where we use the `torchsparse.nn.functional` interfaces to implement weight-shared 3D-NAS modules. |
| 62 | +### Sparse Neural Network |
87 | 63 |
|
88 |
| -### Sparse Hashmap API |
89 |
| - |
90 |
| -Sparse hash map query is important in 3D sparse computation. It is mainly used to infer a point's memory location (*i.e.* index) given its coordinates. For example, we use this operation in kernel map construction part of 3D sparse convolution, and also sparse voxelization / devoxelization in [Sparse Point-Voxel Convolution](https://arxiv.org/abs/2007.16100). Here, we provide the following example for hash map API: |
| 64 | +The neural network interface in TorchSparse is very similar to PyTorch: |
91 | 65 |
|
92 | 66 | ```python
|
93 |
| -source_hash = torchsparse.nn.functional.sphash(torch.floor(source_coords).int()) |
94 |
| -target_hash = torchsparse.nn.functional.sphash(torch.floor(target_coords).int()) |
95 |
| -idx_query = torchsparse.nn.functional.sphashquery(source_hash, target_hash) |
| 67 | +from torch import nn |
| 68 | +from torchsparse import nn as spnn |
| 69 | + |
| 70 | +model = nn.Sequential( |
| 71 | + spnn.Conv3d(in_channels, out_channels, kernel_size), |
| 72 | + spnn.BatchNorm(out_channels), |
| 73 | + spnn.ReLU(True), |
| 74 | +) |
96 | 75 | ```
|
97 | 76 |
|
98 |
| -In this example, `sphash` is the function converting integer coordinates to hashing. The `sphashquery(source_hash, target_hash)` performs the hash table lookup. Here, the hash map has key `target_hash` and value corresponding to point indices in the target point cloud tensor. For each point in the `source_coords`, we find the point index in `target_coords` which has the same coordinate as it. |
99 |
| - |
100 |
| -### Dummy Training Example |
101 |
| - |
102 |
| -We here provides an entire training example with dummy input [here](examples/example.py). In this example, we cover |
103 |
| - |
104 |
| -- How we start from point cloud data and convert it to SparseTensor format; |
105 |
| -- How we can implement SparseTensor batching; |
106 |
| -- How to train a semantic segmentation SparseConvNet. |
107 |
| - |
108 |
| -You are also welcomed to check out our [SPVNAS](https://github.com/mit-han-lab/e3d) project to implement training / inference with real data. |
109 |
| - |
110 |
| -### Mixed Precision (float16) Support |
111 |
| - |
112 |
| -Mixed precision training is supported via `torch.cuda.amp.autocast` and `torch.cuda.amp.GradScaler`. Enabling mixed precision training can speed up training and reduce GPU memory usage. By wrapping your training code in a `torch.cuda.amp.autocast` block, feature tensors will automatically be converted to float16 if possible. See [here](examples/example.py) for a complete example. |
113 |
| - |
114 |
| -## Speed Comparison Between torchsparse and MinkowskiEngine |
115 |
| - |
116 |
| -We benchmark the performance of our torchsparse and latest [MinkowskiEngine V0.4.3](https://github.com/NVIDIA/MinkowskiEngine) here, latency is measured on NVIDIA GTX 1080Ti GPU: |
117 |
| - |
118 |
| -| Network | Latency (ME V0.4.3) | Latency (torchsparse V1.0.0) | |
119 |
| -| :----------------------: | :-----------------: | :--------------------------: | |
120 |
| -| MinkUNet18C (MACs / 10) | 224.7 | 124.3 | |
121 |
| -| MinkUNet18C (MACs / 4) | 244.3 | 160.9 | |
122 |
| -| MinkUNet18C (MACs / 2.5) | 269.6 | 214.3 | |
123 |
| -| MinkUNet18C | 323.5 | 294.0 | |
124 |
| - |
125 | 77 | ## Citation
|
126 | 78 |
|
127 |
| -If you find this code useful, please consider citing: |
| 79 | +If you use TorchSparse in your research, please use the following BibTeX entry: |
128 | 80 |
|
129 | 81 | ```bibtex
|
130 |
| -@inproceedings{ |
131 |
| - tang2020searching, |
132 |
| - title = {Searching Efficient 3D Architectures with Sparse Point-Voxel Convolution}, |
133 |
| - author = {Tang, Haotian* and Liu, Zhijian* and Zhao, Shengyu and Lin, Yujun and Lin, Ji and Wang, Hanrui and Han, Song}, |
134 |
| - booktitle = {European Conference on Computer Vision}, |
| 82 | +@inproceedings{tang2020searching, |
| 83 | + title = {{Searching Efficient 3D Architectures with Sparse Point-Voxel Convolution}}, |
| 84 | + author = {Tang, Haotian and Liu, Zhijian and Zhao, Shengyu and Lin, Yujun and Lin, Ji and Wang, Hanrui and Han, Song}, |
| 85 | + booktitle = {European Conference on Computer Vision (ECCV)}, |
135 | 86 | year = {2020}
|
136 | 87 | }
|
137 | 88 | ```
|
138 | 89 |
|
139 | 90 | ## Acknowledgements
|
140 | 91 |
|
141 |
| -This library is inspired by [MinkowskiEngine](https://github.com/NVIDIA/MinkowskiEngine), [SECOND](https://github.com/traveller59/second.pytorch) and [SparseConvNet](https://github.com/facebookresearch/SparseConvNet). |
| 92 | +TorchSparse is inspired by many existing open-source libraries, including (but not limited to) [MinkowskiEngine](https://github.com/NVIDIA/MinkowskiEngine), [SECOND](https://github.com/traveller59/second.pytorch) and [SparseConvNet](https://github.com/facebookresearch/SparseConvNet). |
0 commit comments