English | 中文
Official repository for the paper [Robust High-Resolution Video Matting with Temporal Guidance](https://peterl1n.github.io/RobustVideoMatting/). RVM is specifically designed for robust human video matting. Unlike existing neural models that process frames as independent images, RVM uses a recurrent neural network to process videos with temporal memory. RVM can perform matting in real-time on any videos without additional inputs. It achieves **4K 76FPS** and **HD 104FPS** on an Nvidia GTX 1080 Ti GPU. The project was developed at [ByteDance Inc.](https://www.bytedance.com/)| Framework | Download | Notes |
| PyTorch |
rvm_mobilenetv3.pth rvm_resnet50.pth |
Official weights for PyTorch. Doc |
| TorchHub | Nothing to Download. | Easiest way to use our model in your PyTorch project. Doc |
| TorchScript |
rvm_mobilenetv3_fp32.torchscript rvm_mobilenetv3_fp16.torchscript rvm_resnet50_fp32.torchscript rvm_resnet50_fp16.torchscript |
If inference on mobile, consider export int8 quantized models yourself. Doc |
| ONNX |
rvm_mobilenetv3_fp32.onnx rvm_mobilenetv3_fp16.onnx rvm_resnet50_fp32.onnx rvm_resnet50_fp16.onnx |
Tested on ONNX Runtime with CPU and CUDA backends. Provided models use opset 12. Doc, Exporter. |
| TensorFlow |
rvm_mobilenetv3_tf.zip rvm_resnet50_tf.zip |
TensorFlow 2 SavedModel. Doc |
| TensorFlow.js |
rvm_mobilenetv3_tfjs_int8.zip |
Run the model on the web. Demo, Starter Code |
| CoreML |
rvm_mobilenetv3_1280x720_s0.375_fp16.mlmodel rvm_mobilenetv3_1280x720_s0.375_int8.mlmodel rvm_mobilenetv3_1920x1080_s0.25_fp16.mlmodel rvm_mobilenetv3_1920x1080_s0.25_int8.mlmodel |
CoreML does not support dynamic resolution. Other resolutions can be exported yourself. Models require iOS 13+. s denotes downsample_ratio. Doc, Exporter
|
English | 中文
论文 [Robust High-Resolution Video Matting with Temporal Guidance](https://peterl1n.github.io/RobustVideoMatting/) 的官方 GitHub 库。RVM 专为稳定人物视频抠像设计。不同于现有神经网络将每一帧作为单独图片处理,RVM 使用循环神经网络,在处理视频流时有时间记忆。RVM 可在任意视频上做实时高清抠像。在 Nvidia GTX 1080Ti 上实现 **4K 76FPS** 和 **HD 104FPS**。此研究项目来自[字节跳动](https://www.bytedance.com/)。| 框架 | 下载 | 备注 |
| PyTorch |
rvm_mobilenetv3.pth rvm_resnet50.pth |
官方 PyTorch 模型权值。文档 |
| TorchHub | 无需手动下载。 | 更方便地在你的 PyTorch 项目里使用此模型。文档 |
| TorchScript |
rvm_mobilenetv3_fp32.torchscript rvm_mobilenetv3_fp16.torchscript rvm_resnet50_fp32.torchscript rvm_resnet50_fp16.torchscript |
若需在移动端推断,可以考虑自行导出 int8 量化的模型。文档 |
| ONNX |
rvm_mobilenetv3_fp32.onnx rvm_mobilenetv3_fp16.onnx rvm_resnet50_fp32.onnx rvm_resnet50_fp16.onnx |
在 ONNX Runtime 的 CPU 和 CUDA backend 上测试过。提供的模型用 opset 12。文档,导出 |
| TensorFlow |
rvm_mobilenetv3_tf.zip rvm_resnet50_tf.zip |
TensorFlow 2 SavedModel 格式。文档 |
| TensorFlow.js |
rvm_mobilenetv3_tfjs_int8.zip |
在网页上跑模型。展示,示范代码 |
| CoreML |
rvm_mobilenetv3_1280x720_s0.375_fp16.mlmodel rvm_mobilenetv3_1280x720_s0.375_int8.mlmodel rvm_mobilenetv3_1920x1080_s0.25_fp16.mlmodel rvm_mobilenetv3_1920x1080_s0.25_int8.mlmodel |
CoreML 只能导出固定分辨率,其他分辨率可自行导出。支持 iOS 13+。s 代表下采样比。文档,导出
|
English | 中文
## Content * [Concepts](#concepts) * [Downsample Ratio](#downsample-ratio) * [Recurrent States](#recurrent-states) * [PyTorch](#pytorch) * [TorchHub](#torchhub) * [TorchScript](#torchscript) * [ONNX](#onnx) * [TensorFlow](#tensorflow) * [TensorFlow.js](#tensorflowjs) * [CoreML](#coreml)English | 中文
## 目录 * [概念](#概念) * [下采样比](#下采样比) * [循环记忆](#循环记忆) * [PyTorch](#pytorch) * [TorchHub](#torchhub) * [TorchScript](#torchscript) * [ONNX](#onnx) * [TensorFlow](#tensorflow) * [TensorFlow.js](#tensorflowjs) * [CoreML](#coreml)