publications

publications by categories in reversed chronological order. generated by jekyll-scholar.

2026

CVPRW
Two-Pass Zero-Shot Temporal-Spatial Grounding of Rare Traffic Events in Surveillance Video

Jiantang Huang

In CVPR Autopilot Workshop (NA Track, Poster), 2026

Awarded Abs Bib HTML PDF

Accepted as Poster, CVPR Autopilot Workshop (NA Track)

Grounding traffic accidents in real CCTV footage is a rare-event problem where training on labeled accident video is often prohibited, yet accurate joint localization in time, space, and collision type is required. We present a no-fine-tuning pipeline that elicits this joint output from frozen vision-language models through two ideas. First, a coarse-to-fine two-pass decomposition with two deterministic confidence gates that revert to the coarse estimate on boundary hedges. Second, a specialist role assignment: Qwen3-VL-Plus handles grounding, Gemini 3.1 Flash-Lite handles typing on a centered video clip. On the ACCIDENT@CVPR 2026 benchmark (2,027 real CCTV videos) we reach ACC^S = 0.539 (95% CI [0.525, 0.553]) — +0.127 over the benchmark paper’s best-of-baselines oracle, at roughly $20 total inference cost.
@inproceedings{huang2026twopass, title = {Two-Pass Zero-Shot Temporal-Spatial Grounding of Rare Traffic Events in Surveillance Video}, author = {Huang, Jiantang}, booktitle = {CVPR Autopilot Workshop (NA Track, Poster)}, year = {2026}, archiveprefix = {arXiv}, }

2025

arXiv
Slow-Motion Video Synthesis for Basketball Using Frame Interpolation

Jiantang Huang

arXiv preprint arXiv:2511.11644, 2025

Abs Bib HTML Website

We investigate frame interpolation methods for generating high-quality slow-motion basketball footage, with an emphasis on preserving fast motion and ball trajectory under occlusion.
@article{huang2025slowmotion, title = {Slow-Motion Video Synthesis for Basketball Using Frame Interpolation}, author = {Huang, Jiantang}, journal = {arXiv preprint arXiv:2511.11644}, year = {2025}, }