Projects

My research aims to build an efficient mobile vision system with edge-assisted live video analytics. I am interested in exploiting and developing a multi-disciplinary solution that incorporates computer vision, edge computing, multimedia, and machine learning to enhance the system’s effectiveness and efficiency.

Currently, I am particularly interested in the following problems:

Making mobile vision more efficient with more video compression
Enhancing DNN deployment efficiency in diverse use scenarios
Building new emerging drone applications that involve multiple computer vision tasks

Building new emerging drone applications that involve multiple computer vision tasks

SSS: Towards Autonomous Drone Delivery to Your Door Over House-Aware Semantics [Release]

We present our attempt to tackle the last-hundred-feet problem for autonomous drone delivery. We take a semantic segmentation-based approach to progressively landing towards a convenient and safe drop-off point at all times. We leverage a single-family house structure to streamline and enhance semantic segmentation in the drop-to-door problem context. We will release the code soon.

Making Mobile Vision More Efficient with More Video Compression

Given the fast development in computer vision, we orthogonally enforce more efficient compression for specific vision inference tasks. Our approach adapts to input contexts and significantly reduces the volume of video data without sacrificing visual inference accuracy.

Towards Drone-Sourced Live Video Analytics via Adaptive-yet-Compatible Compression [paper]

DCC utilizes the drone-specific context and intermediate information obtained from object detection to jointly adjust the resolution, QP, and frame rate during runtime. To demonstrate its effectiveness, we use vehicle detection from the drone as a showcase example.

VPPlus: Exploring the Potentials of Video Processing for Live Video Analytics at the Edge [paper]

VPPlus enlarges the configuration space that can be optimized during on-device processing to achieve greater compression for general object detection tasks. It generates proper feedback automatically to guide the joint tuning over more than 8 parameters (e.g. brightness, saturation, etc.).

Enhancing DNN Deployment Efficiency in Diverse Use Scenarios

When presented with diverse mobile devices performing distinct application tasks in varying scenarios, we tailor the DNN to enhance performance for each scenario using Once-For-All DNN training. This involves training one super network and searching for different sub-networks (subnets) to fit the specific use case.

OPA:One-Predict-All For Efficient Deployment [paper]

Instead of training a specialized DNN for each deployment scenario, we have developed a novel approach of using the shallow subnet to test the water. The effectiveness of using a shallow subnet to accelerate the search of a deep subnet has been validated effective in image classification, one showcase application.

Junpeng Guo