CPS: Medium: Collaborative Research: AI Inference in Vehicular Edge Networks for Advanced Driver-Assistance Systems

List of personnel

This is a multi-university project between Rowan University, Temple University, California Polytechnic State University, and Stony Brook University.
Principal Investigators: Jie Wu (Temple University); Ning Wang (University of Arizona); Yunsheng Wang (Kettering University, and then California Polytechnic State University); Haibin Ling (Stony Brook University).

Vision

This project plans to explore distributed DNN processing scheduling, task assignment, and DNN model parallelism optimization with the consideration of the complex architecture of DNNs and the network environment. Application-wise, the PIs will design novel DNN solutions for ADAS tasks with the coordination of cooperative AI inference paradigm. Testbed-wise, a vehicle edge-computing platform with V2X communication and edge computing capability will be developed at Kettering University GM Mobility Research Center. The cooperative AI inference system will be implemented, and the research findings will be validated on realistic vehicular edge computing environments thoroughly.

Major Activities

We study challenges in enabling distributed and cooperative AI inference, such as where, how, and when images are processed to fully make use of available wireless bandwidth and edge computing resources. We develop new techniques to provide faster and/or more accurate results. The major activities are summarized below.

The indoor experimental testbed, HydraOne, purchased in Project Year 1 was delivered in September 2022. Code modifications and reconfigurations are performed to ensure that the testbed can be used for tasks proposed in Trust 1 and 2. Additional software interfaces are developed to control, monitor, and ensure ease of use during algorithm testing. Software interface is tested on simpler hardware.

Project meeting was held on July 3 2023 at Temple University and attended by all team members to discuss project progress and planning for Year 3 collaborations. Addition meetings were held after the project meetings between PIs to coordinate collaborations between different institutions. Students from Rowan will visit Kettering University to better understand the outdoor testbed, etc.

The Stony Brook and Rowan team continued collaborating on parallelized visual tracking algorithms, with new results submitted to a workshop. In addition, Stony Brook team also worked on general vision algorithms including tracking for transparent objects, camera pose tracking/estimation, and security in visual recognition algorithms.

As for Broadening Participation in Computing (BPC), we recruited multiple undergraduate and female students and tried to disseminate project information and outcomes to broader communities. Specifically, the PI at Rowan University recruited multiple undergraduate students (Jack Campanella (JC), Stephen Piccolo (SP), Matthew McBurney (MM), Sarah Ely (SE), Lauren Eckert (LE), Richard Brown (RB)) to work on the HydraOne vehicle and testbed. JC, MM, SE, LE were involved in ROS understanding and implementation on a HydraOne vehicle. SP was involved in testbed interface implementation. RB was involved in HydraOne hardware modification. Asmika Boosarapu was an MSCS student from 10/1/2021 to 5/31/2023. She successfully defended her MS Thesis titled “Computation Offloading Design for Deep Neural Network Inference on IoT Devices.” JC starts his robotics MSE degree program at UPenn in Fall 2023. SP presented his work on testbed interface implementation at Summer Undergraduate Research Program (SURP) research poster presentation on 7/26/2023. At Temple University, BPC effort was made in June to 10 NSF REU students. Four lectures on ML and its applications have been given. In addition, we assigned a homework project on ML on spectrum sensing to these students. The PI at California Polytechnic State University, Pomona recruited multiple undergraduate students and graduate students to work on this project. The result was presented at the College of Science Research Symposium on April 28, 2023. The PI also supervised one REU student from UMass Amherst in Summer 2023. The PI at Stony Brook University (SBU) continued to supervise high school students for conducting research through the CSIRE (Computer Science and Informatics Summer Research Experience) program at SBU.

The Kettering team completed an edge computing platform for a cooperative AI algorithm. Several onboard computing units (2 Nvidia Jetson Nano units, 2 Nvidia Jetson Orin units, and 3 Alienware Laptops) were purchased and configured successfully. A DNN-based obstacle detection system for collision avoidance is successfully deployed on those onboard computing units. The configured onboard computing units will be used to evaluate the developed cooperative AI algorithm in the GM Mobility Research Center (GMMRC) at Kettering University.

An Nvidia DRIVE AGX Orin unit was purchased and configured as an edge server. The Kettering team established the connection between the edge computing server and the on-board computing units using the client-server protocol.

The Kettering team completed the C-V2X communication setup. Coda Wireless MK6C units were purchased for V2X communication, including two Road Side Units (RSUs) and two On Board Units (OBUs). Vehicles can use their onboard communication tools (OBUs) to share real-time surrounding traffic information. Outdoor testing was conducted for C-V2X communication to broadcast the traffic information (obstacles’ locations detected by the DNN algorithm) from one RSU to several OBUs.

Specific Objectives

Coordination between IoT device and edge server is vital to ensure the best performance of the distributed AI inference. Without careful optimization, the distributed AI inference performance can lead to performance decay and prolonged latency. Thus, our study will focus on the following three thrust areas from algorithm, application, and system perspectives.

Thrust 1: Study of optimal neural network model split in edge computing to accelerate inference. In this thrust, we will focus on optimal neural network model split in different configurations and thus propose a scheduler which can adjust cooperative AI inference strategies with the consideration of the bandwidth fluctuation in vehicular edge networks. Particularly, this thrust will address non-trivial challenges in task assignment and processing scheduling issues.
Thrust 2: Integrating cooperative AI inference in computer vision applications. In this thrust, we will focus on the study of applying proposed cooperative AI inference strategies for vision applications. We will coordinate the research findings in cooperative AI inference theory to develop new visual inference algorithms that reduce dependency between DNN modules to be more friendly and efficient for IoT devices, as well as more effectively collaborated with other offloading techniques without compromising performance, e.g., accuracy, latency, etc.
Thrust 3: System validation, integration, and testbed implementation. In this thrust, we would like to verify the effectiveness of research findings via rigorous theoretical analysis and prototype demonstration. A comprehensive experimental study of the cooperative AI inference on a wide range of ADAS applications will be implemented and tested on the proof-of-concept platform, i.e., HydraOne. The cooperative AI inference system will be evaluated in real-world scenarios in GMMRC at Kettering University. The Kettering team will implement one of the ADAS applications, a collision avoidance system, by integrating C-V2X communication and a cooperative AI algorithm processed by collaborating with an edge server and an onboard vehicular computing unit.

Significant Results

The significant results are summarized below:

We have tested the cooperative AI inference idea on two widely used IoT devices (i.e., Raspberry Pi 3B and Nvidia Jetson Nano 4GB) by using state-of-the-art DNN models such as MobileNet, YoLov4, etc. The results show the proposed cooperative AI inference strategy achieved better performance than on-board computation without computation offloading, especially if the wireless link is not very good.
We are in the process of developing an efficient scheduling algorithm to fully utilize cooperative AI inference by taking advantage of intrinsic DNN computation characteristics to enable computation parallelism and have discussed the trade-off in computation parallelism. Our initial simulation shows promising results of the scheme, and we are in the process of conducting more comprehensive experiments on varying scenarios.
We have extended our previous work parallel tracking and verifying algorithm (PTAV) framework on edge computing environment and further proposed a distributed tracking and verifying (DTAV) framework to enable real-time and high accuracy on IoT devices at the same time. Our solution runs two object tracking algorithms simultaneously in a distributed manner, one fast but low accuracy on the IoT side and one high-accurate on the edge server side. While the client performs fast object tracking, the server’s tracking algorithm verifies the output and corrects the server whenever required to maintain the accuracy of the model. We validated the proposed DTAV approach, which obtained an accuracy of 86.90% and showed an improvement of 7.78%.
For the work of optimal partition and scheduling of multiple linear DNNs, our observations show that the local computation time on a mobile device follows an increasing function, while the communication workload for offloading is usually decreasing as more DNN layers are computed. Based on this, we first relax our problem on continuous domain and show that partitioning all line-structure DNNs at the same layer is sufficient for makespan optimization. Then, for the discrete domain, two types of partitions are sufficient when the time difference between two adjacent partition layers is not drastic, subject to a given condition. An algorithm based on the binary search that efficiently finds optimal partition layers is illustrated. We also extend our approach to general-structure DNNs and offer a heuristic solution. Experiments have been conducted to evaluate the performance of different partition and scheduling methods on sample DNNs. Results validate the optimality of our theoretical results.
Deep Neural Networks (DNNs) have been widely deployed in mobile applications. DNN inference latency is a critical metric to measure the service quality of those applications. Collaborative inference is a promising approach for latency optimization, where partial inference workloads are offloaded from mobile devices to cloud servers. Model partition problems for collaborative inference have been well studied. However, little attention has been paid to optimizing offloading pipeline for multiple DNN inference jobs. In practice, mobile devices usually need to process multiple DNN inference jobs simultaneously. We made some breakthroughs of the above challenges as part of our effort in Thrust 1.
Continuing the collaboration in Year 1, the SBU and Rowan teams further improve the proposed distributed tracking and verifying (DTAV) framework, resulting in a joint workshop submission. For relevant topics, first, a neural network approach with a graph Transformer backbone named GTCaR is developed to address the multi-view camera re-localization problem. GTCaR is evaluated on various public benchmarks, and it outperforms state-of-the-art approaches. Details can be found in our ECCV 2022 paper. Second, a novel defending method against backdoor attacks is invented, which waives labeled training requested by previous methods, yet performs on-par with state-of-the-art defense methods trained using labels. More details can be found in our CVPR 2023 paper.

Other Achievements

We propose to jointly optimize the DNN partitioning and pipeline scheduling for multiple inference jobs. We theoretically analyze the optimal scheduling conditions for homogeneous chain-structure DNNs. Based on the analysis, we proposed near-optimal partitioning and scheduling methods for chain-structure DNNs. We also extend those methods for general-structure DNNs. In addition, we extend our problem scenario to handle heterogeneous DNN inference jobs. A layer-level scheduling algorithm is proposed. Theoretical analyses show that our proposed method is optimal when computation graphs are tree-structure. Our joint optimization methods are evaluated in a real-world testbed. Experiment results show that our methods can significantly reduce the overall inference latency of multiple inference jobs compared to partition-only or schedule-only approaches.
The developed robust tracking algorithm and backdoor defense algorithms have achieved state-of-the-art performances on publicly available benchmarks, and their source codes were released online to facilitate research in related areas. Details can be referred to the relevant publications.

Publications

Journal

Y. Duan and J. Wu, "Optimizing Job Offloading Schedule for Collaborative DNN Inference," accepted to appear in IEEE Transactions on Mobile Computing (TMC), 2023.

Conference

Y. Xiao, F. Qin, X. Sun, and Y. Wang, "A Distributed Architecture for Cooperative Deep Learning System in Intelligent Vehicle Systems", Proc. of the 16th International Conference on Networking, Architecture, and Storage (NAS 2022), Oct. 3-4, 2022
K. Garigapati, E. Blasch, J. Wei, and H. Ling, "Transparent Object Tracking with Enhanced Fusion Module", Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2023.
L. Pang, T. Sun, H. Ling, and C. Chen, "Backdoor Cleansing with Unlabeled Data", Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
X. Li and H. Ling, "GTCaR: Graph Transformer for Camera Re-localization," Proc. of European Conference on Computer Vision (ECCV), 2022.

Workshop

N. Bovee, S. Piccolo, N. Wang, and S. Ho, "Experimental Testbed for Computation Offloading for Cooperative Inference on Edge Devices", EdgeComm: The Fourth Workshop on Edge Computing and Communications at ACM/IEEE Symposium on Edge Computing, 2023. (submitted)
P. Makarand Mhasakar, K. Doshi, N. Wang, S. Shyang Ho, and H. Ling, "Distributed Tracking and Verifying: A Real-Time and High-Accuracy Visual Tracking Edge Computing Framework for Internet of Things, EdgeComm: The Fourth Workshop on Edge Computing and Communications at ACM/IEEE Symposium on Edge Computing, 2023. (submitted)

Poster

S. Piccolo and N. Bovee, "Experimental Testbed for Computation Offloading for Cooperative Inference on Edge Devices", Proc. of the Rowan University Summer Undergraduate Research Program Symposium, July 2023.
B. Chao, C. Lee, L. Saikali, A. Marin, L. Ruiz, and Y. Wang, "Self-driving Car Simulator, California Polytechnic State University", Proc. of the College of Science Research Symposium, Pomona, April 28, 2023.

Thesis/Dissertations

Dubin Duan was a PhD student at Temple University from 9/1/2017 to 8/31/2022. He successfully defended his PhD dissertation titled “Accelerating DNN Inference and Training in Distributed Systems”. He joined Facebook as a software engineer.

Asmika Boosarapu was an MSCS student from 10/1/2021 to 5/31/2023. She successfully defended her MS Thesis titled “Computation Offloading Design for Deep Neural Network Inference on IoT Devices”.