Reporter: bet365 live slot machine of Pittsburgh Prof. Wei Gao
Reporting location: Tencent Conference 813-805-126
bet365 live slot machine time: Beijing time7month22Sunday morning9:30, Eastern Time7month21日夜21:30
bet365 live slot machine title:Real-Time Neural Network bet365 live slot machine Extremely Weak Devices: Agile Offloading with Explainable AI
Host: Zhang Deyu Associate Professor
Personal introduction:
Wei Gao is currently an Associate Professor in the Department of Electrical and Computer Engineering at bet365 live slot machine of Pittsburgh. His research interests widely include mobile and embedded computing, edge computing, on-device AI, mobile health, and Internet of Things. He is the recipient of NSF CAREER award, best paper award of ACM CoNEXT and spotlight paper award of IEEE Transactions on Mobile Computing. He currently serves on the Editorial Board of IEEE Transactions on Mobile Computing, and also serves on the organizing committees and technical program committees of many other premier conference venues, including MobiCom, SenSys, INFOCOM, etc.
bet365 live slot machine Summary:
With the wide adoption of AI applications, there is a pressing need of enabling real-time neural network (NN) bet365 live slot machine small embedded devices at the edge, but deploying NNs on these small devices is challenging due to their extremely weak capabilities. Although NN partitioning and offloading can contribute to such deployment, they are incapable of minimizing the local costs at embedded devices. In this talk, I will introduce our recent research progress of enabling NN bet365 live slot machine weak edge devices (e.g., microcontrollers), via agile NN offloading that migrates the required NN computations from online inference to offline learning. More specifically, our approach leverages explainable AI techniques to explicitly enforce feature sparsity when training the NN model offline. Such sparsity, then, allows online inference to offload the majority of NN features for remote computation with much higher compressibility. Our experiment results show that our agile offloading framework can reduce the inference latency on weak MCUs to <20ms, ensuring that sensory data on embedded devices can be timely consumed. It also reduces the local resource consumption by 8x, without impairing the inference accuracy.