DDD20 End-to-End Event Camera Driving Dataset - Fusing Frames and Events with Deep Learning for Improved Steering Prediction
Neuromorphic event cameras are useful for dynamic vision problems under difficult lighting conditions. To enable studies of using event cameras in automobile driving applications, this paper reports a new end-to-end driving dataset called DDD20. The dataset was captured with a DAVIS camera that concurrently streams both dynamic vision sensor (DVS) brightness change events and active pixel sensor (APS) intensity frames. DDD20 is the longest event camera end-to-end driving dataset to date with 51h of DAVIS event+frame camera and vehicle human control data collected from 4000km of highway and urban driving under a variety of lighting conditions. Using DDD20, we report the first study of fusing brightness change events and intensity frame data using a deep learning approach to predict the instantaneous human steering wheel angle. Over all day and night conditions, the explained variance for human steering prediction from a Resnet-32 is significantly better from the fused DVS+APS frames (0.88) than using either DVS (0.67) or APS (0.77) data alone.
This project presents an empirical study of how reduced precision training methods impact the iCARL incremental learning algorithm. The incremental network accuracies on the CIFAR-100 image dataset show that weights can be quantized to 1 bit (2.39% drop in accuracy) but when activations are quantized to 1 bit, the accuracy drops much more (12.75%). Quantizing gradients from 32 to 8 bits only affects the accuracies of the trained network by less than 1%. These results are encouraging for hardware accelerators that support incremental learning algorithms.
Slasher is the first open 1/10 scale autonomous driving platform for exploring the use of neuromorphic event cameras for fast driving in unstructured indoor and outdoor environments. Slasher features a DAVIS event-based camera and ROS computer for perception and control. The modular design of Slasher can easily integrate additional features and sensors. In this paper, we show its application in a reflexive Convolutional Neural Network (CNN) steering controller trained by end-to-end learning. We present preliminary experiments in closed-loop indoor and outdoor trail driving.
PyAER offers a set of low-level APIs for accessing Neuromorphic Devices such as Dynamic Vision Sensors, DYNAPE-se that are produced by iniLabs, GmbH. The library builds a Python binding of libcaer via SWIG. This package aims to bridge the gap between these fantastic sensors and processors and beginners.
We investigated how well deep learning algorithm can be used to navigate a partially observable (PO) grid world. And we show that VIN performs strongly for this task whilst other RL algorithms fail to generalize. The performance of VIN is compared with the ground truth that is computed by `A*` in fully observable environment. This project is supervised by Nikolay Savinov at Computer Vision and Geometry Group, ETH Zürich.
Character-level Neural Machine Translation (NMT) models have recently achieved impressive results on many language pairs. They mainly do well for Indo-European language pairs, where the languages share the same writing system. However, for translating between Chinese and English, the gap between the two different writing systems poses a major challenge because of a lack of systematic correspondence between the individual linguistic units. In this paper, we enable character-level NMT for Chinese, by breaking down Chinese characters into linguistic units similar to that of Indo-European languages. We use the Wubi encoding scheme, which preserves the original shape and semantic information of the characters, while also being reversible. We show promising results from training Wubi-based models on the character- and subword-level with recurrent as well as convolutional models.
We replaced the Average Pooling to Max-Pooling in ANN-SNN conversion pipeline. And we reported the best SNN results in MNIST and CIFAR-10 to date.
Benchmarks have played a vital role in the advancement of visual object recognition and other fields of computer vision. We report a new benchmark dataset in which we converted established visual video benchmarks for object tracking, action recognition and object recognition into spiking neuromorphic datasets, recorded with the DVS output of a DAVIS camera.