In English

Traffic sign classification with deep convolutional neural networks

Jacopo Credi
Göteborg : Chalmers tekniska högskola, 2016. Diploma work - Department of Applied Mechanics, Chalmers University of Technology, Göteborg, Sweden, ISSN 1652-8557; 2016:25, 2016.
[Examensarbete på avancerad nivå]

In this work, a new library for training deep neural networks for image classification was implemented from the ground up, with the purpose of supporting GPU acceleration through OpenCL™, an open framework for heterogeneous parallel computing. The library introduced here is the first attempt at creating a C# deep learning toolbox, and can thus be more easily integrated with other projects under the .NET framework. The availability of cross-platform tools, covering as many developing environments as possible, can in fact accelerate the deployment of deep learning algorithms into a wide range of industrial applications, including advanced driver assistance systems and autonomous vehicles. The library was tested on the German Traffic Sign Recognition Benchmark (GTSRB) data set, containing 51839 labelled images of real-world traffic signs. The performance of a classic deep convolutional architecture (LeNet) was compared to that of a deeper one (VGGNet), when trained with different regularisation methods. Dropout was observed to be particularly effective in counteracting overfitting for both models. Interestingly, the VGGNet model was observed to be more prone to overfitting, despite having a significantly lower number of parameters (462k) compared to the LeNet model (827k). This led to argue that architectural depth plays a crucial role in determining the capacity of a model, in accordance with some recent theoretical findings. The best classification accuracy (96.9%) on the test GTSRB data was obtained using an ensemble of four deep convolutional neural networks, including both architectures and trained using both images converted to greyscale and the original RGB raw images.

Nyckelord: deep learning, convolutional neural networks, computer vision, machine learning, GPGPU, OpenCL.



Publikationen registrerades 2016-07-04. Den ändrades senast 2016-07-06

CPL ID: 238914

Detta är en tjänst från Chalmers bibliotek