|
Ultra fast cnn based Hardware Computing Platform Concepts for adas visual Sensors and Evolutionary Mobile RobotsBog'liq Alireza Fasih4.3
Contribution to an image processing platform for ADAS
In ADAS systems we need different types of filters and image processing components
which are very complex and time consuming. We need a uniform platform as an image
processing framework. Flexibility in design, the capability of reconfiguration of modules,
the capability of redesigning the system architecture and a very short processing time
along robustness are the main characteristic and properties of the ideal framework. In this
thesis, a hardware architecture implementing a CNN processor matrix for performing
36
different image processing filters and algorithms is provided. For implementing CNN on a
digital platform we need an accurate approximation of the CNN equation in a discrete
mode [95-97]. In this thesis the architecture of a CNN implementation based on GPU and
FPGA are proposed. Figure 4-3 does show the abstract model of the GPU based system
which has been proposed.
Figure
4-3: Architecture of system for processing images based on CNN
To have more flexibility in design and accuracy in result, software based implementation of
CNN is a good option. The only drawback is that by increasing the CNN size, the CNN
performance will be very poor. Therefore we proposed a parallel implementation of CNN
on GPU. Instead of programming in pixel level by vertex engine and fragment engine we
proposed an implementation on OpenCL platform. OpenCL which is a heterogeneous
platform for high performance computing on GPU and CPU devices provided a sort of APIs
for execution of kernels on computing devices and communication between them. Kernels
are distributed in the form of one, two and three dimensional and they following
hierarchical abstraction mode. In GPU device there is local, global and constant memory for
computing and each computing unit has a local memory. OpenCL can manage easily local
communication between these memories between different kernels. Figure 4-4 has shown
the overview of the CNN GPU design, this part has been describe in details in chapter 8.
CNN
Templates
Bank/Memory
CNN on
GPU
CPU
Global
Memory
|
| |