86
The idea of CNN was taken from the architecture of artificial neural networks and cellular
automata. In contrast to ordinary neural networks, CNN has the property of local
connectivity. The weights of the cells are established by the parameters called the
template. The functionality of the CNN is dependent on the template. So with a single
common computing model, by calculating the templates we can achieve the desired
functionality. The CNN has been successfully used for various high-speed parallel signal
processing applications such as image processing, visual computing and pattern
recognition as well as computer vision [91]. So we thought of implementing it on the
hardware for the need of HPC in real time image processing. Also, the parallel processing
capability of the CNN makes us to implement the CNN architecture on the hardware
platform for its efficient visualization.
In this research, the effort is done to develop a DT-CNN model on the graphics processing
units with the OpenCL framework. An effort is done to make the development of DTCNN
entirely on the kernel which make it executable on every platform. But, it should be noticed
that the GPU is a coprocessor which supports the processor in our system. Hence, the CPU
still executes several tasks, like the transmission of the data to the local memory of the
graphics card and retrieving back. Finally, GPU-based
Universal Machine - CNN
(UM-CNN)
was implemented using the OpenCL framework on NVIDIA GPU. A benchmark is provided
with the usage of GPU based CNN model for the image processing in comparison with CPU.
The chapter is structured as follows: Section II gives a clear description about the theory
involved in parallel computing. Section III introduces the concepts of CNN, the system
diagram and its functionality and systems designed methodology which is done using
OpenCL. Section IV concludes the section and says about the work going to be done in the
future.