Figure 7-4: Convolution and stream processing diagram

Download 3,22 Mb. Pdf ko'rish
bet	48/81
Sana	16.05.2024
Hajmi	3,22 Mb.
	#238917

1 ... 44 45 46 47 48 49 50 51 ... 81

Bog'liq
Alireza Fasih

Figure
7-4: Convolution and stream processing diagram
These two processes will run concurrently. The first process read the data and split it in 3
separated rows, and the second process applies a convolution on these data. In each clock
cycle, the columns module reads a pixel value from the live video stream source and it
writes these pixels values on the output stream channels. These pixels values will then be
used by the convolution process.
7.7

Modeling cellular neural network by DDA
Cellular Neural Network (CNN) is a paradigm for parallel processing that is similar to the
neural network but with the difference that the connectivity between the cells is rather
local and not global like in neural network. The main parts of a CNN module are the
convolution module and the integrator [50]. For designing an integrator module we need a
memory for keeping both data and cell values. According to the Equation 7-1, describing
the CNN cell dynamics, we need three templates for each cell: a feedback template TA, a
control template TB, and a bias template I.
(7-1)
ij
ij
A
ij
B
ij
ij
I
y
T
u
T
x
x
-
where, ‘
A
T
’ is denoted feedback template (3x3), and ‘
B
T
’ is denoted control template
(3x3), ‘u’, ’x’, ‘y’ and ‘I’ are the input, the state, the output and the bias term, respectively.
The output signal is related to the Equation 7-2 below equation.
Stream IN
Columns
Convolution
Row 1
Row 2
Row 3
Stream OUT

76
(7-2)
))
(
(
)
(
t
x
f
t
y
ij
ij
In the Equation 7-2, the function
)
(
x
f
is a nonlinear activation function defined in
Equation 7-3.
(7-3)
1
1
2
1
)
(
x
x
x
f
We can model the solution of the Equation 7-1 on FPGA by implementing a simple
approximation technique like the well-known digital differential analyzer (DDA). A DDA,
also sometimes called “digital integrating computer”, is a digital implementation of the
differential analyzer. The integrators in DDA are implemented as accumulators, whereby
the numeric results are converted back to a pulse rate by the overflow of the accumulator.
The main advantage of the digital integrator, when compared to an analog integrator, is the
scalable pre
cision. Also, in a digital integrator based on DDA, we don’t have drift errors and
noise due to the imperfection of electronic components. By accumulating over time of the
values in a register, we can calculate the integral of input signals. The basic digital
integrator is expressed by Equation 7-4.
(7-4)
S
K
X
X
n
n
1
In Equation 7-4,
1
n
X
denotes the next state of the accumulator used for calculating the
integral. The coefficient of
K
is a constant factor that is less than 1; it is used for time-
scaling. In this equation,
S
denotes the input signal for integration. We can map this
technique on FPGA very easily by writing a behavioral code. After each rising clock pulse,

77
the equation updates the integral value. In this integrator, rounding or truncation errors
are only due to the limitation of registers. Therefore, by increasing the register sizes we
have a way to control/reduce this error. This error is cumulative. Thus, for low precision
registers a lack of accuracy will be observed after a long time. The only way for overcoming
to this problem is setting proper register sizes.
By this way, we can compute directly the solution of differential equations. This
simulation of analog computing is a fully parallel method for solving differential systems
such as nonlinear equations and also to realize the integrator module within CNN. For
integrator modules we must have access to the memory for storing the values and the cells
output. The only critical term in CNN equation is the “Integrator”, which we
implement/realize through the DDA model. After approximating the basic CNN cell, we
must cascade the cells together. All these steps are implemented in the CoDeveloper by
using a Fixed-Point method. The result for each three rows will be stored in the memory.
The CoDeveloper can handle the access to the external memory through the multi port
memory controller (MPMC). This controller is a full feature memory controller that is
compatible with standard DDR2 memory devices. This controller must be configured for at
least one read and one write port. And for the many high-end video processing
applications, there is no implicit limit on the number of read or writes ports in MPMC.

Download 3,22 Mb.

1 ... 44 45 46 47 48 49 50 51 ... 81

Download 3,22 Mb.

Pdf ko'rish