



# Symmetry Incorporated Cost-Effective Architectures for Two-Dimensional Digital Filters

Lan-Da Van, *Senior Member, IEEE*, I-Hung Khoo, *Member, IEEE*, Pei-Yu Chen, Haranatha (Hari) C. Reddy, *Life Fellow, IEEE* 

## Abstract

Professor Fettweis as far back as 1977 published a paper generalizing McClellan transformation to obtain circular symmetry in 2-D and spherical, hyper-spherical symmetries in multidimensional digital filters [1]. This survey paper presents stateof-the-art two-dimensional (2-D) VLSI digital filter architectures possessing various symmetries in the filter magnitude response. Preceding the symmetry structures, a generalized formulation is given that allows the derivation of various new 2-D VLSI filter structures of any order without global broadcast. Following

Digital Object Identifier 10.1109/MCAS.2018.2872665 Date of publication: 11 February 2019 this, two types (namely, Type 1 [20] and Type 3 [21], [25], [26]) of cost-effective 2-D magnitude symmetry filter architectures possessing diagonal, four-fold rotational, quadrantal, and octagonal symmetries with reduced number of multipliers are given. By combining the identities of the Types-1 and 3 symmetry filter structures, multimode 2-D symmetry filters which enable the above four symmetry modes are discussed. The Type-1 and Type-3 multimode filters can result in a 65.3% cost reduction in terms of number of multipliers compared with the sum of the multipliers of the four individual Type-1 symmetry filter structures studied in this paper. Furthermore, Type-3 has shorter critical path than Type-1 multimode filter. The paper is concluded with the presentation of a 2-D filter design example and a corresponding structure.

#### I. Introduction

ince 1970s, the theory and design of two-dimensional (2-D) digital filters [1]-[15] has attracted much attention in the digital signal processing field. Although 2-D digital filters can be implemented on a general-purpose processor for various DSP applications, the requirements of high-throughput and low-power issues result in dedicated computing architectures. Several conventional VLSI architectures for 2-D filters have been studied in [11]-[13], and an existing application specific integrated circuit (ASIC) approach has been applied to the design of beam filters [14], [15], and 2-D symmetry filters [16]-[26]. In his paper [1], Fettweis discussed the need for circular symmetry in the magnitude response of 2-D filters. Since circular symmetry cannot be achieved exactly by a rational 2-D transfer function, researchers have focused on achieving symmetries that can approximate the circular symmetry. Therefore, the quadrantal, diagonal, four-fold rotational and octagonal symmetries [8], [9] that help achieve circular symmetry were extensively studied. These symmetries are also needed not only to approximate circular symmetry but also to design Fan, Cone and other filter specifications. The most notable summarization of the research on 2-D symmetry till mid-1980s was done by Swamy and Rajan [8]. Recently the authors have presented efficient 2-D digital filter architectures incorporating these magnitude symmetries. From BIBO stability point of view, incorporating quadrantal, four-fold rotational and octagonal symmetries require separable denominators in the variables. This is considered in the architectures studied by the authors. The significant feature of the filters studied in [18], [19] is that they exhibit the denominator separability as a filter structural property. This means that the separability of the denominator is maintained independent of the choice of multiplier values. This important property is essential for the design of the multimode symmetry filter discussed in this review paper.

It is well-known that the presence of symmetry in the frequency responses of 2-D filters can be used to reduce the number of multipliers [8]–[10]. Consequently, several potential 2-D filter architectures that make use of filter symmetries have been explored [16]–[26]. In this paper, six cost-effective symmetry filter architectures (i.e., four Type-1 and two Type-3) are reviewed and discussed. They possess diagonal, four-fold rotational, quadrantal, and octagonal symmetries, and require fewer multipliers compared to structures that do not use symmetry. Further, to integrate the support of

multiple symmetry functions, two cost-effective Type-1 and Type-3 multimode 2-D filter designs with four symmetries mentioned above is proposed. Type 2 architectures [20] are not reviewed due to space limitation. This paper is organized as follows: Section II describes the various symmetries and their constraints on the coefficients of the 2-D polynomial. Section III discusses the general formulation of 2-D filter architectures. Section IV presents different VLSI suitable filter architectures with symmetry, that require fewer number of multipliers. In Section V, two cost-effective multimode filter architectures with four symmetries are discussed. The error analyses of filter structures are discussed in Section VI. The cost in terms of number of multipliers and adders of the reviewed symmetry filter structures and the multimode filter architecture are profiled and evaluated in Section VII. A filter design example is given in Section VIII. The summary of the results are given in Section IX.

#### II. Various Symmetries and Constraints on the Coefficients of 2-D Polynomials

A real general 2-D z-domain IIR transfer function can be represented as in (1), where  $a_{ij}$  and  $b_{ij}$  are real coefficients and  $b_{00} = 0$ ,  $N_1 \times N_2$  is the order of the filter, and X and Y are respectively the input and output of the filter. The equation can also represent an FIR transfer function if we set  $b_{ij} = 0$  for all i and j.

$$H(z_{1}, z_{2}) = \frac{Y(z_{1}, z_{2})}{X(z_{1}, z_{2})} = \frac{\sum_{i=0}^{N_{1}} \sum_{j=0}^{N_{2}} a_{ij} z_{1}^{-i} z_{2}^{-j}}{1 - \sum_{i=0}^{N_{1}} \sum_{j=0}^{N_{2}} b_{ij} z_{1}^{-i} z_{2}^{-j}} = \frac{P(z_{1}, z_{2})}{Q(z_{1}, z_{2})}$$
(1)

The usefulness of symmetry relations in the design of 2-D filters have been studied extensively [1], [5]–[10]. The symmetry present in the frequency response induces a relation among the filter coefficients and multipliers in the filter structures. This reduces the number of design parameters in an optimization scheme, as well as the number of multipliers in an implementation architecture. There are many possible types of symmetries in the magnitude response such as quadrantal, diagonal, four-fold rotational, and octagonal symmetries.

The frequency response is evaluated on the distinguished boundary of the unit bi-disk,  $z_i = e^{j\theta_i}$ , i = 1, 2, as shown in Fig. 1. If  $P(z_1, z_2)$  is a 2-D z-domain real

L. D. Van, H.C. Reddy, P. Y. Chen are with the Dept. of Computer Science, National Chiao Tung University, Hsinchu, 300, Taiwan, R.O.C. I. H. Khoo\* and H. C. Reddy\* are with Dept. of Electrical Engineering, California State University Long Beach, Long Beach, CA, U.S.A. This work was supported in part by the MOST 106-2221-E-009-028-MY3, MOST 106-2218-E-009-029 and MOST 107-2634-F-009-010.

# The existence of symmetry in $F(\theta_1, \theta_2)$ implies that the value of the function at $(\theta_1, \theta_2)$ on the distinguished boundary is related to the value of the function at $(\theta_{17}, \theta_{27})$ where $(\theta_{17}, \theta_{27})$ is obtained by some operation on $(\theta_1, \theta_2)$ .

polynomial, its frequency response is given by  $P(e^{j\theta_1}, e^{j\theta_2})$ . The magnitude squared function of the frequency response is given by:

$$F(\theta_{1}, \theta_{2}) = |P(e^{j\theta_{1}}, e^{j\theta_{2}})|^{2}$$
  
=  $P(e^{j\theta_{1}}, e^{j\theta_{2}}) \cdot P(e^{-j\theta_{1}}, e^{-j\theta_{2}})$   
=  $P(z_{1}, z_{2}) \cdot P(z_{1}^{-1}, z_{2}^{-1})|_{z_{i}=e^{j\theta_{i}}, i=1,2}$  (2)

It can be seen from (2) that the Centro-Symmetric property, i.e.,  $F(\theta_1, \theta_2) = F(-\theta_1, -\theta_2)$  is always satisfied. The existence of symmetry in  $F(\theta_1, \theta_2)$  implies that the value of the function at  $(\theta_1, \theta_2)$  on the distinguished boundary is related to the value of the function at  $(\theta_{1T}, \theta_{2T})$  where  $(\theta_{1T}, \theta_{2T})$  is obtained by some operation on  $(\theta_1, \theta_2)$  as shown in Figure 2. Also, for our discussion, we assume that the value of the magnitude function is unchanged, i.e.,  $F(\theta_1, \theta_2) = F(\theta_{1T}, \theta_{2T})$ . A more detailed discussion of this can be found in [8]–[10]. Now consider the following symmetries:

#### Quadrantal Symmetry

If the magnitude squared function possesses quadrantal symmetry, then

$$F(\theta_1, \theta_2) = F(-\theta_1, \theta_2) = F(\theta_1, -\theta_2) = F(-\theta_1, -\theta_2), \forall (\theta_1, \theta_2)$$
(3)

Expressing (3) in terms of the polynomial yields:

$$P(z_1, z_2) \cdot P(z_1^{-1}, z_2^{-1}) \cdot z_1^{-N_1} \cdot z_2^{-N_2} = P(z_1^{-1}, z_2) \cdot P(z_1, z_2^{-1}) \cdot z_1^{-N_1} \cdot z_2^{-N_2}$$
(4)

Note that the multiplication by  $z_1^{-N_1} \cdot z_2^{-N_2}$  is needed so that both sides of the equation remain a polynomial in negative powers of *z*. Applying the unique factorization property of 2-variable polynomials [8] to (4), it can be seen that the factors of  $P(z_1, z_2)$  should satisfy one of the following two conditions:

- i)  $P(z_1, z_2) = k_1 \cdot P(z_1^{-1}, z_2) \cdot z_1^{-N_1}$  where  $k_1$  is a real constant.
- ii)  $P(z_1, z_2) = k_2 \cdot P(z_1, z_2^{-1}) \cdot z_2^{-N_2}$  where  $k_2$  is a real constant.

Each of the above conditions will provide a constraint on the polynomial for it to possess quadrantal symmetry in its magnitude response.

Substituting  $P(z_1, z_2) = \sum_{i=0}^{N_1} \sum_{j=0}^{N_2} a_{ij} \cdot z_1^{-i} \cdot z_2^{-j}$  into condition (i) above and assuming  $k_1 = 1$ , we get:

$$\sum_{i=0}^{N_1} \sum_{j=0}^{N_2} a_{ij} \cdot z_1^{-i} \cdot z_2^{-j} = \sum_{i=0}^{N_1} \sum_{j=0}^{N_2} a_{ij} \cdot z_1^{i-N_1} \cdot z_2^{-j}$$
(5)

Applying a change of variable  $i' = N_1 - i$  to (5), we obtain:

$$\sum_{i=0}^{N_1} \sum_{j=0}^{N_2} a_{ij} \cdot z_1^{-i} \cdot z_2^{-j} = \sum_{i'=0}^{N_1} \sum_{j=0}^{N_2} a_{N_1 - i', j} \cdot z_1^{-i'} \cdot z_2^{-j}$$
(6)

So, the coefficient constraint  $a_{ij} = a_{N_1-i,j}$  will ensure that the polynomial  $P(z_1, z_2)$  possesses quadrantal symmetry in its magnitude response. The same steps can be applied to condition (ii) above to obtain another coefficient constraint:

$$a_{ij} = a_{i, N_2 - j} \tag{7}$$

These coefficient constraints can be applied to the transfer function of an FIR filter to ensure quadrantal symmetry. For an IIR filter, the constraint can be applied to the numerator polynomial. To satisfy the coefficient condition  $a_{ij} = a_{N_1-i,j}$  with the requirement of BIBO stability, the denominator  $Q(z_1, z_2)$  must be chosen as a variable separable one [8], i.e., as  $Q(z_1, z_2) = Q_A(z_1) \cdot Q_B(z_2)$ . It is easy to see that  $Q_A(z_1)$  satisfies  $a_{ij} = a_{i,N_2-j}$  and  $Q_B(z_2)$  satisfies  $a_{ij} = a_{N_1-i,j}$ , so their product possesses quadrantal symmetry. In addition, because the denominator is separable, it is easy to check the stability of the filter structure.

Following the above, the coefficient conditions for diagonal and four-fold rotational symmetries can be obtained using the appropriate conditions on the magnitude squared function.





#### **Diagonal Symmetry**

The constraint on the magnitude response for the diagonal symmetry is:

$$F(\theta_1, \theta_2) = F(\theta_2, \theta_1) = F(-\theta_1, -\theta_2) = F(-\theta_2, -\theta_1) \quad (8)$$

The coefficient conditions resulting from the above by following the steps described for quadrantal symmetry are:

$$N_1 = N_2 = N$$

and 
$$a_{ij} = a_{ji}$$
 for  $0 \le i, j \le N$  (9)

or 
$$a_{ij} = a_{N-j,N-i}$$
 for  $0 \le i, j \le N$  (10)

Similar conditions should be satisfied by the denominator  $Q(z_1, z_2)$  in addition to satisfying the BIBO stability conditions.

#### Four-Fold Rotational Symmetry

The magnitude response condition for this symmetry is:

$$F(\theta_1, \theta_2) = F(-\theta_2, \theta_1) = F(-\theta_1, -\theta_2) = F(\theta_2, -\theta_1)$$
(11)

The coefficient conditions resulting from the above will be:

$$N_1 = N_2 = N$$
  
and  $a_{ij} = a_{N-j,i}$  for  $0 \le i, j \le N$  (12)

or 
$$a_{ij} = a_{j,N-i}$$
 for  $0 \le i, j \le N$  (13)

The denominator  $Q(z_1, z_2)$  of a BIBO stable filter with four-fold rotational symmetry should be of the form  $Q(z_1, z_2) = Q_A(z_1) \cdot Q_A(z_2)$ .



If any two of the above three symmetries are satisfied, the resulting symmetry will be an octagonal symmetry.

The above coefficient conditions on 2-D filter transfer function form the basis for deriving various symmetry incorporated architectures with reduced number of multipliers.

The sample sectors for various symmetries are illustrated in  $(\theta_1, \theta_2)$ -plane in Fig. 2. These figures show the shaded region where  $F(\theta_1, \theta_2) = F(\theta_{1T}, \theta_{2T})$ .

III. Generalized Formulation of 2-D Filter Architectures without Global Broadcast

$$H(z_1, z_2) = \frac{\sum_{i=0}^{N_1} F_i(z_2^{-1}) \cdot z_1^{-i}}{1 - \sum_{i=0}^{N_1} G_i(z_2^{-1}) \cdot z_1^{-i}}$$
(14)



Figure 4. Sub-block #2 (1-input-2-outputs, direct-form).



where  $F_i(z_2^{-1}) = \sum_{j=0}^{N_2} a_{ij} z_2^{-j}$  and  $G_i(z_2^{-1}) = \sum_{j=0}^{N_2} b_{ij} z_2^{-j}$  with  $b_{00} = 0$  are 1-D FIR functions in  $z_2$  variable only. These 1-D functions can be realized by the sub-blocks in Figs. 3 to 5. These sub-blocks are then used in the filter frameworks to realize the overall 2-D transfer function in (14).

In our discussion, we assume that the filter is used to process a zero padded image of size  $M_1 \times M_2$  and the pixel values in the image are fed to the filter in rasterscan mode, i.e. the input sequence is x(0, 0), x(0, 1), ...,

# The critical period is the time required for the signal through the slowest (critical) path of the architecture and determines the highest possible clock speed of the architecture.

 $x(0, M_2 - 1), x(1, 0), x(1, 1), ...$  etc. We can then replace  $z_2^{-1}$  by a single delay register  $z^{-1}$  (usually implemented by a D flip-flop).  $z_1^{-1}$  can be replaced by a shift register (SR) of length  $M_2, z^{-M_2}$  (using  $M_2$ -size D flip-flops), provided  $M_2 > N_2$ . Without loss of generality, we will assume  $N_1 = N_2 = N$  in discussing the filters.

#### Filter Sub-blocks

The filter sub-blocks are formulated as general digital two-pair networks to realize 1-D FIR functions in  $z_2$ . Here, we assume  $z^{-1} = z_2^{-1}$ . Sub-block #1, shown in Fig. 3, has 2 inputs and 1 output, where the coefficient inside the cloud symbol  $\bigcirc$  denotes a multiplier. It is direct form, i.e. the multiplier values are the same as the polynomial coefficients. It realizes the following two FIR functions.

$$F_{i}(z^{-1}) = \frac{Y_{i}}{X_{i}}\Big|_{W_{i}=0} = \sum_{j=0}^{N} a_{ij} z^{-j} ,$$
  

$$G_{i}(z^{-1}) = \frac{Y_{i}}{W_{i}}\Big|_{X_{i}=0} = \sum_{j=0}^{N} b_{ij} z^{-j}$$
(15)



Note that the special arrangement of the delays is to eliminate global broadcast of the signals,  $X_i$  and  $W_i$ , and to control the critical period. The critical period is the time required for the signal through the slowest (critical) path of the structure and determines the highest possible clock speed of the structure. A different filter sub-block (sub-block#2) as shown in Fig. 4 can be obtained by taking the transpose of filter sub-block#1, where  $E_i(z^{-1})$  and  $D_i(z^{-1})$  are similarly defined as (15). The sub-block #3 in Fig. 5 has single input and single output (SISO) and realizes the FIR function:

$$C_{\rho i}(z^{-1}) = \frac{Y_i}{X_i} = \sum_{j=0}^{N} \rho_{ij} z^{-j}$$

Note that  $\rho_{ij}$  can represent either the numerator or denominator coefficient  $a_{ij}$  or  $b_{ij}$ .

#### Filter Frameworks

The sub-blocks are used in the filter frameworks to realize the general 2-D z-domain transfer function in (1). Filter framework A shown in Fig. 6 uses the sub-block #1, where denotes the  $(M_2 - 1)$ -size D flip-flops/shift register. Notice that the shift registers are of length  $M_2 - 1$  due to the additional delays added at the input and output branches to eliminate the global broadcast. It can be verified using Mason's gain formula that the structure with  $z^{-1} = z_2^{-1}$  and SR  $= z_1^{-1}z_2$  realizes the transfer function in (14). By taking the transpose of Framework A, a different filter framework can be obtained which utilizes sub-block#2 [23].

# Structure Induced Separable Denominator Frameworks

By mixing the sub-blocks in specific ways, filter frameworks realizing transfer functions with separable denominator of the form in (16) can be obtained. The idea is to form two non-touching loops in different variables.

$$H(z_1, z_2) = \frac{Y(z_1, z_2)}{X(z_1, z_2)} = \frac{\sum_{i=0}^{N} \sum_{j=1}^{N} a_{ij} z_1^{-i} z_2^{-j}}{\left(1 - \sum_{i=1}^{N} b_{i0} z_1^{-i}\right) \cdot \left(1 - \sum_{j=1}^{N} b_{0j} z_2^{-j}\right)}$$
(16)

Separable filter framework A1 is shown in Fig. 7. It is based on framework A. It uses sub-block #2 at the bottom while the rest are sub-block #1. It realizes the transfer function in (17), with  $G_i$ 's being constants.

$$\frac{Y}{X} = \frac{E_0(z_2) + \sum_{i=1}^{N} F_i(z_2) \cdot z_1^{-i}}{(1 - D_0(z_2)) \cdot \left(1 - \sum_{i=1}^{N} G_i(z_2) \cdot z_1^{-i}\right)}$$
(17)

By taking the transpose of filter frameworks A and A1, another set of frameworks can be obtained. Also, a different set of frameworks can be obtained using the sub-block #3. Due to lack of space, these frameworks are not shown here. A detailed discussion of sub-blocks, general frameworks, and separable denominator frameworks can be found in [23].

#### **Explicit 2-D Filter Architectures**

We will now use the sub-blocks in Figs. 3–5 and general frameworks to derive explicit architectures without global broadcast for separable denominator transfer functions. These are then used to incorporate symmetry. The separability is necessary to ensure the BIBO stability and at the same time to achieve quadrantal, four-fold rotational and octagonal symmetries in the filter magnitude response (note that separable denom-



inator is not needed for the diagonal symmetry). In this section, we will focus on deriving structures for realizing (16). Using the sub-blocks in Figs. 3 and 4, the Type-1 separable denominator architecture in Fig. 8 can be obtained (for N = 3) [20]. The transfer function of the 2-D filter can be expressed as:

$$H(z_1, z_2) = \frac{Y_1(z_1, z_2)}{X(z_1, z_2)} \cdot \frac{Y(z_1, z_2)}{Y_1(z_1, z_2)}$$
(18)

where 
$$Y_1 = X + \sum_{i=1}^{N} b_{i0} z_1^{-i} Y_1$$
 (19)

Thus,  $Y(z_1, z_2)/Y_1(z_1, z_2)$  can be generally represented as:

$$Y = \sum_{i=0}^{N} \sum_{j=0}^{N} a_{ij} z_1^{-i} z_2^{-j} Y_1 + \sum_{j=1}^{N} b_{0j} z_2^{-j} Y$$
(20)

It can be verified that the structures of Block 1 and Block 2 in Fig. 8 satisfy (19) and (20) respectively.



**Figure 8.** Type-1 separable denominator filter architecture (N = 3).

The Type-3 separable denominator architecture can be obtained (Fig. 9) using the sub-blocks in Figs. 3 and 5. Its transfer function can be rewritten as:

$$H(z_1, z_2) = \frac{Y(z_1, z_2)}{Y_3(z_1, z_2)} \cdot \frac{Y_3(z_1, z_2)}{X(z_1, z_2)}$$
(21)

where 
$$Y = Y_3 + \sum_{j=1}^{N} b_{0j} z_2^{-j} Y$$
 (22)

Therefore,  $Y_3(z_1, z_2)/X(z_1, z_2)$  can be expressed as:

$$Y_3 = \sum_{i=0}^{N} \sum_{j=0}^{N} a_{ij} z_1^{-i} z_2^{-j} X + \sum_{i=1}^{N} b_{i0} z_1^{-i} Y_3$$
(23)

Using the tree method mentioned in [13] to arrange the adders, the critical periods for the Type 1 and Type 3 architectures are  $T_m + 3T_a$  and  $T_m + 2T_a$ , respectively, where  $T_m$  and  $T_a$  denote the operation time required by

one multiplier and one adder respectively. In the next section, we will focus on the Type-1 and Type-3 filter architectures with symmetry.

# IV. Cost Effective 2-D Filter Architectures Incorporating Different Symmetries

The presence of symmetry in the 2-D frequency response induces certain relationship among the filter coefficients. This translates into reduced number of multipliers while implementing a 2-D digital filter architecture. In this section, we present six symmetry filter architectures with diagonal, four-fold rotational, quadrantal, and octagonal symmetries with separable denominators.

#### **Diagonal Symmetry Filter Architectures**

Applying the diagonal symmetry coefficient constraint (9) to the separable denominator transfer function in (16) implies that  $a_{ij} = a_{ji}$  and  $b_{k0} = b_{0k}$ . Similarly, constraint



in (10) can be used to get different structures. Thus, for the Type-1 filter architecture, with  $Y_1$  given in (19), the expression for the output Y in (20) can be recast as:

$$Y = \sum_{j=1}^{N} b_{0j} z_2^{-j} Y + \sum_{i=0}^{N} a_{ii} z_1^{-i} z_2^{-i} Y_1 + \sum_{i=0}^{N-1} \sum_{j=i+1}^{N} a_{ij} (z_1^{-i} z_2^{-j} + z_1^{-j} z_2^{-i}) Y_1$$
(24)

Implementing (24) results in the Type-1 diagonal symmetry filter architecture of Fig. 10 [20]. Note that for diagonal symmetry, the denominator need not be separable, but they are used here for ease of the implementation of the multimode filter to be discussed in Section V.

In a similar way, one can obtain the Type 3 diagonal symmetry architecture shown in Fig. 11 [21]. Using the tree method to arrange the adders, the critical paths are shown in Figs. 10 & 11 for the two architectures. The critical periods are calculated as  $T_m + 3T_a$  and  $T_m + 2T_a$  respectively. Note that  $T_m$  and  $T_a$  denote the operation time required by the multiplier and adder respectively.

#### Four-fold Rotational Symmetry Filter Architectures

When the 2-D magnitude response of a filter possesses four-fold rotational symmetry, as per (13), the filter coefficients in (16) will satisfy the constraints:  $a_{ij} = a_{j(N-i)}$ and  $b_{k0} = b_{0k}$  for all *i*, *j*, *k*. So, for the Type-1 filter, the output *Y* in (20) for this symmetry can be expressed as:



$$Y = \sum_{j=1}^{N} b_{0j} z_2^{-j} Y + v a_{uu} z_1^{-u} z_2^{-u} Y_1 + \sum_{i=0}^{u-v} \sum_{j=i}^{N-i-1} a_{ij} (z_1^{-i} z_2^{-j} + z_1^{-j} z_2^{-(N-i)} + z_1^{-(N-i)} z_2^{-(N-j)} + z_1^{-(N-j)} z_2^{-i}) Y_1$$
(25)

where  $u = \lfloor N/2 \rfloor$ ,  $v = (N+1) \mod 2$ , and  $\lfloor \cdot \rfloor$  denotes the largest integer that is smaller than or equal to  $\cdot$ . Figure 12 shows the Type-1 four-fold rotational symmetry filter architecture [20]. Following the above, the Type-3 four-fold rotational symmetry separable denominator filter



architecture can be obtained [21]. This structure is not shown here. Performing the critical path analysis on Fig. 12 yields the delay of  $T_m + 3T_a$ . For Type 3 structure, it will be  $T_m + 2T_a$ .

#### **Quadrantal Symmetry Filter Architectures**

When the 2-D magnitude response of a filter possesses quadrantal symmetry, as per (7) the filter coefficients in (16) will satisfy the constraints:  $a_{ij} = a_{(N-i)j}$  and  $b_{k0} = b_{0k}$  for all i, j, k. So, the output Y in (20) for the Type-1 filter becomes:

$$Y = \sum_{j=1}^{N} b_{0j} z_2^{-j} Y + v \cdot \sum_{j=0}^{N} a_{uj} (z_1^{-u} z_2^{-j}) Y_1 + \sum_{i=0}^{u-v} \sum_{j=0}^{N} a_{ij} (z_1^{-i} z_2^{-j} + z_1^{-(N-i)} z_2^{-j}) Y_1$$
(26)

The Type-1 quadrantal symmetry filter architecture is given in Fig. 13 [20]. The Type 3 structure is shown in Fig. 14 [26].

Performing the critical path analysis using the tree method on Figs. 13 & 14 yield the delays of  $T_m + 3T_a$  and  $T_m + 2T_a$  respectively. The critical paths are indicated in the figures.

#### **Octagonal Symmetry Filter Architectures**

Octagonal symmetry is a combination of diagonal, fourfold rotational and quadrantal symmetries. Presence of any two of the three symmetries will guarantee the presence of octagonal symmetry in the 2-D magnitude response of the filter [8]–[10]. This results in the coefficient constraints,  $a_{ij} = a_{ji} = a_{(N-i)j}$  and  $b_{k0} = b_{0k}$ 



for all i, j, k. So, for octagonal symmetry, with eqn. (19) unchanged, the output *Y* in (20) can be expressed as:

$$Y = \sum_{j=1}^{N} b_{0j} z_{2}^{-j} Y + v a_{uu} z_{1}^{-u} z_{2}^{-u} Y_{1}$$
  
+  $v \cdot \sum_{i=0}^{u-v} a_{iu} (z_{1}^{-i} z_{2}^{-u} + z_{1}^{-u} z_{2}^{-(N-i)} + z_{1}^{-(N-i)} z_{2}^{-(N-u)} + z_{1}^{-(N-u)} z_{2}^{-i}) Y_{1}$   
+  $\sum_{i=0}^{u-v} a_{ii} (z_{1}^{-i} z_{2}^{-i} + z_{1}^{-i} z_{2}^{-(N-i)} + z_{1}^{-(N-i)} z_{2}^{-(N-i)} + z_{1}^{-(N-i)} z_{2}^{-i}) Y_{1}$   
+  $\sum_{i=0}^{u-v-1} \sum_{j=i+1}^{u-v} a_{ij} (z_{1}^{-i} z_{2}^{-j} + z_{1}^{-i} z_{2}^{-(N-j)} + z_{1}^{-(N-i)} z_{2}^{-j} + z_{1}^{-(N-i)} z_{2}^{-(N-i)}) Y_{1}$   
+  $z_{1}^{-j} z_{2}^{-i} + z_{1}^{-j} z_{2}^{-(N-i)} + z_{1}^{-(N-i)} z_{2}^{-i} + z_{1}^{-(N-i)} z_{2}^{-(N-i)}) Y_{1}$   
(27)

Implementing the above, one can get the Type-1 octagonal symmetry filter architecture shown in Fig. 15. The Type-3 octagonal symmetry filter architecture is not shown here and can be found in [26]. The critical path analysis yields the delays of  $T_m + 3T_a$  and  $T_m + 2T_a$  respectively.

Due to symmetry, all six cost-effective filter architectures require fewer multipliers. The savings come from realizing the numerator of (16). The direct form implementation of the numerator with no symmetry (for N = 3) requires 16 multipliers. In the case of diagonal, quadrantal, four-fold rotational, octagonal symmetries, the number of multipliers needed are 10, 8, 4 and 3, respectively.



The 2-D octagonal symmetry structure in Fig. 15 has the lowest number of multipliers.

# V. Cost-Effective Multimode 2-D Filter Architecture Incorporating Four Symmetries

To reduce the cost of filter area and to increase hardware flexibility, two multimode filter architectures are developed that each supports four different symmetry modes: diagonal symmetry mode (DSM), four-fold rotational symmetry mode (FRSM), quadrantal symmetry mode (QSM), and octagonal symmetry mode (OSM). These two cost-effective multimode 2-D IIR filter architectures are shown in Fig. 16 and Fig. 17 for N = 3.

The cost-effective multimode 2-D symmetry filter architecture in Fig. 16 [20] can be derived based on three observations. First, the signal paths are added before the  $a_{ij}$  independent coefficient multiplier for the Type-1 symmetry filters in Figs. 10, 12, 13, and 15. Second, it can be



seen from (18) that the Type-1 symmetry filter consists of two transfer functions:  $Y_1/X$  and  $Y/Y_1$ , with the  $Y_1/X$  transfer function being the same for the four individual symmetry filters as shown in (19). The block diagram of  $Y_1/X$ is depicted as Block 1 on the left-hand side of Fig. 16(a). Block 1 requires 3 multipliers and 3 adders. Next, we consider the  $Y/Y_1$  transfer function which is different for each of the four individual symmetry filters. To construct Block 2 of the multimode filter, we need 3 multipliers for the denominator  $\{b_{01}, b_{02}, b_{03}\}$ , and 11 multipliers for the numerator  $\{a_{00}, a_{01}, a_{02}, a_{03}, a_{10}, a_{11}, a_{12}, a_{13}, a_{22}, a_{23}, a_{33}\}.$ Therefore, 11 + 3 = 14 coefficient multipliers are required in Block 2 of Fig. 16(a) to achieve the operations for four different transfer functions of  $Y/Y_1$ . In terms of the number of adders from the architecture viewpoint, for  $Y/Y_1$  of the multimode 2-D symmetry filter, 13 adders

and 11 adders are needed on the left-hand side and righthand side of Block 2 in Fig. 16(a), respectively.

In summary, the Type-1 multimode 2-D symmetry filter requires altogether 17 coefficient multipliers and 27 adders. Also, in Figs. 10, 12, 13, and 15, the interconnection control is only needed for the four  $Y/Y_1$  transfer functions. The multiplication connections and internal connections are controlled by interconnection boxes (IBs) to accomplish the four-mode operations, where IB performs either connection or disconnection task for each signal path. According to the connections of the four individual symmetry filter architectures, 12 IBs are needed for the internal connections in Block 2. Therefore, based on the three observations mentioned above, the Type-1 multimode 2-D IIR filter with four symmetry modes can be obtained in Fig. 16(a). The interconnections difference among the four configurations of the





multimode filter architecture is highlighted in Fig. 16(b). The multimode filter architecture has a critical path of  $T_m + 3T_a$  as shown in Fig. 16(a) by using the tree method.

Similarly, to support multiple symmetry functions, the Type-3 multimode 2-D IIR filter architecture giving (DSM, FRSM, QSM and OSM) modes of operation has been obtained and is given in Fig. 17 [25]. The details of this multimode structure can be found in [25]. Type 3 multimode filter architecture shown in Fig. 17(a) has a critical path of  $T_m + 2T_a$ .

The multimode architecture presented can support four different symmetry modes with just a slight area overhead. It achieves a multiplier reduction of 65.3% for N = 3compared with the sum of the multipliers of the four individual symmetry filter structures thus making the multimode hardware architectures quite cost effective.

# VI. Error Analysis for 2-D Symmetry Filter Architectures

The product quantization errors propagating through the filter architecture in fixed-point implementation have been studied in [3], [27]. Using the same approach here, the round-off noise errors for the Type-1 and Type-3 filter architectures are analyzed in [25] [26]. In the analysis, the round-off noise sources are assumed to be uncorrelated, wide-sense stationary and uniformly distributed, which allows linear decomposition to be applied. Furthermore, the noise source and the noise source with delay are regarded as independent.

For the Type-1 diagonal symmetry filter architecture in Fig. 10, the linear error signals  $e_1$  and  $e_2$  are given by:

$$e_1 = \sum_{j=1}^{N} e_{b0j}$$
(28)

$$e_{2} = \sum_{j=1}^{N} e_{b0j} + \sum_{i=0}^{N} e_{aii} + \sum_{i=0}^{N-1} \sum_{j=i+1}^{N} e_{aij}$$
(29)

Since  $e_1$  error/noise source passes through the whole filter architecture and  $e_2$  passes through  $b_{0j}$  at the righthand side in Fig. 10, total variance of quantization error of the Type-1 diagonal symmetry filter architecture can be derived as:

$$\sigma_{\text{Typel}\_Dia}^{2} = N \sigma_{e}^{2} \sum_{m=-\infty}^{\infty} \sum_{n=-\infty}^{\infty} |h[m,n]|^{2} + \left(2N + 1 + \frac{N(N+1)}{2}\right) \sigma_{e}^{2} \sum_{n=-\infty}^{\infty} |h_{b2}[n]|^{2} \quad (30)$$

where  $\sigma_e^2 = 2^{-2B}/12$ , *B* is the fractional bit width after quantization, and and  $h_{b12}[m, n]$  are defined as:

$$h_{b2}[n] \xleftarrow{z} \frac{1}{1 - \sum_{i=1}^{N} b_{0i} z_2^{-i}}$$
(31)

$$h_{b12}[m,n] \xleftarrow{Z} \frac{1}{\left(1 - \sum_{i=1}^{N} b_{0i} z_{1}^{-i}\right) \cdot \left(1 - \sum_{j=1}^{N} b_{0j} z_{2}^{-j}\right)} \quad (32)$$





For comparison, the error analyses of Type-1 four-fold, quadrantal symmetry and octagonal symmetry filter architectures are listed below.

$$\sigma_{\text{Typel}\_FF}^{2} = N \sigma_{e}^{2} \sum_{m=-\infty}^{\infty} \sum_{n=-\infty}^{\infty} \left| h[m,n] \right|^{2} + (N+v+(u-v+1)(N-u+v)) \sigma_{e}^{2} \times \sum_{n=-\infty}^{\infty} \left| h_{b2}[n] \right|^{2}$$
(33)

$$\sigma_{\text{Typel}_Qua}^{2} = N \sigma_{e}^{2} \sum_{m=-\infty}^{\infty} \sum_{n=-\infty}^{\infty} |h[m,n]|^{2} + [N + v(N+1) + (u - v + 1)(N+1))] \sigma_{e}^{2} \times \sum_{n=-\infty}^{\infty} |h_{b2}[n]|^{2}$$
(34)

$$\sigma_{\text{Type1_Oct}}^2 = N \sigma_e^2 \sum_{m=-\infty}^{\infty} \sum_{n=-\infty}^{\infty} |h[m,n]|^2 + [N+v+v(u-v+1)+(u-v+1)] + (u-v)(u-v+1)/2 \sigma_e^2 \sum_{n=-\infty}^{\infty} |h_{b2}[n]|^2 \quad (35)$$

Similarly, the total variance of quantization error of Type-3 symmetry filter architectures is derived as

$$\sigma_{\text{Type3\_Dia}}^2 = \sigma_{\text{Type3\_FF}}^2 = \sigma_{\text{Type3\_Qua}}^2 = \sigma_{\text{Type3\_Oct}}^2$$
$$= N\sigma_e^2 \sum_{n=-\infty}^{\infty} |h_{b2}[n]|^2 + [N + (N+1)^2]\sigma_e^2$$
$$\times \sum_{m=-\infty}^{\infty} \sum_{n=-\infty}^{\infty} |h_{b12}[m,n]|^2$$
(36)

where  $h_{b2}[n]$  and  $h_{b12}[m, n]$  are defined in (31) and (32) respectively.

# VII. Implementation and Comparison of Results

As proof of concept, the chip layout of Type-1 multimode 2-D IIR filter architecture in Fig. 16 is shown in Fig. 18. The circuit has a size of 718.95  $\mu$ m × 711.05  $\mu$ m and an average power consumption of 29.34 mW. As indicated in [20], compared with the sum of the areas of the four individual symmetry filters, the area saving could be up to 63.25%. The details of the architecture comparison in terms of the number of multipliers, number of adders, and critical path are shown in Table 1.

The Type-3 symmetry filter architectures also possess shorter critical path delay than the Type-1 symmetry filter architecture. In terms of adders, it is known that the area of an adder is much less than that of a multiplier. To achieve fair comparison, the *n*-bit adder can be equivalently evaluated as  $1/n \ n \times n$ -bit multiplier using array multiplier approach [28]. According to the hardware implementation in [20],  $16 \times 16$ -bit multiplier assumptions to realize this design. The Type-3 multimode 2-D filter architecture not only has less critical path delay but also has lower number of adders than the Type-1 multimode 2-D filter architecture.



#### VIII. Filter Design Example

Consider a design example for a narrowband Fan filter with diagonal symmetry. The filter magnitude specification is shown in Fig. 19. The filter passband has an angle  $\phi_1 = 15$  deg and the transition band has an angle  $\phi_2 = 10$  deg. (Note that the specs only show  $\theta_2 > 0$  for the vertical axis). A 2-D Fan filter can be used in the design of 3-D cone filters which see applications in high-selectivity beam formers [15].

Optimization is used to obtain the transfer function that satisfies the Fan filter specification. The form of the transfer function (with unknown coefficients) chosen satisfies the diagonal symmetry as the given Fan filter specs exhibit diagonal symmetry. The objective is to minimize sum of the squared errors between the filter magnitude response and the given filter specifications evaluated on a uniform raster in the 2-D frequency plane. The objective or error function is shown in (37). It is based on the difference between the magnitude response of the transfer function and the desired magnitude response, at selected frequency points in both the passband and stopband.

$$\operatorname{Error} = \sum_{k} \sum_{l} \left[ F(\theta_{1k}, \theta_{2l}) - F_d(\theta_{1k}, \theta_{2l}) \right]^2$$
(37)

where *F* is the transfer function magnitude squared response,  $F_d$  is the desired response, and  $\theta_{1k}$ ,  $\theta_{2l}$  are the sample frequency points where the desired response is specified.

The design is done using a (variable) separable denominator transfer function as described in Section III. The optimization results, for different filter orders, are shown in Table 2. The 3-D surface plot for order  $5 \times 5$  is shown in Fig. 20. The contour plots for orders  $(5 \times 5)$  and  $(3 \times 3)$ are given in Fig. 21. As expected, with higher filter order,



| Table 1.         Comparison of different 2-D IIR filter architectures with the order N. |        |                                                                  |                               |        |                                     |                                   |                  |  |  |
|-----------------------------------------------------------------------------------------|--------|------------------------------------------------------------------|-------------------------------|--------|-------------------------------------|-----------------------------------|------------------|--|--|
|                                                                                         |        |                                                                  | # of Multipliers<br>for N = 3 |        | # of Adds with two inputs for N = 3 |                                   |                  |  |  |
| Works                                                                                   |        | # of Multipliers                                                 | #                             | %      | #                                   | Equivalent # of<br>16×16-bit mul. | Critical<br>Path |  |  |
| Van [13]                                                                                |        | $2(N+1)^2-1$                                                     | 31 for Gen                    | 100%   | 30                                  | 1.875                             | $T_m + 3T_a$     |  |  |
| Separable<br>Denominator                                                                | Type-1 | $(N+1)^2 + 2N$                                                   | 22                            | 70.97% | 21                                  | 1.3125                            | $T_m + 3T_a$     |  |  |
|                                                                                         | Туре-З |                                                                  |                               |        |                                     |                                   | $T_m + 2T_a$     |  |  |
| Diagonal                                                                                | Type-1 | $\frac{1}{2}(N+1)^2 + \frac{5}{2}N + \frac{1}{2}$                | 16                            | 51.61% | 21                                  | 1.3125                            | $T_m + 3T_a$     |  |  |
|                                                                                         | Туре-З |                                                                  |                               |        |                                     |                                   | $T_m + 2T_a$     |  |  |
| Four-Fold<br>Rotational                                                                 | Type-1 | $\frac{1}{4}(N+1)^2 + \frac{3}{4}\nu + 2N$                       | 10                            | 32.26% | 21                                  | 1.3125                            | $T_m + 3T_a$     |  |  |
|                                                                                         | Туре-З |                                                                  |                               |        |                                     |                                   | $T_m + 2T_a$     |  |  |
| Quadrantal                                                                              | Type-1 | $\frac{1}{2}(N+1)^2 + \frac{V}{2}(N+1) + 2N$                     | 14                            | 45.16% | 21                                  | 1.3125                            | $T_m + 3T_a$     |  |  |
|                                                                                         | Туре-З |                                                                  |                               |        |                                     |                                   | $T_m + 2T_a$     |  |  |
| Octagonal                                                                               | Type-1 | $\frac{1}{8}(N+1+\nu)^2 + \frac{1}{4}(N+1+\nu) + 2N$             | 9                             | 29.03% | 21                                  | 1.3125                            | $T_m + 3T_a$     |  |  |
|                                                                                         | Туре-З |                                                                  |                               |        |                                     |                                   | $T_m + 2T_a$     |  |  |
| Type-1 Multimode<br>Type-3 Multimode                                                    |        | $\frac{1}{2}(N+1)^2 + \frac{1}{2}(N+1) + \frac{1}{8}(N+1+\nu)^2$ | 17                            | 54.84% | 29                                  | 1.8125                            | $T_m + 3T_a$     |  |  |
|                                                                                         |        |                                                                  |                               |        | 21                                  | 1.3125                            | $T_m + 2T_a$     |  |  |
|                                                                                         |        | -4(10 + 1 + 0) + 210                                             |                               |        |                                     |                                   |                  |  |  |



| Table 2.<br>Separable denominator design. |              |                  |  |  |  |  |
|-------------------------------------------|--------------|------------------|--|--|--|--|
| Filter order                              | Design error | # of multipliers |  |  |  |  |
| $5 \times 5$                              | 22.5         | 31               |  |  |  |  |
| $4 \times 4$                              | 29.1         | 23               |  |  |  |  |
| 3 × 3                                     | 44.6         | 16               |  |  |  |  |

the design error is reduced. But the tradeoff is the greater number of multipliers required in the final filter structure.

As previously mentioned, for the diagonal symmetry, the denominator of the 2-D filter transfer function need not be separable. For the sake of the completeness of the structures, order  $2 \times 2$  non-separable denominator filter structure possessing diagonal symmetry is shown in Fig. 22. The critical path analysis yields the delay of  $T_m + 2T_a$ . The number of multipliers saved compared to non-symmetric implementation [13] by direct form is six. This order  $2 \times 2$  structure can be extended for a filter of



any order. In general, the number of multipliers required in a non-separable denominator filter structure with diagonal symmetry is  $N^2 + 3N + 1$ . Without any symmetry, the number of multipliers required is  $2N^2 + 4N + 1$ .

#### **IX. SUMMARY**

This review article written as a dedication to the memory of Professor Alfred Fettweis gives the most recent update of the research results connected with 2-D digital filter structures possessing symmetry. The symmetries incorporated in these VLSI implementable architectures are: quadrantal, diagonal, four-fold rotational and octagonal. Cost effective multimode symmetry architectures combining the above symmetry modes of operation are presented. The error analysis of the structures and the





implementation aspects are also discussed. A Fan filter design with diagonal symmetry is presented in the end along with a filter structure for a 2-D transfer function with diagonal symmetry. By utilizing the generalized design procedure and using sub-networks and frameworks, the individual symmetry and multimode filter architecture for any order could be obtained.

#### Acknowledgment

Authors would like to thank Professor P. K. Rajan of Tennessee Tech University for his thorough review and suggestions to improve the quality of presentation.



**Lan-Da Van** (S'98-M'02-SM16) received the Ph. D. degree from National Taiwan University (NTU), Taipei, Taiwan, in 2001 in electrical engineering. From 2001 to 2006, he was with the National Chip Implementation Center (CIC), Hsin-

chu, Taiwan. In 2006, he joined the faculty of the Department of Computer Science, National Chiao Tung University (NCTU), Taiwan, and is currently an Associate Professor. He is now the Deputy Director of NCTU M2M/ IoT R&D Center. His research interests are in digital signal processing and learning computation algorithms, architectures, chips, systems and applications. He received the Best Paper Award in the IEEE iThings2014. He has also received the teaching award of Computer Science College, NCTU in 2014. Dr. Van served as Chairman of the IEEE NTU Student Branch in 2000 for which he has received the IEEE Award for outstanding leadership and service. From 2009 to 2010, he served as an Officer of the IEEE Taipei Section. In 2014, he was a Track Co-Chair of the 22nd IFIP/IEEE VLSI-SoC. He also was a Track Co-Chair of the 2018 IEEE MWSCAS and a Special Session Co-Chair of the 2018 IEEE International Conference on DSP. Professor Van is an Associate Editor for the IEEE Transactions on Computers (2014~2018) and an Associate Editor for the IEEE Access (2018~present).



**I-Hung Khoo** (M'03) received his Ph.D. in Electrical and Computer Engineering from the University of California Irvine in 2002. He is currently a professor in the Department of Electrical Engineering and the Department of Biomedical Engineering at California State University Long Beach. Previously, he was a Senior I. C. Design Engineer at Agilent/Avago Technologies. Dr. Khoo's research interests include high speed circuit design, analog & digital signal processing, and biomedical circuits and devices.



**Pei-Yu Chen** (S'10) was born in Taipei, Taiwan. She received the B.S. and M.S. degrees in engineering and system science from National Tsing Hua University (NTHU), Hsinchu, Taiwan, in 2004 and 2007, respectively. She received Ph.D.

degree from National Chaio Tung University (NCTU), Hsinchu, Taiwan, in 2018. Her research interests include multidimentional signal processing and VLSI design.



# Haranatha (Hari) C. Reddy (M'77-

SM'82-F'92-LF2010) received his Ph.D. in Electronics and Communications Engineering from Osmania University, Hyderabad, India in 1974 and then did two years of Post-Doctoral research at Con-

cordia University, Montreal, Canada. He is now a Professor (emeritus) at the California State University, Long Beach, California and an adjunct professor at the University of Victoria, Canada. He has been a visiting chair professor at NCTU, Taiwan since 2007. Earlier, he held visiting appointments at ETH Zurich, Switzerland; Concordia University, Montreal, Canada and at the University of California, Irvine, CA. His research interests and publications for the past 40 years has been in multidimensional circuits, filters and signal processing area. He is now a Life Fellow of the IEEE. Professor Reddy has over the years served the IEEE Circuits and Systems society in many capacities including serving as the President of the CAS Society in 2001. He is the recipient of the 2003 Meritorious Service award from the **IEEE CAS Society.** 

#### References

[1] A. Fettweis, "Symmetry requirements for multidimensional digital filters," *Int. J. Circuit Theory Appl.*, vol. 5, no. 4, pp. 343–353, 1977.

[2] D. E. Dudgeon and R. M. Mersereau, *Multidimensional Digital Signal Processing Prentice-Hall Signal Processing Series*. Englewood Cliffs, NJ, USA: Prentice-Hall, 1984.

[3] W.-S. Lu and A. Antoniou, *Two-Dimensional Digital Filters (Electrical Engineering and Electronics, No. 80*). New York, NY, USA: M. Dekker, 1992, ch. 11, pp. xii, 398.

[4] E. I. Jury, V. Kolavennu, and B. Anderson, "Stabilization of certain two-dimensional recursive digital filters," *Proc. IEEE*, vol. 65, no. 6, pp. 887–892, 1977.

[5] P. Karivaratharajan and M. Swamy, "Quadrantal symmetry associated with two-dimensional digital transfer functions," *IEEE Trans. Circuits Syst.*\* (1974–1992), vol. 25, no. 6, pp. 340–343, 1978.

[6] S. Aly and M. Fahmy, "Symmetry exploitation in the design and implementation of recursive 2-D rectangularly sampled digital filters," *IEEE Trans. Acoust., Speech, Signal Process.* \*(1975–1990), vol. 29, no. 5, pp. 973–982, 1981.

[7] B. George and A. Venetsanopoulos, "Design of two-dimensional digital filters on the basis of quadrantal and octagonal symmetry," *Circuits Syst. Signal Process.*, vol. 3, no. 1, pp. 59–78, 1984.

[8] M. N. S. Swamy and P. K. Rajan, "Symmetry in 2-D filters and its application," in *Multidimensional Systems: Techniques and Applications*, S. G. Tzafestas, Ed. New York, NY, USA: Marcel Dekkar, 1986, ch. 9.

[9] H. C. Reddy, I.-H. Khoo, and P. K. Rajan, "2-D symmetry: Theory and filter design applications," *IEEE Circuits Syst. Mag.*, vol. 3, no. 3, pp. 4–33, 2003.

[10] H. C. Reddy, I. Khoo, and P. K. Rajan, "Application of symmetry: 2-D polynomials, Fourier transform, and filter design," in *The Circuits and Filters Handbook*, 2009.

[11] M. A. Sid-Ahmed, "A systolic realization for 2-D digital filters," *IEEE Trans. Acoust., Speech, Signal Process.* \*(1975–1990), vol. 37, no. 4, pp. 560–565, 1989.

[12] N. R. Shanbhag, "An improved systolic architecture for 2-D digital filters," *IEEE Trans. Signal Process.*, vol. 39, no. 5, pp. 1195–1202, 1991.

[13] L.-D. Van, "A new 2-D systolic digital filter architecture without global broadcast," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 10, no. 4, pp. 477–486, 2002.

[14] R. M. Joshi, A. Madanayake, J. Adikari, and L. T. Bruton, "Synthesis and array processor realization of a 2-D IIR beam filter for wireless applications," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 20, no. 12, pp. 2241–2254, 2012.

[15] A. Madanayake, C. Wijenayake, D. G. Dansereau, T. K. Gunaratne, L. T. Bruton, and S. B. Williams, "Multidimensional (MD) circuits and systems for emerging applications including cognitive radio, radio astronomy, robot vision and imaging," *IEEE Circuits Syst. Mag.*, vol. 13, no. 1, pp. 10–43, 2013.

[16] P.-Y. Chen, L.-D. Van, H. C. Reddy, and C.-T. Lin, "A new VLSI 2-D diagonal-symmetry filter architecture design," in *Proc. IEEE Asia Pacific Conf. Circuits and Systems*, 2008, pp. 320–323.

[17] P.-Y. Chen, L.-D. Van, H. C. Reddy, and C.-T. Lin, "A new VLSI 2-D fourfold-rotational-symmetry filter architecture design," in *Proc. IEEE Int. Symp. Circuits and Systems*, 2009, pp. 93–96.

[18] I.-H. Khoo, H. C. Reddy, L.-D. Van, and C.-T. Lin, "2-D digital filter architectures without global broadcast and some symmetry applications," in *Proc. IEEE Int. Symp. Circuits and Systems*, 2009, pp. 952–955.

[19] I.-H. Khoo, H. C. Reddy, L.-D. Van, and C.-T. Lin, "Generalized formulation of 2-D filter structures without global broadcast for VLSI implementation," in *Proc. 53rd IEEE Int. Midwest Symp. Circuits and Systems*, 2010, pp. 426–429.

[20] P.-Y. Chen, L.-D. Van, I.-H. Khoo, H. C. Reddy, and C.-T. Lin, "Powerefficient and cost-effective 2-D symmetry filter architectures," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 58, no. 1, pp. 112–125, 2011.

[21] P.-Y. Chen, L.-D. Van, H. C. Reddy, and I.-H. Khoo, "Area-efficient 2-D digital filter architectures possessing diagonal and four-fold rotational symmetries," in *Proc. 9th Int. Conf. Information Communications and Signal Processing*, 2013, pp. 1–5.

[22] I.-H. Khoo, H. C. Reddy, L.-D. Van, and C.-T. Lin, "Design of 2-D digital filters with almost quadrantal symmetric magnitude response without 1-D separable denominator factor constraint," in *Proc. IEEE 56th Int. Midwest Symp. Circuits and Systems*, 2013, pp. 999–1002.

[23] I.-H. Khoo, H. C. Reddy, L.-D. Van, and C.-T. Lin, "General formulation of shift and delta operator based 2-D VLSI filter structures without global broadcast and incorporation of the symmetry," *Multidim. Syst. Signal Process.*, vol. 25, no. 4, pp. 795–828, 2014.

[24] P.-Y. Chen, L.-D. Van, I. Khoo, and H. C. Reddy, "New 2-D quadrantal-and diagonal-symmetry filter architectures using delta operator," in *Proc. IEEE 12th Int. Conf. ASIC (ASICON)*, 2017, pp. 1133–1136.

[25] P.-Y. Chen, L.-D. Van, H. C. Reddy, and I. Khoo, "Type-3 2-D multimode IIR filter architecture and the corresponding symmetry filter's error analysis," in *Proc. IEEE 12th Int. Conf. ASIC (ASICON)*, 2017, pp. 726–729.

[26] P. Y. Chen, L. D. Van, H. C. Reddy, and I. H. Khoo, "New 2-D filter architectures with quadrantal symmetry and octagonal symmetry and their error analysis," in *Proc. IEEE 60th Int. Midwest Symp. Circuits and Systems*, 2017, pp. 265–268.

[27] A. V. Oppenheim, R. W. Schafer, and J. R. Buck, *Discrete-Time Signal Processing*, 2nd ed. Upper Saddle River, NJ, USA: Prentice Hall, 1999, ch.6, pp. xxvi, 870.

[28] N. H. E. Weste and D. M. Harris, *CMOS VLSI Design: A Circuits and Systems Perspective*, 3rd ed. Boston, MA, USA: Pearson, 2005, ch. 10, pp. xxiv, 967.