A Back-Propagation Neural Network for Recognizing Objects

—This paper proposes a neural network approach to recognize two-dimensional objects. Firstly, image preprocessing is applied to the object to obtain its main features, and feature extraction is used to represent the features of the object. The proposed method uses fuzzy K-nearest neighbor clustering and a back-propagation neural network in the following. The extracted features are classified by fuzzy K-nearest neighbors to improve their efficiency. And use the following back-propagation neural network to identify the classified features. The experimental results show that the recognition effect can be improved by using pre-classified features and show that the method is effective in object recognition.


I. INTRODUCTION
In recent years, the application of machine vision in the industry has developed rapidly, and the scope of use has become wider. For example, it is used in different aspects such as product quality control and automatic control of production processes. Product identification and inspection, an inspection of internal defects of parts, weld quality of workpieces, etc. can be combined with image processing. In the past identification and inspection work, because the technology was not as developed as it is today, the functions of computers and their applications cannot be as extensive as they are now, so the identification and inspection work are usually performed by manpower, but the current production process, all kinds of the emphasis is to improve production efficiency and reduce production costs. If people still do the work of identification and inspection, it seems that these requirements can no longer be met. Moreover, errors often occur in manual processing. The longer the working time or the poor working environment, the higher the probability of errors. Therefore, if the production is automated, the complicated identification and inspection work can be handed over to the machine for execution, and the error rate will also increase. Reduce and save many unnecessary labor costs and improve the degree of industrial automation.
Machine vision is an integrated technology that combines image capturers, computers, and artificial intelligence to simulate human behavior. Machine vision with intelligent capabilities has the ability to instantly understand its environment, process work, and solve problems. At present, it has been widely used in some automation fields, such as graphic recognition, visual inspection, and automated assembly. Machines can be combined with computers through CCD cameras, just like human eyes, which are responsible for the bridge between the system and the external environment. In terms of automatic identification, the more commonly used ones in the industry are automatic identification of workpieces, identification of parts, etc. In addition, there are facial recognition, fingerprint recognition, speech recognition, handwriting recognition, etc., which are very popular at present. The advantages of using automation are many, such as saving on personnel usage, avoiding inspectors performing tasks in a hazardous work environment, speeding up inspections for a smoother production process, and improving inspection accuracy.
An important task in manufacturing in general is understanding the conditions of the environment, and machine vision systems meet this requirement for many manufacturing companies. Therefore, the application of machine vision for the automatic identification and measurement of processed workpieces in the manufacturing industry is an important topic in the current application. Since thickness can be considered insignificant for many mechanical parts relative to other lengths and widths, methods for identifying two-dimensional objects are important and most commonly used [12].
Cozar et al. [5] utilize a generalized Hough transform method to detect planar objects, in their method they use a multi-resolution design to speed up the computation in the Hough transform. The computation time and memory requirements required for Hough transform can be reduced. Lindsey and Stromberg [10] used the frequency of simple features as a classifier in terms of simplified features. In this method they claim that the p-gram technique can be used to translate and scale a simple and straightforward image encoding method for translation and scale variation.
In object recognition applications, there are generally two main problems to overcome. The two main problems are object representation and pattern matching. For the representation problem, Marshall [11] reported a number of shapes encoding methods. Neural networks and fuzzy classifiers are two powerful tools for pattern recognition. Abe and Lan [1] proposed a fuzzy classifier based on the nearest network. Bermejo and Cabestany [2] modified the k-nearest neighbor classifier and proposed a learning algorithm to reduce the number of data points to store. Cha and Srihari [3] used feature frequencies to reduce processing time. They propose a method that can quickly eliminate most templates from possible neighbors. Zhong et al. [4] used Hopfield neural networks to find polygonal approximations of shapes, and hierarchical neural networks for identification and classification. Hsieh and Chen [6] proposed a new cognitive neural network model that uses both supervised and unsupervised learning to recognize objects. Kim and Huntsberger [8] developed a self-organizing map neural network that combines a fuzzy c-means method for pattern recognition and classification.
Kim et al. [9] determine the best partition by cluster index. The optimal number of clusters for fuzzy partitions can be found in the fuzzy c-means algorithm. Low overlap and large separation distance should be necessary for a good fuzzy partition technique. Remze et al. [13] proposed a simplified linguistic fuzzy set of labeled cubes. The optimal subset of fuzzy features can be found using traditional search techniques. The original set is presented instead of a fuzzy space. Sherif and Samee [14] used fuzzy c-means clustering rules to speed up the learning process of neural networks. Shi and He [15] adopted a Hopfield neural network with the learning ability to recognize unknown objects. Stoeva and Nikov [16] proposed a fuzzy backpropagation algorithm that proved necessary and sufficient conditions for its convergence to a single-output network. Sun et al. [17] used a genetic algorithm to search the principal component analysis space to select subsets of features. Wang [18] proposed some networks for pattern recognition, such as back-propagation neural networks and adaptive inference neural networks. Wu and Lin [19] used a two-stage method for recognizing partially occluded objects. The backpropagation neural network (BPNN) is used to classify the features. Zhang and Sun [22] used tabu search to solve the feature selection problem.
The ability to achieve object recognition quickly and instantly is very important as machine vision is increasingly adopted by the industry. For example, in the process of factory automation, in order to improve production efficiency and cooperate with automated production equipment, it has become a current trend to use machine vision for automatic identification. Regarding the research on object recognition, the methods used can be divided into two categories: one is to compare and identify the processed images directly; the other is to further analyze the processed images to select the required features. After the value is set, the identification work is carried out. Among them, the second type of method is more flexible and more generally accepted.
The neural network imitates the information processing mode of the human brain nervous system and engages in the overall consideration of the task with the interaction between a large number of signals and processing units. It is different from the way of ordinary digital computers, nor does it follow a certain clear rule, but can self-correct and adjust, which is the ingeniousness of the neural network. In general, neural networks have the following characteristics: (1) parallel processing, (2) error tolerance, (3) associative memory, and (4) computation of non-algorithmic programs.
In the research of image recognition methods, most of the traditional methods are in the way of sequential comparison, but this method requires a lot of memory space, and it takes a long time to do the comparison work. We propose a method for object recognition in this paper. The method uses a fuzzy classifier and a back-propagation neural network. The features of objects are classified into clusters by a fuzzy classifier to reduce the feature dimension in the first stage. In the second stage, the clustered features are then identified by a back-propagation neural network. A fuzzy classifier and backpropagation neural network will be introduced in the next section. Section III presents the experimental results to show the ability of the proposed method. Some concluding remarks will be given in the final section.

II. METHOD OF RECOGNIZING TWO-DIMENSIONAL OBJECTS
In the process of factory automation, in order to improve production efficiency and cooperate with automated production equipment, the use of machine vision for automatic identification has become a current trend. Fig. 1 shows a flowchart of an approach for 2D object recognition by combining a fuzzy classifier and a backpropagation neural network. Firstly, a CCD camera is used to capture the image of objects. Next, we used image denoising, image thresholding, and edge detection as the three image preprocessing processes. After the above image preprocessing procedure, we can get the outline of the object. The dominant points can be detected, and some features are extracted from these dominant points. The extracted features are then classified into several clusters by a fuzzy K-nearest neighbor clustering method. Finally, object recognition is conducted by a back-propagation neural network algorithm. The following subsections illustrate feature extraction, feature clustering, and object recognition techniques.

A. Feature Extraction Technique
The points with the curvature extrema on the curve are considered the dominant points. They describe the curve appropriately for visual perception and identification. The dominant points on the object outline can represent the object, we will use the information on the dominant points to find the features. The curvature-based polygon approximation method is used to detect dominant points on object boundaries [20].
In this paper, relative distance, length, angle, and the inverse of compactness are used as the features in object recognition (see Fig. 2). Suppose that Vi is the i-th dominant point, and C is the center. The four features are introduced as follows [21].
1. Relative distance (RD): It is the relative distance between the vertex and the center. The i th relative distance, di = (see Fig. 2a). 2. Length (LE): It is the length of the polygon. Suppose that li is the length of (see Fig. 2b). 3. Angle (AN): It is the angle of the polygon. Suppose that Θi is the angle of ∠Vi-1ViVi+1 (see Fig. 2b). 4. Reciprocal of compactness (RC): It is i th reciprocal of the compactness, ri (see Fig. 2c). Suppose that pi and ai are the perimeter and the area of the triangle, respectively, RC is defined in (1).
where pi and ai are the perimeter and the area of the triangle, respectively.
It is important to identify the robustness of the algorithm. We should suppose that the features are translation, rotation, and scale invariants (TRS-invariant). Otherwise, the proposed method will not have a good recognition rate. Therefore, to make the features TRS invariant, divide them by the maximum value to normalize them.

B. Feature Clustering Technique
There are many classification methods have been proposed in the past. The fuzzy average method is used to classify the features, which is mainly to divide the set composed of n samples into c groups of clusters according to the mdimensional feature vector space in the sample and the membership function so that the objective function value is minimum. Therefore, after classification, the Euclidean distance of each sample in the same cluster is the shortest, and the distance between the centers of different clusters is the largest, so as to determine the cluster to which it belongs. The features are classified by the fuzzy k nearest cluster analysis method. The algorithm is very similar to the fuzzy c average method. The difference is that it must first set the number of adjacent areas as K, that is to say, when judging the Euclidean distance of a sample. Whether it is the shortest, we must first check whether the sample has samples in K adjacent areas, if so, calculate its weight according to its adjacent sample data, and then calculate the attribution function to determine whether the sample belongs to this cluster.
We can first classify these four features into clusters to reduce the dimension of input vectors. The fuzzy K-nearest neighbors clustering scheme proposed by Keller et al. [7] is used for feature clustering in this paper. The fuzzy K-nearest neighbor scheme is a simple and powerful classification method. In the scheme, samples are assigned to a cluster with the maximum membership. After classifying the features, the center of each cluster is used to represent the cluster. Fig. 3 shows an example of feature clustering. The 15 features are classified as three clusters with 3, 5, and 7 features, respectively. The dimension of the features is then reduced from 15 to 3.
In order to illustrate the fuzzy K-nearest neighbors clustering algorithm, the following symbols should be defined.
n: number of vectors. c: number of clusters. K: number of the nearest neighbors, 1£K£n. m: weight in membership. Si: set of the i-th cluster. fj: j-th vector. dij: distance between the i-th and the j-th vectors. KNNj: set of the K-nearest neighbors of the j-th vector. nij: number of the vectors in the i-th cluster in KNNj. uij: the membership of the j-th vector belongs to the i-th cluster.
The fuzzy K-nearest neighbor clustering method is illustrated by the following: Step 1. Set the values of m and K, and start with c=1.
Step 4. Identify the K smallest values from di,j. The members in set KNNj are the corresponding feature vectors.
Step cluster; otherwise, increase c by 1 and classify fj to the new cluster. In Step 3, the distance between two feature vectors can be defined as the norm of the difference of two feature vectors, as shown in (2). dij = (2) In Step 5, the membership uij is defined in (3).
The weights in membership computation are shown in (4). (4)

C. Back-Propagation Neural Network Technique
The backpropagation neural network model is the most representative and widely used model in the neural network at present. Its basic principle is to use the concept of the steepest gradient descent method to minimize the error function. It can be used in many applications [18].
Its basic principle is to use the concept of the steepest gradient descent method to minimize the error function, which is suitable for use in the diagnosis, prediction, classification and identification and other applications. This paper uses a backpropagation neural network with one hidden layer for object recognition. Fig. 4 shows a typical model of a backpropagation neural network. The neural network model used in this paper is the backward transfer network, which is a supervised network structure that can be used to identify objects. The neural architecture of this paper is a three-layer architecture, which is the input layer, the hidden layer, and the output layer. For example, the input, hidden, and output layers in Fig. 4 have 4, 5, and 3 nodes, respectively.
To find the optimal solution, the backpropagation neural network uses the training data to adjust the weights of the input and hidden, and output layers in the training process. It may consist of several learning cycles. All training data are used to adjust the weights one by one in each learning cycle. To evaluate the convergence of the neural network, an error function should be used to represent the learning effect. Usually, it is defined as the sum of squared differences between the output and the expected output vectors of training data.
Further, an error tolerance is set to determine whether the training process should be stopped. The training process should be stopped when the error is smaller than the error tolerance. Otherwise, a new learning cycle should be started.
The above training process will be repeated until it has an error smaller than the error tolerance. Suppose the output layer has M training data and N nodes, Tki and Oki are the expected value and the output value of the k th nodes for i th training sample, the error function can be defined as (5).

(5)
The parameter η is the learning rate ranging between -1 and 1. In addition, a dynamic learning rate is set. It is defined as (6). (6) where t is the number of learning cycles.
The learning process is performed one training example at a time until all training examples are learned, which is called a learning cycle. Because it is hoped that the inference output value is as close as possible to the target output value, that is, the value of the error function should be less than a reasonable range. Therefore, the learning can be repeated for several learning cycles until the network reaches convergence. In order to test whether the network converges, the error function can be defined as the mean square error between the target value of the output unit and the inferred value of the output unit to represent the learning quality of the network. Therefore, as the learning period increases, the learning rate is decreased to avoid large changes in the error function. Momentum α is between -1 and 1. It is used to shorten the training time. The setting of the learning rate and the momentum will influence the number of trains in the training stage, and they will be evaluated in the experiment.
It is considered as to find which objects are presented in an image for the object recognition problem. Therefore, we use the features of an isolated object as training data during the training process. Furthermore, the dimension of the output vector and the number of object types are the same. The expected output vector, Tk=(Tk1, Tk2,…, TkN), for the k th training sample is defined as (7). Tki= (7) for k=1, 2, …, N.
After the training, we have the optimal solutions for the weighting matrices and the modification vectors. Backpropagation neural networks are used to identify objects. The extracted features can be represented first. Since the output vector, O = (O1, O2, …, ON), and Oi is the probability of i th single object appearing in the images. Therefore, we can find objects from the images by the output vector, The following steps show the proposed back-propagation neural network with a fuzzy classifier algorithm for twodimensional object recognition.
Step 1. Find the features of the template objects. Find the best weights by training these features using the BPNN algorithm.
Step 2. Find features from the dominant points of objects.
Step 3. Classify features by using fuzzy k-nearest neighbors.
Step 4. Recognize the objects by using these clustered features and the BPNN algorithm.

III. ANALYSIS OF EXPERIMENTS
In order to evaluate the proposed method, we used the nine hand tools in the experiment (see Fig. 5). Furthermore, a recognition algorithm should be robust in translationinvariant, rotation-invariant, and scale-invariant. To test the robustness of its proposed method, an experiment has been conducted. There are four size levels (100%, 75%, 50% and 25%) and eight orientation levels (0, 45, 90, 135, 180, 225, 270 and 315 degrees) for each individual subject in the experiment. Therefore, each object has 32 (=4x8) test images, and the number of training data is 288 (=32x9). Furthermore, image preprocessing and dominant point detection are performed on each image to find its contours and features. The four features to be extracted from contour are the relative distance (RD), length (LE), angle (AN), and reciprocal firmness (RC). To evaluate effects of these four features,  there are 15 different feature extraction settings, namely RD,  LE, AN, RC, RD&LE, RD&AN, RD&RC, LE&AN,  LE&RC,  AN&RC,  RD&LE&AN,  RD&LE&RC,  RD&LE&RC, RD&AN&RC, LE&AN&RC, and RD&LE&AN&RC, used as features for object recognition. During the fuzzy K-nearest neighbor clustering process, the value of K is set to n/2, m=2, in the experiment. The nine single objects are used as training data during training process. The error tolerance is set to e=0.001. The fuzzy Knearest neighbor clustering algorithm classifies the 288 extracted features into several clusters. The center represents the respective clusters. They are considered as inputs to the back-propagation neural network. The obtained weights of the back-propagated neural network during the training process, are used for object recognition.
In addition, to evaluate the robust ability of the proposed method, the clustered features as well as the un-clustered features were tested in the recognition processes. For the unclustered feature approach, in order to use the features as the input vectors in the BPNN, the number of dominant points is constraint to be not larger than 100, and therefore the number of the input nodes is set to 100. Since there are 9 tools, the number of output units is set to 9.
Therefore, the numbers of input units, hidden units and output units are 100, 30, and 9 respectively for the nonclustered feature method. Since the fuzzy clustering process can be used to reduce the number of features to be used in the BPNN, resulting in a reduction in the dimensionality of the input cells. Here, the number of clusters of training objects is less than 40. Therefore, for the clustering feature method, the numbers of input units, hidden units, and output units are 40, 20 and 9 in the BPNN, respectively. Also, cluster features are sorted in ascending order before being considered as input vectors to BPNN. The sorting process can solve the change problem of dominant point detection. The smaller value of the input unit and hidden unit of the clustering feature method means that it requires smaller memory requirements.
In the first experiment, the feature RD&LE&AN was used in the recognition, the learning rates η is set to range from 0.05 to 0.95, and the momentums a is ranging from 0.1 to 0.9. Here, the clustered feature approach is used in the experiment to find the smallest learning cycle in the training stage. Fig. 6 shows the learning cycles under different values of η and a. It is seen that the larger the learning rate is, the smaller the learning cycle is. From this test, we can find the best combination of learning rate and momentum. In addition, the learning cycle has the smallest value 122 when η=0.95 and a=0.70. The BPNN has the best converge rate using this combination. The BPNN uses the un-clustered feature approach and has been trained in a similar way. It can also find the best set of parameters.
After training the back-propagation neural network, in the second experiment, there were 100 testing images in arbitrary orientations and positions for each tool shown in Fig. 5 have been used for evaluating the recognition rates. Therefore, there were 900(=100´9) testing images in the experiment. For convenience, the learning rate is 0.95 and the momentum is 0.7 in the experiment. They have been found to have the smallest value of the learning cycle in the above experiment. Fig. 7 shows the recognition rates and learning periods for different feature sets and feature clustering methods. This shows that the clustering feature method has the best result when using features RD&LE&AN. The non-clustered feature method has the best recognition rate when using features LE&AN&RC.  The results show that the recognition rate can be improved by using two or three features instead of a single feature. Furthermore, the proposed clustering feature method outperforms the non-clustering feature method in Fig. 7(a). Most of the clustered feature methods have a better recognition rate than the non-clustered feature methods. It indicates that the use of fuzzy classifiers can improve the recognition rate. The best recognition rates for the clustered feature method and the non-clustered feature method are about 95% and 89%, respectively.
For the learning cycle shown in Fig. 7b, it can also be seen that the clustered feature method converges faster than the non-clustered feature method.
The reason for the small learning cycle obtained by the proposed method is that the features have already been clustered, and fewer features should be considered as the input of the BPNN. Although the clustered feature method requires the features to be clustered before they can be considered as input vectors in BPNN, it is clear that the proposed method has lower memory storage requirements than the non-clustered feature method. Table I presents the recognition rate of each tool using clustered and non-clustered features. The method using clustered features as the input vector of the back-propagation neural network has greater recognition rates than the method using non-clustered features in the back-propagation neural network. In addition, the results also show that the proposed method has fewer learning cycles. That is, using fuzzy Knearest neighbor clustering to reduce the dimensionality of features can improve the performance of object recognition.
Furthermore, the reason for the misclassification is mainly because of the similarity between tools T7 and T9. It can be seen that the proposed method has the best improvement in recognizing these two hand tools. It is clear from Table 1 that the proposed object recognition method can identify objects effectively.
In addition, in practical applications, it is often necessary to identify a certain object or certain objects from many objects, and due to the image acquisition, some parts of the objects before and after are often covered with each other, and some features are also changed by different images, so the recognition of partially overlapping objects is also a very important topic.
To ensure that the method can recognize partially occluded objects effectively, an experiment has been conducted. Fig. 8 shows an example of a partially occluded object. 100 images were acquired for each occluded tool, so there are 900 (=100x9) test images in the experiment.
From the experimental results shown in Fig. 9, it can be seen that partially occluded objects can be identified by the proposed method effectively.   It is seen that most of the clustered feature methods have also a better recognition rate than the non-clustered feature methods. Fig. 9 indicates that the use of fuzzy classifier can improve the recognition rate. The clustering feature method has the best performance when using the feature RD&AN&RC, while the non-clustering feature method has the best recognition rate when using the feature RD&LE&AN. Furthermore, the best recognition rates for the clustered feature method and the non-clustered feature method are about 90% and 82%, respectively.

IV. CONCLUSIONS
In the research of image recognition methods, most of the traditional methods are in the way of sequential comparison, but this method requires a lot of memory space, and it takes a long time to do the comparison work. This paper uses fuzzy K-nearest-neighbor and back-propagation neural networks to try to solve the problem of identifying single object images and overlapping object images. This paper proposes a back-propagation neural network with a fuzzy classifier algorithm for 2D object recognition. According to the fuzzy K-neighbor cluster analysis method, the extracted features are classified in a timely manner to reduce the amount of data to be processed during identification, and redundant and repeated data are deleted to avoid the repetition of features causing the identification result to be wrong. The features are classified using a fuzzy K-nearest neighbor clustering algorithm. The classified features can reduce the dimensionality of the input vector.
The clustering features are used to back-propagate the input vector in the neural network to identify objects in the second stage. Four features are used for identification to find the best combination. Train a template object to find the optimal weights for a back-propagated neural network. The components of the output vector in the backpropagation neural network represent the probability of the corresponding object in the image. Therefore, the object can be identified as the maximum value in the output vector components. In a backpropagation neural network, we have to find the best settings for learning rate and momentum.
The experimental results show that clustering features can improve the recognition rate and training period. Furthermore, the proposed algorithm has been applied to identify partially occluded objects, and the results show that it is efficient.