key: cord-0060981-boheism0 authors: Xin, Qi; Hu, Shaohai; Liu, Shuaiqi; Lv, Hui; Cong, Shuai; Wang, Qiancheng title: Fruit Image Recognition Based on Census Transform and Deep Belief Network date: 2020-06-13 journal: Multimedia Technology and Enhanced Learning DOI: 10.1007/978-3-030-51103-6_39 sha: a784e947cc235b1a9eec4ca49c1a143d656dce50 doc_id: 60981 cord_uid: boheism0 Fruit image recognition plays an important role in the fields of smart agriculture and digital medical treatment. In order to overcome the disadvantage of the deep belief networks (DBN) that ignores the local structure of the image and is difficult to learn the local features of the image, and considering that the fruit image is affected by the change of illumination, we propose a new fruit image recognition algorithm based on Census transform and DBN. Firstly, the texture features of fruit images are extracted by Census transform. Secondly, DBN is trained by Census features of fruit images. Finally, DBN is used for fruit image recognition. The experimental results show that the proposed algorithm has a strong feature learning ability, and the recognition performance is better than the traditional recognition algorithm. Nowadays, the sharp increase in the amount of image data makes the number of images in general image recognition tasks become larger and larger, which also makes it difficult for traditional methods to meet people's needs. As a new subject in machine learning, deep learning has a lot of achievements in various fields. Compared with the artificial feature extraction method, the data features acquired through deep learning model are more representative of the rich inner information of big data and good features can be learned automatically without manual feature extraction. Therefore, deep learning is the future and will receive more attention in big data analytics [1] [2] [3] . Fruit image recognition plays an important role in the fields of smart agriculture and digital medical [4, 5] . With the rapid development of smart agriculture and digital medical in recent years, fruit image recognition has attracted more and more researchers' attention. In order to meet the needs of large-scale and efficient fruit recognition and classification, researchers began to identify fruit image with different algorithms. For example, in [6] , the authors proposed a fruit classification based on six layer convolutional neural network and the classification accuracy is higher than that of traditional single feature. In [7] , the authors present an automatic fruit recognition system for classifying and identifying fruit types and which is capable of automatically recognize the fruit name with a high degree of accuracy. And in [8] , the authors proposed Kiwifruit recognition method at night based on fruit calyx image, whose recognition rate reached 94.3%. And in [9] , the authors proposed a fast and accurate object recognition method especially for fruit recognition to be used for mobile environment. They combined color, shape, texture and intensity into their associated code fields to generate an object code that could be used as a search key for the feature database. And in [10] , the authors given a fruit recognition method via image conversion optimized through evolution strategy principal component analysis and achieved a better recognition effect through the pretreatment, training and recognition of fruit images, with an average recognition rate of over 92%. In general, the process of fruit image recognition system mainly focuses on pretreatment and feature extraction. In this kind of recognition system, fruit image acquisition is mostly conducted by placing the collected fruit in a strictly defined background in order to ensure that the recognition system is less interfered by the outside world, so as to improve the recognition accuracy of the system. However, the image in the actual environment is easily affected by the factors such as illumination change, fruit reflection and shielding, which in vary degrees impact the recognition accuracy of fruit image. In fruit recognition system, fruit features mainly include odor, color, shape and texture. While in the process of growth, different environment will lead to difference in shapes, sizes and colors. In addition, natural light intensity and shadow will also be different when fruit images are collected, which will affect the accuracy of image recognition. What's more, the complexity of the color and texture features of fruit images also makes the recognition more difficult. Therefore, better recognition algorithms are needed to solve this problem. As a representative method in the deep learning, DBN is quite different from the previous algorithms in terms of training method structure. By adopting the idea of layered training, the training speed of DBN is greatly improved [11, 12] . In addition, the idea of layering also increases the system's ability to express complex functions. DBN usually takes pixel-level images as input and extracts the abstract features of the input images from bottom to top, from simple to complex, which is a process of automatic mining useful information in the data. However, general pixel-level images are easily affected by illumination and other factors, which affects the extraction of essential features of input samples in DBN. In order to improve the fruit image recognition performance of DBN under different illumination, we propose a new method combining Census transform with DBN to extract the texture features of images through Census transform to eliminate the influence of ununiform lighting on the recognition results. Census transform [13] is a locality-based non-parametric transformation algorithm, which is mainly used to characterize the local structure features of images and can well detect the edge and corner features of images. It uses the relationship between the gray value of the neighborhood pixel and center pixel as similarity base to eliminate the influence of ununiform lighting. And, the implementation of non-parametric transformation is simple and real-time, which has been widely popularized and applied in engineering implementation. The algorithm flow is shown in Fig. 1 . In Fig. 1 , a pixel matrix is usually taken as the mask, whose central gray value is 127 and the gray value of each neighborhood pixel is shown in Fig. 1 . We can get an 8bit binary bit string through sequentially combine the eight digits obtained by the gray value comparison (from top to bottom, from left to right). Convert the binary bit string into a decimal number, and the gray value (whose range is [0, 255]) of the center pixel after Census transform is calculated. Census transform replaces the original gray value of a pixel with its Census transform value, making these Census transform values mutually constrain and correlate [14] . These correlations make the Census transform value imply global feature information by passing local potential edge information to each other. In addition, the transformed values store information in a certain order, preserving the texture structure information between the local neighborhoods of the image, so that the global and local features of the image are not damaged, and the transformed features are easy to distinguish [15] . Therefore, we extract fruit features with the above algorithm. Deep belief network [16] is a deep learning model with efficient learning algorithm proposed by Hinton. It combines unsupervised and supervised machine learning models and has become the main framework of deep learning algorithms since then. Deep belief network is superimposed by several restricted Boltzmann machines (RBM). Restricted Boltzmann machine [17] (RBM) is developed from a generated random neural network Boltzmann machine based on the principle of statistical mechanics. RBM consists of two layers that are visible layer and hidden layer. There is no connection within the layer and full connection between layers in the connection of neurons. As shown in Fig. 2 , the entire network of RBM is a bipartite graph, where v ¼ ðv 1 ; v 2 ; . . .; v n v Þ T and h ¼ ðh 1 ; h 2 ; . . .; h n h Þ T represent the state vector of the visible layer and the hidden layer, v i represents the state of the i-th neuron in the visible layer, h j represents the state of the j-th neuron, n v and n h represent the number of neurons contained in the visible layer and hidden layer, respectively. Let a ¼ ða 1 ; a 2 ; . . .; a n v Þ T 2 R n v and b ¼ ðb 1 ; b 2 ; . . .; b n h Þ T 2 R n h respectively represent the bias vectors of the visible layer and the hidden layer, a i represents the bias vector of the i-th neuron in the visible layer and b j represents the bias vector of the j-th neuron in the hidden layer. And W ¼ ðw i;j Þ 2 R n h Ân v represents the weight matrix between the hidden layer and the visible layer and w i;j represents the connection weight between the i-th neuron in the hidden layer and the j-th neuron in the visible layer. Even if the parameters of the model w ij , a i and b j can be obtained through training, we still cannot effectively calculate the distribution determined by these parameters. However, the special structure based on the RBM model, that is, the visible layer and the hidden layer, are conditionally independent. When the state of all neurons on the visible layer is given, the probability that a certain neural unit on the hidden layer is activated (that is, the value is set to 1) is where rðÁÞ is the Sigmoid activation function and it is defined as Similarly, we can get The relationship between the values of all nodes in a layer and the values of a single node is: Pðv i jhÞ ð 4Þ The main task of RBM learning is to get the value of the optimal parameter h ¼ fa i ; b j ; w ij g to fit the distribution of the given data sample. Assuming the training data sample to be S ¼ fv ð1Þ ; v ð2Þ ; . . .; v t g, we usually obtain the RBM parameters h ¼ fa i ; b j ; w ij g by maximizing the RBM logarithm likelihood LðhÞ on the training data sample. The specific formula is as follows: log Pðv ðiÞ jhÞ ð 5Þ In order to obtain the optimal parameters h, we use the stochastic gradient ascent method to find the maximum of P t i¼1 log Pðv ðiÞ jhÞ. The parameter setting rules adopted in this paper are as follows: the size of the fruit image in the training and test data is 32  32, and the pixels of image is directly taken as the input of DBN, and the input layer of DBN is defined as 1024 units. In order to improve the computational efficiency and reduce the sampling error of the gradient estimation, it is necessary to use batch learning and divide the data set into small batches containing dozens or hundreds of samples in advance. Due to the limited data samples, each batch of data in this chapter is set to 10 samples. The fine-tuning learning rate of RBM is set as 0.01 and the number of iterations is set as 100. The activation function adopted by the neural network is Sigmoid function. As DBN can only be used for feature learning instead of classification function, we connect a softmax regression classifier at the last layer to classify the abstract feature attributes (network weight optimality of the hidden layer) that DBN eventually learns. The experiment shows that when the number of DBN layers is 2 and the number of hidden layers is 500, the identification accuracy of the whole model is the highest. Based on the advantages of Census transform and deep belief network, we propose a fruit image recognition algorithm based on Census transform and deep belief network. The flow chart of the algorithm is shown in Fig. 3 . The detail steps of the proposed algorithm are as following: Firstly, the texture features of fruit images are extracted through Census transform to eliminate the influence of light changes on feature extraction. Census transform can effectively retain the local features of the image, which makes deep belief network effectively learn the local features of the image and reduce the learning of adverse feature description to deep belief network; Secondly, the obtained Census features of the fruit images were used to train the depth belief network and obtain the relevant parameters of the network; Finally, the deep belief network is used for fruit image recognition. Since the fruit images collected in our paper were collected under three different lighting environments and the fruit reflected light under different lighting conditions, the recognition results of DBN network were affected by ununiform lighting. In order to eliminate the influence of ununiform lighting. Figure 4 is an example of the illumination invariance of the fruit image obtained by our fruit data. Although there are illumination variations in Fig. 4 (a) and Fig. 4(b) , the renderings obtained by Census transform are almost the same in Fig. 4 (c) and Fig. 4(d) . In order to better illustrate the effectiveness of the proposed algorithm which named Census+DBN, different recognition algorithms are used for fruit image recognition, such as CNN proposed in [6] , DBN proposed in [17] and Census+Softmax proposed in [20] . The experimental results are shown in Table 1 . It can be seen from Table 1 that CNN and Census+DBN have achieved good recognition results in fruit image recognition, with recognition accuracy rates 0.94 and 0.98, while the false recognition rates are 0.84% and 0.17% respectively, which indicates that Census+DBN is as robust to light as CNN. Compared with the recognition results of Census+Softmax method, it shows that the deep learning method has a strong feature learning ability and the recognition performance is better than the traditional recognition algorithm. In the two deep learning methods adopted in this paper, although CNN achieves a recognition accuracy rate of 0.94 and false recognition rate of 0.84%, the training time of CNN is too long. One iteration of CNN spends 160 s without GPU acceleration in our experiment. Although convergence of CNN can be achieved with a few iterations, it cannot meet the requirements of practical application. Although DBN takes 2.57 s for one iterate, illumination variation will affect DBN's feature extraction of images and the recognition performance of the algorithm. In this paper, we effectively eliminate the influence of uniform lightning of the image by using Census transform. At the same time, the recognition performance is better than CNN and the iteration time is 1.75 s since the Census transform is a high-speed transform without parameters. So, the proposed fruit image recognition by combining Census transform and DBN can satisfy the requirement of real time application. In this paper, we propose a new fruit recognition algorithm by combing Census transform and DBN. Census transform is adopted to extract texture features from fruit images and the extracted features are taken as the input of DBN. The feasibility and effectiveness of the method are proved through experiments. However, in our method, we do not consider other features of fruits images such as color, shape, texture and intensity. In the future, we can add these features to improve the performance of our algorithm. Multi-focus image fusion based on residual network in nonsubsampled shearlet domain Convolutional neural network and guided filtering for SAR image denoising Fast density peak clustering for large scale data based on kNN. Knowledge-Based System Date fruit dataset for intelligent harvesting Nutrients balance for hydrogen potential upgrading from fruit and vegetable peels via fermentation process Fruit classification based on six layer convolutional neural network Automatic fruit classification using random forest algorithm Kiwifruit recognition method at night based on fruit calyx image A code based fruit recognition method via image convertion using multiple features A fruit recognition method via image conversion optimized through evolution strategy Reducing the dimensionality of data with neural networks Machine learning & artificial intelligence in the quantum domain: a review of recent progress Non-parametric local transforms for computing visual correspondence A novel non-parametric transform stereo matching method based on mutual relationship An efficient implementation of a census-based stereo matching and its applications in medical imaging Fruit recognition from images using deep learning Deep belief network modeling for automatic liver segmentation Representational power of restricted Boltzmann machines and deep belief networks Exploring strategies for training deep neural networks Automatic fruit and vegetable recognition based on CENTRIST and color representation