key: cord-0060961-k3f7os3p authors: Juxiao, Zhang; Xiaoqin, Zeng; Zhaosong, Zhu title: Multi-touch Gesture Recognition of Braille Input Based on RBF Net date: 2020-06-13 journal: Multimedia Technology and Enhanced Learning DOI: 10.1007/978-3-030-51103-6_35 sha: 9ac73aeadd0e03159a05e6a7e31ea5ee7fcdfccb doc_id: 60961 cord_uid: k3f7os3p One challenging task for the blind is to input Braille while by no way could they sense the location information on touch screens. The existing Braille input methods are suffering from problems including inaccurate positioning and lack of interactive prompts. In this paper, touch gestures are recognized by trained RBF network while combined gestures are modelled. By doing so, the Braille input concerning multi-touch gesture recognition is then implemented. The experimental results show that the method is effective and blind people can friendly input Braille with almost real-time interaction. In recent years, although many improvements have been made in the availability of smartphones and other electronic devices for the visually impaired, there is still a huge gap between blind people and touch-screen devices in human-computer interaction. At present, the output stage mostly uses Text to Speech (TTS) technology or external point input devices [1] , the main input methods of touch-screen mobile phones include Pinyin [2] , stroke [3] , speech recognition [4] and on-line handwriting recognition input methods [5] . E.g: Multi-touch input method for the blind [1] , Single-Finger touch gesture input method [6] , Touch action recognition input method [7] . Other common Braille input methods on touch-screen include Perkinput [8] , TypeInBraille [9] and BrailleTouch [10] , etc. Blind touch screen input method has two major shortcomings: one is precise localization; the other is an interactive prompt barrier. The biggest difficulty for blind people to use the touch screen is that they cannot get the location information on the screen, and they cannot obtain a series of operations such as visual focus [11] . The keys of the virtual keyboard used in the input method are too close to each other, which also lead to an increase in the rate of input error [12] . Interactive prompt accessibility barrier means that the time and focus of fingers sliding on the touch screen cannot be fed back to blind persons. A Braille Input concerning Multi-Touch Gesture Recognition based on RBF Net is designed. This input method integrates Braille phonetic notation, which makes it possible to complete Braille input and output through voice input and Text to Speech (TTS) conversion. Besides, when the Braille gesture is inputting on the touch screen, it reduces the interference of orientation. It relates the high inherent logicality of Braille, which makes it easy for blind people to learn and remember. It is suitable for blind people to input Braille information on touch-screen devices, providing a new method for the human-computer interaction between blind people and touch screen. The main algorithm modules include: multi-touch area loaded on electronic devices and the multi-touch area graphic information library module generated by this multitouch area, multi-touch area graphic information library recognition module and Braille input module and so on. The basic graphics feature of multi-touch area is established based on the four-point position features at the top of Braille and the graphic information of multi-touch area. The graphic information library of multi-touch area is constructed by adding one or two-finger touches to the basic graphics feature, and the correspondence table between graphic information of the multi-touch area and the Braille points is generated. Based on the inherent logic of Braille points, the Braille point positions composed of points 1, 2, 4 and 5 are established, with points 3 and 6 points kept empty. The graphics shapes of these four points are taken into account when establishing the Braille points, and the multi-touch area graphic information library is established based on the regional information composed by the above four points. In order to reduce the interference of noisy graphical information caused by touch and improve the degree of distinction between graphical information in the multi-touch area, finger movement in touch area should follow the regulated order principle. Rule 1: The principle of sequence uniqueness. Where X is a finite state set, Rule 2: The sequence of touch sliding is from top to bottom and from left to right. Where X is a finite state set, Continuous sliding: When touching the screen with both left and right hands, pressing and holding or sliding over the multi-touch area to form graphical information, the module collects the input points and area to form a graphical structure, thus draws the features of multitouch area graphical information. Moreover, it compares the graphical information features with those in the multi-touch graphics area information library and looks up the Braille comparison table to extract the Braille. 3 Multi-touch Gesture Recognition Algorithm A touch gesture action is a description of a single point touch, which can be described by a quadruple data structure [13] : Where: ID ¼ 1; 2; 3; 4; 5 f grespectively represent the touchpoint identification numbers of different fingers in one hand. Each touchpoint at a certain moment has a unique ID number, so as to record the trajectory and direction of the same touchpoint. Touchstate ¼ 0; 1 f g represents the direct touch state between the finger and the touch screen, Touchstate ¼ 1 represents touch state, and Touchstate ¼ 0 represents the not-touch state. Coordinate ¼ P x; y ð Þ represents the coordinates of the touchpoint. When Touchstate ¼ 1, the touch screen records a series of coordinate data by IDs according to touchpoint detection and touchpoint tracking. When Touchstate ¼ 0, it will be considered as noise data and the coordinate information will be meaningless. Time is the time data when gestures are recorded, which is acquired by reading the system time. It represents the time parameter when the state or position of the touchpoint changes. Multi-touch gestures consist of touch gesture element actions, which can be described by a quintuple data structure [14] : Where: Number ¼ 1; 2; . . . f grepresents the number of touchpoints tracked by touchpoint detection, usually, 0 Number 5 ð Þ . The other parameters have the same meanings as Metagesture. Touch gestures on the touch-screen are recorded in the form of data, but it is impossible that the time and area scope of all actions are consistent, besides, there will be noise data, which need to be preprocessed. Preprocessing includes noise elimination, absorption of gesture graphics deformation, compression of redundant information and so on. Network training cannot be carried out if the data volume of different touch gestures is inconsistent. Therefore, the data of touch gesture actions need to be normalized. Normalization is to take out the same number of points for all touch gestures according to different time differences. After repeated experiments, 15 sampling points are selected from A to B and are divided equally according to time. The obtained information of the 15 sampled points are stored as 14 two-dimensional vectors and are normalized to obtain the normalized feature vectors of multi-touch gestures, which are used as the input into the RBF network. The multi-touch gesture action is composed of several gesture element actions arranged and combined in chronological order. The feature matrix of the touch gesture of m touchpoints is as follows: The collected multi-touch gesture information is classified and merged based on their features so as to extract the graphics with corresponding feature information. We first determined the number of touchpoints m; then we found the matching gestures in the database through the gesture features. Then we distinguished the actions of doubleclick and clicking twice as well as long-press and sliding actions by combining with the given threshold function. Sliding Displacement Threshold Expression: ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi Time Threshold Expression: If the Gaussian function is used as the radial basis function (RBF), the expression of the radial basis function of the i th hidden layer is as follows: The Gaussian function is adopted as the radial basis function: The width of the RBF network is determined by r. Let d be the maximum distance of the samples and let N be the number of samples, then the width is determined by fixation method. Following the previous data preprocessing steps, the module sends the input vectors of touch gestures into the trained RBF Net for classification. If the output vector of RBF is closest to the standard output of a gesture in the gesture library, it is identified as the desired gesture [15] . Eight students were organized to form a user test group. (1) Availability: the function of inputting Braille on the electronic device of touchscreen is realized. The issue of text information interaction between blind persons and touch-screen is addressed. (2) Simplicity: points 1, 2, 4, 5 in Braille points constitute a graph which is highly similar to Braille points. There are fewer basic graphics, and the inherent logic in Braille points distribution is retained in the expanded graphics, therefore it facilitates blind persons to learn and remember. (3) Efficiency: multiple touches are operated and one block of Braille is input in one operation. Braille Points 3 and 6 are totally dependent on increasing finger touches, which is highly similar to the internal logic structure of Braille. This reduces the time for transformation and thinking when inputting, thus improving the efficiency of inputting Braille. Design and implementation of multi-touch input method for blind usage Intelligent Chinese input method based on android The Design and Implementation of Cross-Platform Stroke Input Method Engine Research and implementation of a voice control audio system based on Android speech recognition Handwritten Chinese character recognition system based on neural network convolution depth A Braille Input Method Based on Gesture Recognition A Method and Device for Output and Input of Braille Characters on Touch Screen No-look flick: single-handed and eyes-free japanese text input system on touch screens of mobile devices. human computer interaction with mobile devices and services TypeInBraille: a braille-based typing application for touchscreen devices Brailletouch: mobile texting for the visually impaired No-look flick: single-handed and eyes-free japanese text input system on touch screens of mobile devices Proficient blind users and mobile text-entry Human-Computer Interaction Research Based on Multi-touch Technology Research on multi-touch gesture analysis and recognition algorithm Interaction gesture analysis based on touch screen