A waveletbased expert system for digital subscriber line topology identification INTERNATIONAL JOURNAL OF COMMUNICATION SYSTEMS Int. J. Commun. Syst. (2014) Published online in Wiley Online Library (wileyonlinelibrary.com). DOI: 10.1002/dac.2795 A wavelet-based expert system for digital subscriber line topology identification V. D. Lima1,*,† , A. Klautau2, J. Costa1, K. Ericson3, A. Fertner3 and C. Sales1 1Applied Electromagnetism Laboratory, Federal University of Pará (UFPA), 66073-900 Belém-PA, Brazil 2Signal Processing Laboratory, Federal University of Pará (UFPA), 66073-900 Belém-PA, Brazil 3Ericsson AB, 164 80 Stockholm, Sweden SUMMARY This work proposes a new method for automatically identifying topologies of lines with one or more sections in a telephone network. The method is based on the examination of both impulse response and time-domain reflectometry trace of a line under test. They are analyzed using a method based on the wavelet transform that identifies and extracts features that contain information about the line topology. Those features are interpreted by an expert system composed of three sequential modules that estimate, respectively, the type of line makeup (serial or bridge tap), the lengths of the line sections, and the corresponding cable type, which are the parameters that completely identify the topology according to the assumed model. A thorough comparison with two state-of-the-art methods is also presented using several twisted-pair copper cables. The results show that the proposed method provides good accuracy with respect to topology identification at low computational cost. Copyright © 2014 John Wiley & Sons, Ltd. Received 7 February 2013; Revised 13 January 2014; Accepted 21 March 2014 KEY WORDS: DSL; line topology identification; continuous wavelet transform; expert system; double- ended line testing; single-ended line testing 1. INTRODUCTION The wireline broadband access currently represents a large share of the telecommunication industry. In the end of the first quarter of 2012, there were about 612.6 million fixed lines in the world, representing 11.5% growth in relation to the same period in the previous year. The digital subscriber line (DSL) continues to be the predominant technology [1]. In general, the use of the DSL is so attractive because a relatively fast and cheap procedure for service deployment can be used once the service is provided through an already existing infrastructure, the copper access-network. However, the quality of service is strongly dependent of the physical characteristics of the network, which was originally designed for plain old telephone service (POTS). Features such as bridge taps, which might not affect POTS, are impairments for DSL. In addition, the knowledge about the network physical structure and consequently its performance is often incomplete or nonexistent [2]. The topology of a telephone line can be made up of one or more sections with different char- acteristics. The identification of these sections can be performed through the analysis of line measurements that can be categorized as follows: single-ended line testing (SELT), which requires measurement equipment only in one line end and double-ended line testing (DELT), which requires equipment in both terminals [3]. It should be highlighted that, even if DELT is not available, there exist techniques to estimate the frequency response through SELT measurements [4, 5]. *Correspondence to: V. D. Lima, Applied Electromagnetism Laboratory at the Federal University of Pará (UFPA), 66073-900 Belém-PA, Brazil. †E-mail: vinicius@ufpa.br Copyright © 2014 John Wiley & Sons, Ltd. V. D. LIMA ET AL. Having a reasonable line topology estimate, it is possible to evaluate the characteristics of the connection that may impact the quality of service. This gives to the service provider the oppor- tunity of improving the network structure. Further, once the line topology estimate is obtained, line monitoring and fault detection are simplified by comparisons of new tests with the results obtained previously. In [6–9], time domain reflectometry (TDR) is discussed and an iterative method, called here SELT-tdr, based on maximum likelihood estimation is proposed. An identification of the twisted- pair line sections is achieved through the comparison of a time domain signal within selected time intervals with curves generated using a line model. The accurate estimation of these intervals is essential. However, a method to detect the intervals is not clearly defined. In [10–13], the one-port scattering parameter S11.f / is measured in frequency domain. After measurement, calibration and pre-processing, S11.f / is analyzed in time domain, where important extracted features are inter- preted through a system based on Bayesian networks for estimating the line topology. In [14], it is proposed a combined use of complementary code based correlation time domain reflectometry (CTDR) and frequency domain reflectometry (FDR). First, CTDR measurements are used to esti- mate the line discontinuities, as an initial guess, and after a FDR-based optimization method is used to refine the estimate. In [15], the limitations of SELT are theoretically analyzed and it is concluded that the structure of a network cannot be always identified without ambiguity using only SELT. In [16], a method based on a genetic algorithm (GA), called TIMEC, analyzes frequency response H.f / and S11.f / with a multi-objective criterion using a line model as reference for line topology identification. The method permits using a priori knowledge about the line. On the other hand, the GA implies a relatively high computational cost when compared with other methods. In [17], an algorithm that compares each DELT measurement with calculated impulse response of a “canonical” set of line topologies is presented. In this paper, we propose an approach that combines SELT and DELT information for topology identification of lines with one or more sections. It is assumed that a line topology can be completely identified by determining three essential parameters of each one of its sections: length; type of cable; which can be defined by its physical properties; and the type of section that can be serial or bridge tap. In order to estimate these parameters, the impulse response and the TDR trace are estimated from DELT and SELT measurements, respectively. Afterwards, they are processed using a wavelet transform to identify and extract features that contain information about the line topology. It must be highlighted that this phase is crucial to the accuracy of the method. The extracted features are then applied to an expert system that aims at correctly identifying the topology parameters. This paper is organized as follows. An overview about how the signals of interest can be interpreted is presented in Section 2. In Section 3, the wavelet-based method for characterizing and extracting features from the signals measured in a line is described. Section 4 describes the expert system proposed for automatically interpreting the signal features and obtaining an esti- mate of the line topology. Identification results, analysis, and comparison with the state-of-the-art methodologies are presented in Section 5. The final remarks are made in Section 6. 2. CHARACTERISTICS OF THE MEASURED SIGNALS Identifying the topology of a given telephone line means identifying the connected line sections that bind the central office (CO) to the customer-premises equipment (CPE). These sections can be made of different cable types connected in cascade (serial sections) or in parallel (bridge taps). This way, the topology of a telephone line can be represented as a set ‚ D¹�1; �2; : : : ; �nsº ; (1) where the subset �k, with k D 1;2; : : : ;ns, represents the k-th line section from CO to CPE or, respectively, port 1 to port 2, considering the line as a two-port network, and ns is the number of sections that composes the line. Each subset �k is represented by the parameters �k D¹�k; lk;�kº ; (2) Copyright © 2014 John Wiley & Sons, Ltd. Int. J. Commun. Syst. (2014) DOI: 10.1002/dac A WAVELET-BASED EXPERT SYSTEM FOR DSL TOPOLOGY IDENTIFICATION where �k 2 ¹bridge tap; serialº, lk, and �k are, respectively, the type of section, its length, and the type of cable that constitutes the k-th line section. For the majority of previous works about line topology identification [8–10, 12, 16], finding �k corresponds to estimate the number of American wire gauge that best fits to the measured data for the k-th section. However, a variety of parameters is required to properly describe a cable type [18] and ideally �k should be the set of nominal physical properties that identify a specific cable. The following discussion will suggest how the signal of interest can be interpreted in order to estimate ‚. The VUB0 model [18] is used to simulate these signals because it is a causal model and is defined as a function of physical properties of the line. The configuration of values for the parameters used in the model is based on [19] and �k is assumed here to be represented by the conductor diameter. In this work, the signals of interest are the impulse response and the TDR trace. The impulse response can be obtained from the inverse Fourier transform of the frequency response. The TDR trace can be obtained from the input impedance Zin.f / through R.f / D Zin.f / Zin.f /CZs Vs.f /; (3) where Zs is the output impedance in the signal source, Vs.f / is the signal generated by the source, in general a rectangular pulse, and R.f / is the resulting voltage in the connection between the source and the line. The TDR trace r.t/ can be obtained by the inverse Fourier transform of R.f /. 2.1. Interpreting impulse responses The voltage reflection coefficient � is defined as the ratio of the reflected voltage V� to the incident voltage VC and can be written as � D V� VC D ZA �ZB ZA CZB ; (4) where ZA and ZB are the characteristic impedances of the line sections after and before the discontinuity, respectively. The voltage transmission coefficient T is expressed by T D 1C�: (5) The characteristic impedance of a metallic line at high frequencies is approximately a pure resis- tance [20]. This implies that reflection coefficient is real-valued and 1 > � > �1, whereas T is always positive. Hence, the impulse response is composed of a series of positive (or upward) pulses delayed in time. Each succeeding pulse is resulting from a different ‘path’ taken by the input signal into the line. Let us consider the topology illustrated in Figure 1(a) and assume that the test signal is injected by a source connected to port 1 and measured in port 2. Figure 1(b) depicts four possible paths in which parts of the input signal can follow from port 1 to port 2, namely by NA, NB, NC , and ND. Figure 1(c) shows the resulting simulated impulse response h.t/. The arrival times of the pulses resulting of each path are, respectively, represented by t NA, t NB, t NC , and t ND. The length Li of the i-th path can be calculated by the time of arrival ti of the i-th pulse by Li D tivp, where vp is the velocity of propagation of the transmitted signal, which is assumed constant here. The first pulse to reach port 2 in t NA is resulting from path NA, the straight path from port 1 to port 2 of length LT , which is an important parameter in the topology characterization. The presence of each bridge tap originates a new path and consequently a new pulse in h.t/. In Figure 1(b), the paths NB and NC are composed of bridge taps. The length of the bridge taps, in the third and fifth sections, can be calculated by l3 D � t NC � t NA � 2 vp and l5 D � t NB � t NA � 2 vp: (6) Unfortunately, it is not possible to determine the position of bridge taps using only the impulse response. If the position of a bridge tap is changed, the path length traveled by the signal remains the same, arriving with the same shape in the receiver. Copyright © 2014 John Wiley & Sons, Ltd. Int. J. Commun. Syst. (2014) DOI: 10.1002/dac V. D. LIMA ET AL. (a) (b) (c) Figure 1. In (a), a topology of six sections with bridge taps in the third and the fifth sections. In (b), four paths are defined by the impedance mismatches. In (c), the simulated impulse response for the topology in (a) is illustrated. Positive pulses can also be generated by multiple reflections. The path ND traveled by the pulse that reaches the receiver in t ND is an example of a pulse generated by multiple reflections. In addition, connections among serial sections with different cable types typically do not generate perceptible modifications in the resulting signal. 2.2. Interpreting TDR traces For interpreting TDR traces, it is assumed here that port 2 is an open-circuit. The essential features in a TDR trace are defined by three main types of line connections: serial to serial, serial or bridge tap to open termination, and serial to bridge tap. Each one is discussed in the sequel. 2.2.1. Serial to serial. The connection between two serial sections of different types of cable is characterized by a subtle change in the signal decay pattern. Figure 2 shows TDR traces simulated from topologies with two serial sections. As showed in [6], when the previous impedance is smaller than the subsequent, � is positive. The opposite happens when larger impedance is connected with a smaller one. The arrival times tA and tB of the signal features generated by the impedance that mismatches A and B, respectively, are represented in Figure 2. The time of arrival tA can be used to estimate the Copyright © 2014 John Wiley & Sons, Ltd. Int. J. Commun. Syst. (2014) DOI: 10.1002/dac A WAVELET-BASED EXPERT SYSTEM FOR DSL TOPOLOGY IDENTIFICATION Figure 2. Comparison among TDR traces simulated from topologies of two sections. The input signal is a rectangular pulse with 1 �s of duration and 10 V of amplitude. The output impedance of the source is 50 ohm. Figure 3. TDR trace simulated for a line with one bridge tap. The reference TDR trace has l1 D 2000 m and � D 0:5 mm. location of point A, a serial to serial connection, through l1 D tA 2 vp: (7) 2.2.2. Serial or bridge tap to open termination. For an open-circuited termination (� D 1), if the input pulse is positive, the reflected pulse is also positive. In the example depicted in Figure 2, the pulse related to termination B reaches port 1 in tB. 2.2.3. Serial to bridge tap. A bridge tap can be considered in parallel with the equivalent impedance of the following sections which means that � < 0. Consequently, if the input pulse is positive, a reflected pulse generated by a serial to bridge tap connection is negative [6]. Figure 3 shows a TDR trace simulated for a topology with one bridge tap. The position of the bridge tap can be obtained from the arrival time tA of the negative pulse via l1 D tAvp=2. The open-circuited termination of Copyright © 2014 John Wiley & Sons, Ltd. Int. J. Commun. Syst. (2014) DOI: 10.1002/dac V. D. LIMA ET AL. the bridge tap originates a positive pulse. The interval between this positive pulse and the negative pulse generated by the bridge tap beginning can be used to estimate the bridge tap length using l2 D .tB � tA/ 2 vp: (8) The pulse of arrival time tC is related to the line termination C . However, in this case, it is not possible to differentiate the positive pulses without a priori information. Additionally, spurious pulses can rise because of multiple reflection of input signal between the impedance mismatches. Pulses and decay variations are very important signal events generated by topology mismatches. If properly identified, these events can be used to infer the topology parameters of a given line. In several signal processing applications, the boundaries of events and the transition between states indicating important features may be indicated through detection of edges [21–28]. Because r.t/ and h.t/ are typically continuous and without eccentricities, edges can be detected as isolated sin- gularities, which are points at which a given function is not defined, or it fails to be well-behaved in some particular way, such as differentiability or continuity. This section discussed how the signals can be interpreted, but this requires detecting the events of interest via the associated singularities. Wavelets are used for estimating these singularities, as will be discussed in the next section. 3. WAVELET-BASED FEATURE EXTRACTION This work proposes a method for extracting features of r.t/ and h.t/ based on wavelet transforms, which provide a powerful multiscale representation of signals. The method consists of two steps: (i) detecting the edges in the signals and (ii) classifying these edges as pulses (positive or negative) or decay variations. 3.1. Singularity detection with wavelets The continuous wavelet transform (CWT) of a given continuous and square-integrable function f.t/ can be computed through the convolution of f.t/ with a set of dilated wavelets. A mother wavelet, or simply wavelet, is a function continuous both in time and frequency domain with a zero average, normalized as k k D 1, and centered at t D 0. A family of wavelets is obtained by scaling by s and translating by u the function u;s.t/ D 1 p s N � t �u s � ; (9) where N represents the complex conjugate of . The CWT of f.t/ at time u and scale s, computed with respect to the wavelet .t/, is given by Wf.u;s/ D C1Z �1 f.t/ u;s.t/dt D f.u/� s.u/ (10) with s.t/ D 1 p s N � �t s � : (11) One of the most crucial features of a wavelet in the evaluation of the local regularity of a signal is the number of its vanishing moments. If the wavelet has N vanishing moments defined by C1Z �1 tk .t/dt D 0 for 0 6 k < N; (12) Copyright © 2014 John Wiley & Sons, Ltd. Int. J. Commun. Syst. (2014) DOI: 10.1002/dac A WAVELET-BASED EXPERT SYSTEM FOR DSL TOPOLOGY IDENTIFICATION and a compact support, then can be written as the N -th order derivative of a function ˇ also of compact support such that [29] .t/ D .�1/N dNˇ.t/ dtN with C1Z �1 ˇ.t/dt ¤ 0: (13) Thus, by using ˇ, the resulting CWT is a multiscale differential operator of order N , and (10) can be rewritten as Wf.u;s/ D sN dN duN Œf.u/�ˇs.u/� (14) with ˇs.t/ D s �1=2 Ň .�t=s/. If has order N D 1, the wavelet modulus maxima are the maxima of the first-order derivative of f smoothed at the scale s by ˇs.u/. The term modulus maximum describes any point .u0;s0/ such that jWf.u;s0/j is a local maximum at u D u0. A maxima line is any connected curve s.u/ in the plane .u;s/ along which all points are modulus maxima. The singularities are detected by following the wavelet transform local maxima from coarser to finer scales and finding the abscissa where the wavelet modulus maxima converge in the form of maxima lines [29]. Thus, the proposed edge detection method is summarized in the following steps. 3.1.1. Calculus of the CWT. It is not guaranteed for any that any modulus maximum located at .u0;s0/ belongs to a maxima line propagating to the finest scale. In [29], it is proved that if ˇ is Gaussian, the modulus maxima of the CWT of any continuous and square-integrable function f.t/ are in maxima lines that are not interrupted when the scale decreases. Then, the wavelet function used here to compute the CWT of the signals under analysis is the normalized first derivative of the Gaussian function. Derivatives of other orders can also be used to detect the singularities; however, the usage of the derivative of the first-order is simpler than higher orders. 3.1.2. Local modulus maxima and thresholding. In the calculus of local modulus maxima, spurious can be generated by numerical errors, mainly in regions where the CWT is close to zero. The strategy adopted here to remove the spurious and, at the same time, to control the detection sensitivity is based on the definition of thresholds at each scale of the CWT. However, we should not define one unique threshold for all CWT scales because each scale contains a different energy content of the signal under analysis. The threshold needs to be proportional to the average amplitudes in the scale, so that the algorithm eliminates small variations. The arithmetic mean and the variance of each scale can be used to assure that the significant maxima will be retained by the algorithm. Thus, the proposed threshold for each scale of the CWT is given by �i D ks 2 6664 nP jD1 ˇ̌ Wf � uj ;si �ˇ̌ n C 2u;si 3 7775 ; (15) where si is the i-th scale, n is the number of the elements in the i-th scale that is equal to the number of samples of the signal f , and 2u;si is the variance of the coefficients at scale si. The constant of sensibility ks is the parameter used to provide control over the sensibility in the search for local extrema. For instance, setting ks D 0:5 duplicates the sensibility and ks D 2 decreases by half the default sensibility. A hard thresholding is applied to each scale of the CWT, that is, zero is set to all coefficients whose absolute value is lower than �i. 3.1.3. Singularity detection. From the ‘thresholded’ CWT, the maxima lines are found. The sin- gularities are detected when the abscissas where these lines converge at finest scales are found. Copyright © 2014 John Wiley & Sons, Ltd. Int. J. Commun. Syst. (2014) DOI: 10.1002/dac V. D. LIMA ET AL. (a) (b) Figure 4. In (a), a pulse containing two singularities, detached by circles, is showed. In (b), its correspondent CWT with the local maxima converge to abscissas that correspond to singularities. Figure 4(a) shows a rectangular pulse whose singularities are detached by circles. Figure 4(b) shows its correspondent CWT with the local maxima converging to abscissas that correspond to singularities. The local modulus maxima are the extrema that can be both peaks and valleys. Peaks are related to variations from lower amplitudes to higher ones and, conversely, valleys are related to variations from higher to lower amplitudes. Using this information, rise edges can be identified by peaks in the finest scale in CWT and fall edges by valleys. 3.2. Classification of estimated singularities Let P D¹p1; : : : ; pmº be the set of m singularities detected in f.t/, where pi D .ti;f .ti// with i D 1; : : : ;m and ti is the time of detection of the i-th singularity. Now, it is necessary to deter- mine if the detected points are related to pulses, decay variations, or spurious caused by numerical error or noise. Because r.t/ and h.t/ have different characteristics, the proposed method for feature extraction defines different algorithms to classify the features of the two signals. In Section 2, impulse responses are characterized by positive, distinct, and sharp pulses. The algorithm verifies, for each singularity pi, if the pair of successive singularities pi and piC1 are, respectively, the rise and fall edges. Each pair identified positively is associated a positive pulse. Copyright © 2014 John Wiley & Sons, Ltd. Int. J. Commun. Syst. (2014) DOI: 10.1002/dac A WAVELET-BASED EXPERT SYSTEM FOR DSL TOPOLOGY IDENTIFICATION On the other hand, TDR traces are characterized by positive and negative pulses and decay vari- ations. Besides, pulses are smoother than in impulse response and can be overlapped generating a mixed event. The first issue is to determine if the detected point pi characterizes a pulse or a decay deviation. It is assumed that the slope of the tangent line at a pi related to pulse tends to be sharper than a point related to decay variation. Thus, the proposed algorithm calculates the approximation of derivatives of first-order at pi, through central difference formula, in order to find slope of the tangent line at that point. A threshold is defined. If the derivative is lower than , pi represents a decay variation. Otherwise, it is necessary to verify if pi pertains to a pulse. To identify pulses, the following basic assumptions are adopted: the input signal is a well-known wideband rectangular pulse, and the propagation channel is approximately a time-invariant linear system. Thus, each echo, as well as the input signal, is composed of a start and an end edge whose time interval between them �tip is approximately the same to the width of the input pulse. This way, for pi to be an edge of a pulse, the following conditions must be true. (1) There is a pk, where i < k 6 m that represents a type of edge different from pi into the interval �tip˙�, where � is a margin used to take into account numerical errors, dispersion, and noise. Its value is defined here through simulations as 20% of the value of �tip. If more than one singularity accomplishes this condition, the pk is selected whose interval pk �pi results in a smaller mean squared error in the comparison with �tip. (2) There is only one local maximum or minimum in the interval Œpi;pk�. (3) pi is not an end edge of a previous pulse, and the local extremum in the interval Œpi;pk� is not closer to zero than the other points in the interval. If all the aforementioned conditions are true, the algorithm verifies if pi is a rise edge. If yes, pi and pk correspond to a positive pulse. Otherwise, pi and pk correspond to a negative pulse. If one of the conditions is false, pi is considered spurious. 4. KNOWLEDGE-BASED LINE TOPOLOGY IDENTIFICATION The proposed expert system uses an explicit model of knowledge that interprets the features extracted from r.t/ and h.t/. This knowledge model was developed through the systematization on a set of rules of the relations between the features of a signal transmitted across a multi-section metallic line and its impedance mismatches. In order to validate these rules, systematic observa- tions of several different topologies simulated using causal line models were used. The test set is composed by measurements on real cables and these results are described in Section 5. Thus, the method was structured into three algorithms that estimate sequentially and, respec- tively, the following sets defined from the model established in Section 2 for the topology of a transmission line, T D ® �1;�2; : : : ;�nS ¯ ; L D ® l1; l2; : : : ; lnS ¯ ; and ˆ D ® �1;�2; : : : ;�nS ¯ : (16) Those three algorithms are explained as follows. 4.1. Identifying the set T The identification of T defines how many sections the line under test have and if these sections are serials or bridge taps. In r.t/, the beginning of the line sections is represented by the arrival time of negative pulses and decay variations inside an interval delimited by t D 0 and a reference time tref , which is correspondent to LT . The method estimates LT from the arrival time of the first pulse in h.t/, t h 1C. This way, tref is calculated by using tref D 2t h 1C. Next, all the events detected inside Œ0; tref � in r.t/ are sequentially analyzed. Assuming that the first section is a serial section, the algorithm includes a new serial Copyright © 2014 John Wiley & Sons, Ltd. Int. J. Commun. Syst. (2014) DOI: 10.1002/dac V. D. LIMA ET AL. section for each detected decay deviation and a new bridge tap for each negative pulse. In the interval between occurrences of two consecutive bridge taps, if any decay deviation is found, one serial section is defined. Thus, an estimate for T is obtained. In general, pulses are detected in h.t/. However, the algorithm has rules for exceptional cases in which pulses are not detected in h.t/ (what usually occurs when the line is very long). These rules are defined in accordance with the features detected in r.t/. � The first detected pulse is positive. tref is estimated as the time of arrival of this first positive pulse. In this case, there are no bridge taps and one or more serial sections can occur. � The first detected pulse is negative and there are one or more positive pulses. It is not possible to obtain a unique estimate for tref . The algorithm considers the arrival time of each positive pulse as a hypothesis for tref . Consequently, more than one estimate for T can be obtained. � Positive pulse was not detected. It is not possible to estimate LT and, consequently, T cannot be fully estimated. However, a partial estimate can be obtained. Decay deviations and negative echoes, if occur, are sequentially analyzed until the last characteristic detected. However, the algorithm cannot estimate bridge tap lengths. 4.2. Estimating the set L Given an estimate for T , the proposed method uses the signal features to obtain an estimate for the length of each estimated line section. The algorithm is divided in two parts: one for serial sections and other for bridge taps. The basic idea is the same for both: finding the time of start ts and the time of end te of the interval in r.t/ that represents the section. This interval is used to estimate the length of the k-th section using lk D .te � ts/ 2 vp: (17) 4.2.1. Algorithm for serial sections. ts and te are estimated from the arrival time of the TDR trace feature correspondent to the element before and after the section under analysis, respectively. Case network element before the section is � The port 1. ts D 0; � A bridge tap. ts D t r B� , where tr B� is the arrival time of the negative pulse that corresponds to the beginning of the bridge tap before the section under analysis; � A serial section. ts D t r Bd , where tr Bd is the arrival time of the decay deviation that corresponds to the beginning of the section under analysis. Case network element after the section under analysis is as follows: � The port 2. te D tref ; � A bridge tap. te D t r A� , where tr A� is the arrival time of the negative pulse that corresponds to the beginning of the bridge tap after the section under analysis; � A serial section. te D t r Ad , where tr Ad is the arrival time of the decay variation that corresponds to the beginning of the serial section after the section under analysis. 4.2.2. Algorithm for bridge taps. Pulses detected in h.t/ after the first one can be related to bridge taps. They can provide length estimates for these sections. The set of ‘candidate lengths’ obtained from h.t/ can be calculated by using lhC.j �1/ D � th jC � th1C � 2 vp; (18) where lh C .j �1/ is the .j �1/-th candidate length, and th jC is the time of arrival of the j -th positive pulse detected in h.t/ considering j > 1. Copyright © 2014 John Wiley & Sons, Ltd. Int. J. Commun. Syst. (2014) DOI: 10.1002/dac A WAVELET-BASED EXPERT SYSTEM FOR DSL TOPOLOGY IDENTIFICATION Possible lengths also can be obtained in r.t/ from the interval between negative pulse correspond- ing to the bridge tap under analysis and all positive pulses detected after this negative echo. These ‘TDR candidates’ are defined by lrC.m/ D � trmC� t r n� � 2 vp; (19) where lr C .m/ is the m-th TDR candidate length, and trmC is the time of arrival of the m-th positive pulse detected in r.t/ after trn�, the time of arrival of the n-th negative pulse. The length attributed to the section under analysis is the value of lh C that obtains the smallest absolute error in the comparison with the values of lr C . After obtaining the lengths for all of the estimated sections, the total serial length of the line is obtained through LT D t h 1Cvp: (20) Once the length of the serial sections is calculated from features of r.t/ and LT is estimated from h.t/, it is possible to have differences between these values. As the value of the LT obtained from h.t/ tends to be more reliable, the serial sections are adjusted proportionally to the value calculated for LT . 4.3. Classification of the set ˆ Assuming that a telecommunication network is composed of a finite set of types of cable, for a given line topology, the set ˆ would be composed of a combination of elements from this ‘set of cable types’. Once estimates for T and L were obtained, a set of hypotheses for ˆ can be defined from a database of cable types. Using a line model, SELT and DELT signals can be generated for each topology hypothesis. This work proposes to compare these simulated signals with line measurements to obtain an estimate for ˆ. The idea is to select the hypothesis that generates the smallest mean squared error in relation to the measured curves. The cable type database is generated from network samples. For each type of cable, the values of the physical and geometric parameters of the line model are adjusted in agreement with the cable configurations. The parameters can be obtained from a priori information or estimated from line measurements. In [30], an interesting method for identification of parameters of twisted-pairs from input impedance measurements through a combination of analytical approach and optimization process is proposed . However, the larger the database, the larger the number of operations is needed to obtain the solu- tion. For example, consider a database composed of four types of cable. If a topology is estimated with four sections, being the third section, a bridge tap and the others serial sections, 192 hypothesis for ˆ would be defined (4�3�4�4 D 192, once a connection between serial sections cannot be made up of the same type of cable). On the other hand, if six sections are estimated, with bridge taps in the third and the fifth sections, 3072 hypothesis would be generated (4�3�4�4�4�4 D 3072). A method for accelerating the classification process is to compare the first serial sections through TDR trace. It can be accomplished because of the fact that the first serial sections before the first derivation (if occurs) produces consecutive signal segments that can be compared separately. Then, for the second example presented in the previous paragraph, for classifying the first serial section, it is necessary to simulate a TDR trace for a single line with infinite (or very long) length for each type of cable in the database. Each simulated curve is compared with the measured signal in the interval between t D 0 and the time of arrival of the decay deviation that delimits the end of the interval related to the section under test. In this case, only four possibilities are tested. In relation to the second section, that is, also serial, the same procedure is applied except for the cable type that was already selected for the previous section. The third section is a bridge tap and, from this point, the comparison of curves for the possible combinations for the last two sections taking into account the first two sections is identified. As consequence, only 263 combinations would be tested (4C3C4�4�4�4 D 263). Copyright © 2014 John Wiley & Sons, Ltd. Int. J. Commun. Syst. (2014) DOI: 10.1002/dac V. D. LIMA ET AL. It is important to highlight that the proposed classification methodology (as well as most methods of topology identification) assumes that the physical parameters of the line have stationary behavior in time and space. Or even that in the variation of the mean, these parameters are sufficiently slow. An abrupt change could have a direct impact on the signal shape and, consequently, on the features identified by the method described in Section 3. However, abrupt changes could be considered as faults, which are out of scope of this work. Another consideration is the importance of accuracy of the feature extraction method. The iden- tification of the sets T , L and ˆ is directly dependent on the correct identification of the signal characteristics. In the next section, the robustness of the method is tested from measurements with real cables and comparison with the state-of-the-art techniques. 5. RESULTS AND ANALYSIS 5.1. General conditions of evaluation In order to evaluate the proposed wavelet-based expert system (WES) for line topology identifica- tion, a test set composed of measurements in twisted-pair copper cables was created . Twelve line topologies were used and divided in three subsets: topologies with only one section (lines 1 to 6), lines made up of more than one serial sections (lines 7 to 10), and lines with one bridge tap (lines 11 and 12). All topologies are composed of two cable types: Ericsson TEL 481 02, 0.4-mm diameter wire, and Ericsson TEL 313 000, 0.5-mm diameter wire, named here A and B, respectively. These test lines are described in Table I. The cable type database, used by the algorithms, is composed of the two types mentioned earlier and two other cable types: Furukawa CTP-APL 65, 0.65-mm diam- eter wire and a theoretical cable 0.9-mm diameter wire, whose parameters are defined in [19]. These cable types are named here C and D, respectively. The physical parameters of the cables A, B, and C were obtained using the method defined in [30]. The measurements were performed in the laboratory. For each one of the test lines, measurements of frequency response H.f /, scattering parameter S11, and open-circuit input impedance Zin were carried out. All those measurements were performed using the network analyzer Agilent 4395A, for H.f / and S11 and the impedance analyzer 4394A, for Zin. The frequency band is taken from 4.3125 kHz to 2.208 MHz, where each measurement has 512 points. Each quantity was measured five times for each line and the signal used in the tests is the average value of these measurements in order to average out the noise influence. Baluns were used in all measurements to properly carry out the connection between equipments and cables. The WES inputs, h.t/ and r.t/, were obtained as described in Section 2. All the thresholds and constants present in the method were selected from the best results of a previous training stage, not the just described test set. The training and the validation sets were composed of signals simulated using the VUB0 model [18]. The constant of sensibility ks defined for analysis of h.t/ and r.t/ is defined as equal to 1. The threshold for inclination of the tangent in the detected singularities in r.t/, , is defined as equal to 0.02. The velocity of propagation is set to 68.7% of the speed of light in vacuum. All the tests were processed using a computer with a CPU Intel Core 2 Duo E7400, 2.80 GHz, with 3 GB of RAM. 5.2. Baseline comparison This work provides a comparison of WES results with the ones obtained by two state-of-the-art methodologies presented in Section 1: TIMEC, described in [16] and SELT-tdr, described in [8, 9]. These two methodologies were first compared in [16] using simulated signals. Both presented good results for different topologies; however, TIMEC obtained better results for the simulated data. For the current work, a comparison using measurements obtained using real cables is proposed. The original implementation of the TIMEC presented in [16], kindly provided by the authors, was used. The method SELT-tdr was also implemented in [16] following the process described in [8, 9], Copyright © 2014 John Wiley & Sons, Ltd. Int. J. Commun. Syst. (2014) DOI: 10.1002/dac A WAVELET-BASED EXPERT SYSTEM FOR DSL TOPOLOGY IDENTIFICATION Table I. Parameters of the test lines and results estimated by WES and state-of-the-art methods. Here, the letter ‘S’ represents the ‘serial section’ and ‘BT’ represents the ‘bridge tap’. The best estimated results for each parameter are highlighted by bold characters. Line parameters WES SELT-tdr TIMEC Line section �k lk (m) �k �k lk (m) �k �k lk (m) �k �k lk (m) �k 1 �1 S 400 A S 409.6 A S 390.6 B S 374.9 A 2 �1 S 500 A S 500.7 A S 477.4 B S 478.0 A 3 �1 S 1000 A S 1001.4 A S 976.4 B S 1016.6 A 4 �1 S 400 B S 409.6 C S 368.9 A S 101.1 A �2 - - - - - - - - - S 312.9 C 5 �1 S 500 B S 500.7 B S 455.7 A S 501.9 A 6 �1 S 1000 B S 1001.4 C S 1084.9 A S 718.6 C �2 - - - - - - - - - S 291.8 A 7 �1 S 500 A S 542.4 A S 976.4 B S 485.6 A �2 S 500 B S 459.0 B - - - S 543.9 C 8 �1 S 500 A S 558.8 A S 1171.7 B S 559.8 A �2 S 1000 B S 988.7 B S 303.8 D S 144.4 B �3 - - - - - - - - - S 853.7 C 9 �1 S 1000 A S 1053.0 A S 1367.0 B S 1210.3 A �2 S 400 B S 358.0 B - - - BT 686.0 B �3 - - - - - - - - - S 3467.2 D �4 - - - - - - - - - S 1069.4 C �5 - - - - - - - - - S 513.1 B 10 �1 S 200 A S 363.0 B S 195.3 A S 571.9 A �2 S 1000 B S 1366.6 A S 86.8 C S 184.5 B �3 S 500 A - - - - - - S 1013.7 C 11 �1 S 500 A S 546.2 A S 477.4 B S 506.0 A �2 BT 200 A BT 204.8 A - - - BT 213.7 B �3 S 1000 B S 1001.4 B - - - S 1052.0 C 12 �1 S 200 A S 733.1 B S 672.6 A S 697.9 A �2 S 500 B - - - - - - - - - �3 BT 500 A BT 523.4 A BT 499.1 D BT 499.8 A �4 S 500 B S 495.9 A S 520.8 D S 503.4 D WES, wavelet-based expert system; SELT-tdr, single-ended line testing-time domain reflectometry. except by the use of derivatives to detect the intervals of analysis once this process is not described in the original papers. This implementation is used here. The inputs to TIMEC are S11.f / and H.f /. Hence, as the proposed method, TIMEC relies on both single-ended and double-ended measurements. The SELT-tdr uses only TDR trace (SELT) as input, which was obtained here from the input impedance as described in Section 2. 5.3. Results and system evaluation Table I summarizes the results obtained using WES and the baseline methods for the topologies in the test set. The best estimated results for each parameter are highlighted by bold characters. The error in the topology identification e.T / is defined as the number of lines where T was not perfectly estimated over the total number of topologies in the test set. The WES estimated correctly 10 of 12 topologies, resulting in e.T / D 16:67%. The two errors correspond to serial sections missed in lines 10 and 12. The parameter e.nSS/ is the error in the estimation of the number of serial sections nSS in the line under test. For instance, line 12 has nSS D 3 and the number of bridge taps nBT is 1. WES identified correctly the number of bridge taps in all the cases resulting Copyright © 2014 John Wiley & Sons, Ltd. Int. J. Commun. Syst. (2014) DOI: 10.1002/dac V. D. LIMA ET AL. Table II. Error obtained in the estimation of T for each evaluated technique. Parameters WES SELT-tdr TIMEC e.�/ 16.7% 41.7% 41.7% e.nSS/ 16.7% 41.7% 41.7% e.nBT / 0.0% 8.3% 8.3% PT 0.2 s 3.2 s 1823.2 s WES, wavelet-based expert system; SELT-tdr, single-ended line testing-time-domain reflectometry. Figure 5. Comparison among the estimates for LT of each test line for WES and the two baseline methods. in e.nBT/ D 0%. The performance of the WES was better than TIMEC and SELT-tdr in all cases for identification of T . TIMEC and SELT-tdr presented similar results in these parameters, resulting from different lines in different ways. The TIMEC presented the tendency of introducing sections, whereas SELT-tdr missed the sections. The worst case for TIMEC was in line 9, where the method inserted three wrong sections. SELT-tdr missed two sections in line 11. These results are summarized in Table II, where a comparison among the average processing time PT , measured in seconds, that each technique needed to process the measurements is also presented. These figures are only approximate because the software code was not optimized. However, they indicate the complexity of the compared methods. The results showed that TIMEC was the most time-consuming technique. WES and the SELT-tdr were less time-consuming, with WES being an order of magnitude faster than SELT-tdr. Figure 5 presents the results of the estimation of LT for each test line through the comparison among the percentile error in the LT estimation ep.LT/ given by ep .LT/ D �ˇ̌ ˇ OLT �LT ˇ̌ ˇ � L T �100; (21) where OLT is the estimated value of LT . Lines 1 to 3 and 4 to 6 are the same except by the cable types. For WES, in relation to the identification of T and estimation of L, the difference between the two types of cable was irrelevant. The results were exactly the same comparing 1 with 4, 2 with 5, and 3 with 6. Copyright © 2014 John Wiley & Sons, Ltd. Int. J. Commun. Syst. (2014) DOI: 10.1002/dac A WAVELET-BASED EXPERT SYSTEM FOR DSL TOPOLOGY IDENTIFICATION Table III. Error obtained in the bridge tap identification for each evaluated technique. The best estimated results for each parameter are highlighted by bold characters. Line Parameter WES (%) SELT-tdr TIMEC (%) 11 e.lBT / 2.40 - 6.85 e.pBT / 9.24 - 1.20 12 e.lBT / 4.68 0.18 0.04 e.pBT / 4.73 3.91 0.30 WES, wavelet-based expert system; SELT-tdr, single-ended line testing- time-domain reflectometry. With exception of lines 1, 8, and 12, WES presented the best results for the estimation of LT . The ep.LT/ value for WES was 1.40% (15.79 m), whereas for TIMEC, excluding the result of line 9 was 2.92% (27.89 m), and SELT-tdr, excluding the results of lines 10 and 11 was 4.13% (30.36 m). The bridge tap identification is evaluated using the absolute error in the estimation of the length and the position of the bridge tap, respectively, given by e � OlBT � D ˇ̌ ˇOlBT � lBT ˇ̌ ˇ lBT �100; and e. OpBT/ D j OpBT �pBT j pBT �100; (22) where OlBT and lBT are, respectively, the estimated and the real bridge tap length, and OpBT and pBT are, respectively, the estimated and the real bridge tap position. Table III summarizes the results obtained using WES and the baseline methods for the bridge tap identification. The best estimated results for each parameter are highlighted by bold characters. Wavelet-based expert system and TIMEC were able to successfully identify the bridge taps in lines 11 and 12, whereas SELT-tdr missed the bridge tap in line 11. However, TIMEC found an inexistent bridge tap in line 9. Except by the bridge tap length of line 11, TIMEC presented the best results for bridge tap identification. Overall, the errors for the three methods were smaller than 10%. In relation to the classification of ˆ, WES estimated the correct cable type sequence in eight cases, whereas TIMEC was well succeeded in three cases. SELT-tdr missed the sequence for all lines. However, considering now the 22 line sections present in the 12 test lines, TIMEC classified 13 line sections with the right type of cable, whereas WES obtained 14 classifications well succeeded. SELT-tdr correctly classified two sections. 6. CONCLUSIONS An expert system for line topology identification based on wavelet analysis was proposed in this paper. The method is based on the examination of the impulse response (estimated via H.f /) and TDR trace (that can be obtained directly or from Zin) of a line under test. The signals are segmented by an edge detection method based on wavelets. The detected edges are analyzed and the main features are extracted from the signals using two algorithms specifically designed for r.t/ and h.t/, respectively. The signal features are interpreted by an expert system composed of three sequential modules that estimate, respectively, the type (serial or bridge tap), the length, and the cable type of the line sections. A comparison between WES and two state-of-the-art methods was presented. From these results, advantages of the proposed technique could be noticed such as the simplicity of its implementa- tion and low computational cost. Besides the lower cost, the majority of results achieved by WES were considerably more accurate than the other two methods. It is important to take into account that TIMEC and WES use SELT and DELT measurements, whereas SELT-tdr uses only SELT measurements. For the used data, the WES processing time varied between 0.17 and 0.22 s. Copyright © 2014 John Wiley & Sons, Ltd. Int. J. Commun. Syst. (2014) DOI: 10.1002/dac V. D. LIMA ET AL. These results demonstrate that WES has the ability to evaluate a large number of lines of a net- work in a relatively short time with good accuracy. This may be important to speed up the user feedback after a query regarding the line under test. Furthermore, because of the simplicity of the technique implementation and low computational cost, WES can be considered to be embedded on distribution points, which would be very useful for VDSL2 and G.fast, where DSL transmission equipment is not in the CO. Although this method has been tested here for twisted-pair copper cables used for DSL communication, its application is not restricted to this type of transmission line. Finally, WES is composed of three separated modules and it can be easily complemented by other techniques. For instance, if the sets T and L were identified, any other methodology can be used to classify the cable types. ACKNOWLEDGEMENTS This work received financial support from the Research and Development Centre, Ericsson Telecomuni- cações S.A., Brazil, and CAPES, Brazilian Ministry of Education. REFERENCES 1. World broadband statistics: a short report from global broadband statistics - Q1 2012. Technical Report , Point Topic Ltd: London, UK, 2012. 2. Golden P, Dedieu H, Jacobsen K. Fundamentals of DSL Technology. Auerbach Publications: Boca Raton, FL, USA, 2006. 3. T’Joens Y, Bostoen T, Bosch SVd, Vandaele P. Managing home and access domains in modern broadband networks. Bell Labs Technical Journal 2008; 13(1):247–262. DOI: 10.1002/bltj.20293. 4. Bostoen T, Boets P, Zekri M, van Biesen L, Pollet T, Rabijns D. Estimation of the transfer function of a subscriber loop by means of a one-port scattering parameter measurement at the central office. IEEE Journal on Selected Areas in Communications 2002; 20(5):936–948. DOI: 10.1109/JS{AC}.2002.1007376. 5. Rodrigues R, Sales C, Klautau A, Ericson K, Costa J. Transfer function estimation of telephone lines from input impedance measurements. IEEE Transactions on Instrumentation and Measurement 2012; 61(1):43–54. DOI: 10.1109/TIM.2011.2157431. 6. Galli S, Waring DL. Loop makeup identification via single ended testing: beyond mere loop qualification. IEEE Journal on Selected Areas in Communications 2002; 20(5):923–935. DOI: 10.1109/JS{AC}.2002.1007375. 7. Galli S, Kerpez KJ. Signal processing for single-ended loop make-up identification. Proceedings of the 6th IEEE Workshop on Signal Processing Advances in Wireless Communications, New York City, New York, USA, 2005; 368– 374. DOI:10.1109/SPAWC.2005.1506049. 8. Galli S, Kerpez KJ. Single-ended loop make-up identification-part I: a method of analyzing tdr measurements. IEEE Transactions on Instrumentation and Measurement 2006; 55(2):528–537. DOI: 10.1109/TIM.2006.870134. 9. Kerpez KJ, Galli S. Single-ended loop-makeup identification - part II: improved algorithms and performance results. IEEE Transactions on Instrumentation and Measurement 2006; 55(2):538–549. DOI: 10.1109/TIM.2006.870136. 10. Vermeiren T, Bostoen T, Boets P, Chebab XO, Louage F. Subscriber loop topology classification by means of time-domain reflectometry. Proceedings of the IEEE International Conference on Communications, ICC’03, Vol. 3, Anchorage, Alaska, USA, 2003; 1998–2002. DOI: 10.1109/ICC.2003.1203949. 11. Boets P, Bostoen T, van Biesen L, Pollet T. Measurement, calibration and pre-processing of signals for single- ended subscriber line identification. Proceedings of the 20th IEEE Instrumentation and Measurement Technology Conference, IMTC ’03, Vol. 1, Vail, Colorado, USA, 2003; 338–343. 12. Boets P, Bostoen T, van Biesen L, Gardan D. Single-ended line testing: a white box approach. Proceedings of the 4th IASTED International Multi-Conference on Wireless and Optical Communications, Banff, AB, Canada, 2004; 393–398. 13. Boets P, Bostoen T, van Biesen L, Pollet T. Preprocessing of signals for single-ended subscriber line testing. IEEE Transactions on Instrumentation and Measurement 2006; 55(5):1509–1518. DOI: 10.1109/TIM.2006.880290. 14. Bharathi M, Ravishankar S. Single ended loop topology estimation using FDR and correlation TDR in a DSL modem. Cyber Journals: Multidisciplinary Journals in Science and Technology, Journal of Selected Areas in Telecommunications (JSAT) 2012; 2(6):40–48. 15. Neus C. Reflectometric analysis of transmission line networks. PhD Thesis, Vrije Universiteit Brussel, 2011. 16. Sales C, Rodrigues RM, Lindqvist F, Costa J, Klautau A, Ericson K, i Riu JR, Brjesson PO. Line topology identifica- tion using multi-objective evolutionary computation. IEEE Transactions on Instrumentation & Measurement 2010; 59(3):715–729. 17. Kerpez KJ. Automated loop identification on DSL lines. International Journal of Communication Systems 2009; 22(12):1479–1493. DOI: 10.1002/dac.1029. Copyright © 2014 John Wiley & Sons, Ltd. Int. J. Commun. Syst. (2014) DOI: 10.1002/dac A WAVELET-BASED EXPERT SYSTEM FOR DSL TOPOLOGY IDENTIFICATION 18. Boets P, Zekri M, van Biesen L, Bostoen T, Pollet T. On the identification of cables for metallic access networks. Pro- ceedings of the 18th IEEE Instrumentation and Measurement Technology Conference, IMTC’01, Vol. 2, Budapest, Hungary, 2001; 1348–1353. DOI: 10.1109/IMTC.2001.928292. 19. Test procedures for digital subscriber line (DSL) transceivers. International Telecommunication Union (ITU) recommendation G.996.1, 2001. 20. Time Domain Reflectometry Theory. Application Note 1304-2. Agilent Technologies, Inc.: USA, 2006. 21. Neus C, Boets P, van Biesen L. Feature extraction of one port scattering parameters for single ended line testing. XVIII IMEKO World Congress, Rio de Janeiro, Brazil, 2006; 6 pages. 22. Ruinskiy D, Lavner Y. An effective algorithm for automatic detection and exact demarcation of breath sounds in speech and song signals. IEEE Transactions on Audio, Speech, and Language Processing 2007; 15(3):838–850. 23. Ng B, Ab-Rahman MS, Premadi A. Development of monitoring system for FTTH-PON using combined ACS and SANTAD. International Journal of Communication Systems 2010; 23:429–446. DOI: 10.1002/dac.1078. 24. Nes PG. Fast multi-scale edge-detection in medical ultrasound signals. Signal Processing 2012; 92(10):2394–2408. DOI: 10.1016/j.sigpro.2012.02.021. 25. Luo GY. On-line wavelet filtering of narrowband noise in signal detection of spread spectrum system for location tracking. International Journal of Communication Systems 2012; 25(5):598–615. DOI: 10.1002/dac.1278. 26. Lv S, Liu J. A novel signal separation algorithm based on compressed sensing for wideband spectrum sensing in cognitive radio networks. International Journal of Communication Systems 2013. DOI: 10.1002/dac.2495. 27. Bhuyan MK, Chakraborty BK. Motion adaptive video coding scheme for time-varying network. International Journal of Communication Systems 2013. DOI: 10.1002/dac.2712. 28. Huang H, Huang S, Chen J, Wang R, Xiong J. An image information hiding algorithm based on grey system theory. International Journal of Communication Systems 2013. DOI: 10.1002/dac.2551. 29. Mallat S. A Wavelet Tour of Signal Processing (2nd edn). Academic Press: San Diego, CA, USA, 1999. 30. Borges G, Rodrigues RM, Sales C, Ericson K, Costa J. Cable parameters identification for DSL systems. Proceedings of IEEE International Conference on Computer as a Tool, EUROCON’11, Lisbon, Portugal, 2011; 1–4. Copyright © 2014 John Wiley & Sons, Ltd. Int. J. Commun. Syst. (2014) DOI: 10.1002/dac A wavelet-based expert system for digital subscriber line topology identification SUMMARY INTRODUCTION CHARACTERISTICS OF THE MEASURED SIGNALS Interpreting impulse responses Interpreting TDR traces Serial to serial Serial or bridge tap to open termination Serial to bridge tap WAVELET-BASED FEATURE EXTRACTION Singularity detection with wavelets Calculus of the CWT Local modulus maxima and thresholding Singularity detection Classification of estimated singularities KNOWLEDGE-BASED LINE TOPOLOGY IDENTIFICATION Identifying the set T Estimating the set L Algorithm for serial sections Algorithm for bridge taps Classification of the set RESULTS AND ANALYSIS General conditions of evaluation Baseline comparison Results and system evaluation CONCLUSIONS REFERENCES