id author title date pages extension mime words sentences flesch summary cache txt cord-311839-61djk4bs Wei, Dan A novel hierarchical clustering algorithm for gene sequences 2012-07-23 .txt text/plain 8033 496 61 We propose a new alignment-free algorithm, mBKM, based on a new distance measure, DMk, for clustering gene sequences. DMk shows better performance than the k-tuple distance in our experiments, and mBKM outperforms SL, CL, AL, BKM and KM when tested on public gene sequence datasets. In this paper we propose a new alignment-free similarity measure, DMk, based on which we developed mBKM to cluster gene sequences. To evaluate the proposed similarity measure, we test DMk on gene sequence data sets and compare it with the k-tuple distance. Moreover, we use our method, mBKM with similarity measure DMk, in phylogenetic analysis to show how well the genes are grouped together and how well the resulting trees agree with existing phylogenies. In order to illustrate the efficiency of mBKM in gene sequence clustering, we ran mBKM with the k-tuple distance and DMk on real data sets listed in Table 1 . ./cache/cord-311839-61djk4bs.txt ./txt/cord-311839-61djk4bs.txt