id author title date pages extension mime words sentences flesch summary cache txt work_x6fbaqa72nhnhazv3jeeoqz4ci Wenqian Shang A novel feature selection algorithm for text categorization 2007 5 .pdf application/pdf 3500 459 67 A new measure function of Gini index is constructed and made to fit text categorization. show that our improvements of Gini index behave better than other methods of feature selection. Keywords: Text feature selection; Text categorization; Gini index; kNN classifier; Text preprocessing information gain, expected cross entropy, the weight of evidence of text, odds ratio, term frequency, mutual information, CHI (Yang & Pedersen, 1997; Mladenic & Grobelnik, index for text feature selection and weight-adjustment. Through deeply analyzing the principles of Gini index and text feature, we construct a new measure function of Gini index and use it to select features The performance of five feature selection measure functions on top 10 classes The performance of five feature selection measure functions on training set B The performance of five feature selection measure functions on training set B feature selection methods in text categorization. ./cache/work_x6fbaqa72nhnhazv3jeeoqz4ci.pdf ./txt/work_x6fbaqa72nhnhazv3jeeoqz4ci.txt