A significant increase in the ability to collect and store diverse information over the past decade has led to an outright data explosion, providing larger and richer datasets than ever before. This proliferation in dataset sizes is accompanied by the quandary of successfully mining this data to discover patterns of interest. However, extreme dataset sizes place unprecedented demands on high-performance computing infrastructures, and a gap has developed between the available real-world datasets and our ability to process them. Dataset sizes are quickly approaching Tera and Petabytes. This rate of increase also challenges the subsampling paradigm, as even a subsample of data runs into Gigabytes. It is our goal to exploit recent advances in multi-threaded processor technology for scalable data mining. With this work, we explore one such architecture -- the Cray MTA-2. We conjecture that the architectural design is well suited for the application of machine learning to massive datasets. To that end, we present a thorough complexity analysis and experimental evaluation of five different popular learning algorithms. We use a diverse body of datasets with sizes varying in both the dimensions (instances and attributes). Our results lead to an analysis of whether the architectural design of the Cray MTA-2 is an appropriate platform for massively parallel, highly scalable learning algorithm implementations.