Ann Chen
Yahoo
Senior Research Engineer
Different tools are applicable for different situations while performing machine learning algorithm on big data. In this work, we will introduce two different machine learning tools - Mahout and SAMOA. They are used in batch training and online streaming calculation, respectively. Furthermore, Spark is the next generation map-reduce platform with great performance gain compared to hadoop. Mahout and other machine learning tools are moving toward it. Though it is not very mature yet, we can peek some progresses in this talk.