Full course description
The course will discuss data mining and machine learning algorithms for analyzing very large amounts of data, which introduces big data infrastructure, distributed computational paradigm, and distributed data analytics algorithms for common real-world applications. The emphasis will be on MapReduce and Spark as tools for creating parallel algorithms that can process very large amounts of data.