pi-cluster

Background / Purpose / Getting Started / Cluster Guide

Background

We live in a digital world full of data. Whether that is un-, semi- or structured data, there is a lot of things we can do with it if we make use of Machine Learning (ML). More specifically, supervised, unsupervised or reinforcement learning algorithms are used to build models that learn (train) and discover patterns from historical data to predict an outcome. There are numerous applications of ML such as weather forecast (prediction), storm tracking models, among others.

Yes. It all sounds the same, let the following image sink in…

The intention of the cluster is to provide to you with the option of being able to Extract, Transform and Load (ETL) data with Spark while using Hadoop’s File System (HDFS) for storaging needs. Data Analysis and Visualization is out this scope and up to your choice.