0%

Set Up a Hadoop Cluster From Scratch

A sample Hadoop cluster augmented with ZKFC and YARN, for MapReduce tasks.

The official Docker image of Apache Hadoop is based on CentOS 7, but as the EOL of CentOS 7 on June 30th, 2024, no new updates for CentOS will be made available. Time to deploy a Hadoop cluster in Docker on our own!

Overview

In this article, we will set up a Hadoop cluster in Docker with 2 name nodes and 3 data nodes, along with ZKFC (Zookeeper Failover Controller) and YARN (Yet Another Resource Negotiator).