- Modern Big Data Processing with Hadoop
- V. Naresh Kumar Prashant Shindgikar
- 230字
- 2025-04-04 17:12:20
Important architecture points
Following are some important points to remember about the HDFS HA using the QJM architecture:
- In the cluster, there are two separate machines—the active state NameNode and standby state NameNode.
- At any point in time, exactly one of the NameNodes is in an active state, and the other is in a standby state.
- The active NameNode manages the requests from all client DataNodes in the cluster, while the standby remains a slave.
- All the DataNodes are configured in such a way that they send their block reports and heartbeats to both active and standby NameNodes.
- Both NameNodes, active and standby, remain synchronized with each other by communicating with a group of separate daemons called JournalNodes (JNs).
- When a client makes any filesystem change, the active NameNode durably logs a record of the modification to the majority of these JNs.
- The standby node immediately applies those changes to its own namespace by communicating with JNs.
- In the event of the active NameNode being unavailable, the standby NameNode makes sure that it absorbs all the changes (edits) from JNs and promotes itself as an active NameNode.
- To avoid a scenario that makes both the NameNodes active at a given time, the JNs will only ever allow a single NameNode to be a writer at a time. This allows the new active NameNode to safely proceed with failover.