2.3 NameServer路由注册、故障剔除

NameServer的主要作用是为消息生产者和消息消费者提供关于topic的路由信息,那么NameServer就需要存储路由的基础信息,并且能够管理Broker节点,包括路由注册、路由删除等功能。

2.3.1 路由元信息

NameServer的路由实现类是org.apache.rocketmq.namesrv.routeinfo.RouteInfoManager。在了解路由注册之前,我们先看一下NameServer到底存储了哪些信息,如代码清单2-6所示。

代码清单2-6 RouteInfoManager路由元数据

private final HashMap<String/* topic */, List<QueueData>> topicQueueTable;
private final HashMap<String/* brokerName */, BrokerData> brokerAddrTable;
private final HashMap<String/* clusterName */, Set<String/* brokerName */>>
    clusterAddrTable;
private final HashMap<String/* brokerAddr */, BrokerLiveInfo> brokerLiveTable;
private final HashMap<String/* brokerAddr */, List<String>/* Filter Server */>
    filterServerTable;

1)topicQueueTable:topic消息队列的路由信息,消息发送时根据路由表进行负载均衡。

2)brokerAddrTable:Broker基础信息,包含brokerName、所属集群名称、主备Broker地址。

3)clusterAddrTable:Broker集群信息,存储集群中所有Broker的名称。

4)brokerLiveTable:Broker状态信息,NameServer每次收到心跳包时会替换该信息。

5)filterServerTable:Broker上的FilterServer列表,用于类模式消息过滤。类模式过滤机制在4.4及以后版本被废弃。

QueueData、BrokerData、BrokerLiveInfo类图如图2-3所示。

045-1

图2-3 路由元数据类图

注意

RocketMQ基于订阅发布机制,一个topic拥有多个消息队列,一个Broker默认为每一主题创建4个读队列和4个写队列。多个Broker组成一个集群,BrokerName由相同的多台Broker组成主从架构,brokerId=0代表主节点,brokerId>0表示从节点。BrokerLiveInfo中的lastUpdateTimestamp存储上次收到Broker心跳包的时间。

RocketMQ 2主2从数据结构如图2-4所示。

046-1

图2-4 RocketMQ 2主2从数据结构展示图

对应运行时的内存结构如图2-5、图2-6所示。

046-2

图2-5 topicQueueTable、brokerAddrTable运行时内存结构

046-3

图2-6 brokerLiveTable、clusterAddrTable运行时内存结构

2.3.2 路由注册

RocketMQ路由注册是通过Broker与NameServer的心跳功能实现的。Broker启动时向集群中所有的NameServer发送心跳语句,每隔30s向集群中所有的NameServer发送心跳包,NameServer收到Broker心跳包时会先更新brokerLiveTable缓存中BrokerLiveInfo的lastUpdateTimestamp,然后每隔10s扫描一次brokerLiveTable,如果连续120s没有收到心跳包,NameServer将移除该Broker的路由信息,同时关闭Socket连接。

1. Broker发送心跳包

Broker发送心跳包的核心代码如代码清单2-7、代码清单2-8所示。

代码清单2-7 Broker发送心跳包(BrokerController#start)

this.scheduledExecutorService.scheduleAtFixedRate(new Runnable() {
        public void run() {
            try {
                BrokerController.this.registerBrokerAll(true, false);
            } catch (Throwable e)
                { log.error("registerBrokerAll Exception", e);
            }
        }
    }, 1000 * 10, 1000 * 30, TimeUnit.MILLISECONDS);

代码清单2-8 Broker发送心跳包(BrokerOuterAPI#registerBrokerAll)

List<String> nameServerAddressList =
this.remotingClient.getNameServerAddressList();
if (nameServerAddressList != null) {
    for (String namesrvAddr : nameServerAddressList) {// 遍历所有 NameServer 列表
        try {
            RegisterBrokerResult result = this.registerBroker(namesrvAddr,
                clusterName, brokerAddr, brokerName, brokerId, haServerAddr,
                topicConfigWrapper, filterServerList, oneway, timeoutMills);
                // 向 NameServer 注册
            if (result != null)
                { registerBrokerResult = result;
            }
            log.info("register broker to name server {} OK", namesrvAddr);
        } catch (Exception e) {
            log.warn("registerBroker Exception, {}", namesrvAddr, e);
        }
    }
}

该方法遍历NameServer列表,Broker消息服务器依次向NameServer发送心跳包,如代码清单2-9所示。

代码清单2-9 BrokerOuterAPI#registerBroker(网络发送代码)

RegisterBrokerRequestHeader requestHeader = new RegisterBrokerRequestHeader();
requestHeader.setBrokerAddr(brokerAddr);
requestHeader.setBrokerId(brokerId);
requestHeader.setBrokerName(brokerName);
requestHeader.setClusterName(clusterName);
requestHeader.setHaServerAddr(haServerAddr);
RemotingCommand request = RemotingCommand.createRequestCommand(
              RequestCode.REGISTER_BROKER, requestHeader);
RegisterBrokerBody requestBody = new RegisterBrokerBody();
requestBody.setTopicConfigSerializeWrapper(topicConfigWrapper);
requestBody.setFilterServerList(filterServerList);
request.setBody(requestBody.encode());
if (oneway)
    { try {
            this.remotingClient.invokeOneway(namesrvAddr, request, timeoutMills);
    } catch (RemotingTooMuchRequestException e) {
            // 忽略
    }
    return null;
    }
RemotingCommand response = this.remotingClient.invokeSync(namesrvAddr, request,
        timeoutMills);

下面分析发送心跳包的具体逻辑,首先封装请求包头(Header)。

1)brokerAddr:broker地址。

2)brokerId:brokerId=0表示主节点,brokerId>0表示从节点。

3)brokerName:broker名称。

4)clusterName:集群名称。

5)haServerAddr:主节点地址,初次请求时该值为空,从节点向NameServer注册后返回。

6)requestBody:

  • topicConfigWrapper,主题配置,topicConfigWrapper内部封装的是TopicConfig Manager中的topicConfigTable,内部存储的是Broker启动时默认的一些topic,如MixAll.SELF_TEST_TOPIC、MixAll.DEFAULT_TOPIC(AutoCreateTopic-Enable=true)、MixAll.BENCHMARK_TOPIC、MixAll.OFFSET_MOVED_EVENT、BrokerConfig#brokerClusterName、BrokerConfig#brokerName。Broker中topic默认存储在${Rocket_Home}/store/confg/topics.json中。
  • filterServerList,消息过滤服务器列表。

注意

RocketMQ网络传输基于Netty,本书不具体剖析网络实现细节,在这里介绍一下网络跟踪方法。对于每一个请求,RocketMQ都会定义一个RequestCode。在服务端会有相应的网络处理器(processor包中),只须整库搜索RequestCode,即可找到相应的处理逻辑。对Netty感兴趣的读者,可以参考笔者发布的文章(https://blog.csdn.net/prestigeding/article/details/53977445)。

2. NameServer处理心跳包

org.apache.rocketmq.namesrv.processor.DefaultRequestProcessor是网络处理器解析请求类型,如果请求类型为RequestCode.REGISTER_BROKER,则请求最终转发到RouteInfoMan ager#registerBroker,如代码清单2-10所示。

代码清单2-10 RouteInfoManager#registerBroker clusterAddrTable的维护

this.lock.writeLock().lockInterruptibly();
Set<String> brokerNames = this.clusterAddrTable.get(clusterName);
if (null == brokerNames) {
    brokerNames = new HashSet<String>();
    this.clusterAddrTable.put(clusterName, brokerNames);
}
brokerNames.add(brokerName);

第一步:路由注册需要加写锁,防止并发修改RouteInfoManager中的路由表。首先判断Broker所属集群是否存在,如果不存在,则创建集群,然后将broker名加入集群Broker集合,如代码清单2-11所示。

代码清单2-11 RouteInfoManager#registerBroker brokerAddrTable的维护

BrokerData brokerData = this.brokerAddrTable.get(brokerName);
if (null == brokerData) {
        registerFirst = true;
        brokerData = new BrokerData(clusterName, brokerName, new HashMap<Long,
            String>());
        this.brokerAddrTable.put(brokerName, brokerData);
    }
String oldAddr = brokerData.getBrokerAddrs().put(brokerId, brokerAddr);
registerFirst = registerFirst || (null == oldAddr);

第二步:维护BrokerData信息,首先从brokerAddrTable中根据broker名尝试获取Broker信息,如果不存在,则新建BrokerData并放入brokerAddrTable,registerFirst设置为true;如果存在,直接替换原先的Broker信息,registerFirst设置为false,表示非第一次注册,如代码清单2-12所示。

代码清单2-12 RouteInfoManager#registerBroker topicQueueTable的维护

if (null != topicConfigWrapper && MixAll.MASTER_ID == brokerId) {
    if (this.isBrokerTopicConfigChanged(brokerAddr,
                topicConfigWrapper.getDataVersion()) || registerFirst) {
        ConcurrentMap<String, TopicConfig> tcTable =
                    topicConfigWrapper.getTopicConfigTable();
        if (tcTable != null) {
            for (Map.Entry<String, TopicConfig> entry : tcTable.entrySet()) {
                this.createAndUpdateQueueData(brokerName, entry.getValue());
            }
        }
    }
}

第三步:如果Broker为主节点,并且Broker的topic配置信息发生变化或者是初次注册,则需要创建或更新topic路由元数据,并填充topicQueueTable,其实就是为默认主题自动注册路由信息,其中包含MixAll.DEFAULT_TOPIC的路由信息。当消息生产者发送主题时,如果该主题未创建,并且BrokerConfig的autoCreateTopicEnable为true,则返回MixAll.DEFAULT_TOPIC的路由信息,如代码清单2-13所示。

代码清单2-13 RouteInfoManager#createAndUpdateQueueData

private void createAndUpdateQueueData(final String brokerName, final TopicConfig
topicConfig) {
    QueueData queueData = new QueueData();
    queueData.setBrokerName(brokerName);
    queueData.setWriteQueueNums(topicConfig.getWriteQueueNums());
    queueData.setReadQueueNums(topicConfig.getReadQueueNums());
    queueData.setPerm(topicConfig.getPerm());
    queueData.setTopicSynFlag(topicConfig.getTopicSysFlag());
    List<QueueData> queueDataList =
                    this.topicQueueTable.get(topicConfig.getTopicName());
    if (null == queueDataList) {
        queueDataList = new LinkedList<QueueData>();
        queueDataList.add(queueData);
        this.topicQueueTable.put(topicConfig.getTopicName(),
            queueDataList);
        log.info("new topic registerd, {} {}", topicConfig.getTopicName(),
            queueData);
    } else {
        boolean addNewOne = true;
        Iterator<QueueData> it = queueDataList.iterator();
        while (it.hasNext()) {
            QueueData qd = it.next();
            if (qd.getBrokerName().equals(brokerName))
                { if (qd.equals(queueData)) {
                    addNewOne = false;
                } else {
                    log.info("topic changed, {} OLD: {} NEW: {}",
                        topicConfig.getTopicName(), qd, queueData);
                    it.remove();
                }
            }
        }
        if (addNewOne)
            { queueDataList.add(queueData);
        }
    }
}

根据topicConfig创建QueueData数据结构,然后更新topicQueueTable,如代码清单2-14所示。

代码清单2-14 RouteInfoManager#registerBroker

BrokerLiveInfo prevBrokerLiveInfo = this.brokerLiveTable.put(brokerAddr,
    new BrokerLiveInfo(System.currentTimeMillis(),
        topicConfigWrapper.getDataVersion(),
        channel,
        haServerAddr));
if (null == prevBrokerLiveInfo) {
    log.info("new broker registerd, {} HAServer: {}", brokerAddr, haServerAddr);
}

第四步:更新BrokerLiveInfo,存储状态正常的Broker信息表,BrokeLiveInfo是执行路由删除操作的重要依据,如代码清单2-15所示。

代码清单2-15 RouteInfoManager#registerBroker

if (filterServerList != null) {
    if (filterServerList.isEmpty())
            { this.filterServerTable.remove(brokerAddr);
    } else {
            this.filterServerTable.put(brokerAddr, filterServerList);
    }
}
if (MixAll.MASTER_ID != brokerId) {
    String masterAddr = brokerData.getBrokerAddrs().get(MixAll.MASTER_ID);
    if (masterAddr != null) {
        BrokerLiveInfo brokerLiveInfo = this.brokerLiveTable.get(masterAddr);
        if (brokerLiveInfo != null) {
            result.setHaServerAddr(brokerLiveInfo.getHaServerAddr());
            result.setMasterAddr(masterAddr);
        }
    }
}

第五步:注册Broker的过滤器Server地址列表,一个Broker上会关联多个FilterServer消息过滤服务器,此部分内容将在第6章详细介绍。如果此Broker为从节点,则需要查找该Broker的主节点信息,并更新对应的masterAddr属性。

设计亮点:NameServer与Broker保持长连接,Broker的状态信息存储在brokerLive-Table中,NameServer每收到一个心跳包,将更新brokerLiveTable中关于Broker的状态信息以及路由表(topicQueueTable、brokerAddrTable、brokerLiveTable、filterServer-Table)。更新上述路由表(HashTable)使用了锁粒度较少的读写锁,允许多个消息发送者并发读操作,保证消息发送时的高并发。同一时刻NameServer只处理一个Broker心跳包,多个心跳包请求串行执行。这也是读写锁经典的使用场景,更多关于读写锁的信息,可以参考笔者的博文:http://blog.csdn.net/prestigeding/article/details/53286756

2.3.3 路由删除

根据上文所述,Broker每隔30s向NameServer发送一个心跳包,心跳包中包含BrokerId、Broker地址、Broker名称、Broker所属集群名称。如果Broker宕机,NameServer无法收到心跳包,此时NameServer如何剔除失效的Broker呢?

NameServer会每隔10s扫描一次brokerLiveTable状态表,如果BrokerLive的lastUpdate-Timestamp时间戳距当前时间超过120s,则认为Broker失效,移除该Broker,关闭与Broker的连接,同时更新topicQueueTable、brokerAddrTable、brokerLiveTable、filterServerTable。

RocketMQ有两个触发点来触发路由删除操作。

1)NameServer定时扫描brokerLiveTable,检测上次心跳包与当前系统时间的时间戳,如果时间戳大于120s,则需要移除该Broker信息。

2)Broker在正常关闭的情况下,会执行unregisterBroker指令。

因为不管是何种方式触发的路由删除,删除方法是一样的,都是从topicQueueTable、brokerAddrTable、brokerLiveTable、filterServerTable中删除与该Broker相关的信息,所以RocketMQ用这两种方式维护路由信息时会抽取公共代码,本节将以第一种方式为例展开分析,如代码清单2-16所示。

代码清单2-16 RouteInfoManager#scanNotActiveBroker

public void scanNotActiveBroker()
    { Iterator<Entry<String, BrokerLiveInfo>> it =
                this.brokerLiveTable.entrySet().iterator();
    while (it.hasNext()) {
        Entry<String, BrokerLiveInfo> next = it.next();
        long last = next.getValue().getLastUpdateTimestamp();
        if ((last + BROKER_CHANNEL_EXPIRED_TIME) < System.currentTimeMillis())
            { RemotingUtil.closeChannel(next.getValue().getChannel());
            it.remove();
            log.warn("The broker channel expired, {} {}ms", next.getKey(),
                    BROKER_CHANNEL_EXPIRED_TIME);
            this.onChannelDestroy(next.getKey(), next.getValue().getChannel());
        }
    }
}

我们知道scanNotActiveBroker在NameServer中每10s执行一次。逻辑也很简单,先遍历brokerLiveInfo路由表(HashMap),检测BrokerLiveInfo的LastUpdateTimestamp上次收到心跳包的时间,如果超过120s,则认为该Broker已不可用,然后将它移除并关闭连接,最后删除与该Broker相关的路由信息,如代码清单2-17所示,路由表维护过程需要申请写锁。

代码清单2-17 RouteInfoManager#onChannelDestroy

this.lock.writeLock().lockInterruptibly();
this.brokerLiveTable.remove(brokerAddrFound);
this.filterServerTable.remove(brokerAddrFound);

第一步:申请写锁。根据brokerAddress从brokerLiveTable、filterServerTable中移除Broker相关的信息,如代码清单2-18所示。

代码清单2-18 RouteInfoManager#onChannelDestroy

String brokerNameFound = null;
boolean removeBrokerName = false;
Iterator<Entry<String, BrokerData>> itBrokerAddrTable =
    this.brokerAddrTable.entrySet().iterator();
while (itBrokerAddrTable.hasNext() && (null == brokerNameFound)) {
    BrokerData brokerData = itBrokerAddrTable.next().getValue();
    Iterator<Entry<Long, String>> it =
        brokerData.getBrokerAddrs().entrySet().iterator();
    while (it.hasNext()) {
        Entry<Long, String> entry = it.next();
        Long brokerId =  entry.getKey();
        String brokerAddr = entry.getValue();
        if (brokerAddr.equals(brokerAddrFound)) {
            brokerNameFound = brokerData.getBrokerName();
            it.remove();
            log.info("remove brokerAddr[{}, {}] from brokerAddrTable,
                because channel destroyed",
                brokerId, brokerAddr);
            break;
        }
    }
    if (brokerData.getBrokerAddrs().isEmpty()) {
        removeBrokerName = true;
        itBrokerAddrTable.remove();
        log.info("remove brokerName[{}] from brokerAddrTable, because channel
            destroyed",brokerData.getBrokerName());
    }
}

第二步:维护brokerAddrTable。遍历HashMap<String/* brokerName */, BrokerData> brokerAddrTable,从BrokerData的HashMap<Long/* brokerId */, String/* broker address */> brokerAddrs中,找到具体的Broker,从BrokerData中将其移除。如果移除后在BrokerData中不再包含其他Broker,则在brokerAddrTable中移除该brokerName对应的条目,如代码清单2-19所示。

代码清单2-19 RouteInfoManager#onChannelDestroy

if (brokerNameFound != null && removeBrokerName) {
    Iterator<Entry<String, Set<String>>> it =
              this.clusterAddrTable.entrySet().iterator();
    while (it.hasNext()) {
        Entry<String, Set<String>> entry = it.next();
        String clusterName = entry.getKey();
        Set<String> brokerNames = entry.getValue();
        boolean removed = brokerNames.remove(brokerNameFound);
        if (removed) {
            log.info("remove brokerName[{}], clusterName[{}] from
                                 clusterAddrTable, because channel destroyed",
                                brokerNameFound, clusterName);

            if (brokerNames.isEmpty()) {
                log.info("remove the clusterName[{}] from clusterAddrTable, because
                    channel destroyed and no broker in this cluster",
                            clusterName);
                it.remove();
            }
            break;
        }
    }
}

第三步:根据BrokerName,从clusterAddrTable中找到Broker并将其从集群中移除。如果移除后,集群中不包含任何Broker,则将该集群从clusterAddrTable中移除,如代码清单2-20所示。

代码清单2-20 RouteInfoManager#onChannelDestroy

if (removeBrokerName) {
    Iterator<Entry<String, List<QueueData>>> itTopicQueueTable =
            this.topicQueueTable.entrySet().iterator();
    while (itTopicQueueTable.hasNext()) {
        Entry<String, List<QueueData>> entry = itTopicQueueTable.next();
        String topic = entry.getKey();
        List<QueueData> queueDataList = entry.getValue();
        Iterator<QueueData> itQueueData = queueDataList.iterator();
        while (itQueueData.hasNext()) {
            QueueData queueData = itQueueData.next();
            if (queueData.getBrokerName().equals(brokerNameFound)) {
                itQueueData.remove();
                log.info("remove topic[{} {}], from topicQueueTable, because
                    channel destroyed",topic, queueData);
            }
        }

        if (queueDataList.isEmpty()) {
            itTopicQueueTable.remove();
            log.info("remove topic[{}] all queue, from topicQueueTable, because
                channel destroyed",topic);
        }
    }
}

第四步:根据BrokerName,遍历所有主题的队列,如果队列中包含当前Broker的队列,则移除,如果topic只包含待移除Broker的队列,从路由表中删除该topic,如代码清单2-21所示。

代码清单2-21 RouteInfoManager#onChannelDestroy

finally {
    this.lock.writeLock().unlock();
}

第五步:释放锁,完成路由删除。

2.3.4 路由发现

RocketMQ路由发现是非实时的,当topic路由出现变化后,NameServer不主动推送给客户端,而是由客户端定时拉取主题最新的路由。根据主题名称拉取路由信息的命令编码为GET_ROUTEINTO_BY_TOPIC。RocketMQ路由结果如图2-7所示。

055-1

图2-7 RocketMQ路由结果实体

  • orderTopicConf:顺序消息配置内容,来自kvConfig。
  • List queueDatas:topic队列元数据。
  • List brokerDatas:topic分布的broker元数据。
  • HashMap filterServerTable:Broker上过滤服务器的地址列表。

NameServer路由发现实现类为DefaultRequestProcessor#getRouteInfoByTopic,如代码清单2-22所示。

代码清单2-22 DefaultRequestProcessor#getRouteInfoByTopic

public RemotingCommand getRouteInfoByTopic(ChannelHandlerContext ctx,
            RemotingCommand request) throws RemotingCommandException {
    final RemotingCommand response = RemotingCommand.createResponseCommand(null);
    final GetRouteInfoRequestHeader requestHeader =(GetRouteInfoRequestHeader)
        request.decodeCommandCustomHeader(GetRouteInfoRequestHeader.class);
    TopicRouteData topicRouteData = this.namesrvController.
        getRouteInfoManager().pickupTopicRouteData(requestHeader.getTopic());
        if (topicRouteData != null) {
            if(this.namesrvController.getNamesrvConfig().isOrderMessageEnable()) {
                String orderTopicConf =this.namesrvController.getKvConfigManager()
                    .getKVConfig(NamesrvUtil.NAMESPACE_ORDER_TOPIC_CONFIG,
                    requestHeader.getTopic());
                topicRouteData.setOrderTopicConf(orderTopicConf);
            }
            byte[] content = topicRouteData.encode();
            response.setBody(content);
            response.setCode(ResponseCode.SUCCESS);
            response.setRemark(null);
            return response;
        }
    response.setCode(ResponseCode.TOPIC_NOT_EXIST);
    response.setRemark("No topic route info in name server for the topic: "
                + requestHeader.getTopic()
                + FAQUrl.suggestTodo(FAQUrl.APPLY_TOPIC_URL));
    return response;
}

第一步:调用RouterInfoManager的方法,从路由表topicQueueTable、brokerAddrTable、filterServerTable中分别填充TopicRouteData中的List<QueueData>、List<BrokerData>和filterServer地址表。

第二步:如果找到主题对应的路由信息并且该主题为顺序消息,则从NameServer KVConfig中获取关于顺序消息相关的配置填充路由信息。如果找不到路由信息Code,则使用TOPIC_NOT_EXISTS,表示没有找到对应的路由。