During our troubleshooting process, we observed in the APM Status Console that the historical metrics threshold for Collector A had been reached. We also noticed that in the MOM's loadbalancing.xml file, these agents were configured to only be load-balanced to Collector A.
Typically, when the Enterprise Manager's historical metric clamp is hit, the agents which are already connected to the Enterprise Manager (EM) would stay connected, and the EM will continue to accept data for existing metrics. The EM will only stop registering/accepting new metrics from the existing agents, and any connection request from a new agent.
In this case, the historical metric limit has been breached prior to the cluster restart, and the existing agents were still connecting to it. However, upon the restart, where the agents had disconnected, and were trying to re-connect to the Collectors through MOM's load balancing, the MOM determined that Collector A has hit the historical metric clamp, and there is no other Collector that the agents are allowed to connect to, hence denying the agent connection requests.