Data Repository host reboots upon restarting down node.

Document ID : KB000005317
Last Modified Date : 14/02/2018
Show Technical Document Details
Issue:

One of the nodes in a 3-node Data Repository cluster will not start.  When the node is restarted it comes up and starts to initialize, but eventually theLinux server reboots and this error is in the vertica.log file 

 EEThread:0x7f07dc130930-a0000000b17641 <PANIC> @v_drdata_node0003: VX001/4064: NewPool::addChunk talloc calloc() error: 'Cannot allocate memory'; size 134217728 

Cause:

This usually occurs when a node has been down for a long time. It appears the recovery will try to use more memory then the system can handle. 

Resolution:

To get the node to start you can set the system to to do a full rebuild instead of trying to rebuild from the time the node went down. The procedure to do this is to go to one of the running nodes and execute the following vsql:

/opt/vertica/bin/vsql -U dauser -w dapass -c "select make_ahm_now(true);" 

Then restart the problem node.