CADR database was down but it was later restored

Document ID : KB000005822
Last Modified Date : 14/02/2018
Show Technical Document Details
Issue:

The DR database (Vertica) is down, the customer manages to restore the system however there are no apparent errors.

Environment:
Data Repository 2.6 and above
Cause:

Confirm whether this article will address the problem here by investigating the following log files:

See the below entry in dbLog: 


*** glibc detected *** /opt/vertica/bin/vertica: invalid fastbin entry (free): 0x00007fd7112e4280 *** 

And below entry in /var/log/messages/ : 

Jan 29 09:38:00 ca-dr abrt[27582]: Saved core dump of pid 2675 (/opt/vertica/bin/vertica) to /var/spool/abrt/ccpp-2017-01-29-09:37:13-2675 (15629914112 bytes) Jan 29 09:38:00 ca-dr abrtd: Directory 'ccpp-2017-01-29-09:37:13-2675' creation detected Jan 29 09:38:00 ca-dr abrtd: Size of '/var/spool/abrt' >= 1000 MB, deleting 'oops-2016-11-14-11:21:50-2103-0' 
Jan 29 09:38:00 ca-dr abrt[27582]: /var/spool/abrt is 15631322592 bytes (more than 1279MiB), deleting 'oops-2016-11-14-11:21:50-2103-0' 
Jan 29 09:38:00 ca-dr abrt[27582]: Lock file '/var/spool/abrt/oops-2016-11-14-11:21:50-2103-0/.lock' is locked by process 2120 Jan 29 09:38:01 ca-dr abrt[27582]: '/var/spool/abrt/oops-2016-11-14-11:21:50-2103-0' does not exist Jan 29 09:38:01 ca-dr abrtd: Package 'vertica' isn't signed with proper key Jan 29 09:38:01 ca-dr abrtd: 'post-create' on '/var/spool/abrt/ccpp-2017-01-29-09:37:13-2675' exited with 1 Jan 29 09:38:01 ca-dr abrtd: Corrupted or bad directory /var/spool/abrt/ccpp-2017-01-29-09:37:13-2675, deleting 

Review: 

https://my.vertica.com/docs/ReleaseNotes/7.1.x/HP_Vertica_7.1.x_Release_Notes.htm 

- specifically VER-34540, Here's the bug description and resolution Linux servers using glibc versions 

2.12 and later may encounter an issue where a node goes down and the dbLog contains the error "*** glibc detected *** /opt/vertica/bin/vertica: invalid fastbin entry (free)". 

Resolution:

This may be due to a bug in glibc. The RedHat Advisories provide additional details. See 

https://rhn.redhat.com/errata/RHBA-2014-0480.html 

and 

https://rhn.redhat.com/errata/RHSA-2014-1391.html 

If you encounter this error, first determine if you are using an affected version of glibc: 

rpm -qa | grep glibc 

If your version of glibc is 2.12 and does not have a .149 or later suffix, then your server may be affected by this issue. If so, apply the security fix as described by your vendor. For RedHat systems, see: 

https://rhn.redhat.com/errata/RHSA-2014-1391.html