Greenplum에서 버그가 발생하면 CoreDump 파일을 가지고 분석을 하여 원인을 찾을 수 있는데 CoreDump 파일을 생성해 주는 툴이 ABRT(Automatic Bug Reporting Tool) 이다.
ABRT는 Greenplum 에만 국한 되어 있는 것은 아니고 RedHat/CentOS 에서 작동되는 프로그램이 문제가 발생하면 코어덤프를 생성하는 툴이다.
Tanzu Greenplum - Install Guide 에서는 ABRT를 공식적으로 소개하지 않아 참고용으로 확인하는 것이 좋을 듯 하다.
ABRT 정보
- https://zetawiki.com/wiki/%EB%A6%AC%EB%88%85%EC%8A%A4_ABRT
- https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/deployment_guide/ch-abrt
- https://access.redhat.com/solutions/56021
실행환경
- RedHat 7.5
- Greenplum 6.12.0
- ABRT 2.1.11
1. ABRT 설치 및 구동
yum install abrt
yum install abrt-addon-ccpp
yum install abrt-tui
[root@mdw ~]# yum install abrt
Loaded plugins: product-id, search-disabled-repos, subscription-manager
This system is not registered with an entitlement server. You can use subscription-manager to register.
RHEL7.5 | 4.3 kB 00:00:00
Resolving Dependencies
--> Running transaction check
---> Package abrt.x86_64 0:2.1.11-50.el7 will be installed
--> Processing Dependency: abrt-libs = 2.1.11-50.el7 for package: abrt-2.1.11-50.el7.x86_64
.
. 생략
.
Dependency Installed:
abrt-dbus.x86_64 0:2.1.11-50.el7 abrt-libs.x86_64 0:2.1.11-50.el7 abrt-python.x86_64 0:2.1.11-50.el7 augeas-libs.x86_64 0:1.4.0-5.el7
json-c.x86_64 0:0.11-4.el7_0 libreport.x86_64 0:2.1.11-40.el7 libreport-filesystem.x86_64 0:2.1.11-40.el7 libreport-plugin-rhtsupport.x86_64 0:2.1.11-40.el7
libreport-plugin-ureport.x86_64 0:2.1.11-40.el7 libreport-python.x86_64 0:2.1.11-40.el7 libreport-web.x86_64 0:2.1.11-40.el7 libtar.x86_64 0:1.2.11-29.el7
python-augeas.noarch 0:0.5.0-2.el7 satyr.x86_64 0:0.13-14.el7 sos.noarch 0:3.5-6.el7 xmlrpc-c.x86_64 0:1.32.5-1905.svn2451.el7
xmlrpc-c-client.x86_64 0:1.32.5-1905.svn2451.el7
Complete!
[root@mdw ~]# yum install abrt-addon-ccpp
Loaded plugins: product-id, search-disabled-repos, subscription-manager
This system is not registered with an entitlement server. You can use subscription-manager to register.
Resolving Dependencies
--> Running transaction check
---> Package abrt-addon-ccpp.x86_64 0:2.1.11-50.el7 will be installed
--> Finished Dependency Resolution
.
. 생략
.
Running transaction
Installing : abrt-addon-ccpp-2.1.11-50.el7.x86_64 1/1
Verifying : abrt-addon-ccpp-2.1.11-50.el7.x86_64 1/1
Installed:
abrt-addon-ccpp.x86_64 0:2.1.11-50.el7
Complete!
[root@mdw ~]#
[root@mdw ~]# yum -y install abrt-tui
Loaded plugins: product-id, search-disabled-repos, subscription-manager
This system is not registered with an entitlement server. You can use subscription-manager to register.
RHEL7.5 | 4.3 kB 00:00:00
Resolving Dependencies
--> Running transaction check
---> Package abrt-tui.x86_64 0:2.1.11-50.el7 will be installed
--> Processing Dependency: libreport-cli >= 2.1.11-36 for package: abrt-tui-2.1.11-50.el7.x86_64
--> Running transaction check
---> Package libreport-cli.x86_64 0:2.1.11-40.el7 will be installed
--> Finished Dependency Resolution
.
. 생략
.
Installed:
abrt-tui.x86_64 0:2.1.11-50.el7
Dependency Installed:
libreport-cli.x86_64 0:2.1.11-40.el7
Complete!
[root@mdw ~]#
설치가 완료되었으면 구동
systemctl enable abrtd
systemctl enable abrt-ccpp
systemctl start abrtd abrt-ccpp
systemctl status abrtd abrt-ccpp
[root@mdw ~]# systemctl enable abrtd
[root@mdw ~]# systemctl enable abrt-ccpp
[root@mdw ~]# systemctl start abrtd abrt-ccpp
[root@mdw ~]# systemctl status abrtd abrt-ccpp
● abrtd.service - ABRT Automated Bug Reporting Tool
Loaded: loaded (/usr/lib/systemd/system/abrtd.service; enabled; vendor preset: enabled)
Active: active (running) since Sun 2021-07-18 15:25:45 KST; 4s ago
Main PID: 1483 (abrtd)
Tasks: 1
Memory: 1.6M
CGroup: /system.slice/abrtd.service
└─1483 /usr/sbin/abrtd -d -s
Jul 18 15:25:45 mdw.gphd.local systemd[1]: Started ABRT Automated Bug Reporting Tool.
Jul 18 15:25:45 mdw.gphd.local systemd[1]: Starting ABRT Automated Bug Reporting Tool...
Jul 18 15:25:45 mdw.gphd.local abrtd[1483]: Init complete, entering main loop
● abrt-ccpp.service - Install ABRT coredump hook
Loaded: loaded (/usr/lib/systemd/system/abrt-ccpp.service; enabled; vendor preset: enabled)
Active: active (exited) since Sun 2021-07-18 15:25:45 KST; 4s ago
Process: 1484 ExecStart=/usr/sbin/abrt-install-ccpp-hook install (code=exited, status=0/SUCCESS)
Main PID: 1484 (code=exited, status=0/SUCCESS)
Jul 18 15:25:45 mdw.gphd.local systemd[1]: Starting Install ABRT coredump hook...
Jul 18 15:25:45 mdw.gphd.local systemd[1]: Started Install ABRT coredump hook.
[root@mdw ~]#
2. 설정 적용
/etc/security/limits.d/coredumps.conf 파일을 작성
[root@mdw ~]# vi /etc/security/limits.d/coredumps.conf
# Core file size set to unlimited
gpadmin - core unlimited
:wq
[root@mdw ~]#
gpadmin 계정으로 접속한 뒤 ulimit 명령어 실행결과가 unlimited 면 문제없이 적용된 것이다.
[root@mdw limits.d]# su - gpadmin
Last login: Wed Jul 14 15:58:59 KST 2021 on pts/0
[gpadmin@mdw ~]$ ulimit -S -c
unlimited
[gpadmin@mdw ~]$ ulimit -H -c
unlimited
[gpadmin@mdw ~]$
/etc/abrt/abrt-action-save-package-data.conf 파일 수정
[root@mdw ~]# vi /etc/abrt/abrt-action-save-package-data.conf
OpenGPGCheck = no
BlackList = nspluginwrapper, valgrind, strace, mono-core, postgres
ProcessUnpackaged = yes
:wq
[root@mdw ~]#
ABRT 데몬을 재기동 한다.
systemctl restart abrtd
systemctl restart abrt-ccpp
3. 코어덤프 생성방법
설치 및 설정이 완됴되었으니 코어덤프가 실제로 생성되고 어떤 파일을이 생성되는지 확인이 필요하다.
ps 명령어로 postgres 프로세스들을 확인한뒤 kill -11 명령어로 특정 프로세스를 종료해보자. (개발/테스트 환경에서만 실행하자.)

kill -11 명령어를 실행하니 abrt-hook-ccpp 가 실행되고 있는 것을 확인할 수 있다.

[root@mdw abrt]# ps -ef | grep postgres
gpadmin 1706 1 0 15:49 ? 00:00:00 /usr/local/greenplum-db-6.12.0/bin/postgres -D /data/master/gpseg-1 -p 5432 -E
gpadmin 1707 1706 0 15:49 ? 00:00:00 postgres: 5432, master logger process
gpadmin 1710 1706 0 15:49 ? 00:00:00 postgres: 5432, checkpointer process
gpadmin 1711 1706 0 15:49 ? 00:00:00 postgres: 5432, writer process
gpadmin 1712 1706 0 15:49 ? 00:00:00 postgres: 5432, wal writer process
gpadmin 1713 1706 0 15:49 ? 00:00:00 postgres: 5432, stats collector process
gpadmin 1714 1706 0 15:49 ? 00:00:00 postgres: 5432, bgworker: dtx recovery process
gpadmin 1715 1706 0 15:49 ? 00:00:00 postgres: 5432, bgworker: ftsprobe process
gpadmin 1726 1706 0 15:49 ? 00:00:00 postgres: 5432, bgworker: ic proxy process
gpadmin 1728 1706 0 15:49 ? 00:00:00 postgres: 5432, bgworker: metrics collector
root 1792 346 60 15:52 ? 00:00:01 /usr/libexec/abrt-hook-ccpp 11 18446744073709551615 1728 998 1000 1626591161 postgres 1728 1728 mdw.gphd.local
root 1794 1303 0 15:52 pts/0 00:00:00 grep --color=auto postgres
[root@mdw abrt]#
4. 코어덤프 확인
/var/spool/abrt 폴더에서 생성된 코어덤프(폴더)를 확인하자.

폴더명은 cccp-년-월-일-시:분:초-프로세스ID 로 생성된 것을 확인할 수 있다.
아니면 abrt-cli 명령어로 상세하게 확인할 수 있다.
abrt-cli ls

코어덤프 발생시 /var/log/messages 에서 확인 할 수 있어 messages 파일을 계속 모니터링하면 코어덤프 발생시점을 알 수 있다.

[root@mdw ccpp-2021-07-18-15:57:24-3077]# tail -f /var/log/messages
Jul 18 15:57:03 mdw systemd: Starting Session 8 of user gpadmin.
Jul 18 15:57:03 mdw systemd-logind: New session 8 of user gpadmin.
Jul 18 15:57:03 mdw systemd-logind: Removed session 8.
Jul 18 15:57:03 mdw systemd: Removed slice User Slice of gpadmin.
Jul 18 15:57:03 mdw systemd: Stopping User Slice of gpadmin.
Jul 18 15:57:24 mdw abrt-hook-ccpp: Process 3077 (postgres) of user 998 killed by SIGSEGV - dumping core
Jul 18 15:58:01 mdw dbus[595]: [system] Activating service name='org.freedesktop.problems' (using servicehelper)
Jul 18 15:58:01 mdw dbus[595]: [system] Successfully activated service 'org.freedesktop.problems'
Jul 18 16:01:01 mdw systemd: Started Session 9 of user root.
Jul 18 16:01:01 mdw systemd: Starting Session 9 of user root.
생성된 코어덤프 폴더(/var/spool/abrt/ccpp-2021-07-18-15:57:24-3077)를 확인하면 코어덤프 파일 및 OS정보와 관련된 파일들을 확인할 수가 있다.
[root@mdw ccpp-2021-07-18-15:57:24-3077]# ll
total 169360
-rw-r----- 1 root abrt 6 Jul 18 15:57 abrt_version
-rw-r----- 1 root abrt 4 Jul 18 15:57 analyzer
-rw-r----- 1 root abrt 6 Jul 18 15:57 architecture
-rw-r----- 1 root abrt 237 Jul 18 15:57 cgroup
-rw-r----- 1 root abrt 178 Jul 18 15:57 cmdline
-rw-r----- 1 root abrt 14 Jul 18 15:57 component
-rw-r----- 1 root abrt 5261 Jul 18 15:57 core_backtrace
-rw-r----- 1 root abrt 203440128 Jul 18 15:57 coredump
-rw-r----- 1 root abrt 1 Jul 18 15:58 count
-rw-r----- 1 root abrt 4222 Jul 18 15:58 dso_list
-rw-r----- 1 root abrt 7293 Jul 18 15:57 environ
-rw-r----- 1 root abrt 0 Jul 18 15:58 event_log
-rw-r----- 1 root abrt 43 Jul 18 15:57 executable
-rw-r----- 1 root abrt 4 Jul 18 15:57 global_pid
-rw-r----- 1 root abrt 14 Jul 18 15:57 hostname
-rw-r----- 1 root abrt 21 Jul 18 15:57 kernel
-rw-r----- 1 root abrt 10 Jul 18 15:57 last_occurrence
-rw-r----- 1 root abrt 1323 Jul 18 15:57 limits
-rw-r----- 1 root abrt 135 Jul 18 15:58 machineid
-rw-r----- 1 root abrt 21662 Jul 18 15:57 maps
-rw-r----- 1 root abrt 1266 Jul 18 15:57 open_fds
-rw-r----- 1 root abrt 532 Jul 18 15:57 os_info
-rw-r----- 1 root abrt 51 Jul 18 15:57 os_release
-rw-r----- 1 root abrt 27 Jul 18 15:57 package
-rw-r----- 1 root abrt 4 Jul 18 15:57 pid
-rw-r----- 1 root abrt 6 Jul 18 15:57 pkg_arch
-rw-r----- 1 root abrt 1 Jul 18 15:57 pkg_epoch
-rw-r----- 1 root abrt 14 Jul 18 15:57 pkg_name
-rw-r----- 1 root abrt 5 Jul 18 15:57 pkg_release
-rw-r----- 1 root abrt 6 Jul 18 15:57 pkg_vendor
-rw-r----- 1 root abrt 6 Jul 18 15:57 pkg_version
-rw-r----- 1 root abrt 1172 Jul 18 15:57 proc_pid_status
-rw-r----- 1 root abrt 20 Jul 18 15:57 pwd
-rw-r----- 1 root abrt 26 Jul 18 15:57 reason
-rw-r----- 1 root abrt 4 Jul 18 15:57 runlevel
-rw-r----- 1 root abrt 8289220 Jul 18 15:57 sosreport.tar.xz
-rw-r----- 1 root abrt 10 Jul 18 15:57 time
-rw-r----- 1 root abrt 4 Jul 18 15:57 type
-rw-r----- 1 root abrt 3 Jul 18 15:57 uid
-rw-r----- 1 root abrt 8 Jul 18 15:57 username
-rw-r----- 1 root abrt 40 Jul 18 15:58 uuid
-rw-r----- 1 root abrt 272 Jul 18 15:58 var_log_messages
[root@mdw ccpp-2021-07-18-15:57:24-3077]#
유의사항
- 코어덤프가 한번씩 생성되면 용량이 꽤 커서 일정시간동안 보관 후 백업 및 삭제가 꼭 필요하다.