


  • 前言
  • coredump文件
  • 使用gdb调试coredump文件
  • 使用valgrind调试内存泄漏






root@zhr-workstation:~/test# ulimit -c


root@zhr-workstation:~/test# ulimit -c unlimited


root@zhr-workstation:~/test# ulimit -c 


root@zhr-workstation:~/test# ./search.o 
Segmentation fault (core dumped)

再ls -al一下发现本目录下还是没有coredump文件…那么我们怎么办,我们知道coredump文件以pid作为文件名字的一部分,所以我们要知道产生coredump进程的pid,然后搜索他,所以我们在产生coredump的代码中加上这一句

#include <unistd>

printf("pid is %d\n"getpid());


root@zhr-workstation:~/test# find / -name *.212206.*
find: ‘/run/user/1000/gvfs’: Permission denied
find: ‘/run/user/126/gvfs’: Permission denied

我们还可以通过看apport的日志来确定coredump文件的名字apport是Ubuntu’s crash reporting system,coredump就是通过这个apport系统生成的,所以我们看这个日志除了看coredump文件的类型还可以看到我们程序因为收到什么信号发生的coredump,如下

root@zhr-workstation:~/test# tail  /var/log/apport.log



gdb binary_file core_file
root@zhr-workstation:~/test# gdb search.o core._root_test_search_o.0.7304495b-f7bd-4c87-a009-f5c63b165ceb.212237.223362044 
GNU gdb (Ubuntu 11.1-0ubuntu2) 11.1
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
Find the GDB manual and other documentation resources online at:

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from search.o...
[New LWP 212237]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/".
Core was generated by `./search.o'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x000055f27615227b in binary_search (data=0x7fffcabcd590, key=2, low=0, hight=<error reading variable: Cannot access memory at address 0x7fffca3cfffc>) at search.c:16
16      int binary_search(int* data, int key,  int low, int hight){


(gdb) bt
#0  0x000055f27615227b in binary_search (data=0x7fffcabcd590, key=2, low=0, hight=<error reading variable: Cannot access memory at address 0x7fffca3cfffc>) at search.c:16
#1  0x000055f2761522c2 in binary_search (data=0x7fffcabcd590, key=2, low=0, hight=0) at search.c:20
#2  0x000055f2761522c2 in binary_search (data=0x7fffcabcd590, key=2, low=0, hight=0) at search.c:20
#3  0x000055f2761522c2 in binary_search (data=0x7fffcabcd590, key=2, low=0, hight=0) at search.c:20
#4  0x000055f2761522c2 in binary_search (data=0x7fffcabcd590, key=2, low=0, hight=0) at search.c:20
#5  0x000055f2761522c2 in binary_search (data=0x7fffcabcd590, key=2, low=0, hight=0) at search.c:20
#6  0x000055f2761522c2 in binary_search (data=0x7fffcabcd590, key=2, low=0, hight=0) at search.c:20
#7  0x000055f2761522c2 in binary_search (data=0x7fffcabcd590, key=2, low=0, hight=0) at search.c:20
#8  0x000055f2761522c2 in binary_search (data=0x7fffcabcd590, key=2, low=0, hight=0) at search.c:20
#9  0x000055f2761522c2 in binary_search (data=0x7fffcabcd590, key=2, low=0, hight=0) at search.c:20
#10 0x000055f2761522c2 in binary_search (data=0x7fffcabcd590, key=2, low=0, hight=0) at search.c:20
#11 0x000055f2761522c2 in binary_search (data=0x7fffcabcd590, key=2, low=0, hight=0) at search.c:20
#12 0x000055f2761522c2 in binary_search (data=0x7fffcabcd590, key=2, low=0, hight=0) at search.c:20
#13 0x000055f2761522c2 in binary_search (data=0x7fffcabcd590, key=2, low=0, hight=0) at search.c:20

backtrace就是定位引起crash(这里是segment fault)的那个代码所在函数的栈以及之前用的链路栈信息,假如引起crash的代码在c()这个函数中,且main()调用了a()函数,a()函数调用了b()函数,b()函数调用了c()函数,c()函数使用了使程序crash的代码,那么backtrace就会打印从main函数开始一直到调用c()函数的栈的信息(main()->a()->b()->c())

我们用frame 0跳转到第一个出问题的栈帧上,info local看变量没问题把,最后发现问题出在访问hight变量报错,

(gdb) frame 0
#0  0x000055f27615227b in binary_search (data=0x7fffcabcd590, key=2, low=0, hight=<error reading variable: Cannot access memory at address 0x7fffca3cfffc>) at search.c:16
16      int binary_search(int* data, int key,  int low, int hight){
(gdb) info local
mid = 0
index = 0
(gdb) p key
$1 = 2
(gdb) p low
$2 = 0
(gdb) p hight
Cannot access memory at address 0x7fffca3cfffc


(gdb) list
11              int result = binary_search(data, 2, 0, sizeof(data)/4);
13              printf("result is %d\n", result);
14      }
16      int binary_search(int* data, int key,  int low, int hight){
17              int mid = (low + hight) / 2;
18              int index;
19              if( data[mid] > key ){
20                      index = binary_search(data, key, low, mid);
21              }else if( data[mid] < key ){
22                      index = binary_search(data, key, mid, hight);
23              }else if( data[mid] == key ){
24                      return mid;
25              }
26              return index;
28      }



首先用户进程向内核讨要一连串内存空间(当然是虚拟地址),内核不会立马就给,只有当进程真正的写入这个地址的时候,会触发缺页(page_fault)通知内核(虚拟地址到物理地址转换通过tlb–>page walk等等又是另外的故事),当内核给用户进程分配页的时候,分配的页是dirty的(页被分给了其他的进程,且其他的进程做了写磁盘的操作,但是为了提升性能不会立马写入磁盘而是暂存在page中),那么os会将dirty的数据刷入磁盘中,然后将这个page分配给刚刚需要的进程,如果实再没有page可以分配就OOM告诉进程让其停止运行,因为内存不够了

先写出我们造成memory leak的程序

#include <iostream>

class memory_leak{
        std::cout << "construct memory leak class" << std::endl;
        std::cout << "destruct memory leak class without delete" << std::endl;

    void init();
    int* leak_data;

void memory_leak::init(){
    leak_data = new int[10]; //memory leak happen here

int main(){
    auto ml = new memory_leak();
    delete ml; //will memory leak
    return 0;

上述的程序会在ml class对象析构的时候发生内存泄漏,因为构造的时候在heap上new的内存(new int[10])没有delete掉,上述代码发生内存泄漏我们可以一眼看出问题所在,但是在代码几千上万行的时候就非常难了,所以我们要借助valgrand工具,首先我们正常运行

root@zhr-workstation:~/test/gdb# g++ memory_leak.cpp -g 
root@zhr-workstation:~/test/gdb# ./a.out 
construct memory leak class
destruct memory leak class without delete


root@zhr-workstation:~/test/gdb# valgrind ./a.out 
==17759== Memcheck, a memory error detector
==17759== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==17759== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==17759== Command: ./a.out
construct memory leak class
destruct memory leak class without delete
==17759== HEAP SUMMARY:
==17759==     in use at exit: 40 bytes in 1 blocks
==17759==   total heap usage: 4 allocs, 3 frees, 73,776 bytes allocated
==17759== LEAK SUMMARY:
==17759==    definitely lost: 40 bytes in 1 blocks
==17759==    indirectly lost: 0 bytes in 0 blocks
==17759==      possibly lost: 0 bytes in 0 blocks
==17759==    still reachable: 0 bytes in 0 blocks
==17759==         suppressed: 0 bytes in 0 blocks
==17759== Rerun with --leak-check=full to see details of leaked memory
==17759== For lists of detected and suppressed errors, rerun with: -s
==17759== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)


root@zhr-workstation:~/test/gdb# valgrind --leak-check=full ./a.out 
==17860== Memcheck, a memory error detector
==17860== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==17860== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==17860== Command: ./a.out
construct memory leak class
destruct memory leak class without delete
==17860== HEAP SUMMARY:
==17860==     in use at exit: 40 bytes in 1 blocks
==17860==   total heap usage: 4 allocs, 3 frees, 73,776 bytes allocated
==17860== 40 bytes in 1 blocks are definitely lost in loss record 1 of 1
==17860==    at 0x484A2F3: operator new[](unsigned long) (in /usr/libexec/valgrind/
==17860==    by 0x109243: memory_leak::init() (memory_leak.cpp:20)
==17860==    by 0x10937C: memory_leak::memory_leak() (memory_leak.cpp:8)
==17860==    by 0x109274: main (memory_leak.cpp:24)
==17860== LEAK SUMMARY:
==17860==    definitely lost: 40 bytes in 1 blocks
==17860==    indirectly lost: 0 bytes in 0 blocks
==17860==      possibly lost: 0 bytes in 0 blocks
==17860==    still reachable: 0 bytes in 0 blocks
==17860==         suppressed: 0 bytes in 0 blocks
==17860== For lists of detected and suppressed errors, rerun with: -s
==17860== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

看到main()—>memory_leak::memory_leak()—>memory_leak::init()—>operator new[]引起的内存泄漏,这个像gdb的backtrace一样打印出出问题的栈调用链信息,17860代表进程id

#include <iostream>

class memory_leak{
        std::cout << "construct memory leak class" << std::endl;
        std::cout << "destruct memory leak class without delete" << std::endl;
        delete leak_data;

    void init();
    int* leak_data;

void memory_leak::init(){
    leak_data = new int[10]; //memory leak happen here

int main(){
    auto ml = new memory_leak();
    delete ml; //will memory leak
    return 0;


root@zhr-workstation:~/test/gdb# valgrind   ./a.out 
==18226== Memcheck, a memory error detector
==18226== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==18226== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==18226== Command: ./a.out
construct memory leak class
destruct memory leak class without delete
==18226== Mismatched free() / delete / delete []
==18226==    at 0x484BB6F: operator delete(void*, unsigned long) (in /usr/libexec/valgrind/
==18226==    by 0x1093D3: memory_leak::~memory_leak() (memory_leak.cpp:12)
==18226==    by 0x109289: main (memory_leak.cpp:26)
==18226==  Address 0x4ddf110 is 0 bytes inside a block of size 40 alloc'd
==18226==    at 0x484A2F3: operator new[](unsigned long) (in /usr/libexec/valgrind/
==18226==    by 0x109243: memory_leak::init() (memory_leak.cpp:21)
==18226==    by 0x10937C: memory_leak::memory_leak() (memory_leak.cpp:8)
==18226==    by 0x109274: main (memory_leak.cpp:25)
==18226== HEAP SUMMARY:
==18226==     in use at exit: 0 bytes in 0 blocks
==18226==   total heap usage: 4 allocs, 4 frees, 73,776 bytes allocated
==18226== All heap blocks were freed -- no leaks are possible
==18226== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
==18226== 1 errors in context 1 of 1:
==18226== Mismatched free() / delete / delete []
==18226==    at 0x484BB6F: operator delete(void*, unsigned long) (in /usr/libexec/valgrind/
==18226==    by 0x1093D3: memory_leak::~memory_leak() (memory_leak.cpp:12)
==18226==    by 0x109289: main (memory_leak.cpp:26)
==18226==  Address 0x4ddf110 is 0 bytes inside a block of size 40 alloc'd
==18226==    at 0x484A2F3: operator new[](unsigned long) (in /usr/libexec/valgrind/
==18226==    by 0x109243: memory_leak::init() (memory_leak.cpp:21)
==18226==    by 0x10937C: memory_leak::memory_leak() (memory_leak.cpp:8)
==18226==    by 0x109274: main (memory_leak.cpp:25)

本文标签: coredumpgdbvalgrindamp