【从入门到放弃-MySQL】InnoDB源码分析-基础数据结构和算法

前言

最近在学习mysql数据库InnoDB引擎的相关知识,在学习的过程中做了一些摘抄和笔记,同时也有一些自己的理解,希望能分享给大家,共同学习,欢迎指正

InnoDB源码文件

先来看一些相关文件及其说明

文件名说明
dyn0dyn.*动态数组的实现
fut0fut.*基于文件的工具集,实现了基于磁盘地址而不是内存地址的双链表
fut0lst.*操作
ha0ha.*用于哈希索引系统的哈希表实现
hash0hash.*简单哈希表实现,用于fil、buf和recv等模块中
mem0.内存管理系统,包括内存池的实现
ut0byte.*字节操作实现
ut0dbg.cdebug工具
ut0mem.*内存管理基元,内存块实现
ut0rnd.*随机值和哈希值操作实现
ut0ut.*其它常用操作,包括时间、打印、对数和指数操作
ut0sort.h标准排序算法的宏定义,基于合并排序
ut0lst.h双向线性链表实现

内存管理

InnoDB存储引擎采用内存堆的方式来进行内存对象的管理(而不是直接使用malloc和free),其优点是可以一次性分配大块的内存,而不是按需分配。这样的分配方式可以将多次的内存分配合并为单次进行,之后的内存请求救可以在InnoDB引擎内部进行,从而减少了频繁调用函数malloc和free进行的时间与潜在的性能开销。
此外,InnoDB存储引擎还允许从缓冲池中分配内存来建立内存堆,这样可以更快的请求整个内存页(通常为16K)。这种分配方法称为缓冲池分配

在这,我联想到之前看到过在php的内核中,内存管理也是采取这样的方式,因为不管是php还是mysql都要经常的申请和释放内存,而只有操作系统才能使用malloc和free操作内存,普通的应用程序是无法直接对内存进行访问的,需要向操作系统申请,向操作系统申请会引发系统调用,在系统调用时,cpu要从用户态切换到内核态,而这个切换的开销是比较大的,如果频繁切换则会有很大性能开销。所以,在对内存有频繁操作的应用程序中,通常每次申请一大块内存来自己维护,同样的在这些内存使用完之后也不立刻归还给操作系统,而是自己留着以备下次使用

内存块的数据结构

内存堆相当于一个栈,通过不断的增加内存块对象来增长空间,
InnoDB使用mem_block_t来表示内存堆中每次从操作系统或者缓冲池中分配的内存块。

/innobase/include/mem0mem.h line:43

1
2
3
/* A block of a memory heap consists of the info structure
followed by an area of memory */
typedef struct mem_block_info_t mem_block_t;

mem_block_info_t的数据结构如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
### /innobase/include/mem0mem.h    line:371
/** The info structure stored at the beginning of a heap block */
struct mem_block_info_t {
ulint magic_n;/* magic number for debugging */
#ifdef UNIV_DEBUG,
char file_name[8];/* file name where the mem heap was created */
ulint line; /*!< line number where the mem heap was created */
#endif /* UNIV_DEBUG */
UT_LIST_BASE_NODE_T(mem_block_t) base; /* In the first block in the
the list this is the base node of the list of blocks;
in subsequent blocks this is undefined */
UT_LIST_NODE_T(mem_block_t) list; /* This contains pointers to next
and prev in the list. The first block allocated
to the heap is also the first block in this list,
though it also contains the base node of the list. */
ulint len; /*!< physical length of this block in bytes */
ulint total_size; /*!< physical length in bytes of all blocks
in the heap. This is defined only in the base
node and is set to ULINT_UNDEFINED in others. */
ulint type; /*!< type of heap: MEM_HEAP_DYNAMIC, or
MEM_HEAP_BUF possibly ORed to MEM_HEAP_BTR_SEARCH */
ulint free; /*!< offset in bytes of the first free position for
user data in the block */
ulint start; /*!< the value of the struct field 'free' at the
creation of the block */
#ifndef UNIV_HOTBACKUP
void* free_block;
/* if the MEM_HEAP_BTR_SEARCH bit is set in type,
and this is the heap root, this can contain an
allocated buffer frame, which can be appended as a
free block to the heap, if we need more space;
otherwise, this is NULL */
void* buf_block;
/* if this block has been allocated from the buffer
pool, this contains the buf_block_t handle;
otherwise, this is NULL */
#endif /* !UNIV_HOTBACKUP */
#ifdef MEM_PERIODIC_CHECK
UT_LIST_NODE_T(mem_block_t) mem_block_list;
/* List of all mem blocks allocated; protected
by the mem_comm_pool mutex */
#endif
};

UT_LIST_BASE_NODE_T ### /innobase/include/ut0lst.h line:75

1
2
3
4
5
6
7
8
9
10
11
12
13
14
This macro expands to the unnamed type definition of a struct which acts
as the two-way list base node. The base node contains pointers
to both ends of the list and a count of nodes in the list (excluding
the base node from the count).
@param TYPE the name of the list node data type */
template <typename TYPE>
struct ut_list_base {
typedef TYPE elem_type;
ulint count; /*!< count of nodes in list */
TYPE* start; /*!< pointer to list start, NULL if empty */
TYPE* end; /*!< pointer to list end, NULL if empty */
};

#define UT_LIST_BASE_NODE_T(TYPE) ut_list_base<TYPE>

可以看到,有两个指针指向链表头和链表尾部,count存储节点个数

UT_LIST_NODE_T在 ### /innobase/include/ut0lst.h line:75

1
2
3
4
5
6
7
template <typename TYPE>
struct ut_list_node {
TYPE* prev; /*!< pointer to the previous node,
NULL if start of list */
TYPE* next; /*!< pointer to next node, NULL if end of list */
};
#define UT_LIST_NODE_T(TYPE) ut_list_node<TYPE>

可以看到有两个指针指向前一个和后一个结点

InnoDB 存储引擎定义了三种内存堆类型:

/innobase/include/mem0mem.h line:52

1
2
3
4
5
6
7
8
9
#define MEM_HEAP_DYNAMIC	0	/* the most common type 堆的内存调用通过内存池接口申请*/
#define MEM_HEAP_BUFFER 1 /* 堆的内存调用通过缓冲池申请 */
#define MEM_HEAP_BTR_SEARCH 2 /* this flag can optionally be
ORed to MEM_HEAP_BUFFER, in which
case heap->free_block is used in
some cases for memory allocations,
and if it's NULL, the memory
allocation functions can return
NULL. 是MEM_HEAP_BUFFER的子类型,仅在自适应哈希索引中使用,并且此类型会多维护一个free_block缓冲块 */

内存堆的创建是由函数mem_heap_create_func完成

/innobase/include/mem0mem.ic line:431

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
/*****************************************************************//**
NOTE: Use the corresponding macros instead of this function. Creates a
memory heap. For debugging purposes, takes also the file name and line as
argument.
@return own: memory heap, NULL if did not succeed (only possible for
MEM_HEAP_BTR_SEARCH type heaps) */
UNIV_INLINE
mem_heap_t*
mem_heap_create_func(
/*=================*/
ulint n, /*!< in: desired start block size,
this means that a single user buffer
of size n will fit in the block,
0 creates a default size block */
#ifdef UNIV_DEBUG
const char* file_name, /*!< in: file name where created */
ulint line, /*!< in: line where created */
#endif /* UNIV_DEBUG */
ulint type) /*!< in: heap type */
{
mem_block_t* block;
if (!n) {
n = MEM_BLOCK_START_SIZE;
}
block = mem_heap_create_block(NULL, n, type, file_name, line);
if (block == NULL) {
return(NULL);
}
UT_LIST_INIT(block->base);
/* Add the created block itself as the first block in the list */
UT_LIST_ADD_FIRST(list, block->base, block);
#ifdef UNIV_MEM_DEBUG
mem_hash_insert(block, file_name, line);
#endif
return(block);
}

从上面代码中可以看到mem_heap_create_func主要是调用mem_heap_create_block函数实现内存块的创建的

/innobase/include/mem0mem.ic line:32

1
2
3
4
5
6
7
# define mem_heap_create_block(heap, n, type, file_name, line)		\
mem_heap_create_block_func(heap, n, file_name, line, type)
# define mem_heap_create_at(N, file_name, line) \
mem_heap_create_func(N, file_name, line, MEM_HEAP_DYNAMIC)
#else /* UNIV_DEBUG */
# define mem_heap_create_block(heap, n, type, file_name, line) \
mem_heap_create_block_func(heap, n, type)

再来追mem_heap_create_block_func函数

/innobase/mem/mem0mem.cc line:296

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
/***************************************************************//**
Creates a memory heap block where data can be allocated.
@return own: memory heap block, NULL if did not succeed (only possible
for MEM_HEAP_BTR_SEARCH type heaps) */
UNIV_INTERN
mem_block_t*
mem_heap_create_block_func(
/*=======================*/
mem_heap_t* heap, /*!< in: memory heap or NULL if first block
should be created */
ulint n, /*!< in: number of bytes needed for user data */
#ifdef UNIV_DEBUG
const char* file_name,/*!< in: file name where created */
ulint line, /*!< in: line where created */
#endif /* UNIV_DEBUG */
ulint type) /*!< in: type of heap: MEM_HEAP_DYNAMIC or
MEM_HEAP_BUFFER */
{
#ifndef UNIV_HOTBACKUP
buf_block_t* buf_block = NULL;
#endif /* !UNIV_HOTBACKUP */
mem_block_t* block;
ulint len;

ut_ad((type == MEM_HEAP_DYNAMIC) || (type == MEM_HEAP_BUFFER)
|| (type == MEM_HEAP_BUFFER + MEM_HEAP_BTR_SEARCH));

if (heap && heap->magic_n != MEM_BLOCK_MAGIC_N) {
mem_analyze_corruption(heap);
}

/* In dynamic allocation, calculate the size: block header + data. */
len = MEM_BLOCK_HEADER_SIZE + MEM_SPACE_NEEDED(n);

#ifndef UNIV_HOTBACKUP
if (type == MEM_HEAP_DYNAMIC || len < UNIV_PAGE_SIZE / 2) {

ut_ad(type == MEM_HEAP_DYNAMIC || n <= MEM_MAX_ALLOC_IN_BUF);

block = static_cast<mem_block_t*>(
mem_area_alloc(&len, mem_comm_pool));
} else {
len = UNIV_PAGE_SIZE;

if ((type & MEM_HEAP_BTR_SEARCH) && heap) {
/* We cannot allocate the block from the
buffer pool, but must get the free block from
the heap header free block field */

buf_block = static_cast<buf_block_t*>(heap->free_block);
heap->free_block = NULL;

if (UNIV_UNLIKELY(!buf_block)) {

return(NULL);
}
} else {
buf_block = buf_block_alloc(NULL);
}

block = (mem_block_t*) buf_block->frame;
}

if(!block) {
ib_logf(IB_LOG_LEVEL_FATAL,
" InnoDB: Unable to allocate memory of size %lu.\n",
len);
}
block->buf_block = buf_block;
block->free_block = NULL;
#else /* !UNIV_HOTBACKUP */
len = MEM_BLOCK_HEADER_SIZE + MEM_SPACE_NEEDED(n);
block = ut_malloc(len);
ut_ad(block);
#endif /* !UNIV_HOTBACKUP */

block->magic_n = MEM_BLOCK_MAGIC_N;
ut_d(ut_strlcpy_rev(block->file_name, file_name,
sizeof(block->file_name)));
ut_d(block->line = line);

#ifdef MEM_PERIODIC_CHECK
mutex_enter(&(mem_comm_pool->mutex));

if (!mem_block_list_inited) {
mem_block_list_inited = TRUE;
UT_LIST_INIT(mem_block_list);
}

UT_LIST_ADD_LAST(mem_block_list, mem_block_list, block);

mutex_exit(&(mem_comm_pool->mutex));
#endif
mem_block_set_len(block, len);
mem_block_set_type(block, type);
mem_block_set_free(block, MEM_BLOCK_HEADER_SIZE);
mem_block_set_start(block, MEM_BLOCK_HEADER_SIZE);

if (UNIV_UNLIKELY(heap == NULL)) {
/* This is the first block of the heap. The field
total_size should be initialized here */
block->total_size = len;
} else {
/* Not the first allocation for the heap. This block's
total_length field should be set to undefined. */
ut_d(block->total_size = ULINT_UNDEFINED);
UNIV_MEM_INVALID(&block->total_size,
sizeof block->total_size);

heap->total_size += len;
}

ut_ad((ulint)MEM_BLOCK_HEADER_SIZE < len);

return(block);
}

这个就是最终的创建内存块的函数,主要是对mem_block_t结构的一些赋值操作

到这一步mem_block_t结构的构成、创建和赋值已经算是简单了解完了。可能在学习方法或某些方面会存在不足,希望大家能批评指正。