产生的困惑
mq落盘的时候,rocketmq采用mmap,但是多处mmap同一个文件,会不会导致映射多次到物理内存上?
页是内存管理的最小单位(通常是4k)
查看程序虚拟地址
利用 man proc 可以看到命令介绍,/proc/[pid]/maps里面关于一些参数解释
/proc/[pid]/maps
A file containing the currently mapped memory regions and their access permissions. See mmap(2) for some further information about memory mappings.
Permission to access this file is governed by a ptrace access mode PTRACE_MODE_READ_FSCREDS check; see ptrace(2).
The format of the file is:
address perms offset dev inode pathname
00400000-00452000 r-xp 00000000 08:02 173521 /usr/bin/dbus-daemon
00651000-00652000 r--p 00051000 08:02 173521 /usr/bin/dbus-daemon
00652000-00655000 rw-p 00052000 08:02 173521 /usr/bin/dbus-daemon
00e03000-00e24000 rw-p 00000000 00:00 0 [heap]
00e24000-011f7000 rw-p 00000000 00:00 0 [heap]
...
35b1800000-35b1820000 r-xp 00000000 08:02 135522 /usr/lib64/ld-2.15.so
35b1a1f000-35b1a20000 r--p 0001f000 08:02 135522 /usr/lib64/ld-2.15.so
35b1a20000-35b1a21000 rw-p 00020000 08:02 135522 /usr/lib64/ld-2.15.so
35b1a21000-35b1a22000 rw-p 00000000 00:00 0
35b1c00000-35b1dac000 r-xp 00000000 08:02 135870 /usr/lib64/libc-2.15.so
35b1dac000-35b1fac000 ---p 001ac000 08:02 135870 /usr/lib64/libc-2.15.so
35b1fac000-35b1fb0000 r--p 001ac000 08:02 135870 /usr/lib64/libc-2.15.so
35b1fb0000-35b1fb2000 rw-p 001b0000 08:02 135870 /usr/lib64/libc-2.15.so
...
f2c6ff8c000-7f2c7078c000 rw-p 00000000 00:00 0 [stack:986]
...
7fffb2c0d000-7fffb2c2e000 rw-p 00000000 00:00 0 [stack]
7fffb2d48000-7fffb2d49000 r-xp 00000000 00:00 0 [vdso]
The address field is the address space in the process that the mapping occupies. The perms field is a set of permissions:
r = read
w = write
x = execute
s = shared
p = private (copy on write)
利用 /proc/[pid]/maps
可以很方便地看到程序里面申请的虚拟地址空间
虚拟内存分布
//cat test.go
//go build -o test test.go
package main
import (
"fmt"
"os"
"os/signal"
"time"
)
func main() {
c := make(chan os.Signal)
signal.Notify(c)
go func() {
for {
fmt.Printf("hello world\n")
time.Sleep(time.Duration(1) * time.Second)
}
}()
<-c
fmt.Printf("close")
}
cat /proc/10415/maps
00400000-0048c000 r-xp 00000000 08:05 523769 /media/chainhelen/LENOVO/xlworkspace/code/go/src/test
0048c000-0054d000 r--p 0008c000 08:05 523769 /media/chainhelen/LENOVO/xlworkspace/code/go/src/test
0054d000-00561000 rw-p 0014d000 08:05 523769 /media/chainhelen/LENOVO/xlworkspace/code/go/src/test
00561000-0057f000 rw-p 00000000 00:00 0
c000000000-c004000000 rw-p 00000000 00:00 0
7f7a637de000-7f7a65a8f000 rw-p 00000000 00:00 0
7fff8850e000-7fff8852f000 rw-p 00000000 00:00 0 [stack]
7fff8852f000-7fff88532000 r--p 00000000 00:00 0 [vvar]
7fff88532000-7fff88534000 r-xp 00000000 00:00 0 [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]
这里有意思对于同一个path /media/chainhelen/LENOVO/xlworkspace/code/go/src/test
有三份映射,可以看到第一个是 x
权限,可以猜测到是.text
也就是代码段
第二个只读,而第三个是读写,分别是.bss
和 .data
段
我们也用readelf
可以验证一下猜想
[ 1] .text PROGBITS 0000000000401000 00001000
000000000008aedb 0000000000000000 AX 0 0 16
[ 9] .data PROGBITS 0000000000559aa0 00159aa0
0000000000006ef0 0000000000000000 WA 0 0 32
[10] .bss NOBITS 00000000005609a0 001609a0
000000000001b9b0 0000000000000000 WA 0 0 32
恩,分分钟打脸,还是有一些对不上,毕竟还有一些存在感比较少的section
那就从segment
角度出发去理解section
的管理
readelf -l test
Elf 文件类型为 EXEC (可执行文件)
Entry point 0x455050
There are 7 program headers, starting at offset 64
程序头:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
PHDR 0x0000000000000040 0x0000000000400040 0x0000000000400040
0x0000000000000188 0x0000000000000188 R 0x1000
NOTE 0x0000000000000f9c 0x0000000000400f9c 0x0000000000400f9c
0x0000000000000064 0x0000000000000064 R 0x4
LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000
0x000000000008bedb 0x000000000008bedb R E 0x1000
LOAD 0x000000000008c000 0x000000000048c000 0x000000000048c000
0x00000000000c0bd1 0x00000000000c0bd1 R 0x1000
LOAD 0x000000000014d000 0x000000000054d000 0x000000000054d000
0x00000000000139a0 0x0000000000031ab8 RW 0x1000
GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x0000000000000000 0x0000000000000000 RW 0x8
LOOS+0x5041580 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x0000000000000000 0x0000000000000000 0x8
Section to Segment mapping:
段节...
00
01 .note.go.buildid
02 .text .note.go.buildid
03 .rodata .typelink .itablink .gosymtab .gopclntab
04 .noptrdata .data .bss .noptrbss
注意三个LOAD刚好对应 程序的虚拟内存,分别对应 Section to Segment mapping
的 02 03 04
地址也是刚好对应进程运行的三个虚拟内存地址
每一个映射
都对应多个 .section
的文件内容
物理内存地址
上面都是虚拟内存地址,如何找到映射后对应的物理内存地址,/proc/[pid]/pagemap
可以查到映射关系
cat /proc/10415/pagemap
���������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������
不能直接查看,一堆乱码,看起来是二进制
// 使用 man proc
/proc/[pid]/pagemap (since Linux 2.6.25)
This file shows the mapping of each of the process's virtual pages into physical page frames or swap area. It contains one 64-bit value for each virtual page, with the bits set as follows:
63 If set, the page is present in RAM.
62 If set, the page is in swap space
61 (since Linux 3.5)
The page is a file-mapped page or a shared anonymous page.
60–57 (since Linux 3.11)
Zero
56 (since Linux 4.2)
The page is exclusively mapped.
55 (since Linux 3.11)
PTE is soft-dirty (see the kernel source file Documentation/vm/soft-dirty.txt).
54–0 If the page is present in RAM (bit 63), then these bits provide the page frame number, which can be used to index /proc/kpageflags and /proc/kpagecount. If the page is present in
swap (bit 62), then bits 4–0 give the swap type, and bits 54–5 encode the swap offset.
从上面可知,对于每一个虚拟页,都提供了64个bit来描述映射关系
如果我们不考虑swap space
,只看RAM,其中 0-54字节提供了page frame number
可以理解映射页对应的物理内存的页数(将这些字节转化成数字)
而第63字节,代表是否真的存在在RAM
的内存里面
转换
./vir_trans_phy [pid] [viraddr]
package main
import (
"bufio"
"fmt"
"io"
"os"
"strconv"
)
func main() {
var (
err error
pid int
viraddr int64
file *os.File
pagemappath string
pagesize int64
reader io.Reader
info []byte
pageframenum int64
phyaddr int64
offset int64
)
if len(os.Args) != 3 {
panic("be like ./vir_trans_phy pid virtual_address")
}
if pid, err = strconv.Atoi(os.Args[1]); err != nil {
panic("be like ./vir_trans_phy pid virtual_address " + err.Error())
}
if viraddr, err = strconv.ParseInt(os.Args[2], 16, 64); err != nil {
panic("be like ./vir_trans_phy pid virtual_address " + err.Error())
}
pagemappath = fmt.Sprintf("/proc/%d/pagemap", pid)
if file, err = os.Open(pagemappath); err != nil {
panic(fmt.Sprintf("can't not open %s, err=%s", pagemappath, err.Error()))
}
pagesize = int64(os.Getpagesize())
offset = (viraddr / pagesize) * 8
fmt.Printf("moviing offset %d\n", offset)
if _, err = file.Seek(offset, 0); err != nil {
panic("file seek offset" + err.Error())
}
reader = bufio.NewReader(file)
info = make([]byte, 8)
if _, err = reader.Read(info); err != nil {
panic("reader info " + err.Error())
}
fmt.Printf("reader info %#v\n", info)
if 0 == (info[7] & (1 << uint(7))) {
if 0 == (info[7] & (1 << uint(6))) {
fmt.Printf("pid:%d virtaddr:0x%x not present, not in swapped\n", pid, viraddr)
} else {
fmt.Printf("pid:%d virtaddr:0x%x not present, in swapped\n", pid, viraddr)
}
} else {
tmp := int64(1)
for i := 0; i <= 54; i++ {
res := info[(i)/8] & (1 << uint((i)%8))
if 0 != res {
pageframenum += tmp
}
tmp = tmp * 2
}
phyaddr = pageframenum*pagesize + viraddr%pagesize
fmt.Printf("pid:%d virtaddr:0x%x phyaddr:0x%x\n", pid, viraddr, phyaddr)
}
}
代码基本如上,需要几处注意的点:
- 64位对应起来是8个字节,所以上面申请了8大小的byte数组,实际上面语句并不是按字节来解析的,而是按照bit来解析的
- 拿到真实页帧数,因为本身映射是按照page来算的,所以在映射page起始的偏移量是相同的
- 需要先算
present
的值
测试
采用mmap
进行测试
1 Mmap
之后 直接 Munmap
// cat test_mmap_1.go
package main
import (
"fmt"
"os"
"os/signal"
"syscall"
)
func main() {
c := make(chan os.Signal)
signal.Notify(c)
go func() {
var err error
file, err := os.OpenFile("mmap.test", os.O_RDWR|os.O_CREATE, 0644)
if err != nil {
panic(err)
}
fmt.Printf("%#v\n", file.Fd())
defer file.Close()
mmapdata, err := syscall.Mmap(int(file.Fd()), 0, 1<<8, syscall.PROT_READ|syscall.PROT_WRITE, syscall.MAP_SHARED)
if err != nil {
panic(err)
}
if err = syscall.Munmap(mmapdata); nil != err {
panic(err)
}
}()
<-c
}
// cat /proc/10994/maps
00400000-0048d000 r-xp 00000000 08:05 523772 /media/chainhelen/LENOVO/xlworkspace/code/go/src/test_mmap_1
0048d000-0054f000 r--p 0008d000 08:05 523772 /media/chainhelen/LENOVO/xlworkspace/code/go/src/test_mmap_1
0054f000-00563000 rw-p 0014f000 08:05 523772 /media/chainhelen/LENOVO/xlworkspace/code/go/src/test_mmap_1
00563000-00581000 rw-p 00000000 00:00 0
c000000000-c004000000 rw-p 00000000 00:00 0
7fbbe34f1000-7fbbe57e2000 rw-p 00000000 00:00 0
7fff8a25e000-7fff8a27f000 rw-p 00000000 00:00 0 [stack]
7fff8a36c000-7fff8a36f000 r--p 00000000 00:00 0 [vvar]
7fff8a36f000-7fff8a371000 r-xp 00000000 00:00 0 [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]
发现并没有什么留痕,那么接下来测试就加上一些对文件的操作
2 Mmap
之后 操作文件 之后在 Munmap
...
22
23 mmapdata, err := syscall.Mmap(int(file.Fd()), 0, 1<<8, syscall.PROT_READ|syscall.PROT_WRITE, syscall.MAP_SHARED)
24 if err != nil {
25 panic(err)
26 }
27
28 for i, v := range []byte("hello mmap_1") {
29 mmapdata[i] = v
30 }
31
32 if err = syscall.Munmap(mmapdata); nil != err {
33 panic(err)
34 }
...
// cat /proc/11909/maps
00400000-0048d000 r-xp 00000000 08:05 523772 /media/chainhelen/LENOVO/xlworkspace/code/go/src/test_mmap_2
0048d000-0054f000 r--p 0008d000 08:05 523772 /media/chainhelen/LENOVO/xlworkspace/code/go/src/test_mmap_2
0054f000-00563000 rw-p 0014f000 08:05 523772 /media/chainhelen/LENOVO/xlworkspace/code/go/src/test_mmap_2
00563000-00581000 rw-p 00000000 00:00 0
c000000000-c004000000 rw-p 00000000 00:00 0
7febe4951000-7febe6c02000 rw-p 00000000 00:00 0
7ffc17dda000-7ffc17dfb000 rw-p 00000000 00:00 0 [stack]
7ffc17dfb000-7ffc17dfe000 r--p 00000000 00:00 0 [vvar]
7ffc17dfe000-7ffc17e00000 r-xp 00000000 00:00 0 [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]
还是maps
什么都没有,但是已经产生了一个mmap.test
文件,继续将 Munmap
去掉看看
3 Mmaps
完,操作文件mmap.test
文件,不使用Munmap
23 mmapdata, err := syscall.Mmap(int(file.Fd()), 0, 1<<8, syscall.PROT_READ|syscall.PROT_WRITE, syscall.MAP_SHARED)
24 if err != nil {
25 panic(err)
26 }
27
28 for i, v := range []byte("hello mmap_3") {
29 mmapdata[i] = v
30 }
31
32 // if err = syscall.Munmap(mmapdata); nil != err {
33 // panic(err)
34 // }
00400000-0048d000 r-xp 00000000 08:05 523750 /media/chainhelen/LENOVO/xlworkspace/code/go/src/test_mmap_3
0048d000-0054f000 r--p 0008d000 08:05 523750 /media/chainhelen/LENOVO/xlworkspace/code/go/src/test_mmap_3
0054f000-00563000 rw-p 0014f000 08:05 523750 /media/chainhelen/LENOVO/xlworkspace/code/go/src/test_mmap_3
00563000-00581000 rw-p 00000000 00:00 0
c000000000-c004000000 rw-p 00000000 00:00 0
7f851fb44000-7f851fb45000 rw-s 00000000 08:05 522712 /media/chainhelen/LENOVO/xlworkspace/code/go/src/mmap.test
7f851fb45000-7f8521e36000 rw-p 00000000 00:00 0
7ffc38027000-7ffc38048000 rw-p 00000000 00:00 0 [stack]
7ffc381c2000-7ffc381c5000 r--p 00000000 00:00 0 [vvar]
7ffc381c5000-7ffc381c7000 r-xp 00000000 00:00 0 [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]
然而实际,求值 虚拟内存
会发现
./vir_trans_phy 15570 7fbe16a95000
moviing offset 274325001384
reader info []byte{0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x80, 0xa1}
pid:15570 virtaddr:0x7fbe16a95000 phyaddr:0x0
这个0x0
数值明显不对,到底是为什么??
问题竟然出在 权限上,不知道为什么这些字节的读权限竟然不一样!!!!!!!
sudo ./vir_trans_phy 15570 7fbe16a95000
moviing offset 274325001384
reader info []byte{0x90, 0xc9, 0x1f, 0x0, 0x0, 0x0, 0x80, 0xa1}
pid:15570 virtaddr:0x7fbe16a95000 phyaddr:0x1fc990000
4 多个协程去mmap同一个文件会怎么样,修改一下上面的代码
package main
import (
"fmt"
"os"
"os/signal"
"syscall"
)
func main() {
c := make(chan os.Signal)
signal.Notify(c)
go func() {
var err error
file, err := os.OpenFile("mmap.test", os.O_RDWR|os.O_CREATE, 0644)
if err != nil {
panic(err)
}
fmt.Printf("%#v\n", file.Fd())
defer file.Close()
mmapdata, err := syscall.Mmap(int(file.Fd()), 0, 1<<8, syscall.PROT_READ|syscall.PROT_WRITE, syscall.MAP_SHARED)
if err != nil {
panic(err)
}
for i, v := range []byte("hello mmap_3") {
mmapdata[i] = v
}
}()
go func() {
var err error
file, err := os.OpenFile("mmap.test", os.O_RDWR|os.O_CREATE, 0644)
if err != nil {
panic(err)
}
fmt.Printf("%#v\n", file.Fd())
defer file.Close()
mmapdata, err := syscall.Mmap(int(file.Fd()), 1<<12, 1<<8, syscall.PROT_READ|syscall.PROT_WRITE, syscall.MAP_SHARED)
if err != nil {
panic(err)
}
for i, v := range []byte("hello mmap_4") {
mmapdata[i] = v
}
}()
<-c
}
a. 注意一个细节,就是Mmap 的 第二个参数 offset 必须是 4k
(1<<12
,也就是页单位)的整数倍,否则就是invalid argument
,主要是内存对齐会出现问题
b. 而实际上面的代码执行之后,一直报错
0x3
unexpected fault address 0x7f8edf46a000
fatal error: fault
0x4
[signal SIGBUS: bus error code=0x2 addr=0x7f8edf46a000 pc=0x48c65d]
访问超出创建文件尾的内存地址,SIGBUS
就来了,需要使用Ftruncate
处理一下
package main
import (
"fmt"
"os"
"os/signal"
"syscall"
)
func main() {
c := make(chan os.Signal)
signal.Notify(c)
go func() {
var err error
file, err := os.OpenFile("mmap.test", os.O_RDWR|os.O_CREATE, 0644)
if err != nil {
panic(err)
}
fmt.Printf("%#v\n", file.Fd())
defer file.Close()
mmapdata, err := syscall.Mmap(int(file.Fd()), 0, 1<<8, syscall.PROT_READ|syscall.PROT_WRITE, syscall.MAP_SHARED)
if err != nil {
panic(err)
}
for i, v := range []byte("hello mmap_3") {
mmapdata[i] = v
}
// if err = syscall.Munmap(mmapdata); nil != err {
// panic(err)
// }
}()
go func() {
var err error
file, err := os.OpenFile("mmap.test", os.O_RDWR|os.O_CREATE, 0644)
if err != nil {
panic(err)
}
fmt.Printf("%#v\n", file.Fd())
defer file.Close()
mmapdata, err := syscall.Mmap(int(file.Fd()), 1<<12, 1<<8, syscall.PROT_READ|syscall.PROT_WRITE, syscall.MAP_SHARED)
if err != nil {
panic(err)
}
err = syscall.Ftruncate(int(file.Fd()), int64(1<<15))
if err != nil {
panic(err)
}
for i, v := range []byte("hello mmap_4") {
mmapdata[i+10] = v
}
// if err = syscall.Munmap(mmapdata); nil != err {
// panic(err)
// }
}()
<-c
}
查看一下两个mmap的实际物理内存地址
cat /proc/26655/maps
00400000-0048d000 r-xp 00000000 08:05 523751 /media/chainhelen/LENOVO/xlworkspace/code/go/src/test_mmap_4
0048d000-0054f000 r--p 0008d000 08:05 523751 /media/chainhelen/LENOVO/xlworkspace/code/go/src/test_mmap_4
0054f000-00563000 rw-p 0014f000 08:05 523751 /media/chainhelen/LENOVO/xlworkspace/code/go/src/test_mmap_4
00563000-00581000 rw-p 00000000 00:00 0
c000000000-c004000000 rw-p 00000000 00:00 0
7f15a7910000-7f15a7911000 rw-s 00000000 08:05 522712 /media/chainhelen/LENOVO/xlworkspace/code/go/src/mmap.test
7f15a7911000-7f15a7912000 rw-s 00001000 08:05 522712 /media/chainhelen/LENOVO/xlworkspace/code/go/src/mmap.test
sudo ./vir_trans_phy 26655 7f15a7910000
moviing offset 272912074880
reader info []byte{0x75, 0xc4, 0x16, 0x0, 0x0, 0x0, 0x80, 0xa1}
pid:26655 virtaddr:0x7f15a7910000 phyaddr:0x16c475000
sudo ./vir_trans_phy 26655 7f15a7911000
moviing offset 272912074888
reader info []byte{0x5f, 0x21, 0x1c, 0x0, 0x0, 0x0, 0x80, 0xa1}
pid:26655 virtaddr:0x7f15a7911000 phyaddr:0x1c215f000
5 最后再看一下,多个协程mmap同一文件的同一页,实际对应物理地址是不是同一个
package main
import (
"fmt"
"os"
"os/signal"
"syscall"
)
func main() {
c := make(chan os.Signal)
signal.Notify(c)
go func() {
var err error
file, err := os.OpenFile("mmap.test", os.O_RDWR|os.O_CREATE, 0644)
if err != nil {
panic(err)
}
fmt.Printf("%#v\n", file.Fd())
defer file.Close()
mmapdata, err := syscall.Mmap(int(file.Fd()), 0, 1<<8, syscall.PROT_READ|syscall.PROT_WRITE, syscall.MAP_SHARED)
if err != nil {
panic(err)
}
for i, v := range []byte("hello mmap_3") {
mmapdata[i] = v
}
// if err = syscall.Munmap(mmapdata); nil != err {
// panic(err)
// }
}()
go func() {
var err error
file, err := os.OpenFile("mmap.test", os.O_RDWR|os.O_CREATE, 0644)
if err != nil {
panic(err)
}
fmt.Printf("%#v\n", file.Fd())
defer file.Close()
mmapdata, err := syscall.Mmap(int(file.Fd()), 0, 1<<8, syscall.PROT_READ|syscall.PROT_WRITE, syscall.MAP_SHARED)
if err != nil {
panic(err)
}
err = syscall.Ftruncate(int(file.Fd()), int64(1<<15))
if err != nil {
panic(err)
}
for i, v := range []byte("hello mmap_4") {
mmapdata[i+10] = v
}
// if err = syscall.Munmap(mmapdata); nil != err {
// panic(err)
// }
}()
<-c
}
cat /proc/27729/maps
00400000-0048d000 r-xp 00000000 08:05 523764 /media/chainhelen/LENOVO/xlworkspace/code/go/src/test_mmap_5
0048d000-0054f000 r--p 0008d000 08:05 523764 /media/chainhelen/LENOVO/xlworkspace/code/go/src/test_mmap_5
0054f000-00563000 rw-p 0014f000 08:05 523764 /media/chainhelen/LENOVO/xlworkspace/code/go/src/test_mmap_5
00563000-00581000 rw-p 00000000 00:00 0
c000000000-c004000000 rw-p 00000000 00:00 0
7f96452c6000-7f96452c7000 rw-s 00000000 08:05 522712 /media/chainhelen/LENOVO/xlworkspace/code/go/src/mmap.test
7f96452c7000-7f96452c8000 rw-s 00000000 08:05 522712 /media/chainhelen/LENOVO/xlworkspace/code/go/src/mmap.test
sudo ./vir_trans_phy 27729 7f96452c6000
moviing offset 273990981168
reader info []byte{0x75, 0xc4, 0x16, 0x0, 0x0, 0x0, 0x80, 0xa0}
pid:27729 virtaddr:0x7f96452c6000 phyaddr:0x16c475000
sudo ./vir_trans_phy 27729 7f96452c7000
moviing offset 273990981176
reader info []byte{0x75, 0xc4, 0x16, 0x0, 0x0, 0x0, 0x80, 0xa0}
pid:27729 virtaddr:0x7f96452c7000 phyaddr:0x16c475000
看起来是一样的