我可以: 邀请好友来看>>
ZOL论坛 > 硬件论坛 > 硬盘论坛 > 840 PRO做数据库高并发测试时性能和稳定性不行
帖子很冷清,卤煮很失落!求安慰
返回列表
签到
手机签到经验翻倍!
快来扫一扫!

840 PRO做数据库高并发测试时性能和稳定性不行

20浏览 / 0回复

晴过雨天

晴过雨天

0
精华
13
帖子

等  级:Lv.3
经  验:999
  • Z金豆: 298

    千万礼品等你来兑哦~快点击这里兑换吧~

  • 城  市:北京
  • 注  册:2020-04-11
  • 登  录:2020-04-15
发表于 2020-04-11 14:44:08
电梯直达 确定
楼主

最近打算将Postgresql数据放到ssd上运行,主要是ssd的性能确认太强了,但经过多次高并发,高io请求测试结果,可以说极不安全,我用的ssd是三星的840 PRO 128G,二只

一、会导致os挂起,目前这个通过禁用cache,基本没再发过,便后面会不会发生就不清楚

硬件
CPU:I7 3770K
内存:kingston DDR III 1600 8G*2
主板:技嘉 z77
硬盘:2只HDD(seagate 512G sata2) + 2只SSD(三星840 PRO SSD 128G)
OS: CentOS 6.4 ,内核是2.6.32和3.2.41


SSD我是想拿来运行数据库服务的,一直没在生产系统上用ssd,所以上线前就做了严格的各种测试出现过几次os挂起
刚开始有怀疑是ssd的问题,于是装上一只hdd连续相同的并发测试,测试结果hdd完好没事,于是我调整ssd的io调度跟hdd一样采用cfq(原先的调度我设置成deadline),但还是会不定时的出错
后来其中一次只是暴出错误信息,但os没挂起,于是看的log信息如下

kernel: swap_free: Bad swap offset entry 400000000000
Apr 8 14:20:51 pgsqldb-master kernel: BUG: Bad page map in process postgres pte:80000000000000 pmd:2ecafd067
Apr 8 14:20:51 pgsqldb-master kernel: addr:0000003d85671000 vm_flags:08000075 anon_vma: (null) mapping:ffff880403d343f0 index:71
Apr 8 14:20:51 pgsqldb-master kernel: vma->vm_ops->fault: filemap_fault+0x0/0x4b0
Apr 8 14:20:51 pgsqldb-master kernel: vma->vm_file->f_op->mmap: ext4_file_mmap+0x0/0x60 [ext4]

Apr 8 14:20:51 pgsqldb-master kernel: Pid: 3115, comm: postgres Not tainted 3.2.41 #1
Apr 8 14:20:51 pgsqldb-master kernel: Call Trace:
Apr 8 14:20:51 pgsqldb-master kernel: [] print_bad_pte+0x1dc/0x250
Apr 8 14:20:51 pgsqldb-master kernel: [] zap_pte_range+0x1bd/0x4a0
Apr 8 14:20:51 pgsqldb-master kernel: [] ? __mem_cgroup_commit_charge+0x6c/0xc0
Apr 8 14:20:51 pgsqldb-master kernel: [] unmap_page_range+0x1b6/0x310
Apr 8 14:20:51 pgsqldb-master kernel: [] ? __activate_page+0x171/0x190
Apr 8 14:20:51 pgsqldb-master kernel: [] unmap_vmas+0xca/0x1b0
Apr 8 14:20:51 pgsqldb-master kernel: [] exit_mmap+0x96/0x140
Apr 8 14:20:51 pgsqldb-master kernel: [] mmput+0x73/0x110
Apr 8 14:20:51 pgsqldb-master kernel: [] exit_mm+0x10d/0x140
Apr 8 14:20:51 pgsqldb-master kernel: [] ? acct_collect+0xaa/0x1b0
Apr 8 14:20:51 pgsqldb-master kernel: [] do_exit+0x173/0x450
Apr 8 14:20:51 pgsqldb-master kernel: [] ? vfs_write+0x125/0x190
Apr 8 14:20:51 pgsqldb-master kernel: [] do_group_exit+0x51/0xc0
Apr 8 14:20:51 pgsqldb-master kernel: [] sys_exit_group+0x17/0x20
Apr 8 14:20:51 pgsqldb-master kernel: [] system_call_fastpath+0x16/0x1b
Apr 8 14:20:51 pgsqldb-master kernel: Disabling lock debugging due to kernel taint
Apr 8 14:20:51 pgsqldb-master kernel: swap_free: Bad swap offset entry 400000000000
Apr 8 14:20:51 pgsqldb-master kernel: BUG: Bad page map in process postgres pte:80000000000000 pmd:2ecafd067
Apr 8 14:20:51 pgsqldb-master kernel: addr:0000003d85679000 vm_flags:08000075 anon_vma: (null) mapping:ffff880403d343f0 index:79
Apr 8 14:20:51 pgsqldb-master kernel: vma->vm_ops->fault: filemap_fault+0x0/0x4b0
Apr 8 14:20:51 pgsqldb-master kernel: vma->vm_file->f_op->mmap: ext4_file_mmap+0x0/0x60 [ext4]
Apr 8 14:20:51 pgsqldb-master kernel: Pid: 3115, comm: postgres Tainted: G B 3.2.41 #1
Apr 8 14:20:51 pgsqldb-master kernel: Call Trace:
Apr 8 14:20:51 pgsqldb-master kernel: [] print_bad_pte+0x1dc/0x250
Apr 8 14:20:51 pgsqldb-master kernel: [] zap_pte_range+0x1bd/0x4a0
Apr 8 14:20:51 pgsqldb-master kernel: [] ? __mem_cgroup_commit_charge+0x6c/0xc0
Apr 8 14:20:51 pgsqldb-master kernel: [] unmap_page_range+0x1b6/0x310
Apr 8 14:20:51 pgsqldb-master kernel: [] ? __activate_page+0x171/0x190
Apr 8 14:20:51 pgsqldb-master kernel: [] unmap_vmas+0xca/0x1b0
Apr 8 14:20:51 pgsqldb-master kernel: [] exit_mmap+0x96/0x140
Apr 8 14:20:51 pgsqldb-master kernel: [] mmput+0x73/0x110
Apr 8 14:20:51 pgsqldb-master kernel: [] exit_mm+0x10d/0x140
Apr 8 14:20:51 pgsqldb-master kernel: [] ? acct_collect+0xaa/0x1b0
Apr 8 14:20:51 pgsqldb-master kernel: [] do_exit+0x173/0x450
Apr 8 14:20:51 pgsqldb-master kernel: [] ? vfs_write+0x125/0x190
Apr 8 14:20:51 pgsqldb-master kernel: [] do_group_exit+0x51/0xc0
Apr 8 14:20:51 pgsqldb-master kernel: [] sys_exit_group+0x17/0x20
Apr 8 14:20:51 pgsqldb-master kernel: [] system_call_fastpath+0x16/0x1b
Apr 8 14:20:51 pgsqldb-master kernel: general protection fault: 0000 [#1] SMP 
Apr 8 14:20:51 pgsqldb-master kernel: CPU 5 
Apr 8 14:20:51 pgsqldb-master kernel: Modules linked in: autofs4 sbs sbshc coretemp sunrpc 8021q garp stp llc cpufreq_ondemand acpi_cpufreq freq_table mperf ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state ip6table_filter ip6_tables ipv6 dm_mirror dm_region_hash dm_log dm_mod uinput mxm_wmi 3c59x mii wmi sg microcode pcspkr i2c_i801 iTCO_wdt iTCO_vendor_support snd_hda_codec_hdmi snd_hda_codec_via snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd soundcore snd_page_alloc xhci_hcd ext4 mbcache jbd2 sr_mod cdrom sd_mod crc_t10dif ahci libahci i915 drm_kms_helper drm i2c_algo_bit i2c_core video [last unloaded: scsi_wait_scan]
Apr 8 14:20:51 pgsqldb-master kernel: 
Apr 8 14:20:51 pgsqldb-master kernel: Pid: 3129, comm: postgres Tainted: G B 3.2.41 #1 Gigabyte Technology Co., Ltd. To be filled by O.E.M./Z77X-UD3H
Apr 8 14:20:51 pgsqldb-master kernel: RIP: 0010:[] [] __wake_up_common+0x31/0x90
Apr 8 14:20:51 pgsqldb-master kernel: RSP: 0018:ffff8802ec95bcb8 EFLAGS: 00010006
Apr 8 14:20:51 pgsqldb-master kernel: RAX: ff7f8802ec9444f0 RBX: ffff8802ec944500 RCX: 0000000000000000
Apr 8 14:20:51 pgsqldb-master kernel: RDX: ff7f8802ec944508 RSI: 0000000000000001 RDI: ffff8802ec944500
Apr 8 14:20:51 pgsqldb-master kernel: RBP: ffff8802ec95bcf8 R08: 0000000000000000 R09: 0000000000000000
Apr 8 14:20:51 pgsqldb-master kernel: R10: ffff8804041b0ad0 R11: 00000000000164a8 R12: 0000000000000282
Apr 8 14:20:51 pgsqldb-master kernel: R13: ffff8802ec944508 R14: 0000000000000000 R15: 0000000000000000
Apr 8 14:20:51 pgsqldb-master kernel: FS: 0000000000000000(0000) GS:ffff88041f340000(0000) knlGS:0000000000000000
Apr 8 14:20:51 pgsqldb-master kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Apr 8 14:20:51 pgsqldb-master kernel: CR2: 00007ff5d392b0a0 CR3: 0000000001a05000 CR4: 00000000001406e0
Apr 8 14:20:51 pgsqldb-master kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Apr 8 14:20:51 pgsqldb-master kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Apr 8 14:20:51 pgsqldb-master kernel: Process postgres (pid: 3129, threadinfo ffff8802ec95a000, task ffff8803fea3c0c0)
Apr 8 14:20:51 pgsqldb-master kernel: Stack:
Apr 8 14:20:51 pgsqldb-master kernel: ffff8802ec95bcc8 0000000100000000 ffff8802ec95bce8 ffff8802ec944500
Apr 8 14:20:51 pgsqldb-master kernel: 0000000000000282 0000000000000001 0000000000000000 0000000000000000
Apr 8 14:20:51 pgsqldb-master kernel: ffff8802ec95bd38 ffffffff81050208 ffff880310092210 ffff8802ec944200
Apr 8 14:20:51 pgsqldb-master kernel: Call Trace:
Apr 8 14:20:51 pgsqldb-master kernel: [] __wake_up+0x48/0x70
Apr 8 14:20:51 pgsqldb-master kernel: [] unix_release_sock+0xd3/0x240
Apr 8 14:20:51 pgsqldb-master kernel: [] unix_release+0x26/0x30
Apr 8 14:20:51 pgsqldb-master kernel: [] sock_release+0x29/0x90
Apr 8 14:20:51 pgsqldb-master kernel: [] sock_close+0x17/0x30
Apr 8 14:20:51 pgsqldb-master kernel: [] __fput+0xbe/0x240
Apr 8 14:20:51 pgsqldb-master kernel: [] fput+0x25/0x30
Apr 8 14:20:51 pgsqldb-master kernel: [] filp_close+0x63/0x90
Apr 8 14:20:51 pgsqldb-master kernel: [] put_files_struct+0x7f/0xf0
Apr 8 14:20:51 pgsqldb-master kernel: [] exit_files+0x4b/0x60
Apr 8 14:20:51 pgsqldb-master kernel: [] do_exit+0x1a2/0x450
Apr 8 14:20:51 pgsqldb-master kernel: [] ? vfs_write+0x125/0x190
Apr 8 14:20:51 pgsqldb-master kernel: [] do_group_exit+0x51/0xc0
Apr 8 14:20:51 pgsqldb-master kernel: [] sys_exit_group+0x17/0x20
Apr 8 14:20:51 pgsqldb-master kernel: [] system_call_fastpath+0x16/0x1b
Apr 8 14:20:51 pgsqldb-master kernel: Code: 41 56 41 55 41 54 53 48 83 ec 18 0f 1f 44 00 00 89 75 cc 89 55 c8 4c 8d 6f 08 48 8b 57 08 41 89 cf 4d 89 c6 48 8d 42 e8 49 39 d5 <48> 8b 58 18 74 3f 48 83 eb 18 eb 0a 0f 1f 00 48 89 d8 48 8d 5a 
Apr 8 14:20:51 pgsqldb-master kernel: RIP [] __wake_up_common+0x31/0x90
Apr 8 14:20:51 pgsqldb-master kernel: RSP
Apr 8 14:20:51 pgsqldb-master kernel: ---[ end trace 733adaf8d47b415e ]---
Apr 8 14:20:51 pgsqldb-master kernel: Fixing recursive fault but reboot is needed!

经过QQ群的与大家的讨论,感觉就是ssd的固件有问题,从上面log猜就是cache跟磁盘数据出错,后来搜索了三星 840 pro的一些问题,也发现他的固件先前的版本有问题,但我的固件版本是4BQ,最新的了。不管了,先将ssd的cache禁用掉,禁用办法


[root@pgsqldb-master ~]# hdparm -W 0 /dev/sda

/dev/sda:
setting drive write-caching to 0 (off)
write-caching = 0 (off)

另外统过下面命令

[root@pgsqldb-master ~]# hdparm -I /dev/sda | grep -i cache
cache/buffer size = unknown
Write cache
* Mandatory FLUSH_CACHE
* FLUSH_CACHE_EXT
[root@pgsqldb-master ~]#

显示cache/buffer size竟然是unknown,不知是os识别不出来,还是ssd的问题

再运行pgbench高并测试,现在基本稳定平稳了,不过禁用cache后对于大数据表的测试结果是性能下降太利害了,后面有一帖子专门比较一下,目前看来三星840 PRO 128g在高并发,高io压力的情况是不宜于启用cache的,不知其它ssd会不会出现这种情况,有条件的同学们可以测试一下,顺便报告一下

二、就是在高并发的情况下出现过二次数据丢失,一次是我没禁用cache,但今晚这次我是关掉了cache的

出错信息如下
pgbench运行到最后出现如下的

LOG: server process (PID 5015) was terminated by signal 11: Segmentation fault
DETAIL: Failed process was running: SELECT * FROM user_login($1::varchar,$2::varchar,'8.8.8.8');
LOG: terminating any other active server processes
WARNING: terminating connection because of crash of another server process
DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the databbse and repeat your command.
WARNING: terminating connection because of crash of another server process
DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the databbse and repeat your command.
WARNING: terminating connection because of crash of another server process
DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.

。。。。。

LOG: all server processes terminated; reinitializing
LOG: databbse system was interrupted; last known up at 2013-04-09 21:40:48 CST
LOG: databbse system was not properly shut down; automatic recovery in progress
LOG: redo starts at D0/B00497F8
LOG: incorrect resource manager data checksum in record at D1/DA8FC78
LOG: redo done at D1/DA8ECA0
LOG: last completed transaction was at log time 2013-04-09 21:40:53.667749+08
LOG: databbse system is ready to accept connections
LOG: autovacuum launcher started

然后数据库重启

进着我进入系统执行

pgbench=# q
[postgres@pgsqldb-master bin]$ ./psql -p 5433 -d pgbench
psql (9.2.3)
Type "help" for help.

pgbench=# select count(1) from users;
count
---------
9999533
(1 row)

nnd,少了467笔,上次是少了205笔../lib/images/bchh/6.gif

pgbench=# vacuum ANALYZE ;
ERROR: xlog flush request D1/19BDD1D0 is not satisfied --- flushed only to D1/E5BC1E0
CONTEXT: writing block 230 of relation bbse/174223590/174223593

又来同样的错误,../lib/images/bchh/9.gif


这次真的很无语了,现在说什么真的不敢用了,最多用来存储索引,临时表之类的应用,民用级别的看来是适用不了这高并发的要求,分钱分货../lib/images/bchh/11.gif

高级模式
论坛精选大家都在看24小时热帖7天热帖大家都在问最新回答

针对ZOL论坛您有任何使用问题和建议 您可以 联系论坛管理员查看帮助  或  给我提意见

快捷回复 APP下载 返回列表