HugePages是集成到Linux kernel 2.6中的一个功能。启用HugePages可以使用操作系统来支持比缺省的内存页(4KB)更大的内存页。使用非常大的内存页大小可以通过减少访问页表条目所需要的系统资源数量而提高系统性能。HugePages对于32位与64位系统都是有效的。HugePage的大小范围从2MB到256MB,依赖于内核版本和硬件架构。对于Oracle数据库,使用HugePages减少操作系统维护内存页
状态并增加Translation Lookaside Buffer(TLB)的撞击率。
1.使用HugePages来优化SGA
不使用HugePages时,操作系统将保持每个内存页大小为4KB,当为SGA分配内存页时,操作系统内核必须对分配给SGA的每个4KB页使用页生命周期(脏,可用,映射到进程,等等)持续更新。
使用HugePages时,操作系统页表(虚拟内存到物理内存的映射)很小,因为每个页表条目指向的内存页大小从2MB到256MB。同时内核有比较少的内存页生命周期被监控。例如,如果64位硬件使用HugePages,并且想要映射256MB的内存,你可能只需要一个页表条目(PTE)。如果不使用HugePages并且想要映射256MB内存,那么必须有256*1024KB/4KB=65536个PTEs。
HugePages提供了以下优点:
通过增加TLB撞击率来提高性能
内存页被锁定在内存中并且不会发生交换,对共享内存结构比如SGA提供了随机访问
连续内存页预分配除了用于系统的共享内存比如SGA不能用于其它的目的
因为使用大的内存页大小所以虚拟内存相关的内核有较少性能开销
2 对Linux配置HugePages
运行以下命令来判断内核是否支持HugePages:
[root@jyrac1 ~]# uname -r
2.6.18-164.el5
[root@jyrac1 ~]# grep Huge /proc/meminfo
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
Hugepagesize: 2048 kB
有一些Linux缺省情况下是不支持HugePages的。 对于这样的系统使用config_hugetlbfs和config_hugetlb_page配置选项来构建Linux内核。config_hugetlbfs位于文件系统并且当你选择config_hugetlbfs时需要同时选择config_hugetlb_page。
编辑/etc/security/limits.conf文件来设置memlock。memlock设置以KB为单位,并且当启用HugePages内存时,最大锁定内存限制应该被设置为当前可随机访问内存的90%,当没有启用HugePages内存时,最大锁定内存限制应该被设置成至少3145728KB(3GB)。例如,如果有2G可随机访问内存,并且增加以下条目来增加最大锁定内存地址空间:
[root@jyrac1 ~]# vi /etc/security/limits.conf
grid soft memlock 2097152
grid hard memlock 2097152
oracle soft memlock 2097152
oracle hard memlock 2097152
也可以将memlock的值设置为比SGA的值大
以grid用户登录,并执行ulimit -l命令来验证新设置的memlock是否生效
[grid@jyrac1 ~]$ ulimit -l
2097152
以oracle用户登录,并执行ulimit -l命令来验证新设置的memlock是否生效
[oracle@jyrac1 ~]$ ulimit -l
2097152
运行以下命令来显示Hugepagesize变量:
[oracle@jyrac1 ~]$ grep Hugepagesize /proc/meminfo
Hugepagesize: 2048 kB
完成以下过程来创建一个脚本用来为当前共享内存段计算hugepages配置的建议值创建一个hugepages_settings.sh脚本并增加以下内容:
[root@jyrac1 /]# vi hugepages_settings.sh
#!/bin/bash
#
# hugepages_settings.sh
#
# Linux bash script to compute values for the
# recommended HugePages/HugeTLB configuration
# on Oracle Linux
#
# Note: This script does calculation for all shared memory
# segments available when the script is run, no matter it
# is an Oracle RDBMS shared memory segment or not.
#
# This script is provided by Doc ID 401749.1 from My Oracle Support
# http://support.oracle.com
# Welcome text
echo "
This script is provided by Doc ID 401749.1 from My Oracle Support
(http://support.oracle.com) where it is intended to compute values for
the recommended HugePages/HugeTLB configuration for the current shared
memory segments on Oracle Linux. Before proceeding with the execution please note following:
* For ASM instance, it needs to configure ASMM instead of AMM.
* The 'pga_aggregate_target' is outside the SGA and
you should accommodate this while calculating SGA size.
* In case you changes the DB SGA size,
as the new SGA will not fit in the previous HugePages configuration,
it had better disable the whole HugePages,
start the DB with new SGA size and run the script again.
And make sure that:
* Oracle Database instance(s) are up and running
* Oracle Database 11g Automatic Memory Management (AMM) is not setup
(See Doc ID 749851.1)
* The shared memory segments can be listed by command:
# ipcs -m
Press Enter to proceed..."
read
# Check for the kernel version
KERN=`uname -r | awk -F. '{ printf("%d.%dn",$1,$2); }'`
# Find out the HugePage size
HPG_SZ=`grep Hugepagesize /proc/meminfo | awk '{print $2}'`
if [ -z "$HPG_SZ" ];then
echo "The hugepages may not be supported in the system where the script is being executed."
exit 1
fi
# Initialize the counter
NUM_PG=0
# Cumulative number of pages required to handle the running shared memory segments
for SEG_BYTES in `ipcs -m | cut -c44-300 | awk '{print $1}' | grep "[0-9][0-9]*"`
do
MIN_PG=`echo "$SEG_BYTES/($HPG_SZ*1024)" | bc -q`
if [ $MIN_PG -gt 0 ]; then
NUM_PG=`echo "$NUM_PG+$MIN_PG+1" | bc -q`
fi
done
RES_BYTES=`echo "$NUM_PG * $HPG_SZ * 1024" | bc -q`
# An SGA less than 100MB does not make sense
# Bail out if that is the case
if [ $RES_BYTES -lt 100000000 ]; then
echo "***********"
echo "** ERROR **"
echo "***********"
echo "Sorry! There are not enough total of shared memory segments allocated for
HugePages configuration. HugePages can only be used for shared memory segments
that you can list by command:
# ipcs -m
of a size that can match an Oracle Database SGA. Please make sure that:
* Oracle Database instance is up and running
* Oracle Database 11g Automatic Memory Management (AMM) is not configured"
exit 1
fi
# Finish with results
case $KERN in
'2.2') echo "Kernel version $KERN is not supported. Exiting." ;;
'2.4') HUGETLB_POOL=`echo "$NUM_PG*$HPG_SZ/1024" | bc -q`;
echo "Recommended setting: vm.hugetlb_pool = $HUGETLB_POOL" ;;
'2.6') echo "Recommended setting: vm.nr_hugepages = $NUM_PG" ;;
'3.8') echo "Recommended setting: vm.nr_hugepages = $NUM_PG" ;;
'3.10') echo "Recommended setting: vm.nr_hugepages = $NUM_PG" ;;
'4.1') echo "Recommended setting: vm.nr_hugepages = $NUM_PG" ;;
esac
# End
执行以下命令来改变hugepages_settings.sh脚本的权限
[root@jyrac1 /]# chmod +x hugepages_settings.sh
运行hugepages_settings.sh脚本来计算hugepages配置的参数值
[root@jyrac1 /]# ./hugepages_settings.sh
This script is provided by Doc ID 401749.1 from My Oracle Support
(http://support.oracle.com) where it is intended to compute values for
the recommended HugePages/HugeTLB configuration for the current shared
memory segments on Oracle Linux. Before proceeding with the execution please note following:
* For ASM instance, it needs to configure ASMM instead of AMM.
* The 'pga_aggregate_target' is outside the SGA and
you should accommodate this while calculating SGA size.
* In case you changes the DB SGA size,
as the new SGA will not fit in the previous HugePages configuration,
it had better disable the whole HugePages,
start the DB with new SGA size and run the script again.
And make sure that:
* Oracle Database instance(s) are up and running
* Oracle Database 11g Automatic Memory Management (AMM) is not setup
(See Doc ID 749851.1)
* The shared memory segments can be listed by command:
# ipcs -m
Press Enter to proceed...
***********
** ERROR **
***********
Sorry! There are not enough total of shared memory segments allocated for
HugePages configuration. HugePages can only be used for shared memory segments
that you can list by command:
# ipcs -m
of a size that can match an Oracle Database SGA. Please make sure that:
* Oracle Database instance is up and running
* Oracle Database 11g Automatic Memory Management (AMM) is not configured
从上面的信息可以看到需要确认Oracle实例是否正在运行,如果是Oracle 11g不能使用AMM
[root@jyrac1 ~]# ps -ef | grep pmon
grid 4116 1 0 Apr18 ? 00:00:03 asm_pmon_+ASM1
oracle 4944 1 0 Apr18 ? 00:00:03 ora_pmon_jyrac1
root 18184 29273 0 15:15 pts/1 00:00:00 grep pmon
上面信息可以看到Oracle实例正在运行。
[grid@jyrac1 ~]$ sqlplus / as sysasm
SQL*Plus: Release 11.2.0.4.0 Production on Wed Apr 20 15:20:23 2016
Copyright (c) 1982, 2013, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - Production
With the Real Application Clusters and Automatic Storage Management options
SQL> set long 900
SQL> set linesize 900
SQL> show parameter instance_name
NAME TYPE VALUE
------------------------------------ ---------------------- ------------------------------
instance_name string +ASM1
SQL> show parameter memory
NAME TYPE VALUE
------------------------------------ ---------------------- ------------------------------
memory_max_target big integer 1076M
memory_target big integer 1076M
[oracle@jyrac1 ~]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.4.0 Production on Wed Apr 20 15:21:04 2016
Copyright (c) 1982, 2013, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options
SQL> set long 900
SQL> set linesize 900
SQL> show parameter instance_name
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
instance_name string jyrac1
SQL> show parameter memory
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
hi_shared_memory_address integer 0
memory_max_target big integer 2G
memory_target big integer 2G
shared_memory_address integer 0
确实asm与数据库实例都启用了AMM,需要禁用AMM但是可以使用ASMM修改ASM实例,禁用AMM,但使用ASMM,如果是RAC所有节点都需要修改
SQL> alter system set sga_max_size=640M scope=spfile sid='*';
System altered.
SQL> alter system set sga_target=640M scope=spfile sid='*';
System altered.
SQL> alter system set pga_aggregate_target=320M scope=spfile sid='*';
System altered.
SQL> alter system set memory_target=0 scope=spfile sid='*';
System altered.
这里对于memory_target不能使用reset否则会出现以下错误:
SQL> startup
ORA-01078: failure in processing system parameters
ORA-00843: Parameter not taking MEMORY_MAX_TARGET into account
ORA-00849: SGA_TARGET 671088640 cannot be set to more than MEMORY_MAX_TARGET 0.
SQL> alter system reset memory_max_target scope=spfile sid='*';
System altered.
修改数据库实例,禁用AMM,但使用ASMM,如果是RAC所有节点都需要修改
SQL> alter system set sga_max_size=640M scope=spfile sid='*';
System altered.
SQL> alter system set sga_target=640M scope=spfile sid='*';
System altered.
SQL> alter system set pga_aggregate_target=320M scope=spfile sid='*';
System altered.
SQL> alter system reset memory_max_target scope=spfile sid='*';
System altered.
SQL> alter system reset memory_target scope=spfile sid='*';
System altered.
重启ASM与数据库实例,如果是RAC所有节点都需要重启,首先停止ASM与数据库实例
[grid@jyrac1 ~]$ srvctl stop asm -n jyrac1 -f
[grid@jyrac1 ~]$ srvctl stop asm -n jyrac2 -f
[grid@jyrac1 ~]$ srvctl stop database -d jyrac
[grid@jyrac1 ~]$ crsctl stat res -t
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.CRSDG.dg
OFFLINE OFFLINE jyrac1
OFFLINE OFFLINE jyrac2
ora.DATADG.dg
OFFLINE OFFLINE jyrac1
OFFLINE OFFLINE jyrac2
ora.LISTENER.lsnr
ONLINE ONLINE jyrac1
ONLINE ONLINE jyrac2
ora.asm
OFFLINE OFFLINE jyrac1 Instance Shutdown
OFFLINE OFFLINE jyrac2 Instance Shutdown
ora.gsd
ONLINE OFFLINE jyrac1
ONLINE OFFLINE jyrac2
ora.net1.network
ONLINE ONLINE jyrac1
ONLINE ONLINE jyrac2
ora.ons
ONLINE ONLINE jyrac1
ONLINE ONLINE jyrac2
ora.registry.acfs
OFFLINE OFFLINE jyrac1
OFFLINE OFFLINE jyrac2
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
1 ONLINE ONLINE jyrac2
ora.cvu
1 ONLINE ONLINE jyrac2
ora.jyrac.db
1 OFFLINE OFFLINE Instance Shutdown
2 OFFLINE OFFLINE Instance Shutdown
ora.jyrac1.vip
1 ONLINE ONLINE jyrac1
ora.jyrac2.vip
1 ONLINE ONLINE jyrac2
ora.oc4j
1 ONLINE ONLINE jyrac2
ora.scan1.vip
1 ONLINE ONLINE jyrac2
启动ASM与数据库实例
grid@jyrac1 ~]$ sqlplus / as sysasm
SQL*Plus: Release 11.2.0.4.0 Production on Wed Apr 20 17:48:32 2016
Copyright (c) 1982, 2013, Oracle. All rights reserved.
Connected to an idle instance.
SQL> startup
ASM instance started
Total System Global Area 669581312 bytes
Fixed Size 1366724 bytes
Variable Size 643048764 bytes
ASM Cache 25165824 bytes
ASM diskgroups mounted
ASM diskgroups volume enabled
SQL> show parameter instance_name
NAME TYPE VALUE
------------------------------------ ---------------------- ------------------------------
instance_name string +ASM2
SQL> show parameter memory
NAME TYPE VALUE
------------------------------------ ---------------------- ------------------------------
memory_max_target big integer 0
memory_target big integer 0
SQL> show parameter sga
NAME TYPE VALUE
------------------------------------ ---------------------- ------------------------------
lock_sga boolean FALSE
sga_max_size big integer 640M
sga_target big integer 640M
grid@jyrac2 ~]$ sqlplus / as sysasm
SQL*Plus: Release 11.2.0.4.0 Production on Wed Apr 20 17:48:32 2016
Copyright (c) 1982, 2013, Oracle. All rights reserved.
Connected to an idle instance.
SQL> startup
ASM instance started
Total System Global Area 669581312 bytes
Fixed Size 1366724 bytes
Variable Size 643048764 bytes
ASM Cache 25165824 bytes
ASM diskgroups mounted
ASM diskgroups volume enabled
SQL> show parameter instance_name
NAME TYPE VALUE
------------------------------------ ---------------------- ------------------------------
instance_name string +ASM2
SQL> show parameter memory
NAME TYPE VALUE
------------------------------------ ---------------------- ------------------------------
memory_max_target big integer 0
memory_target big integer 0
SQL> show parameter sga
NAME TYPE VALUE
------------------------------------ ---------------------- ------------------------------
lock_sga boolean FALSE
sga_max_size big integer 640M
sga_target big integer 640M
[grid@jyrac1 ~]$ crsctl stat res -t
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.CRSDG.dg
ONLINE ONLINE jyrac1
ONLINE ONLINE jyrac2
ora.DATADG.dg
ONLINE ONLINE jyrac1
ONLINE ONLINE jyrac2
ora.LISTENER.lsnr
ONLINE ONLINE jyrac1
ONLINE ONLINE jyrac2
ora.asm
ONLINE ONLINE jyrac1 Started
ONLINE ONLINE jyrac2 Started
ora.gsd
ONLINE OFFLINE jyrac1
ONLINE OFFLINE jyrac2
ora.net1.network
ONLINE ONLINE jyrac1
ONLINE ONLINE jyrac2
ora.ons
ONLINE ONLINE jyrac1
ONLINE ONLINE jyrac2
ora.registry.acfs
ONLINE ONLINE jyrac1
ONLINE ONLINE jyrac2
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
1 ONLINE ONLINE jyrac1
ora.cvu
1 ONLINE ONLINE jyrac1
ora.jyrac.db
1 OFFLINE OFFLINE Instance Shutdown
2 OFFLINE OFFLINE Instance Shutdown
ora.jyrac1.vip
1 ONLINE ONLINE jyrac1
ora.jyrac2.vip
1 ONLINE ONLINE jyrac2
ora.oc4j
1 ONLINE ONLINE jyrac1
ora.scan1.vip
1 ONLINE ONLINE jyrac1
从上面的信息可以看到asm实例已经启动了并且禁用了AMM
[grid@jyrac1 ~]$ srvctl start database -d jyrac
[grid@jyrac1 ~]$ crsctl stat res -t
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.CRSDG.dg
ONLINE ONLINE jyrac1
ONLINE ONLINE jyrac2
ora.DATADG.dg
ONLINE ONLINE jyrac1
ONLINE ONLINE jyrac2
ora.LISTENER.lsnr
ONLINE ONLINE jyrac1
ONLINE ONLINE jyrac2
ora.asm
ONLINE ONLINE jyrac1 Started
ONLINE ONLINE jyrac2 Started
ora.gsd
ONLINE OFFLINE jyrac1
ONLINE OFFLINE jyrac2
ora.net1.network
ONLINE ONLINE jyrac1
ONLINE ONLINE jyrac2
ora.ons
ONLINE ONLINE jyrac1
ONLINE ONLINE jyrac2
ora.registry.acfs
ONLINE ONLINE jyrac1
ONLINE ONLINE jyrac2
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
1 ONLINE ONLINE jyrac1
ora.cvu
1 ONLINE ONLINE jyrac1
ora.jyrac.db
1 ONLINE ONLINE jyrac1 Open
2 ONLINE ONLINE jyrac2 Open
ora.jyrac1.vip
1 ONLINE ONLINE jyrac1
ora.jyrac2.vip
1 ONLINE ONLINE jyrac2
ora.oc4j
1 ONLINE ONLINE jyrac1
ora.scan1.vip
1 ONLINE ONLINE jyrac1
SQL> show parameter instance_name
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
instance_name string jyrac1
SQL> show parameter memory
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
hi_shared_memory_address integer 0
memory_max_target big integer 0
memory_target big integer 0
shared_memory_address integer 0
SQL> show parameter sga
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
lock_sga boolean FALSE
pre_page_sga boolean FALSE
sga_max_size big integer 640M
sga_target big integer 640M
SQL> show parameter instance_name
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
instance_name string jyrac2
SQL> show parameter memory
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
hi_shared_memory_address integer 0
memory_max_target big integer 0
memory_target big integer 0
shared_memory_address integer 0
SQL> show parameter sga
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
lock_sga boolean FALSE
pre_page_sga boolean FALSE
sga_max_size big integer 640M
sga_target big integer 640M
数据库也已经成功启动并且禁用了AMM
再次执行hugepages_settings.sh脚本计算HugePages的大小
[root@jyrac1 /]# ./hugepages_settings.sh
Recommended setting: vm.nr_hugepages = 649
编辑/etc/sysctl.conf文件增加参数vm.nr_hugepages = 649,并执行sysctl -p命令使用修改立即生效,但oracle实例并没有使用HugePages从HugePages_Total与HugePages_Free相等可以判断出来。
[root@jyrac1 /]# vi /etc/sysctl.conf
vm.nr_hugepages = 649
[root@jyrac1 /]# sysctl -p
[root@jyrac1 /]# grep Huge /proc/meminfo
HugePages_Total: 649
HugePages_Free: 649
HugePages_Rsvd: 0
Hugepagesize: 2048 kB
重启实例
SQL> startup
ASM instance started
Total System Global Area 669581312 bytes
Fixed Size 1366724 bytes
Variable Size 643048764 bytes
ASM Cache 25165824 bytes
ASM diskgroups mounted
ASM diskgroups volume enabled
查看asm实例的alert_+ASM1.log可以看到如下信息:
Starting ORACLE instance (normal)
************************ Large Pages Information *******************
Per process system memlock (soft) limit = 2048 MB
Total Shared Global Region in Large Pages = 642 MB (100%)
Large Pages used by this instance: 321 (642 MB)
Large Pages unused system wide = 328 (656 MB)
Large Pages configured system wide = 649 (1298 MB)
Large Page size = 2048 KB
SQL> startup
ORACLE instance started.
Total System Global Area 669581312 bytes
Fixed Size 1366724 bytes
Variable Size 243270972 bytes
Database Buffers 419430400 bytes
Redo Buffers 5513216 bytes
Database mounted.
Database opened.
查看实例jyrac1的alert_jyrac1.log可以看到如下信息:
Starting ORACLE instance (normal)
************************ Large Pages Information *******************
Per process system memlock (soft) limit = 2048 MB
Total Shared Global Region in Large Pages = 642 MB (100%)
Large Pages used by this instance: 321 (642 MB)
Large Pages unused system wide = 7 (14 MB)
Large Pages configured system wide = 649 (1298 MB)
Large Page size = 2048 KB
[root@jyrac1 /]# grep Huge /proc/meminfo
HugePages_Total: 649
HugePages_Free: 239
HugePages_Rsvd: 232
Hugepagesize: 2048 kB
从上面的信息可以看到已经使用了Hugepages
3.HugePages的限制
HugePages有以下限制:
a.对于Oracle 11g及以上版本数据库实例必须对memory_target与memory_max_target参数执行alter system reset命令,但对于ASM实例,对于memory_target参数只能设置为0。
b.AMM与HugePages是不兼容的,当使用AMM,整个SGA内存通过在/dev/shm创建文件来进行内存的分配,当使用AMM分配SGA时,HugePages不会被保留。
c.如果在32位系统中使用VLM,那么对数据库buffer cache不能使用HugePages。但对于SGA中的其它组件比如shared_pool,
large_pool等等可以使用HugePages。对于VLM(buffer cache)分配内存是通过使用共享内存文件系统(ramfs/tmpfs/shmfs)来实现的。
d.HugePgaes在系统启动后不受分配或释放,除非系统管理员通过修改可用页数或改变池大小来改变HugePages的配置。如果在系统启动时内存中没有保留所需要内存空间,那么HugePages会分配失败。
e.确保HugePages配置合理,如果内存耗尽,应用将不能使用HugePages。
f.如果当实例启动用没有足够的HugePages并且参数use_large_pages设置为only,那么Oracle数据库将会启动失败并向alert.log中记录相关信息。
: