当前位置:  数据库>oracle

hash partition 平衡分布数据的测试

    来源: 互联网  发布时间:2017-05-05

    本文导语: Thedestination of a row is determined by the internal hash function applied to therow by the database. The hashing algorithm is designed to evenly distributesrows across devices so that each partition contains about the same number ofrows. 官方文档说明hashpartition 能够平衡每个分区上...

Thedestination of a row is determined by the internal hash function applied to therow by the database. The hashing algorithm is designed to evenly distributesrows across devices so that each partition contains about the same number ofrows.

官方文档说明hashpartition 能够平衡每个分区上面的行的个数,做了个小测试,记录下来,供大家参考。

1. 创建一个hash分区表 , 一共四个分区,分区名称由系统自动生成。
--也可以替换为 create table hash_t (id number ) partition by hash(id) (partition&part_name1,partition &part_name2,partition &part_name3,partition&part_name4) ;

dexter@ORCL>create tablehash_t (id number ) partition by hash(id) partitions 4 ;

Tablecreated.

--查看一下表的统计信息,可以看到blocks为空但是segment已经创建
dexter@ORCL> select TABLE_NAME, PARTITION_NAME, TABLESPACE_NAME ,NUM_ROWS  , BLOCKS , segment_created fromuser_tab_partitions ;

TABLE_NAMEPARTITION_NAME                TABLESPACE_NAME                 NUM_ROWS     BLOCKSSEG
---------- ------------------------------ ---------------------------------------- ---------- ---
HASH_T     SYS_P29                        USERS                                               YES
HASH_T     SYS_P28                        USERS                                               YES
HASH_T     SYS_P30                        USERS                                               YES
HASH_T     SYS_P31                        USERS                                               YES





--已经分配了extent
dexter@ORCL> select * from user_extents where segment_name='HASH_T' ;

SEGMENT_NAPARTITION_NAME                SEGMENT_TYPE      TABLESPACE_NAME                EXTENT_ID      BYTES     BLOCKS
---------- ------------------------------ ------------------------------------------------ ---------- --------------------
HASH_T     SYS_P28                        TABLE PARTITION    USERS                                   0      65536          8
HASH_T     SYS_P29                        TABLE PARTITION    USERS                                   0      65536          8
HASH_T     SYS_P30                        TABLE PARTITION    USERS                                   0      65536          8
HASH_T     SYS_P31                        TABLE PARTITION    USERS                                   0      65536          8




--查看一下参数
dexter@ORCL> show parameter deferred_segment_creation

NAME                                 TYPE       VALUE
------------------------------------ -----------------------------------------
deferred_segment_creation           boolean     TRUE

--在11.2.0.1版本的Oracle数据库中分区表不受deferred_segment_creation的影响,不会延迟分配segment
--但是在11.2.0.2版本的oracle数据库中加入了这个功能,官方文档中有提及,最下面有测试结果
/*
参数说明
DEFERRED_SEGMENT_CREATION specifies the semantics of deferred segment creation.
If set to true, then segments for tables and their dependent objects (LOBs,indexes) will not be created until the first row is inserted into thetable.
Before creating a set of tables, if it is known that a significant number ofthem will not be populated, then consider setting this parameter to true.
This saves disk space and minimizes install time.
*/







2. 初始化一些数据
dexter@ORCL> insert into hash_t select level from dual connectby level commit ;

Commitcomplete.

3. 查看一下分区统计信息
--我们来收集一下统计信息,使用dbms_stats收集分区表的统计信息更加准确
dexter@ORCL> execdbms_stats.gather_table_stats('dexter','hash_t',cascade=>true);

PL/SQLprocedure successfully completed.

--可以看到分布的比较均匀
dexter@ORCL> select TABLE_NAME, PARTITION_NAME, TABLESPACE_NAME ,NUM_ROWS  , BLOCKS fromuser_tab_partitions ;

TABLE_NAMEPARTITION_NAME                TABLESPACE_NAME                 NUM_ROWS    BLOCKS
---------- ------------------------------ ---------------------------------------- ----------
HASH_T     SYS_P29                        USERS                               24956         43
HASH_T     SYS_P28                        USERS                               24945         43
HASH_T     SYS_P30                        USERS                               25209         43
HASH_T     SYS_P31                        USERS                               24890         43




4. 插入 partition key 为7887 的 数据到分区表中

--在做一下测试看一下是否真的均匀分配
dexter@ORCL> select * from hash_t partition (sys_p29) where rownum insertinto hash_t select 7887 from dual connect by level commit ;

Commitcomplete.
--再来分析一下
dexter@ORCL> execdbms_stats.gather_table_stats('dexter','hash_t',cascade=>true);

PL/SQLprocedure successfully completed.

--可以看到数据只分布到了sys_p29这个分区里面,也就是说当partition key 经过hash function 运算后放在了某一分区后,那么相同的partition key 的row就会在同一分区里面 , partition key 不会跨分区
--所以说hash partition 在每个partition 之间均匀分布数据也不是绝对的 , 因此最好不要使用 基数较低的 column 作为partition column
dexter@ORCL> select TABLE_NAME, PARTITION_NAME, TABLESPACE_NAME ,NUM_ROWS  , BLOCKS fromuser_tab_partitions ;

TABLE_NAMEPARTITION_NAME                TABLESPACE_NAME                 NUM_ROWS    BLOCKS
---------- ------------------------------ ---------------------------------------- ----------
HASH_T     SYS_P29                        USERS                              124956        244
HASH_T     SYS_P28                        USERS                               24945         43
HASH_T     SYS_P30                        USERS                               25209         43
HASH_T     SYS_P31                        USERS                               24890         43




5. 再次验证测试结果
--再来测试一下其他情况的数据是否均匀分配
--清空数据
dexter@ORCL> truncate table hash_t ;


Tabletruncated.

dexter@ORCL>exec dbms_stats.gather_table_stats('dexter','hash_t',cascade=>true);

PL/SQLprocedure successfully completed.

dexter@ORCL>select TABLE_NAME, PARTITION_NAME, TABLESPACE_NAME , NUM_ROWS  , BLOCKS from user_tab_partitions ;

TABLE_NAMEPARTITION_NAME                TABLESPACE_NAME                 NUM_ROWS    BLOCKS
---------- ------------------------------ ---------------------------------------- ----------
HASH_T     SYS_P29                        USERS                                   0          0
HASH_T     SYS_P28                        USERS                                   0          0
HASH_T     SYS_P30                        USERS                                   0          0
HASH_T     SYS_P31                        USERS                                   0          0




--初始化数据,id值只有 1,2,3
dexter@ORCL>  insert into hash_t selectmod(level,3) from dual connect by level commit ;

Commitcomplete.

dexter@ORCL>execdbms_stats.gather_table_stats('dexter','hash_t',cascade=>true);

PL/SQLprocedure successfully completed.

--可以看到,因为partitionkey 无法跨 分区 , 所以只有三个分区里面拥有数据
dexter@ORCL> select TABLE_NAME, PARTITION_NAME, TABLESPACE_NAME ,NUM_ROWS  , BLOCKS fromuser_tab_partitions ;

TABLE_NAMEPARTITION_NAME                TABLESPACE_NAME                 NUM_ROWS    BLOCKS
---------- ------------------------------ ---------------------------------------- ----------
HASH_T     SYS_P29                        USERS                               33333         58
HASH_T     SYS_P28                        USERS                                   0          0
HASH_T     SYS_P30                        USERS                               33333         58
HASH_T     SYS_P31                        USERS                               33334         58




6. 附录:11.2.0.3 版本数据库 分区表 延迟分配segment 测试

_sys@FAKE>select version from v$instance ;

VERSION
----------------------------------
11.2.0.3.0

_sys@FAKE>show parameter defe

NAME                                 TYPE                  VALUE
------------------------------------ ----------------------------------------------------
deferred_segment_creation           boolean                TRUE

_dexter@FAKE>create table hash_t (id number )
  2   partition by hash(id)
  3   (partition &part_name1 tablespace users,
  4    partition &part_name2 tablespace example,
  5    partition &part_name3,
  6    partition &part_name4) ;
Enter value for part_name1: p_t1
old   3:  (partition &part_name1 tablespace users,
new   3:  (partition p_t1 tablespace users,
Enter value for part_name2: p_t2
old   4:    partition &part_name2 tablespaceexample,
new   4:    partition p_t2 tablespaceexample,
Enter value for part_name3: p_t3
old   5:    partition &part_name3,
new   5:    partition p_t3,
Enter value for part_name4: p_t4
old   6:    partition &part_name4)
new   6:    partition p_t4)
















Tablecreated.

_dexter@FAKE>select TABLE_NAME, PARTITION_NAME, TABLESPACE_NAME , NUM_ROWS  , BLOCKS , segment_created fromuser_tab_partitions where table_name='HASH_T'

TABLE_NAMEPARTITION_NAME                TABLESPACE   NUM_ROWS     BLOCKS SEGMENT_
---------- ------------------------------ ---------- ---------- ------------------
HASH_T     P_T1                           USERS                            NO
HASH_T     P_T2                           EXAMPLE                          NO
HASH_T     P_T3                           USERS                            NO
HASH_T     P_T4                           USERS                            NO





    
 
 

您可能感兴趣的文章:

 
本站(WWW.)旨在分享和传播互联网科技相关的资讯和技术,将尽最大努力为读者提供更好的信息聚合和浏览方式。
本站(WWW.)站内文章除注明原创外,均为转载、整理或搜集自网络。欢迎任何形式的转载,转载请注明出处。












  • 相关文章推荐
  • PHP中对各种加密算法、Hash算法的速度测试对比代码
  • linux下c/c++使用hash_map方法介绍 iis7站长之家
  • 找不到hash_set hash_map
  • linux下c/c++使用hash_map方法介绍
  • 请问LINUX下有没有跟在windows下一样的hash_map使用?比如VS2005下有微软实现的hash_map.
  • HASH查找的程序实现及性能分析
  • HASH算法是干嘛用的阿?
  • nginx的hash
  • hash密码破解工具 Hashkill
  • 开源GPU HASH密码破解工具 Whitepixel
  • 设计这样一个hash表
  • Google Sparse Hash
  • jQuery密码输入体验 Chroma-Hash
  • 关于hash_map
  • C/C++中使用hash_map要包含哪些头文件啊?
  • C Minimal Perfect Hashing Library
  • Linux的路由表是用hash表实现的呢还是radix树?
  • is hashed问题
  • 哈希计算工具 java-hash
  • 请教一句话:hash、表驱动、cache是程序设计中唯一伟大的三种技术。
  • john "No password hashes loaded"
  • MySQL Hash索引和B-Tree索引的区别


  • 站内导航:


    特别声明:169IT网站部分信息来自互联网,如果侵犯您的权利,请及时告知,本站将立即删除!

    ©2012-2021,