一. Data Blocks and Operating System Blocks
At the physical level, database data is stored in disk files made up of operating system blocks. An operating system block is the minimum unit of data that the operating system can read or write. In contrast, an Oracle block is a logical storage structure whose size and structure are not known to the operating system.
Figure shows that operating system blocks may differ in size from data blocks. The database requests data in multiples of data blocks, not operating system blocks.
When the database requests a data block, the operating system translates this operation into a requests for data in permanent storage. The logical separation of data blocks from operating system blocks has the following implications:
(1)Applications do not need to determine the physical addresses of data on disk.
(2)Database data can be striped or mirrored on multiple physical disks.
Every database has a database block size. The DB_BLOCK_SIZE initialization parameter sets the data block size for a database when it is created. The size is set for the SYSTEM and SYSAUX tablespaces and is the default for all other tablespaces. The database block size cannot be changed except by re-creating the database.
If DB_BLOCK_SIZE is not set, then the default data block size is operating system-specific. The standard data block size for a database is 4 KB or 8 KB. If the size differs for data blocks and operating system blocks, then the data block size must be a multiple of the operating system block size.
三. Tablespace Block Size
You can create individual tablespaces whose block size differs from the DB_BLOCK_SIZE setting. A nonstandard block size can be useful when moving a transportable tablespace to a different platform.
Every data block has a format or internal structure that enables the database to track the data and free space in the block. This format is similar whether the data block contains table, index, or table cluster data. Figure shows the format of an uncompressed data block.
Oracle Database uses the block overhead to manage the block itself. The block overhead is not available to store user data. The block overhead includes the following parts:
This part contains general information about the block, including disk address and segment type. For blocks that are transaction-managed, the block header contains active and historical transaction information.
A transaction entry is required for every transaction that updates the block. Oracle Database initially reserves space in the block header for transaction entries. In data blocks allocated to segments that support transactional changes, free space can also hold transaction entries when the header space is depleted. The space required for transaction entries is operating system dependent. However, transaction entries in most operating systems require approximately 23 bytes.
For a heap-organized table, this directory contains metadata about tables whose rows are stored in this block. Multiple tables can store rows in the same block.
-- metadata,元数据,指表的定义语句。
heap-organized table: A table in which the data rows are stored in no particular order on disk. By default, CREATE TABLE creates a heap-organized table.
5.3 Row directory
For a heap-organized table, this directory describes the location of rows in the data portion of the block.
After space has been allocated in the row directory, the database does not reclaim this space after row deletion. Thus, a block that is currently empty but formerly had up to 50 rows continues to have 100 bytes allocated for the row directory. The database reuses this space only when new rows are inserted in the block.
Some parts of the block overhead are fixed in size, but the total size is variable. On average, the block overhead totals 84 to 107 bytes.
The row data part of the block contains the actual data, such as table rows or index key entries. Just as every data block has an internal format, every row has a row format that enables the database to track the data in the row.
Oracle Database stores rows as variable-length records. A row is contained in one or more row pieces. Each row piece has a row header and column data.
Figure shows the format of a row.
Oracle Database uses the row header to manage the row piece stored in the block. The row header contains information such as the following:
(1)Columns in the row piece
(2)Pieces of the row located in other data blocks
If an entire row can be inserted into a single data block, then Oracle Database stores the row as one row piece. However, if all of the row data cannot be inserted into a single block or an update causes an existing row to outgrow its block, then the database stores the row in multiple row pieces (see "Chained and Migrated Rows"). A data block usually contains only one row piece per row.
(3)Cluster keys for table clusters (see "Overview of Table Clusters")
A row fully contained in one block has at least 3 bytes of row header.
After the row header, the column data section stores the actual data in the row. The row piece usually stores columns in the order listed in the CREATE TABLE statement, but this order is not guaranteed. For example, columns of type LONG are created last.
For each column in a row piece, Oracle Database stores the column length and data separately. The space required depends on the data type. If the data type of a column is variable length, then the space required to hold a value can grow and shrink with updates to the data.
Each row has a slot in the row directory of the data block header. The slot points to the beginning of the row.
See Also: "Table Storage" and "Index Storage"
Oracle Database uses a rowid to uniquely identify a row. Internally, the rowid is a structure that holds information that the database needs to access a row. A rowid is not physically stored in the database, but is inferred from the file and block on which the data is stored.
An extended rowid includes a data object number. This rowid type uses a base 64 encoding of the physical address for each row. The encoding characters are A-Z, a-z, 0-9, +, and /.
Example 12-1 queries the ROWID pseudocolumn to show the extended rowid of the row in the employees table for employee 100.
Example 12-1 ROWID Pseudocolumn
SQL> SELECT ROWID FROM employees WHERE employee_id = 100;
ROWID
------------------
AAAPecAAFAAAABSAAA
the following Figure illustrates the format of an extended rowid.
An extended rowid is displayed in a four-piece format, OOOOOOFFFBBBBBBRRR, with the format divided into the following components:
7.1 OOOOOO
The data object number identifies the segment (data object AAAPec in Example 12-1). A data object number is assigned to every database segment. Schema objects in the same segment, such as a table cluster, have the same data object number.
-- 数据对象编号:每个数据对象(如表或索引)在创建时都分配有此编号,并且此编号在数据库中是唯一的.
7.2 FFF
The tablespace-relative data file number identifies the data file that contains the row (file AAF in Example 12-1).
相关文件编号:此编号对于表空间中的每个文件是唯一的
7.3 BBBBBB
The data block number identifies the block that contains the row (block AAAABS in Example 12-1). Block numbers are relative to their data file, not their tablespace. Thus, two rows with identical block numbers could reside in different data files of the same tablespace.
块编号:表示包含此行的块在文件中的位置
7.4 RRR
The row number identifies the row in the block (row AAA in Example 12-1).
行编号:标识块头中行目录位置的位置
After a rowid is assigned to a row piece, the rowid can change in special circumstances. For example,
if row movement is enabled, then the rowid can change because of partition key updates, Flashback Table operations, shrink table operations, and so on.
If row movement is disabled, then a rowid can change if the row is exported and imported using Oracle Database utilities.
Note:
Internally, the database performs row movement as if the row were physically deleted and reinserted. However, row movement is considered an update, which has implications for triggers.
The database can use table compression to eliminate duplicate values in a data block (see "Table Compression"). This section describes the format of data blocks that use compression.
The format of a data block that uses basic and OLTP table compression is essentially the same as an uncompressed block. The difference is that a symbol table at the beginning of the block stores duplicate values for the rows and columns. The database replaces occurrences of these values with a short reference to the symbol table.
Assume that the rows in Example 12-2 are stored in a data block for the seven-column sales table.
Example 12-2 Rows in sales Table
2190,13770,25-NOV-00,S,9999,23,161
2225,15720,28-NOV-00,S,9999,25,1450
34005,120760,29-NOV-00,P,9999,44,2376
9425,4750,29-NOV-00,I,9999,11,979
1675,46750,29-NOV-00,S,9999,19,1121
When basic or OLTP table compression is applied to this table, the database replaces duplicate values with a symbol reference.
Example 12-3 is a conceptual representation of the compression in which the symbol * replaces 29-NOV-00 and % replaces 9999.
Example 12-3 OLTP Compressed Rows in sales Table
2190,13770,25-NOV-00,S,%,23,161
2225,15720,28-NOV-00,S,%,25,1450
34005,120760,*,P,%,44,2376
9425,4750,*,I,%,11,979
1675,46750,*,S,%,19,1121
Table 12-1 conceptually represents the symbol table that maps symbols to values.
Symbol
Value
Column
Rows
*
29-NOV-00
3
958-960
%
9999
5
956-960
九. Space Management in Data Blocks
As the database fills a data block from the bottom up, the amount of free space between the row data and the block header decreases. This free space can also shrink during updates, as when changing a trailing null to a nonnull value. The database manages free space in the data block to optimize performance and avoid wasted space.
Note: This section assumes the use of automatic segment space management.
9.1 Percentage of Free Space in Data Blocks
The PCTFREE storage parameter is essential to how the database manages free space. This SQL parameter sets the minimum percentage of a data reserved as free space for updates to existing rows. Thus, PCTFREE is important for preventing row migration and avoiding wasted space.
For example, assume that you create a table that will require only occasional updates, most of which will not increase the size of the existing data. You specify the PCTFREE parameter within a CREATE TABLE statement as follows:
CREATE TABLE test_table (n NUMBER) PCTFREE 20;
Figure 12-9 shows how a PCTFREE setting of 20 affects space management. The database adds rows to the block over time, causing the row data to grow upwards toward the block header, which is itself expanding downward toward the row data. The PCTFREE setting ensures that at least 20% of the data block is free. For example, the database prevents an INSERT statement from filling the block so that the row data and header occupy a combined 90% of the total block space, leaving only 10% free.
Note:
This discussion does not apply to LOB data types, which do not use the PCTFREE storage parameter or free lists. See "Overview of LOBs".
9.2 Optimization of Free Space in Data Blocks
While the percentage of free space cannot be less than PCTFREE, the amount of free space can be greater. For example, a PCTFREE setting of 20% prevents the total amount of free space from dropping to 5% of the block, but permits 50% of the block to be free space. The following SQL statements can increase free space:
(1)DELETE statements
(2)UPDATE statements that either update existing values to smaller values or increase existing values and force a row to migrate
(3)INSERT statements on a table that uses OLTP compression
If inserts fill a block with data, then the database invokes block compression, which may result in the block having more free space.
The space released is available for INSERT statements under the following conditions:
(1)If the INSERT statement is in the same transaction and after the statement that frees space, then the statement can use the space.
(2)If the INSERT statement is in a separate transaction from the statement that frees space (perhaps run by another user), then the statement can use the space made available only after the other transaction commits and only if the space is needed.
9.3 Coalescing Fragmented Space
Released space may or may not be contiguous with the main area of free space in a data block, as shown in Figure 12-10. Noncontiguous free space is called fragmented space.
Figure 12-10 Data Block with Fragmented Space
Oracle Database automatically and transparently coalesces the free space of a data block only when the following conditions are true:
(1)An INSERT or UPDATE statement attempts to use a block that contains sufficient free space to contain a new row piece.
(2)The free space is fragmented so that the row piece cannot be inserted in a contiguous section of the block.
After coalescing, the amount of free space is identical to the amount before the operation, but the space is now contiguous. Figure 12-11 shows a data block after space has been coalesced.
Figure 12-11 Data Block After Coalescing Free Space
Oracle Database performs coalescing only in the preceding situations because otherwise performance would decrease because of the continuous coalescing of the free space in data blocks.
The database can reuse space within an index block. For example, if you insert a value into a column and delete it, and if an index exists on this column, then the database can reuse the index slot when a row requires it.
The database can reuse an index block itself. Unlike a table block, an index block only becomes free when it is empty. The database places the empty block on the free list of the index structure and makes it eligible for reuse. However, Oracle Database does not automatically compact the index: an ALTER INDEX REBUILD or COALESCE statement is required.
Figure 12-12 represents an index of the employees.department_id column before the index is coalesced. The first three leaf blocks are only partially full, as indicated by the gray fill lines.
Figure 12-12 Index Before Coalescing
Figure 12-13 shows the index in Figure 12-12 after the index has been coalesced. The first two leaf blocks are now full, as indicated by the gray fill lines, and the third leaf block has been freed.
Figure 12-13 Index After Coalescing
Oracle Database must manage rows that are too large to fit into a single block. The following situations are possible:
(1)The row is too large to fit into one data block when it is first inserted.
In row chaining, Oracle Database stores the data for the row in a chain of one or more data blocks reserved for the segment. Row chaining most often occurs with large rows.
Examples include rows that contain a column of data type LONG or LONG RAW, a VARCHAR2(4000) column in a 2 KB block, or a row with a huge number of columns. Row chaining in these cases is unavoidable.
(2)A row that originally fit into one data block is updated so that the overall row length increases, but insufficient free space exists to hold the updated row.
In row migration, Oracle Database moves the entire row to a new data block, assuming the row can fit in a new block. The original row piece of a migrated row contains a pointer or "forwarding address" to the new block containing the migrated row. The rowid of a migrated row does not change.
(3)A row has more than 255 columns.
Oracle Database can only store 255 columns in a row piece. Thus, if you insert a row into a table that has 1000 columns, then the database creates 4 row pieces, typically chained over multiple blocks.
Figure 12-14 depicts shows the insertion of a large row in a data block. The row is too large for the left block, so the database chains the row by placing the first row piece in the left block and the second row piece in the right block.
Figure 12-15, the left block contains a row that is updated so that the row is now too large for the block. The database moves the entire row to the right block and leaves a pointer to the migrated row in the left block.
When a row is chained or migrated, the I/O needed to retrieve the data increases. This situation results because Oracle Database must scan multiple blocks to retrieve the information for the row.
For example, if the database performs one I/O to read an index and one I/O to read a nonmigrated table row, then an additional I/O is required to obtain the data for a migrated row.
The Segment Advisor, which can be run both manually and automatically, is an Oracle Database component that identifies segments that have space available for reclamation. The advisor can offer advice about objects that have significant free space or too many chained rows.