Archive for category internals

Size, Yes Its Different

Some times, there is an assumption about things. And as with assumption always, its mostly not true. Still, we do believe on those assumptions and spread them as well , not caring that it may happen that the things are not the way you know them or have read them anywhere! I had a similar sort of discussion few days back with some guys and one of the discussions did end up  as a matter of small confusion. This small note is about that confusion’s answer!

I have met many oracle dba’s arguing that the standard block size set at the time of the database creation is the size which is applicable to all the types of the files. So if you have a block size of 8kb, this would be size of your datafiles, control files and also of the log files. This sounds reasonable as well since besides setting the value for the Oracle block size, you don’t have option to mention any other kind of block size as well. So if you are setting a value of it, this should be applicable to all the database files. And this was the topic of discussion as well between me and few delegates! The answer of this doubt is a No actually! Oracle’s standard block size is applicable to only the datafiles. The redo log files and the control files use a different block size. Yes, the block size used by them is not shown in any of the views( things have changed a little from 11.1 onwards) so even if there would be any other value for the blocks, that is not visible. So let’s first check the size used in the datafiles. We shall check the values in 10.2 and 11.1 databases (10201, 11106) running on Windows XP Professional. First data files on 10g,

Connected to:
  Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production 

With the Partitioning, OLAP and Data Mining options


SQL> select file#,block_size from  V$datafile; 

     FILE# BLOCK_SIZE
  ---------- ---------- 

         1       8192 

         2       8192 

         3       8192 

         4       8192 

         5       8192 

         6       8192 

         7       8192

SQL> sho parameter block_size 

NAME                                 TYPE        VALUE
  ------------------------------------ ----------- ----------- 

db_block_size                        integer     8192 

SQL>

 

And now the same for 11.1,

Connected to:
  Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - Production 

With the Partitioning, OLAP, Data Mining and Real Application Testing options


SQL> select file#, block_size from V$datafile; 

     FILE# BLOCK_SIZE
  ---------- ---------- 

         1       8192 

         2       8192 

         3       8192 

         4       8192 

         5       8192 

         6       8192 

         7       8192 

7 rows selected. 

SQL> sho parameter block_size 

NAME                                 TYPE        VALUE
  ------------------------------------ ----------- ------- 

db_block_size                        integer     8192

 

And this surely matches with what we have set in the parameter DB_BLOCK_SIZE!

Now, the issue is how do we check the block size used by the redo log files? In 10g, there is no provision given by oracle in the external view(s) to see this!

BANNER
  ---------------------------------------------------------------- 

Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Prod 

PL/SQL Release 10.2.0.1.0 - Production 

CORE    10.2.0.1.0      Production 

TNS for 32-bit Windows: Version 10.2.0.1.0 - Production 

NLSRTL Version 10.2.0.1.0 - Production


SQL> desc V$log;
  Name                                      Null?    Type 

----------------------------------------- -------- -------------------- 

GROUP#                                             NUMBER 

THREAD#                                            NUMBER 

SEQUENCE#                                          NUMBER 

BYTES                                              NUMBER 

MEMBERS                                            NUMBER 

ARCHIVED                                           VARCHAR2(3) 

STATUS                                             VARCHAR2(16) 

FIRST_CHANGE#                                      NUMBER 

FIRST_TIME                                         DATE 

SQL> desc V$logfile
  Name                                      Null?    Type 

----------------------------------------- -------- -------------------- 

GROUP#                                             NUMBER 

STATUS                                             VARCHAR2(7) 

TYPE                                               VARCHAR2(7) 

MEMBER                                             VARCHAR2(513) 

IS_RECOVERY_DEST_FILE                              VARCHAR2(3) 

SQL>

But the size is indeed shown in an internal table, X$KCCRT( Kernel Cache Current Redo Thread). I am not going to show all the columns of this table but only that one which would be having the “log block size” .

SQL> select lebsz from X$kccle;


     LEBSZ
  ---------- 

       512 

       512 

       512

Yes, this is the size of the redo log block in which the LGWR writes into the log files. So this is indeed not true that the standard block size is applicable to the redo log files. This size basically is picked by Oracle based on the physical block size given by the media. We have the 512byte of the physical block size available and that’s the same is used by the redo log files. Still, you should check this on your own box! Surely enough, there is no need to change this size or play around with it! In 11g(11.1), this is the same output that you would be getting from oracle as there is no change that’s there in 11.1 for this nomenclature. Here is an output from 11.1 system,

BANNER
  -------------------------------------------------------------------------------- 

Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - Production 

PL/SQL Release 11.1.0.6.0 - Production 

CORE    11.1.0.6.0      Production 

TNS for 32-bit Windows: Version 11.1.0.6.0 - Production 

NLSRTL Version 11.1.0.6.0 - Production


SQL> select lebsz from x$kccle; 

     LEBSZ
  ---------- 

       512 

       512 

       512 

SQL> desc V$log
  Name                                      Null?    Type 

----------------------------------------- -------- ---------------------------- 

GROUP#                                             NUMBER 

THREAD#                                            NUMBER 

SEQUENCE#                                          NUMBER 

BYTES                                              NUMBER 

MEMBERS                                            NUMBER 

ARCHIVED                                           VARCHAR2(3) 

STATUS                                             VARCHAR2(16) 

FIRST_CHANGE#                                      NUMBER 

FIRST_TIME                                         DATE

SQL> desc V$logfile
  Name                                      Null?    Type 

----------------------------------------- -------- ---------------------------- 

GROUP#                                             NUMBER 

STATUS                                             VARCHAR2(7) 

TYPE                                               VARCHAR2(7) 

MEMBER                                             VARCHAR2(513) 

IS_RECOVERY_DEST_FILE                              VARCHAR2(3)

The output varies a little from 11.2 onwards where to check the redo log block, you won’t need to check any internal table but the same would be shown in the standard V$log. Since I don’t have 11.2 running with me on this machine so here is an excerpt from docs for the same,

http://download.oracle.com/docs/cd/E11882_01/server.112/e10820/dynviews_2030.htm#REFRN30127

You can see a new column sized called BLOCKSIZE which can have two values, 512 or 4096! Hang on a sec, two values? How that’s possible? The answer of that lies in a change that has started to come in the way our hard disks.

The standard physical sector size supported by the hard drives was always 512byte. This was( and still is) the same size of the block that was picked by redo log files as explained above. Since there are more bigger and faster media solutions available now, the hardware vendors have started moving from 512 byte physical block sized disks to 4kb physical block sized disks! Surely enough, this would increase the capability of underlying systems to do a more larger chunk of IOs in both reading and writing. But for this, there remains one hurdle that if the oracle files are still going to be using the 512byte sector sized block, this optimization from the hard disk vendors won’t bear any fruit since the IO would be still limited to the underlying block size of the redo log file which would be still 512byte. To overcome this, from 11.2 onwards, Oracle supports the 4kb sector disk drives and offers two modes of the working, Emulation mode and Native mode for the block size support of the redo log files. You can read about it here,

http://download.oracle.com/docs/cd/E11882_01/server.112/e10881/chapter1.htm#FEATURENO08747

Oracle is capable to detect what’s the size of the physical block offered by the disk and accordingly, the redo log block size is picked up. And this was the right thing to do in the past as well since there was no requirement to do any kind of tuning to this behavior as well as there was no other option available from the physical disk as well. But since now the vendors have started shipping disks with 4kb disks as well, so the support for the same must come both from Oracle and from the operating systems as well. Here is a support note from Microsoft about the same,

http://support.microsoft.com/kb/923332

The same support is offered by Oracle from 11.2 onwards where two new clauses are introduced for the same, sector_size and blocksize. The SECTOR_SIZE clause is added while you are working with the ASM(without ACFS) i.e. when you are creating a diskgroup, you can mention that what would be the size of the block used by that disk group. Oracle docs explain this in a good manner here,

http://download.oracle.com/docs/cd/E11882_01/server.112/e10500/asmdiskgrps.htm#OSTMG10203

The BLOCKSIZE support is offered from 11.2 while creating the log files. Now, you can mention that what should be the size of the redo log block on the underlying disk which supports either 4kb or 512 byte sized sector. If you are going to use 512 byte sized sector on a 4kb sector disk, this would be a non-good approach actually. So if you do know that you have a disk supporting 4kb sized sector, its advisable to use the same while creating the redo log files as well. For the same, 11.2 docs have this section added,

http://download.oracle.com/docs/cd/E11882_01/server.112/e10595/onlineredo002.htm#ADMIN12891

Emulation mode is the mode where the physical block size offered by the vendor is 4kb but the sized used by the redo log files is still 512byte only. This would be considered as a logical size where 8 logical sectors of 512 byte would map to one physical sector of 4096 bytes. As I said above, this would be a non-good approach. The better option would be to use the same size at both disk and within the file. There is another type of the mode offered called Native mode where the logical and physical size of the sector would be the same.

Oracle recommends that the block size used by the redo logs should match with the physical sector and the size of the data block should be either equivalent of multiple of the physical sector size. This optimization does “not” get applied to the control files which won’t experience any chance in their working since they don’t use either the standard block size or the redo log block size. Didn’t I say so already :-) ?

Unlike the redo log and data files, control files use a standard block size of 16kb irrespective of what is offered from the media. Again, this is not shown from the standard control file views so we need to go a step ahead and check the table, X$KCCCF( Kernel Cache Current Control File). Let’s check this table on both 10.2 and 11.1 versions,

SQL> select * from V$version;


BANNER
  ----------------------------------------------------------------

Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Prod

PL/SQL Release 10.2.0.1.0 - Production

CORE    10.2.0.1.0      Production

TNS for 32-bit Windows: Version 10.2.0.1.0 - Production

NLSRTL Version 10.2.0.1.0 - Production 

SQL> select cfbsz from X$kcccf; 

     CFBSZ
  ----------

     16384

     16384

     16384 

SQL> sho parameter control 

NAME                                 TYPE        VALUE
  ------------------------------------ ----------- ------------------------------

control_file_record_keep_time        integer     7

control_files                        string      E:\ORACLE\PRODUCT\10.2.0\ORADA

                                                 TA\ORCL\CONTROL01.CTL, E:\ORAC

                                                 LE\PRODUCT\10.2.0\ORADATA\ORCL

                                                 \CONTROL02.CTL, E:\ORACLE\PROD

                                                 UCT\10.2.0\ORADATA\ORCL\CONTRO

                                                 L03.CTL

What we are seeing is that there are three control files in my db and all are using 16kb as the block size. Let’s check on 11.1 now,

SQL> select * from V$version;


BANNER
  --------------------------------------------------------------------------------

Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - Production

PL/SQL Release 11.1.0.6.0 - Production

CORE    11.1.0.6.0      Production

TNS for 32-bit Windows: Version 11.1.0.6.0 - Production

NLSRTL Version 11.1.0.6.0 - Production 

SQL> select cfbsz from x$kcccf; 

     CFBSZ
  ----------

     16384

     16384

     16384 

SQL> sho parameter control 

NAME                                 TYPE        VALUE
  ------------------------------------ ----------- ------------------------------

control_file_record_keep_time        integer     7

control_files                        string      E:\APP\ARISTADBA\ORADATA\ORCL1

                                                 11G\CONTROL01.CTL, E:\APP\ARIS

                                                 TADBA\ORADATA\ORCL111G\CONTROL

                                                 02.CTL, E:\APP\ARISTADBA\ORADA

                                                 TA\ORCL111G\CONTROL03.CTL

control_management_pack_access       string      DIAGNOSTIC+TUNING

 

Which shows the same output with 3 files and a 16kb block size! This is sort of hardcoded only and won’t change even if you are going to use a disk which supports variable sector sizes.

To conclude, its good to believe on rumors but its more better to check their authentication since 9/10 times, rumors are just what they are, rumors! Sounds like a great punch line right :-D !

Special thanks to Jared for pointing and correcting a typo in the post :-)

,

17 Comments

Redo & Undo, Partners In Crime Talked About Yet Again….

Anyone who does know about Oracle db and its working even the slightest bit would have heard about these two words for sure, Redo and Undo! Both make Oracle db what it is, a very robust data engine and also come out as one of the most complex algorithms to understand within the same. Almost all the time, there is some confusion that’s there about these two and I am sure, both will be a very hot topic of discussion for many many years yet to come!

I am quite active on OTN Forums ( and no that’s not an excuse of being less active here) and at times, there are some very interesting questions which do pop up there. One such question about Redo, which obviously would include Undo as well did come out few days back over this thread, why not use Redo log for consistent read. Now, it shouldn’t come out as a surprise that why Redo can’t be used if one understands the way it works but it was still a very interesting topic to jump into since the way Oracle db works is one of my favorite areas. So I did try to explain it somewhat. If you think that you don’t know the answer or even if you know you do, I would welcome you to read the thread! As like always, any corrections wherever needed and possible are most welcome!

Though the thread went to great lengths, still I fee that the underlying answer was not really what it should had been. I am sure the reason is my not-so-deep understanding about Oracle db only since I was the one who did give maximum replies there. But I guess, the topic was actually good because of which, Jonathan Lewis, probably the best and the most knowledgeable person over this planet about Oracle, did pick the question and gave the actual (and an awesome one) and very detailed answer over here. I would recommend you to read the complete blog post which clears the concept completely that why Redo should NOT be used for read consistency and Undo should be! Thanks so much Jonathan for such a brilliant explanation!

I am pretty busy at the moment( and no, this is also not the reason for being less active here) because there is a very important writing assignment I am working over but I shall try to write soon about some of these basic concepts over here! Hope that “soon” happens soon :-) !

,

No Comments

Select For Update A DML, Yes It Is….

Okay, if you think that what the heck, this guy didn’t know this much even, well, its true. I didn’t know or to save myself, I would say, I did know what Select For Updat does , like it locks the rows and so on, but didn’t relate to the redo part. So when I came to know about it reading this thread(thanks to Mark) today I tested it out, it appears to be  true. So before I would forget about it, I thought to store it here. If you already knew, refresh your memories, if not , keep it in memory :-) .

A Select statement is merely meant to select the data from the table, that’s it, plain and simple. It doesn’t do anything else. It doesn’t generate any undo and redo in normal conditions but for a delayed block cleanout, it may generate redo as well. There are no locks which are acquired by it , there is nothing which gets changed. Yes it may look for the Snapshot Blocks if there are DMLs happening on the data which it is looking for but that’s it. Other than all this, it behaves properly :-) .

Select for update statement, is a cousin of Select only but it is not that simple as like its sibling. With For Update added , Oracle has to make sure that the candidate rows must be locked. And this means, for all the rows being effected, the lock byte of them would get locked , including the segment also whose row we are going to hit. This means, Select For Update is actually not just a select statement  but its a DML which actually would start a transaction, lock the rows, generate redo for changing their lock records and also would require a rollback explicitly given to get those locks released. Have a look,
<code>
SQL> select * from v$version;
BANNER
—————————————————————-
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 – Prod
PL/SQL Release 10.2.0.1.0 – Production
CORE    10.2.0.1.0      Production
TNS for Linux: Version 10.2.0.1.0 – Production
NLSRTL Version 10.2.0.1.0 – Production
SQL> create table testredo as select * from all_objects;
Table created.
SQL> set autot trace stat
SQL> select * from testredo;
49714 rows selected.
Statistics
———————————————————-
        288  recursive calls
          0  db block gets
       4041  consistent gets
        683  physical reads
          0  redo size
    5337330  bytes sent via SQL*Net to client
      36839  bytes received via SQL*Net from client
       3316  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      49714  rows processed
SQL> /
49714 rows selected.
Statistics
———————————————————-
          0  recursive calls
          0  db block gets
       3956  consistent gets
          0  physical reads
          0  redo size
    5337330  bytes sent via SQL*Net to client
      36839  bytes received via SQL*Net from client
       3316  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      49714  rows processed
</code>
There was no requirement for running the same select twice, I just ran it to ensure me that nothing fishy is going on :-) .

So there is no redo with the normal select. Let’s try out For Update now,

<code>
SQL> select * from testredo for update;
49714 rows selected.
Statistics
———————————————————-
        771  recursive calls
      50738  db block gets
       4862  consistent gets
          1  physical reads
   10369188  redo size
    4444384  bytes sent via SQL*Net to client
      36839  bytes received via SQL*Net from client
       3316  SQL*Net roundtrips to/from client
          4  sorts (memory)
          0  sorts (disk)
      49714  rows processed
WHOOOAAA! There is a lot of redo. Let’s see if there is some transaction record that we have got,
SQL> select * from V$locked_object;
    XIDUSN    XIDSLOT     XIDSQN  OBJECT_ID SESSION_ID
———- ———- ———- ———- ———-
ORACLE_USERNAME                OS_USER_NAME                   PROCESS
—————————— —————————— ————
LOCKED_MODE
———–
         9          8        364      52533        159
AMAN                           oracle                         4989
          3
SQL> select xidusn,xidslot,xidsqn from V$transaction;
    XIDUSN    XIDSLOT     XIDSQN
———- ———- ———-
         9          8        364
SQL>            
</code>
So there is indeed a transaction which got recorded and there is surely a lock that’s there on the object also. Let’s finish it off,
<code>
SQL>roll;
Rollback Complete
SQL> select * from V$locked_object;
no rows selected
SQL> select * from V$transaction;
no rows selected
SQL> set autot trace stat
SQL> select * from testredo;
49714 rows selected.
Statistics
———————————————————-
          0  recursive calls
          0  db block gets
       3956  consistent gets
          1  physical reads
          0  redo size
    5337330  bytes sent via SQL*Net to client
      36839  bytes received via SQL*Net from client
       3316  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      49714  rows processed
SQL>               
</code>
So we are back to from where we started.

A small but good learning , at least for me, thanks to Mark and Charles for their explanation over the the original thread.

Conclusion, think more deeply about something. At times, just knowing about the concept is not enough :-) . And yes, just for the record, select for update is a  DML statement, not just a normal select , that’s the actual conclusion right ;-) !

,

4 Comments

Fixed Table Definitions, A New Learning….

Whenever I used to find out the info of any fixed table(x$) I used to do this by a workaround. I used to set the trace, run the query and see the table’s name. For example, if we are looking for the V$log’s fixed table name so I would do something like that ,

SQL> select * from  V$log;

Execution Plan

———————————————————-

Plan hash value: 2536105608

——————————————————————————–

————

| Id  | Operation                | Name            | Rows  | Bytes | Cost (%CPU)

| Time     |

——————————————————————————–

————

|   0 | SELECT STATEMENT         |                 |     1 |   185 |     0   (0)

| 00:00:01 |

|   1 |  NESTED LOOPS            |                 |     1 |   185 |     0   (0)

| 00:00:01 |

|*  2 |   FIXED TABLE FULL       | X$KCCLE         |     1 |   136 |     0   (0)

| 00:00:01 |

|*  3 |   FIXED TABLE FIXED INDEX| X$KCCRT (ind:1) |     1 |    49 |     0   (0)

| 00:00:01 |

——————————————————————————–

————

(output trimmed)
So I got that x$kccle and x$kccrt are driving it. Well not a bad way IMO.
Till today….
When I found this note,
So there is actually a view which tells us the description of the fixed tables, cool! So doing the same from here,
SQL> select view_definition
2  from v$fixed_view_definition where view_name=’V$LOG’;
VIEW_DEFINITION
——————————————————————————–
select   GROUP# , THREAD# , SEQUENCE# , BYTES , MEMBERS , ARCHIVED , STATUS , FI
RST_CHANGE# , FIRST_TIME from GV$LOG where inst_id = USERENV(’Instance’)
SQL> select view_definition
2  from v$fixed_view_definition where view_name=’GV$LOG’;
VIEW_DEFINITION
——————————————————————————–
select le.inst_id, le.lenum, le.lethr, le.leseq, le.lesiz*le.lebsz, ledup, decod
e(bitand(le.leflg,1),0,’NO’,’YES’), decode(bitand(le.leflg,24), 8, ‘CURRENT’,
16,’CLEARING’,                            24,’CLEARING_
CURRENT’,        decode(sign(leseq),0,’UNUSED’,        decode(sign((to_number(rt
.rtckp_scn)-to_number(le.lenxs))*        bitand(rt.rtsta,2)),-1,’ACTIVE’,’INACTI
VE’))), to_number(le.lelos), to_date(le.lelot,’MM/DD/RR HH24:MI:SS’,’NLS_CALENDA
R=Gregorian’) from x$kccle le, x$kccrt rt where le.ledup!=0 and le.lethr=rt.rtnu
m and  le.inst_id = rt.inst_id
SQL>
Voila! We got the fixed table’s names and much more info about how they are being used. Cool! Its always better to take a straight way rather than a workaround. Learned some thing new today :-) .

2 Comments

Create Table Creating Dirty Buffers….

I always thought that the creation of table is a mere data dictionary update. I mean to say that the data dictionary , dba_tables, OBJ$ and other similar ones would get updated when we create a table. The logic was that its just a definition so where else it would be going? Its correct too that its just a definition but still for couple of things, Oracle does access DataBuffer Cache too when it creates a table and (here is what I didnt even think to see) not just accesses it but also makes couple of its block dirty too. Doesn’t make a sense,well ok let the numbers,bits, flags speak then. Have a look,

 

    1. SQL*Plus: Release 11.1.0.6.0 - Production on Fri Aug 22 00:24:26 2008  
    2.   
    3. Copyright (c) 19822007, Oracle.  All rights reserved.  
    4.   
    5. Enter password:  
    6.   
    7. Connected to:  
    8. Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - Production  
    9. With the Partitioning, Oracle Label Security, OLAP, Data Mining,  
    10. Oracle Database Vault and Real Application Testing options  
    11.   
    12. SQL> create table t (a number);  
    13.   
    14. Table created.  
    15.   
    16. SQL> select object_id ,object_name from user_objects where object_name=‘T’;  
    17.   
    18.  OBJECT_ID  OBJECT_NAME  
    19. ———-  ———-  
    20.   
    21.      72217  T  
  •  

    Here comes the interesting part:
    1. SQL> SELECT b.block#,b.class#,b.status, object_name, object_type, dirty “Dirty” FROM v$bh b,  
    2.   2   dba_objects o WHERE b.objd = o.data_object_id AND o.owner = ‘AMAN’ and object_name=‘T’;  
    3.   
    4.     BLOCK#     CLASS# STATUS     OBJECT_NAME          OBJECT_TYPE         D  
    5. ———- ———- ———- ——————– ——————- -  
    6.       3746          9 xcur       T                    TABLE               Y  
    7.       3745          8 xcur       T                    TABLE               Y  
    8.       3747          4 xcur       T                    TABLE               Y  
    9.   
    10. SQL>  

    The buffers from the table T are marked Dirty and are accessed in the XCUR mode which means the blocks are accessed in the Exclusive Current mode and is representing that these blocks are going to be modified. The description of the buffer status which is shown from V$BH(or X$BH.status) from 11gr1 docs is given here,

    STATUS VARCHAR2(6) Status of the buffer: 

    • free - Not currently in use
    • xcur - Exclusive
    • scur - Shared current
    • cr - Consistent read
    • read - Being read from disk
    • mrec - In media recovery mode
    • irec - In instance recovery mode

    Here is the link for the same in docs.http://download.oracle.com/docs/cd/B28359_01/server.111/b28320/dynviews_1059.htm#REFRN30029

    A much better description of the same is mentioned in the 9iR2 docs,in the RAC guide. The link is mentioned below and also the description,

    http://download.oracle.com/docs/cd/B10501_01/rac.920/a96597/pslkgdtl.htm#19780

    Block Access Modes and Buffer States

    An additional concurrency control concept is the buffer state which is the state of a buffer in the local cache of an instance. The buffer state of a block relates to the access mode of the block. For example, if a buffer state is exclusive current (XCUR), an instance owns the resource in exclusive mode.

    To see a buffer’s state, query the STATUS column of the V$BH dynamic performance view. The V$BH view provides information about the block access mode and their buffer state names as follows:

    • With a block access mode of NULL the buffer state name is CR–An instance can perform a consistent read of the block. That is, if the instance holds an older version of the data.
    • With a block access mode of S the buffer state name is SCUR–An instance has shared access to the block and can only perform reads.
    • With a block access mode of X the buffer state name is XCUR–An instance has exclusive access to the block and can modify it.
    • With a block access mode of NULL the buffer state name is PI–An instance has made changes to the block but retains copies of it as past images to record its state before changes.

    Only the SCUR and PI buffer states are Real Application Clusters-specific. There can be only one copy of any one block buffered in the XCUR state in the cluster database at any time. To perform modifications on a block, a process must assign an XCUR buffer state to the buffer containing the data block.

    For example, if another instance requests read access to the most current version of the same block, then Oracle changes the access mode from exclusive to shared, sends a current read version of the block to the requesting instance, and keeps a PI buffer if the buffer contained a dirty block.

    At this point, the first instance has the current block and the requesting instance also has the current block in shared mode. Therefore, the role of the resource becomes global. There can be multiple shared current (SCUR) versions of this block cached throughout the cluster database at any time.

    So the blocks are meant for the modification and hence they are marked dirty after being modified. The same is confirmed from the query written by Jonathan Lewis,

    1. SQL> select  
    2.          decode(bitand(flag,power(2,00)),0,‘No’,‘Yes’ ) buffer_dirty,  
    3.          decode(bitand(flag,power(2,01)),0,‘No’,‘Yes’ ) about_to_modify,  
    4.          decode(bitand(flag,power(2,02)),0,‘No’,‘Yes’ ) mod_started,  
    5.          decode(bitand(flag,power(2,03)),0,‘No’,‘Yes’ ) block_has_been_logged,  
    6.          decode(bitand(flag,power(2,04)),0,‘No’,‘Yes’ ) temp_data,  
    7.          decode(bitand(flag,power(2,05)),0,‘No’,‘Yes’ ) being_written,  
    8.          decode(bitand(flag,power(2,06)),0,‘No’,‘Yes’ ) waiting_for_write,  
    9.          decode(bitand(flag,power(2,07)),0,‘No’,‘Yes’ ) checkpoint_wanted,  
    10.          decode(bitand(flag,power(2,08)),0,‘No’,‘Yes’ ) recovery_reading,  
    11.          decode(bitand(flag,power(2,09)),0,‘No’,‘Yes’ ) unlink_from_lock,  
    12.          decode(bitand(flag,power(2,10)),0,‘No’,‘Yes’ ) down_grade_lock,  
    13.          decode(bitand(flag,power(2,11)),0,‘No’,‘Yes’ ) cross_instance_write,  
    14.          decode(bitand(flag,power(2,12)),0,‘No’,‘Yes’ ) reading_as_CR,  
    15.          decode(bitand(flag,power(2,13)),0,‘No’,‘Yes’ ) gotten_in_current_mode,  
    16.          decode(bitand(flag,power(2,14)),0,‘No’,‘Yes’ ) stale,  
    17.          decode(bitand(flag,power(2,15)),0,‘No’,‘Yes’ ) deferred_ping,  
    18.          decode(bitand(flag,power(2,16)),0,‘No’,‘Yes’ ) direct_access,  
    19.          decode(bitand(flag,power(2,17)),0,‘No’,‘Yes’ ) moved_to_lru_tail,  
    20.          decode(bitand(flag,power(2,18)),0,‘No’,‘Yes’ ) ignore_redo,  
    21.          decode(bitand(flag,power(2,19)),0,‘No’,‘Yes’ ) only_sequential_access,  
    22.          decode(bitand(flag,power(2,20)),0,‘No’,‘Yes’ ) unused_0×100000,  
    23.          decode(bitand(flag,power(2,21)),0,‘No’,‘Yes’ ) re_write_needed,  
    24.          decode(bitand(flag,power(2,22)),0,‘No’,‘Yes’ ) flushed,  
    25.          decode(bitand(flag,power(2,23)),0,‘No’,‘Yes’ ) resilvered_already,  
    26.          decode(bitand(flag,power(2,24)),0,‘No’,‘Yes’ ) ckpt_writing,  
    27.          decode(bitand(flag,power(2,25)),0,‘No’,‘Yes’ ) redo_since_read,  
    28.          decode(bitand(flag,power(2,26)),0,‘No’,‘Yes’ ) unused_0×4000000,  
    29.          decode(bitand(flag,power(2,27)),0,‘No’,‘Yes’ ) unused_0×8000000,  
    30.          decode(bitand(flag,power(2,28)),0,‘No’,‘Yes’ ) unused_0×10000000,  
    31.          decode(bitand(flag,power(2,29)),0,‘No’,‘Yes’ ) unused_0×20000000,  
    32.          decode(bitand(flag,power(2,30)),0,‘No’,‘Yes’ ) unused_0×40000000,  
    33.          decode(bitand(flag,power(2,31)),0,‘No’,‘Yes’ ) unused_0×80000000  
    34.  from x$bh where obj=72217;  
    35.   
    36. BUF ABO MOD BLO TEM BEI WAI CHE REC UNL DOW CRO REA GOT STA DEF DIR MOV IGN ONL UNU RE_ FLU RES CKP RED UNU UNU UNU UNU UNU UNU  
    37. — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — —  
    38. Yes No  No  No  No  No  No  No  No  No  No  No  No  No  No  No  No  No  No  No  No  No  No  No  No  Yes No  No  No  No  No  No  
    39.   
    40. Yes No  No  No  No  No  No  No  No  No  No  No  No  No  No  No  No  No  No  No  No  No  No  No  No  Yes No  No  No  No  No  No  
    41.   
    42. Yes No  No  No  No  No  No  No  No  No  No  No  No  No  No  No  No  No  No  No  No  No  No  No  No  Yes No  No  No  No  No  No  

    So indeed the buffers are marked are as dirty only. But the buffers are supposed to be marked dirty when they are changed and with the create table,what’s going on which has made them change. Each buffer belongs to a particular classLets try to find out the class description of these buffers using their numbers.The block classes are described here*,

    Block Class Description
    1 Data block
    2 Sort block
    3 Save undo block
    4 Segment header
    5 Save undo header
    6 Free list
    7 Extent map
    8 1st level bitmap block
    9 2nd level bitmap block
    10 3rd level bitmap block
    11 Bitmap block
    12 Bitmap index block
    13 File header block
    14 Unused
    15 System undo block
    16 System undo block
    17 Undo header
    18 Undo block

    So according to the result that we have got, the blocks belong to the classes 4,8,9 and these numbers represent the classes(in the same order ),Header Block,1st level Bitmap Block,2nd Level Bitmap Block.

    When oracle creates a table, it has to allocate a segment header block. This is done by picking up a block from a cache,formatting it,writing into it the new description of the data segment.As it is written so it is marked dirty and subsequently is written to the datafile.

    The Automatic Segment Space Management allocates 3 levels of the blocks for example, Level1 , Level2 , Level3. The level 3 is the bitmap header block which resides in the segment header. This is linked to the Level 2 block. This block is used to search Level 1 which is always the first block in the extent. Level 1 marks the blocks for their respective status. So it appears , Oracle does the same what it does for the Segment header block to the Bitmap Blocks i.e. acquires two blocks from the cache , marks as Bitmap blocks, writes into them and marks them dirty. So this is the reason that we get the blocks marked as dirty even when we have not done anything by ourselves to change them. For me it was a new concept as I was not aware about it before. I would still try to dig upon more and will try to post more as and when I shall find it out. Heartiest thanks to Jonathan Lewis for helping me to understand it.

    References,

    http://www.juliandyke.com/Internals/BlockClasses.html

    http://www.jlcomp.demon.co.uk/buf_flag.html

    , ,

    3 Comments

    A New Learning, IMU & Private Redo Strand….

    There are very few things that I know but there are tons of things which I don’t know. I came across one more such thing like this few days ago when this question (http://forums.oracle.com/forums/thread.jspa?messageID=2713561&#2713561 came over OTN. Interestingly, what the OP mentioned was indeed true and I was not having a clue what’s going on. Like most of the times, JL had an answer to the puzzle and he gave a lead. This blog post is not about any “new” invention that I have done. Its merely a collection of my understanding about a new mechanism in >10g SGAs. So with this note ,let’s get the party started.

    Dirty Buffers & Redo Log Buffer

    In Oracle buffer cache, buffers stay in multiple states. When a buffer is not changed and is just available for use, it’s a clean buffer. When the buffer encounters a change, the buffer status is marked as Dirty buffer. This status is used by Oracle to provide read-consistancy for the other users who would access the same buffer but when the transaction is alive. In simple terms, a dirty buffer is that one who got an image in it which is not the one when it was brought into the cache. Oracle maintains for this dirty buffer, CR(consistent read) buffers aka snapshot blocks which provide read conistancy for the others.

    The change vectors for this buffer are also logged in the redo log buffer before they get applied to the actual buffers(write ahead logging) , using which Oracle maintains this assurance that no matter what, if we have the data in the log buffer, its safe. The data in the log buffer is written by LGWR to the redo log files. Once there, its permanently saved and hence protected against any crash.

    In order to log changes in the log buffer, oracle typically follows this process

    1)      A calculation is done that how much space would be needed in the log buffer.

    2)      Server process acquires Redo Copy Latch. This is used for making the announcement that some redo will be copied into the log buffer.

    3)       Acquires Redo Allocation Latch. This is the actual latch which will allocate the space in the log buffer.

    4)       Redo Allocation Latch is released after the space is acquired in the log buffer.

    5)      Redo copy latch is used to copy the redo change vectors in the log buffers.

    6)      Redo copy latch is released.

    After the copy is copy is complete and applying them over the buffers in the buffer cache, depending upon various conditions, LGWR may start writing the contents into the log files. How many latches we have got have at the moment for our database can be seen from V$latch_childeren. Here is an output from my 11gr1 machine.

     1  select latch#,child#,name from V$latch_children

      2* where name like ‘%redo%’ order by 1,2

    SQL> /

     

     

        LATCH#     CHILD# NAME

    ———- ———- —————————————————————-

           169          1 redo copy

           169          2 redo copy

           169          3 redo copy

           169          4 redo copy

           170          1 redo allocation

           170          2 redo allocation

           170          3 redo allocation

           170          4 redo allocation

           170          5 redo allocation

           170          6 redo allocation

           170          7 redo allocation

           170          8 redo allocation

           170          9 redo allocation

           170         10 redo allocation

           170         11 redo allocation

           170         12 redo allocation

           170         13 redo allocation

           170         14 redo allocation

           170         15 redo allocation

           170         16 redo allocation

           170         17 redo allocation

           170         18 redo allocation

           170         19 redo allocation

           170         20 redo allocation

     

    24 rows selected.

    With heavy processing, lots of redo being generated, there may be a contention over the redo allocation latches. So for this purpose, there was a parameter Log_parallelism( http://download.oracle.com/docs/cd/B10501_01/server.920/a96536/ch1105.htm#REFRN10099) in 9i which could be used to make the redo generation go more faster.  This would separate the log buffer into further small buffers. This parameter is only till 9i and from 10.1 onwards its deprecated(http://download.oracle.com/docs/cd/B28359_01/server.111/b28300/changes.htm#UPGRD12575) as in 10g, the mechanism is changed ( and is the actual topic of this post) which handles the excessive redo workload.

    So the moral of the story so far comes up is that when we change some thing in the buffer cache, the change is protected in the log buffer. To do this two latches are majorly used. The changes in the log buffer are protected in the physical log files by LGWR.

    You must be thinking that that is this the replica of docs or what? Well certainly, it’s the basics and I thought before I forget all this, I must write it down and this talks about the terms which will be coming up a little later in the post. So chill J. Told you its not any new invention but collection ;-) .

    What’s Wrong With All This

    Well actually nothing much. This is some this which is happening since the inception of Oracle I believe. Some things came and go but in nutshell things remained the same. This allocation of latches and other stuff did prove to be fine but as they all good things come to an end, this also needed some modifications. As with the workloads being more heavy and with much bigger databases and higher transaction rates, it becomes too tough to manage the redo allocations using latches. Clearly for the reasons that are well documented in PT guide that with lots of redo, the latches required to copy the redo data in the log buffer come under contention and so does the LGWR process too. There is a workaround to invoke multiple slaves for LGWR for heavy systems but that too is not some thing which is not recommended as long as not fully benchmarked. Surely there are tons of guidelines that are there for users in order to make transactions small, have a good media where redo can stay and so on but they can be applied to a certain level. So there was a need for things to be modified from the grass root level and that’s what happened in 10g(10.1) onwards.

    A Strange Output

    With the Dirty Buffers discussion given above, you must be clear about the idea about them. So here is small table creation and modification and we would see some dirty buffers getting created,

    SQL> select * from V$version;

     

    BANNER

    ——————————————————————————–

    Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 – Production

    PL/SQL Release 11.1.0.6.0 – Production

    CORE    11.1.0.6.0      Production

    TNS for 32-bit Windows: Version 11.1.0.6.0 – Production

    NLSRTL Version 11.1.0.6.0 – Production

     

    SQL> alter system flush buffer_cache;

     

    System altered.

     

    SQL> create table t(a number)

      2  /

     

    Table created.

     

    SQL> SELECT b.status, object_name, object_type, dirty “Dirty” FROM v$bh b, dba_objects o

      2  WHERE b.objd = o.data_object_id AND o.owner = ‘AMAN’ and object_name=’T’;

     

    STATUS     OBJECT_NAME

    ———- ———————————————————————————–

    ———

    OBJECT_TYPE         D

    ——————- -

    xcur       T

    TABLE               Y

     

    xcur       T

    TABLE               Y

     

    xcur       T

    TABLE               Y

     

     

    SQL> insert into t values(1);

     

    1 row created.

     

    SQL> SELECT b.status, object_name, object_type, dirty “Dirty” FROM v$bh b, dba_objects o

      2  WHERE b.objd = o.data_object_id AND o.owner = ‘AMAN’ and object_name=’T’;

     

    STATUS     OBJECT_NAME

    ———- ———————————————————————————–

    ———

    OBJECT_TYPE         D

    ——————- -

    xcur       T

    TABLE               Y

     

    xcur       T

    TABLE               N

     

    xcur       T

    TABLE               N

     

    xcur       T

    TABLE               Y

     

    xcur       T

    TABLE               N

     

    xcur       T

    TABLE               Y

     

    xcur       T

    TABLE               N

     

    xcur       T

    TABLE               N

     

     

    8 rows selected.

     

    SQL>

    SQL> commit;

     

    Commit complete.

     

    SQL> SELECT b.status, object_name, object_type, dirty “Dirty” FROM v$bh b, dba_objects o

      2  WHERE b.objd = o.data_object_id AND o.owner = ‘AMAN’ and object_name=’T’;

     

    STATUS     OBJECT_NAME

    ———- —————————————————————————–

    ———

    OBJECT_TYPE         D

    ——————- -

    xcur       T

    TABLE               Y

     

    xcur       T

    TABLE               Y

     

    xcur       T

    TABLE               Y

     

    xcur       T

    TABLE               Y

     

    xcur       T

    TABLE               Y

     

    xcur       T

    TABLE               Y

     

    xcur       T

    TABLE               Y

     

    xcur       T

    TABLE               Y

     

     

    8 rows selected.

     

    SQL>

    Well if you have seen the output with a little care, you must have seen that even when we have given the dmp(insert in this case) there were no buffers which were marked dirty. Oracle did acquire some buffers to keep our changes but they were not marked dirty.  But when we issued commit, all the buffers were marked dirty. Why so?

    The New SGA Of IMU/ Private Redo Strands

    Due to the reasons mentioned above for the allocation and redo latch contentions, from 10.1 onwards, Oracle has introduced private redo allocation areas and In memory Undo.

    IMU(In Memory Undo)

    With the same reason that Oracle needs to protect redo, it also has to protect Undo to make sure that the block’s old contents are also saved. This would require an access to the blocks and than later on change the transaction table to make the changes logged. The transaction records can be seen from V$transaction. In order to access undo segment header blocks directly, we can also check X$KTUXE and for the base table of V$transaction, X$KTCXB can be used. From oracle 10g onwards, Oracle does use the Undo blocks but makes them available in private memory areas which are specifically created in the Shared Pool. Each pool is containing Undo data of a  transaction. Undo data is updated into the segment header but not into the undo blocks. For maintaining them, in the shared pool , memory areas are configured which are used to protect the undo that is coming up. These are controlled by two parameters _in_memory_undo and _imu_pools. Both can be seen from the base tables x$ksppi and x$ksppcv2

    SQL> l

      1  select  KSPFTCTXDVL,kspftctxdf,ksppinm

      2  from x$ksppcv2 a,x$ksppi b

      3* where a.indx=b.indx and b.ksppinm like ‘%imu%’

    SQL> column KSPFTCTXDVL format a10

    SQL> /

     

    KSPFTCTXDV KSPFTC KSPPINM

    ———- —— —————————————————

    2          TRUE   _simulator_upper_bound_multiple

    16         TRUE   _simulator_pin_inval_maxcnt

    10240      TRUE   _simulator_lru_rebalance_thresh

    5          TRUE   _simulator_lru_rebalance_sizthr

    8192       TRUE   _simulator_bucket_mindelta

    8          TRUE   _simulator_lru_scan_count

    10         TRUE   _simulator_internal_bound

    1024       TRUE   _simulator_reserved_obj_count

    4096       TRUE   _simulator_reserved_heap_count

    2          TRUE   _simulator_sampling_factor

    0          TRUE   _simulate_mem_transfer

    0          TRUE   _db_writer_flush_imu

    0          TRUE   _minimum_giga_scn

    1800       TRUE   _log_simultaneous_copies

    d:\oracle\ FALSE  _allow_error_simulation

    flash_reco

    very_area

     

    SLFLUSH    TRUE   _gc_policy_minimum

    FALSE      TRUE   _gc_maximum_bids

    TRUE       TRUE   _minimum_db_flashback_retention

    600        TRUE   _recursive_imu_transactions

    TRUE       TRUE   _smu_error_simulation_site

    TRUE       TRUE   _smu_error_simulation_type

    0          TRUE   _imu_pools

    10         TRUE   _minimum_blocks_to_shrink

    TRUE       TRUE   _minimum_extents_to_shrink

    RUN_TIME   TRUE   _plsql_minimum_cache_hit_percent

     

    25 rows selected.

    And

    SQL> l

      1  select  KSPFTCTXDVL,kspftctxdf,KSPFTCTXVL,KSPFTCTXIS,ksppinm

      2  from x$ksppcv2 a,x$ksppi b

      3* where a.indx=b.indx and b.ksppinm like ‘%undo%’

    SQL> /

     

    KSPFTCTXDV KSPFTC KSPFTCTXVL KSPFTC KSPPINM

    ———- —— ———- —— ——————–

    FALSE      TRUE   FALSE      FALSE  _gc_undo_affinity

    AUTO       TRUE   AUTO       FALSE  _gc_dissolve_undo_af

                                        finity

     

    TRUE       TRUE   TRUE       FALSE  _gc_initiate_undo_af

                                        finity

     

    0          TRUE   0          FALSE  _gc_undo_block_disk_

                                        reads

     

               TRUE              FALSE  _undo_block_compress

                                        ion

     

               TRUE              FALSE  undo_management

               TRUE              FALSE  undo_tablespace

    100        TRUE   100        FALSE  _collect_undo_stats

    0          TRUE   0          FALSE  _undo_debug_mode

    0          TRUE   0          FALSE  _verify_undo_quota

               TRUE              FALSE  _in_memory_undo

    FALSE      TRUE   FALSE      FALSE  _undo_autotune

    FALSE      TRUE   FALSE      FALSE  undo_retention

    0          TRUE   0          FALSE  _undo_debug_usage

    0          TRUE   0          FALSE  _smon_undo_seg_resca

                                        n_limit

     

    0          TRUE   0          FALSE  _optimizer_undo_cost

                                        _change

     

    TRUE       TRUE   TRUE       FALSE  _optimizer_undo_chan

                                        ges

     

     

    17 rows selected.

    Each pool is maintained by a latch and that is called “in memory undo latch”. They can be seen from V$latch_children

    SQL> select name from V$latch_children where name like ‘%undo%’;

     

    NAME

    —————————————————————-

    undo global data

    undo global data

    undo global data

    In memory undo latch

    In memory undo latch

    In memory undo latch

    In memory undo latch

    In memory undo latch

    In memory undo latch

    In memory undo latch

    In memory undo latch

    In memory undo latch

    In memory undo latch

    In memory undo latch

    In memory undo latch

    In memory undo latch

    In memory undo latch

    In memory undo latch

    In memory undo latch

    In memory undo latch

    In memory undo latch

     

    21 rows selected.

     

     

    We can see that in the shared pool, regarding Undo,there are several structures available.

    SQL> select pool,name from V$sgastat  where pool=’shared pool’ and lower(name) like ‘%undo%’;

    POOL         NAME

    ———— ————————–

    shared pool  UNDO INFO SEGMENTED ARRAY

                 KCB undo bitvec

                 Undo Meta Data

                 Auto tune undo info

                 UNDO INFO

                 KTI-UNDO

                 UNDO INFO HASH

                 UNDO STAT INFO

    The IMU pools are allocated in the shared pool. The data that we change gets copied to these pools from a  latch called In memory Undo latch. We can see it here from V$latch,

    SQL> select name from V$latch where name like ‘%undo%’;

     

    NAME

    —————————————————————-

    undo global data

    corrupted undo seg lock

    In memory undo latch

    When we modify the data, oracle doesn’t apply the data to the data blocks immediately but It allocates the data in the IMU pools in the shared pool with the help of this IMU latch. This mechanism avoids the pinning of the Undo segment header block and also to the undo data blocks which are otherwise used immediately when the transaction starts. Now they are only used when the transaction commits. The data is in the pool and from there only it gets copied to the buffer cache. As each pool is assigned to one transaction and is maintained by an individual latch, the mechanism works well to reduce the contention. I am still not clear that with a very very heavy OLTP environment, what would happen? Are the pools increased at that time or not, or some thing else happens, I am not sure.

    The IMU pools are also described in the X$ktifp. The size of the pools are 65kb and can be seen as follows,

    SQL> select ktifpno,ktifppsi from x$ktifp;

     

       KTIFPNO   KTIFPPSI

    ———- ———-

             0      65535

             1      65535

             2      65535

             3      65535

             4      65535

             5      65535

             6      65535

             7      65535

             8      65535

             9      65535

            10      65535

     

       KTIFPNO   KTIFPPSI

    ———- ———-

            11      65535

            12      65535

            13      65535

            14      65535

            15      65535

            16      65535

            17      65535

     

    18 rows selected.

    So there are 18 pools allocated and are of 65kb each in size. This undo data is “flushed” to the undo segments when the pools are full. This write happens in the batch format. There are events when the data in the IMU pools gets flushed. These events are mentioned in the base table x$ktiff. We can see the reasons here,

    SQL> l

      1  select  KTIFFCAT, KTIFFFLC

      2* from  x$ktiff

    SQL> /

     

    KTIFFCAT                                   KTIFFFLC

    —————————————- ———-

    Undo pool overflow flushes                        1

    Stack cv flushes                                  9

    Multi-block undo flushes                          0

    Max. chgs flushes                                 2

    NTP flushes                                       0

    Contention flushes                                9

    Redo pool overflow flushes                        0

    Logfile space flushes                             0

    Multiple persistent buffer flushes                0

    Bind time flushes                                 0

    Rollback flushes                                335

    Commit flushes                                 1570

    Recursive txn flushes                             2

    Redo only CR flushes                              0

    Ditributed txn flushes                            0

    Set txn use rbs flushes                           0

    Bitmap state change flushes                      21

    Presumed commit violation                         0

    Securefile direct-write lob update flush          0

    es

     

     

    19 rows selected.

    We can see with simple DML that data is being changed and IMU is being flushed.

    In session 1, I made a table,

    C:\Documents and Settings\Aman>sqlplus / as sysdba

     

    SQL*Plus: Release 11.1.0.6.0 – Production on Sat Aug 30 11:42:21 2008

     

    Copyright (c) 1982, 2007, Oracle.  All rights reserved.

     

     

    Connected to:

    Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 – Production

    With the Partitioning, Oracle Label Security, OLAP, Data Mining,

    Oracle Database Vault and Real Application Testing options

     

    SQL> create table a as select * from dba_objects;

     

    Table created.

     

    SQL>

     

     

    And in session 2(the same session from where we saw the events),

    SQL> /

     

    KTIFFCAT                                   KTIFFFLC

    —————————————- ———-

    Undo pool overflow flushes                        1

    Stack cv flushes                                  9

    Multi-block undo flushes                          0

    Max. chgs flushes                                 2

    NTP flushes                                       0

    Contention flushes                                9

    Redo pool overflow flushes                        0

    Logfile space flushes                             0

    Multiple persistent buffer flushes                0

    Bind time flushes                                 0

    Rollback flushes                                338

    Commit flushes                                 1592

    Recursive txn flushes                             2

    Redo only CR flushes                              0

    Ditributed txn flushes                            0

    Set txn use rbs flushes                           0

    Bitmap state change flushes                      21

    Presumed commit violation                         0

    Securefile direct-write lob update flush          0

    es

     

     

    19 rows selected.

    And in session 1,

    SQL> select n.name, s.value from v$statname n, v$sesstat s

      2  where n.statistic# = s.statistic#

      3  and s.sid = (select sid from v$mystat where rownum = 1)

      4  and n.name in (’IMU Flushes’,’IMU commits’);

     

    NAME                                                                  VALUE

    —————————————————————- ———-

    IMU commits                                                               0

    IMU Flushes                                                               0

     

    SQL>

    We do some change in the data in session 1,

    SQL> insert into a select * from a;

     

    68864 rows created.

     

    SQL> select n.name, s.value from v$statname n, v$sesstat s

      2  where n.statistic# = s.statistic#

      3  and s.sid = (select sid from v$mystat where rownum = 1)

      4  and n.name in (’IMU Flushes’,’IMU commits’);

     

    NAME                                                                  VALUE

    —————————————————————- ———-

    IMU commits                                                               0

    IMU Flushes                                                               1

    And we have got a flush happened as the data generated is too much big. Lets see in session 2,

    SQL> /

     

    KTIFFCAT                                   KTIFFFLC

    —————————————- ———-

    Undo pool overflow flushes                        1

    Stack cv flushes                                  9

    Multi-block undo flushes                          0

    Max. chgs flushes                                 2

    NTP flushes                                       0

    Contention flushes                                9

    Redo pool overflow flushes                        1

    Logfile space flushes                             0

    Multiple persistent buffer flushes                0

    Bind time flushes                                 0

    Rollback flushes                                353

    Commit flushes                                 1654

    Recursive txn flushes                             2

    Redo only CR flushes                              0

    Ditributed txn flushes                            0

    Set txn use rbs flushes                           0

    Bitmap state change flushes                      21

    Presumed commit violation                         0

    Securefile direct-write lob update flush          0

    es

     

     

    19 rows selected.

    After some more changes in session 1,

    SQL> insert into a select * from a;

     

    137728 rows created.

     

    SQL> where n.statistic# = s.statistic#

    SP2-0734: unknown command beginning “where n.st…” – rest of line ignored.

    SQL> select n.name, s.value from v$statname n, v$sesstat s

      2  where n.statistic# = s.statistic#

      3  and s.sid = (select sid from v$mystat where rownum = 1)

      4  and n.name in (’IMU Flushes’,’IMU commits’);

     

    NAME                                                                  VALUE

    —————————————————————- ———-

    IMU commits                                                               0

    IMU Flushes                                                               1

     

    SQL> commit;

     

    Commit complete.

     

    SQL> select n.name, s.value from v$statname n, v$sesstat s

      2  where n.statistic# = s.statistic#

      3  and s.sid = (select sid from v$mystat where rownum = 1)

      4  and n.name in (’IMU Flushes’,’IMU commits’);

     

    NAME                                                                  VALUE

    —————————————————————- ———-

    IMU commits                                                               0

    IMU Flushes                                                               1

     

    SQL>

    We see again session 1,

    SQL>  /

     

    KTIFFCAT                                   KTIFFFLC

    —————————————- ———-

    Undo pool overflow flushes                        1

    Stack cv flushes                                  9

    Multi-block undo flushes                          0

    Max. chgs flushes                                 2

    NTP flushes                                       0

    Contention flushes                               10

    Redo pool overflow flushes                        1

    Logfile space flushes                             0

    Multiple persistent buffer flushes                0

    Bind time flushes                                 0

    Rollback flushes                                359

    Commit flushes                                 1688

    Recursive txn flushes                             2

    Redo only CR flushes                              0

    Ditributed txn flushes                            0

    Set txn use rbs flushes                           0

    Bitmap state change flushes                      21

    Presumed commit violation                         0

    Securefile direct-write lob update flush          0

    es

     

     

    19 rows selected.

     

    SQL>

    This is worth to mention that the same kind of mechanism was introduced by Oracle earlier with the help of Discrete Transaction Management,http://download.oracle.com/docs/cd/B10501_01/server.920/a96524/c17trans.htm#421

    But this didn’t work out too well as the DTM used to pin the data blocks during the transaction. When the transaction used to get committed than the data was written to the redo logs.  In private redo, no pinning of the data blocks is done. Hence they are much better than DTM.  DTM is obsolete now.

     

    Private Redo Strands( Private Redo )

    Earlier there was a single redo allocation latch available for the entire log buffer. This was a major contention point as till the time one won’t be able to pin the latch, he wont be able to write his changed redo vectors in the log buffer. From 10.1 onwards, the log buffer is broken in to smaller sub-buffers called Private Redo Strands(shown in the x$kcrfstrand). These are also allocated in the shared pool. These all are written by redo allocation latches. There are allocation latches now for each private strand. The data when changed now, doesn’t contend to go into the log buffer as like before but goes into the private redo strands and from there, it goes into the log buffer when the buffer pools become full or when the transaction is committed.  We can see the private strands shown in the shared pool from the V$sgastat

     

    SQL> select name,pool from V$sgastat

      2  where name like ‘%private%’;

     

    NAME                       POOL

    ————————– ————

    private strands            shared pool

    KCB private handles        shared pool

    At log file switch , when the data is still is in the Private redo strands, lgwr may wait to get the flush done by DBWR. Hence the event, log file switch (private strand flush incomplete) may occur. See the doc note for it,

    http://download.oracle.com/docs/cd/B19306_01/server.102/b14237/waitevents003.htm#sthref3117

    This is controlled by the parameter _log_parallelism which is a dynamic and hidden parameter in 10g. The value depends on the cpu_count parameter. Oracle starts with 2 strands.

    The benefit of private redo strand is also for the same reason that this would make the LGWR less busy and also the contention over the redo allocation latches will decrease.

    Reference:

    www.tanelpoder.com/files/Oracle_Performance_Scalability_10g_11g.pdf

    Metalink note 372557.1

    http://www.fortunecity.com/skyscraper/oracle/699/orahtml/hayden/discrete.html


    , ,

    4 Comments