Checkpoint Count/SCN, What It Is ….

A question came on this thread at the OTN forums asking that what is the “Checkpoint SCN” and “Checkpoint Count” and followed by another question that from where we can see both. It took a little while to figure out the answer of the question that where is the checkpoint counter stored so in this post, I shall be discussing that along with the answer of the original question.

What is a Checkpoint

Starting from the term, checkpoint, it’s an event when the Database Writer process is going to flush the dirty buffers to the data file. This is an on-going activity and shall be happening all the time. The benefit would be,since the number of the dirty buffers would be constantly flushed out to the disk, most of the buffers would be already saved on the disk with their changed images(without a commit) and in the event of the next instance crash, wouldn’t need to get recovered for the instance recovery. The checkpoint number is going to be constantly incremented in the data file’s header and in the control file and the background process CKPT takes care of it.

I believe the above few lines should make the basic of the checkpoint clear. Now, to make things a little more complex, there isn’t just one type of checkpoint happening at one specific event but there are many! And they do happen whenever the respective event responsible for that checkpoint event occurs. For the complete list of the checkpoint event, you can check this blog post of Jonathan Leiws, http://jonathanlewis.wordpress.com/2007/04/12/log-file-switch/ . As the question asked was about Checkpoint SCN, so we shall just discuss only one type of checkpoint which is related to it, Thread Checkpoint.

Thread Checkpoint

This type of checkpoint is caused when the log switch happens and because of that, DBWR is asked to do a flush of the dirty buffers whose change vectors are protected by the "active" redo log that needs to be checkpointed now to become "inactive" so that it can be reused. This event information is updated in both the data file headers and the control file. Also, this checkpoint number is going to be the same number that’s going to be there in the redo log file that’s going to be the current redo log file and is shown with the Checkpoint_Change# from the V$datafile_header view. This would be the same number that can be verified from the v$log , from the current thread of it. We can see it below. The output is from 11203 running on OEL5.3(x86),

SQL> select  group#, first_change#,status from V$log;

    GROUP# FIRST_CHANGE# STATUS
———- ————- —————-
         1       1407847 CURRENT
         2       1407825 INACTIVE
         3       1407831 INACTIVE

SQL> select file#, checkpoint_change# from V$datafile_header;

     FILE# CHECKPOINT_CHANGE#
———- ——————
         1            1407847
         2            1407847
         3            1407847
         4            1407847
         5            1407847
         6            1407847

6 rows selected.

So we can see the the checkpoint recorded in the headers of the datafile is the same what’s recorded as the lowest SCN in the current redo log file. This means, this is the first ever SCN from where this log file got opened. Now, there is another column in the V$LOG which is NEXT_CHANGE# which is going to be the highest SCN that was ever recorded in that log file before it got closed and was switched over. If this log group is not a Current one, this column would be populated with a valid number that would become the first SCN or lowest SCN from where, the next log group would start functioning and the same would be recorded as the highest checkpoint recorded in the headers of the data files. But if the log group is a Current one, since Oracle can’t determine the highest SCN that would come in it, the value for the column NEXT_CHANGE# is stored as a very high value 281474976710655.

Since, it’s a well known rule that the database is marked to be consistent if the data files, redo logs and the control file, all three files are consistent with each other. So now,we have seen that the most recent checkpoint in both redo log file and the data files is the same, let’s compare this with the control file of our database. And to do that, we shall dump the control file header at level 3. Have a look,

SQL*Plus: Release 11.2.0.3.0 Production on Fri Feb 1 21:41:01 2013

Copyright (c) 1982, 2011, Oracle.  All rights reserved.

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 – Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> alter session set tracefile_identifier=’ctlf’;

Session altered.

SQL> alter session set events ‘immediate trace name controlf level 2’;

Session altered.

SQL> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 – Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
[oracle@machine2 ~]$ cd $ORACLE_BASE
[oracle@machine2 oracle]$ cd diag/rdbms/orcl/orcl/trace/
oracle@machine2 trace]$ ls *ctlf*
orcl_ora_4718_ctlf.trc  orcl_ora_4718_ctlf.trm
[oracle@machine2 trace]$ more orcl_ora_4718_ctlf.trc
Trace file /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_4718_ctlf.trc
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 – Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
ORACLE_HOME = /u01/app/oracle/product/11.2.0/dbhome_1
System name:    Linux
Node name:      machine2
Release:        2.6.18-128.el5PAE
Version:        #1 SMP Wed Jan 21 08:28:30 EST 2009
Machine:        i686
Instance name: orcl
Redo thread mounted by this instance: 1
Oracle process number: 32
Unix process pid: 4718, image: oracle@machine2 (TNS V1-V3)

*** 2013-02-01 21:42:00.057
*** SESSION ID:(38.79) 2013-02-01 21:42:00.057
*** CLIENT ID:() 2013-02-01 21:42:00.057
*** SERVICE NAME:(SYS$USERS) 2013-02-01 21:42:00.057
*** MODULE NAME:(sqlplus@machine2 (TNS V1-V3)) 2013-02-01 21:42:00.057
*** ACTION NAME:() 2013-02-01 21:42:00.057

DUMP OF CONTROL FILES, Seq # 3156 = 0xc54
V10 STYLE FILE HEADER:
        Compatibility Vsn = 186646528=0xb200000
        Db ID=1330567865=0x4f4edab9, Db Name=’ORCL’
        Activation ID=0=0x0
        Control Seq=3156=0xc54, File size=612=0x264
        File Number=0, Blksiz=16384, File Type=1 CONTROL

 

***************************************************************************
DATABASE ENTRY
***************************************************************************
(size = 316, compat size = 316, section max = 1, section in-use = 1,
  last-recid= 0, old-recno = 0, last-recno = 0)
(extent = 1, blkno = 1, numrecs = 1)
12/23/2012 12:25:29
DB Name "ORCL"
Database flags = 0x00404001 0x00001200
Controlfile Creation Timestamp  12/23/2012 12:25:32
Incmplt recovery scn: 0x0000.00000000
Resetlogs scn: 0x0000.000c05b9 Resetlogs Timestamp  12/23/2012 12:25:34
Prior resetlogs scn: 0x0000.00000001 Prior resetlogs Timestamp  09/18/2011 17:33:42
Redo Version: compatible=0xb200000
#Data files = 6, #Online files = 6
Database checkpoint: Thread=1 scn: 0x0000.00157b67
Threads: #Enabled=1, #Open=1, Head=1, Tail=1
enabled  threads:  01000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
–More–(30%)

We can see that in the header of the control file, the Database Checkpoint SCN is mentioned as 00157b67 which being converted to Decimal format is 1407847 and that’s the same which is there for the redo logs and the data files, as we saw before.

So what’s the Checkpoint SCN finally?

Well, by now this should be clear that checkpoint SCN happens with the a log switch and this is used to decide that whether the database is in synch or not-with the help of comparing this among all the three mandatory type files of the database.

But this is not the complete puzzle but one half of the puzzle. The other half of the puzzle that how the instance recovery is needed or not is detected is to check the checkpoint count and the SCN of that.

Checkpoint Count

As I said, this is the counter of the checkpoints that have happened on the data files and is recorded in both the control file and in the datafile headers. If the number matches in both along with the Checkpoint SCN,in the next database startup, this would be ensuring that there is no instance recovery needed. To see this number, let’s dump the control file but at this time, at level 4.

SQL*Plus: Release 11.2.0.3.0 Production on Fri Feb 1 21:41:01 2013

Copyright (c) 1982, 2011, Oracle.  All rights reserved.

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 – Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> alter session set tracefile_identifier=’myctlf’;

Session altered.

SQL> alter session set events ‘immediate trace name controlf level 4’;

Session altered.

Now, let’s open the file and see what does the entry for each of the data file in it contains about the checkpoint count.
[oracle@machine2 trace]$ pwd
/u01/app/oracle/diag/rdbms/orcl/orcl/trace
[oracle@machine2 trace]$ ls *myctl*
orcl_ora_4918_myctlf.trc  orcl_ora_4918_myctlf.trm
[oracle@machine2 trace]$ vi orcl_ora_4918_myctlf.trc
<<Output from VI , Data file Records(skipping the parts before it>>

***************************************************************************
DATA FILE RECORDS
***************************************************************************
(size = 520, compat size = 520, section max = 100, section in-use = 6,
  last-recid= 220, old-recno = 0, last-recno = 0)
(extent = 1, blkno = 11, numrecs = 100)
DATA FILE #1:
  name #7: /u01/app/oracle/oradata/orcl/system01.dbf
creation size=0 block size=8192 status=0xe head=7 tail=7 dup=1
tablespace 0, index=1 krfil=1 prev_file=0
unrecoverable scn: 0x0000.00000000 01/01/1988 00:00:00
Checkpoint cnt:209 scn: 0x0000.00157b67 02/01/2013 20:55:58
Stop scn: 0xffff.ffffffff 02/01/2013 18:40:18
Creation Checkpointed at scn:  0x0000.00000007 09/18/2011 17:33:47
thread:0 rba:(0x0.0.0)
<output snipped>>
DATA FILE #3:
  name #5: /u01/app/oracle/oradata/orcl/undotbs01.dbf
creation size=0 block size=8192 status=0xe head=5 tail=5 dup=1
tablespace 2, index=3 krfil=3 prev_file=0
unrecoverable scn: 0x0000.00000000 01/01/1988 00:00:00
Checkpoint cnt:136 scn: 0x0000.00157b67 02/01/2013 20:55:58
Stop scn: 0xffff.ffffffff 02/01/2013 18:40:18
Creation Checkpointed at scn:  0x0000.000bface 09/18/2011 18:13:22
thread:0 rba:(0x0.0.0)

So the counter is at 209 and the SCN for the same, in Hexadecimal is 00157b67 which after being converted to Decimal is 1407847, same what we saw earlier for both redo log file and data file headers. But this is an entry from the Control file. Let’s confirm the same from the data files also. And for that, we shall dump the data file headers. See below,

[oracle@machine2 ~]$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.3.0 Production on Fri Feb 1 22:20:34 2013

Copyright (c) 1982, 2011, Oracle.  All rights reserved.

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 – Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> alter session set tracefile_identifier=’dfh’;

Session altered.

SQL> alter session set events ‘immediate trace name FILE_HDRS level 1’;

Session altered.

SQL> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 – Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
[oracle@machine2 ~]$ cd /u01/app/oracle/diag/rdbms/orcl/orcl/trace/
[oracle@machine2 trace]$ ls -l *dfh*
-rw-r—– 1 oracle oinstall 21574 Feb  1 22:20 orcl_ora_5076_dfh.trc
-rw-r—– 1 oracle oinstall    98 Feb  1 22:20 orcl_ora_5076_dfh.trm
[oracle@machine2 trace]$ more orcl_ora_5076_dfh.trc
Trace file /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_5076_dfh.trc
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 – Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
ORACLE_HOME = /u01/app/oracle/product/11.2.0/dbhome_1
System name:    Linux
Node name:      machine2
Release:        2.6.18-128.el5PAE
Version:        #1 SMP Wed Jan 21 08:28:30 EST 2009
Machine:        i686
Instance name: orcl
Redo thread mounted by this instance: 1
Oracle process number: 28
Unix process pid: 5076, image: oracle@machine2 (TNS V1-V3)

*** 2013-02-01 22:20:58.881
*** SESSION ID:(42.179) 2013-02-01 22:20:58.881
*** CLIENT ID:() 2013-02-01 22:20:58.881
*** SERVICE NAME:(SYS$USERS) 2013-02-01 22:20:58.881
*** MODULE NAME:(sqlplus@machine2 (TNS V1-V3)) 2013-02-01 22:20:58.881
*** ACTION NAME:() 2013-02-01 22:20:58.881

DUMP OF DATA FILES: 6 files in database

DATA FILE #1:
  name #7: /u01/app/oracle/oradata/orcl/system01.dbf
creation size=0 block size=8192 status=0xe head=7 tail=7 dup=1
tablespace 0, index=1 krfil=1 prev_file=0
unrecoverable scn: 0x0000.00000000 01/01/1988 00:00:00
Checkpoint cnt:209 scn: 0x0000.00157b67 02/01/2013 20:55:58
Stop scn: 0xffff.ffffffff 02/01/2013 18:40:18
Creation Checkpointed at scn:  0x0000.00000007 09/18/2011 17:33:47
thread:0 rba:(0x0.0.0)
<<output snipped>>

So it’s the same count 209 and the same checkpoint SCN 1407847.

I believe, this should clear now that what’s the Checkpoint SCN and what’s the Checkpoint Counter and how it can be used. Now, for the last part of the question, which table contains the information of the checkpoint count(and many other things) of the data file, and that table name is X$KCVFH (Kernel Cache Recovery All File Header) which contains , for checkpoint counter,the column FHCPC(File Header Checkpoint Count).

Conclusion

Checkpoint and SCN are probably two concepts which are discussed most often and many times, are still the most misunderstood concepts. I have tried to explain a little about a very tiny part about the checkpoint here. Since this all doesn’t come under the information which is very elaborated in detail by Oracle itself, the entire thing is a mere reflection of my own understaning about these concepts-which can be wrong as well. So if anything you see here that can be corrected or can be explained in a more better way, feel free to mention that in the comments section below.

Have a good time pinpointing the Checkpoints!!

References

1) Oracle documentation Reference guide

2) IXORA website of Steve Adams(http://ixora.com.au)

3) Various threads on OTN forums at Database-General Discussion forum

PS: Thanks to Joel for sharing the link containing details about the internal table.