in genernal

Oracle 12c, ILM Management Part-3, Heat Map….

In the last two parts, we saw two new approaches available in 12c for the better management of the data that’s considered to be stale or inactive in a database-In-Database Archiving and Temporal Validity. The key point in both the techniques is to make it possible that the data which is not relevant to the business shouldn’t impact the regular queries and the DML’s. Both the techniques are indeed good but there is still room for more options when it comes to manage multi-gigabytes(or terabytes) of data which probably is not required in the daily usage of the business and some action needs to be taken for it. This post would extend the discussion further for the better ways of doing ILM in 12c.

Pre 12.1 way of managing things!

Before we would discuss what is new in 12c, let’s try to understand how things are done till now when it comes to manage large sized data sets which are considered stale as per the business policy and to illustrate, let’s take an example of a telecom company. In telecom business, the data which is related to the most current month is of the highest importance, obviously because that’s going to bring the rental money. The data that’s preceding the current month is important but not that much and moreover, its not the whole set of the old data that’s going to be useful. For example, as the current month is July so the data belonging to this month would be used in order to generate the bills for the subscribers next month and thus, this month’s data is extremely important and not just for the reason that it’s going to bring money but in terms of performance as well because using it only, the itemized bills, reports etc. would be generated so the access of this must be fast enough. The data of the month of June, May can be considered relevant because of reporting, for example if we consider that in the quarterly report would be generated from the cumulative access of all the three months, May, June and July. But what about January’s data or even of months before that? Is that is having the same importance as that of June or rather the better question would be, should that data have the same importance? The answer would be a NO because it’s very old data and though we shouldn’t say the data is completely useless now, after 6 months later but rather that its very less important now. So what we can do to manage such kind of data distributions which become obsolete after some time on the object level? Probably the most effective technique which comes to mind is Partitioning and since we have taken an example of telecom sector, it would be a perfect fit for such kind of data as well!For each month, a separate partition needs to be created and the partition of the most current month would also be the most active one. Now, for the older partitions, since they are not going to be of much in use, you have couple of choices like moving them to a slower storage or along with it compressing the data so that it won’t consume much of the space and after some time, compress it using some aggressive compression algorithm, make it read-only and move it to some other facility where it would be kept till the time what your company decides as per its policy. And why to do all this, to make a better use of the storage that you have with you.

So what’s the issue in all this when this is what DBA’s have been doing already from many years? Well, there isn’t any other issue except one, a DBA has to do it all with a proper strategy and planning and when the time comes to take actions like making the segment compressed or moving it to some less performance centric storage, do those tasks manually! So to make your life easier, 12c has made this process automatic, well almost automatic to be precise! How? By using two new techniques where one is going to monitor your segments i.e. activities happening over them like queries and DML’s, to even the fine level of data blocks and extents and the second technique, based on the results of the first would decide what action(s) to take, for example whether to compress the data or compress it and move it to some alternative storage etc.

In this post, we shall look at the first technique, Heat Map which would be used to monitor the activities of the segments.

Heat Map, getting started!

A very interesting title isn’t it but it is indeed a very appropriate one as well! The meaning of Heat Map is to track what’s the temperature of your object like a table or a partition of a table, in terms of “accessibility”, for both selects and DML’s or in other words, for both reads and writes done over the object. To put simply, it’s the tracking of the “hotness” of your object! There can be three kinds of temperatures possible,

  • Hot
  • Warm
  • Cold

The Hot temperature would mean that the object is actively participating in both Read as well as Write operations. Warms stands for that object which is accessed but only via queries i.e. for reads only and the Cold means the object is not participating in any kind of activity and thus can be safely considered as a suitable candidate for being moved to a facility(for example to a tablespace sitting on a slow storage) which would hold such kind of “stale data”. Using such kind of heat tracking for objects, it would be very easy to take such ILM decisions!

Enabling Heat Map using HEAT_MAP parameter
Now, this shouldn’t be any surprise that since the feature’s name is Heat Map, the parameter which would enable it would also be called HEAT_MAP only!This parameter is possible to be modified at either session or system level and would enable or disable the segment-activity tracking. The statistics which are collected over a segment are stored first in the memory in the view V$HEAT_MAP_SEGMENT(X$HEATMAPSEGMENT) before being pushed over to disk every hour using the DBMS_SCHEDULER interface.The statistics collection is in real-time and is categorized as

  • Segment Access(either by Full Table Scan or a Lookup Scan)
  • Segment Modification
  • Row Modification

So let’s see what’s the value of the parameter HEAT_MAP is in our non-container database ORCL12. This is very important to remember because at this moment, with 12.1, the feature of heat map is NOT supported for multitenant databases.

 

So the parameter is set to OFF which means no tracking of any segment is going to be done by default. Since there is no tracking going on, there must not be any thing recorded in the view V$HEAT_MAP_SEGMENT as well.

So now, let’s modify the parameter to value ON and see the effect of it by accessing tables of Scott schema in both the queries and in DML’s.

The view V$HEAT_MAP_SEGMENT is an in-memory view and that means, the statistics collected in it won’t be persistent. So these statistics are flushed to a more persistent table HEAT_MAP_STAT$ table. As mentioned before as well, this flushing is scheduled using the scheduler interface. After being flushed, you can view the same information from the views DBA_HEAT_MAP_SEG_HISTOGRAM and DBA_HEAT_MAP_SEG. So let’s do a database bounce which would flush the view and we shall re-populate it by accessing the EMP table of Scott schema.

So we can see that the view is empty after the database restart. So let’s enable the heat-tracking by modifying the parameter again.

So we have got the view populated for both the Full table scan and also for a Lookup i.e. index based scan. Please note that this access is tracked for even partitions as well. Let’s write into our table by doing a simple insert into it.

So indeed that’s also is tracked in the view! Now, let’s check the outputs from the views DBA_HEAT_MAP_SEG_HISTOGRAM and DBA_HEAT_MAP_SEGMENT.

 

Using DBMS_HEAT_MAP package to go even deeper

The default access is only going to let you know at segment level the details of the statistics. But if you want to go more deeper, for example to check which blocks were accessed and at what time, you would need to use the new package DBMS_HEAT_MAP. Using it, you can query the heat mapping at different levels like,

  1. Object level
  2. Segment level
  3. Tablespace level
  4. Extent level
  5. Block level

Here is an example of using the package to access the information for the blocks which were accessed for the EMP table.

 

Heat Map, old wine in a new bottle? Not really!

If you are wondering that what’s so new about this Heat Map feature and isn’t it the same or kind-of-same to Auditing the table or like monitoring of the read and write activities on the table using the views like V$SEGMENT_STATISTICS, V$OBJECT_USAGE etc. which are there in the prior releases well than the answer is a NO. What you do with the Auditing or with the statistics like how many reads, writes done on a table in the previous versions is very basic and that information can’t be used in any intelligent way by the database itself. Yes, being a DBA you make take some actions manually but that’s about it! Heat Map, on the other hand, though appears to work in the same fashion but is meant for an entirely different requirement and that is to decide what to do intelligently and automatically for the data based on the access patterns of its usage. How to do it? Well, we shall see that in the next post where we shall use this heat map data in creating policies over our objects which would be applied based on the conditions that we shall mention. Sounds cool isn’t it? Well it is indeed a very cool stuff so stay tuned for the next part of this series!

References

http://docs.oracle.com/cd/E16655_01/server.121/e17613/part_lifecycle.htm#CHDCEGFE