Understanding Oracle traces and plan

March 7, 2013

Below are few of the excellent notes on execution plans and trace from expert Charles Hooper

http://hoopercharles.wordpress.com/2010/01/30/execution-plans-what-is-the-plan-and-where-do-i-find-it/ (3 part series)
http://hoopercharles.wordpress.com/2009/12/01/10046-extended-sql-trace-interpretation/ (2 part series)
http://hoopercharles.wordpress.com/2010/01/11/explain-plan-lies-autotrace-lies-tkprof-lies-what-is-the-plan/

 

Parallelism and DEFAULT value

March 4, 2013

Recently in one of the client systems, performance issue was reported in sqls which were quite simple. Explain plan revealed that it is going for parallelism but we had not given any parallel hint in the query

So from where it is coming….

10053 trace of the query was showing default parallelism was used. Parallel_query_default_dop parameter in 1003 trace tells us about this default parallelism used. It is optimizer defined run time parameter and is not init.ora parameter.

DEFAULT value of the DEGREE column in the dba_tables for that particular showed the path. It can be set to DEFAULT when we run ALTER table PARALLEL option without degree alter table XA_1 parallel;

The default Degree of Parallelism(DOP) is determined based on the system configuration: People(at least me !!) generally assume that When column DEGREE has value default, it means it will use parallel =1, that is serial mode, but that is not the case.

DEFAULT DOP = cpu_count * parallel_threads_per_cpu* cluster_database_instances

In our case cpu_count was 64 with parallel_thread_per_cpu=2 and cluser_database_instance=2 (Since system was RAC) so DEFAULT DOP = 62*2*2 = 256. This exactly matches with Parallel_query_default_dop parameter found in the 10053 trace

Yes !!!. this was the issue since DEGREE was DEFAULT it was picking parallel plan.

By changing the DEGREE to 1 plans were as expected.

alter table XA_1 parallel 1;

So in case where plans are in PARALLEL option , checking the DEGREE column can be good starting point. Generally we never want this to be DEFAULT.

Below are few good links:

http://gerardnico.com/wiki/database/oracle/dop https://forums.oracle.com/forums/thread.jspa?threadID=1120327

Getting Row lock wait information from AWR repository

December 10, 2012

There was case in one of the Production environments, where “row cache lock” wait event was topmost wait event. Below is one of the approaches that one can follow to see what is the real issue

Row Cache Lock

As seen from below awrs for time period of 12 to 18, most of time is lost on “row cache lock” wait event.

Snap Id Snap Time Sessions Curs/Sess
--------- ------------------- -------- ---------
Begin Snap: 38729 03-Dec-12 12:00:36 408 114.4
End Snap: 38735 03-Dec-12 18:00:10 282 118.7
Elapsed: 359.57 (mins)
DB Time: 6,839.92 (mins)

Top 5 Timed Events Avg %Total
~~~~~~~~~~~~~~~~~~ wait Call
Event Waits Time (s) (ms) Time Wait Class
------------------------------ ------------ ----------- ------ ------ ----------
row cache lock 3,980,592 285,082 72 69.5 Concurrenc
CPU time 48,706 11.9
gc buffer busy 1,591,825 13,796 9 3.4 Cluster
db file sequential read 5,521,456 12,893 2 3.1 User I/O
enq: TX - row lock contention 202,525 12,454 61 3.0 Applicatio
-------------------------------------------------------------

Now question came, what is major culprit.

Checked the Row lock wait count for that period – 3 Dec (12-18PM )

Used the below sqls

1. To check what are wait events during the particular time period(based on SNAP ids)
It will reconfirm the AWR that row lock wait is coming at top
select
EVENT, count(1)
from
DBA_HIST_ACTIVE_SESS_HISTORY
where
instance_number =1 and
snap_id between 38729 and 38735
group by event order by 2 asc;

2. To check sql ids based on event
It will give what sqls that contributed to this wait event

select distinct sql_id
from
DBA_HIST_ACTIVE_SESS_HISTORY
where
instance_number =1 and
snap_id between 38729 and 38735
and event='row cache lock';

3. To check sql, program details for sqls for particluar wait

select sql_id, PROGRAM, MODULE,count(event)
from
DBA_HIST_ACTIVE_SESS_HISTORY
where
instance_number =2 and
snap_id between 38729 and 38735
and event='row cache lock' and sql_id in ('bbacffquhrm08','b6sddsvvh6cq4')
group by sql_id, PROGRAM, MODULE order by 2 asc;

Row lock waits are related to cache_id 13, which is related to dc_sequences.

SQL_ID PROGRAM MODULE COUNT(EVENT)
————- ——- ————————
b6sdnpwzh6cq4 10535
bbarfqquhrm08 11884

4.
To check various parameters of wait events
For the sqls(captured above) that are contributing major, this will give details of the wait

select distinct sql_id, p1, p1text, p2, p2text, p3, p3text
from
DBA_HIST_ACTIVE_SESS_HISTORY
where
instance_number =2 and
snap_id between 38729 and 38735
and event='row cache lock' and sql_id in (:b1, :b2);
Wait was related to sequences.

SQL_ID P1 P1TEXT P2 P2TEXT
———- —— ——————- ———- —————————-
bbacffquhrm08 13 cache id 0 mode
b6sddsvvh6cq4 13 cache id 0 mode

Queries related to fnd_concurrent_requests were major contributor to “row cache lock” wait.

5. Now you can see what object type it is referring .

select CACHE#, PARAMETER from v$rowcache where cache#=13;

CACHE# PARAMETER
———- ——————————-
13 dc_sequences

It signifies that lock row lock waits were related to the object of SEQUENCES type.

In the same manner you can find for other objects in your case

6. Below query can be used to see other cache related information during that problem time period

select PARAMETER ,b.snap_id, b.instance_number, gets, GETMISSES, SCANS, SCANMISSES, MODIFICATIONS, FLUSHES
from
dba_hist_rowcache_summary a,
dba_hist_snapshot b
where a.SNAP_ID =b.SNAP_ID and a.instance_number=b.instance_number
--and b.instance_numer=1
and parameter like 'dc_sequences%' -- change as per your case
and
(
b.END_INTERVAL_TIME between to_timestamp('2012-12-03-12-30-00','YYYY-MM-DD-HH24-MI-SS')
and to_timestamp('2012-12-03-18-30-00','YYYY-MM-DD-HH24-MI-SS')
) order by 2 asc;

Sequence objects – row cache lock details

PARAMETER SNAP_ID INSTANCE_NUMBER GETS GETMISSES SCANS SCANMISSES MODIFICATIONS FLUSHES
dc_sequences 38730 2 1063376 418020 0 0 1063376 1063376
dc_sequences 38730 1 4019806 411951 0 0 4019806 4019806

So in my case it gave me lead that on RAC some sequences issue is there. This is already known fact but now from here I have to dig more :)

Oracle.. who is generating trace

May 12, 2012

Trace is genrally enabled by means of enabling 10046 trace ie sql_trace =TRUE
Sometime we see trace files are getting generated without enabling the trace.

Refer to metalink note SQL TRACE GENERATING TO TRACE FILES NEVER STOPS [ID 556756.1] that can help in this case

Sudden Performance Change after Upgrade

November 23, 2011

I was working on the upgrade issue from 10.2.0.4 to 11.2.0.2 where some queries were taking long time. In this case, since there were lot of intermediate releases so we had to check in which release actually the issues start coming.

Found the solution for this in the excellent blog by Coskan, where he described step by step methodology to understand and resolve these issues.

http://coskan.wordpress.com/2011/01/17/11-2-0-2-performance-stories/

This is kind of narrow down approach in which things that are causing issues are caught (and fixed for some time)and then work on them with Oracle to have proper fix

It is based on the view v$system_fix_control which contains the list of bugs fixed in particular release.

Following is the approach :

1. Get the sql_id from for which plan is changing across releases
2. Get the explain plan in all the versions(get from the v$system_fix_control) starting from the start version (say 10.2.0.4) to final version (11.2.0.2)
3. Version in which plan changes, get the list of all bug fixed in that
4. Then for that bug fixes, try to “off” one of the fixes to get the right performance
5. Followup with Oracle for that to resolve that issue in proper manner

JVM architecture simplified

July 30, 2011

Last week , I got the chance to look and to tweak some of the JVM options while tuning the Oracle report sever 11g
After doing couple of google searches, following is my understanding on this

JVM is Java Virtual Machine “actual” engine that runs and controls the java runtime enviornment

What actually a java needs while running is following:

1. Classes :- These are stored in the Permanent Generation Space
2. Objects(of the classes at runtime) :- These are stored on Heap
3. Stack space
4. JVM own structure also know as Native area( can’t be set)

and that’s it !!!!

Size wise what seems most important is Heap as it contains all the objects and that is 90% time where tuning time is concentrated

So as heap is main area, JVM runtime memory area is divided into 2 parts
Heap area & Non Heap area(Contains 1, 3 and 4)

Heap or better call it Java Heap , java architects divide in terms of human age :)
as Young generation and Old(Tenured) generation

And going further they divide Young into Eden(object birth like child birth ) and Survivor(survives at last !!!)

So in heap space object travel phase is like below like we move advance in human age!!!

———-YOUNG————————>>>———-OLD———->>>
Eden Space >>> Survivor Space >>> Tenured(Old) Space
———-YOUNG————————>>>———-OLD———->>>

Hats off to mind of architects how they got inspired by nature and human life

Heap Tuning is just related to how much space you wish to give for above spaces and this is in turn related to your application requirements

If application has large number of objects which have LARGE life span(as humans), TENURED generation space should be large.
If application has large number of objects but have SHORT life span than , YOUNG generation space should be large.

Eden is the first place where the java objects are created first time and Survivor is where objects are transfered if they
survive one or two Garbage collector(GC) phases

And if it still survives few more GC phases , they are termed as old as humans ;)

Seems like I am writing technical philposphy ..oops no more

Let have a look at some parameters that we can use to set the values

Heap related
===========

Young space related
——————–
1. -Xmn >>> n for NEW generation size
2. -XX:SurvivorRatio >>> sets the ratio between each survivor space and eden

Tenured space related
———————
3. -XX:NewRatio=3 means that 3:1 ratio of tenured and young generation

Overall space related
———————
4. -Xms >>> s stands for START. Heap start size
5. -Xmx >>> x stands for MAXIMUM. Heap Maximum size
During my testing for JVM 1.6 found that still it can increase JVM size more than specied Xmx ,will find out more on this

Most of the times setting -Xmx,-Xms, -Xmn are sufficient
If you wish to have more control of space then can use rest from the above

Non Heap related
================
6. PermGen Space: -XX:MaxPermSize
This is space where all classes(not objects, they are stored in heap) are stored at runtime

7. Stack Space : -Xss where ss stands for stack space

Example
-Xms512m -Xmx2048m -Xmn256m -Xss512k

When Oracle does not creates indexes automatically

July 11, 2011

I had simple doubt that when does Oracle create indexes in case of creation of Primary and Foreign key constraint. Other constraints do not need indexes so they are not consideration

Performed the following test case

1. Created simple table
create table t2 ( n number);

2. Verified the indexes and constraints on the table
select * from all_constraints where table_name=’T2′
>> No results as no constraints are created

select * from all_indexes where table_name=’T2′
>> No indexes created as of now

Case “Non-existence of Indexes on the columns which will have constraint condition”
3. Added Primary key constraint
alter table t2 add constraint t2_pk primary key(n);

4. Added new column to the table
alter table t2 add u number;

5. Added Unique key constraint
alter table t2 add constraint t2_uniq unique(u);

6. Checked the dba_indexes and found that two new indexes are created

It means creating unique and primary key constraints create the indexes..if the indexes are not there.

7. Tried dropping the primary key and unique index
drop index t2_pk;
drop index t2_uniq;

Above resulted in error
So Indexes are not allowed to drop as they are enforcing the constraint rules

8. Tried dropping Primary key and Unique key constraint
alter table t2 drop constraint t2_pk;
alter table t2 drop constraint t2_uniq;

Both the Constraints were removed. So checked the all_indexes view to check the indexes existing on the table

select * from all_indexes where table_name=’T2′

As expected, both the indexes were dropped !!
So Primary and Foreign key enabling indexes are automatically dropped on dropping the constraint if they are created as part of Constraint creation and does not exist earlier

Case “Existence of Indexes on the columns which will have constraint condition”

9.Created Indexes beforehand

create index t2_n_pk on t2(n);
create index t2_u_uk on t2(u);

So now 2 indexes were created …but there are no constraints
select * from all_indexes where table_name=’T2′

10. Added the Primary and Foreign key constraint
alter table t2 add constraint t2_pk primary key(n);
alter table t2 add constraint t2_uniq unique(u);

Since indexes were already there so no new were created as part of index creation.
It means that when indexes on the constraint columns already exist, Oracle does not create new indexes and use the existing indexes to enforce the constraints.

11. Tried dropping indexes, but it resulted in error as expected as they were used to enforce constraints
drop index T2_U_UK;
drop index T2_N_PK;

So it means the Oracle does not allow indexes which are used to enforce the constraints.

12. Dropped the constraint
alter table t2 drop constraint t2_pk;

Index is not dropped as it was not created as a part of the index

So Oracle creates indexes automatically during creation of Primary and Unique constraints only in case they do not exist earlier, otherwise it use the existing indexes on the constraint columns.

Oracle Fun at Friday 6PM

July 11, 2011

Last friday at 6 PM, I was to apply the patch for Oracle Forms and Reports 11.1.4.0 and then OPatch gave me an error that I have old version. So again went to Oracle’s support site to download the Oracle Opatch utility’s patch. Downloaded the patch and then fun starts :)

Sometimes, even the simple things I fails to understand. Oracle Release notes of Opatch utility mentions that you should unzip the patch’s zip file in Oracle home.  And word that misleads me was “SHOULD”. I thought as usual, that unzipping the patch in Oracle’s Home might corrupt existing installation and then I started looking at the metalink help .

At last after 1 hour search, found the another note that told that you MUST unzip it in Oracle Home  and I had a heartily laugh. OPatch utilty upgrade is one of the easiest upgrade of any Oracle product :)

So it was again a learning .. sometimes note of small things make technical life much simpler .  Will try to take more care of the “little” catches of the documentation !!

 

 

Tuning Full Table Scans

July 2, 2011

Full Table Scan: - These are generally regarded as the worst at first look of the sql plan. But they are not that bad. If the number if rows is really small that is up to 5000 rows I never mind this being happening.
For large number of rows, we have to look at the ways how to avoid it by providing proper index or making it run in parallel mode, if possible.
Few months back, in one of the application there comes a strange situation, Full table scans are necessary for some table and number of rows is around 10 million. Many batch jobs running in serial fashion were hitting this table and so the whole process was taking about 17 to 18 hours thus breaching many SLA’s.
I tried all approaches like using the hash join but not much improvement were there.
Then another idea came to increase the Block size of the table itself. And it works!!!!
From 16K we increased it to 32K, allocated new tablespace to it and setting the DB_32K_CACHE_SIZE.

What made this work..?
Well this is a case where we are trying to the Full table Scan. Here we know that number of rows is same. We can’t reduce this figure, but what we can reduce, is the I/O’s which are very costly in themselves. By putting the more number of rows in block size of larger capacity (in our case we just doubled it) the number of blocks containing the data get reduced. It leads to getting almost 1.5 times the data in single I/O fetch than the previous.

Results were not bad; the process is now running for 8-9 hours when the volume is high thus saving more than 8 hours. When the data volume is just average the whole process is getting completed in under two hours.

Just to mention I made the table to be hash joined and put the FULL hint

Oracle Interview Questions

July 2, 2011

1. Are the following are same.
select a.* from a where a.x in ( select y from b)  select a.* from a,b where a.x=b.x
Answer:- Yes both are same .
2. what are the three stages in sql statement execution
Answer:- Parsing, Execution and fetch
3. Can we put return clause in the Oracle Procedure.. In function it is there .
Answer:- Yes it can be put. It will act there as exit. The place where it is put, the execution of the procedure stops
4. Why we need cursors
Answer:- To create the looping effect on the tables..No other technique can create this effect
5. Why the global indexes are in invalid state where there is any DDL activity on thebase table.
Answer:- Since Oracle feels that while performing the DDL the whole structure is changed, it is better to put the index in invalid mode.It does not happens for the Partitoned indexes as oracle knows that changes are partition based so there should not be whole index changes.
6. Will the following work … There is something missing to make this work in parallel mode
insert /*+ append parallel(t) */ into t select /*+ parallel(t) */ from a;
Answer:- alter session enable parallel dml; alter session disable parallel dml.

7. what is the necessary and minimum condition for the number of joins in the query..
Answer:- As per the database theory minimum is n-1 join condition where is the n is the number of tables.
8. Based on above in what case i can have less than n-1 conditions and still the query is OK
Answer:- If one the tables return 1 row only then it is OK.. and the join of that table to other tables in the query can be avoided as n*1=n .. and it not leading to cartesian product issue
9. You have the option to write a logic both in PL/SQL and SQL .What you prefer and why..?
Answer:- SQL is prefered than PL/SQL as it is to avoid the context switch between SQL and PL/SQL engines thus improving performance.
10. All says SQL is non-proecedural .. why?
Answer:- It lacks the progamming constructs like looping and function and procs.That is why PL/SQL is there
11. Does a sql script stops executing if there is error in one of statements..
Answer:- NO.It will move on next statement.
12. Is it same for PL/SQL ..
Answer:- NO. It will stop procesing . that is why exception handling is there .
13. Is it possible that execution plan and the run time exectuion plan differs.and if yes.. then how to resolve it
Answer:- Yes.it can happen.. First we have to look at the stats.. also if sql design change cane be done. Last resort is adding hints
14. ORA-0155 error.. how we can avoid it..
Answer:- It is basically a scheduling issue so to run the batch and long running jobs at the off peak houers
15. It is said that pl/SQL provides bulk processing feauters…. it speed up.. but how..
Answer:- Bulk processing avoids the heavy context switch among the SQL and PL/SQL engine, thus giving huge performance gains
16.Suppose our application needs the FTS(full table scan) on the table..How can we tune the FTS                                                  Answer:- Increase the data blocksize as it will lead to getting same amount of data in less number of I/O.. Also if there is high DML activity on the table..reorganize the table to reduce the fragmenation issue in the data blocks.. this compacting the size of table by reducing number of data blocks


Follow

Get every new post delivered to your Inbox.