Tuesday, March 27, 2007

HTML help..anyone?

I confess - being a database person is lot more easier than a web designer..I can come up with Oracle contents easily but its so hard to make this blog wider..Have looked around every single template and changed them so many times ..but still cant get the main section (body/subbody whatever its called) to expand..

I would typially like it not to eat up the spaces by itself - so that sql code would look little readable..

I even opened up the template code (HTML)..all width entries matches with some px value (pixels?) ..I thought they usually put in characters..(last time I checked HTML tags was around 5 years ago :) )

anyway , if there is anyone willing to make this blog more wider & readable, you are more than welcome..

Partitoning Headaches - constrained with constraints

Okay, slight change in thought..the follow-up from my Dev-team didnt happen as much as I thought, probably due to the fact the past posts might have been very primitive..but the good feedbacks and suggestions from other circle made me think I should turn this more of my sctrach pad..anything I find interesting gets posted here..and that may have nothing to do with development or tuning..(hopefully that makes a few of my DBA friends happy - who didnt like the tone of my prev posts where I had mentioned "..keep your DBA away" :) ).

Now here is something I learnt something new while troubleshooting a SQL performance at a client site..They have this huge table Range Partitioned by Date..and local indexes etc..but the sql(think its a MERGE statement) was taking forever.


Now before we jump into what the problem..here is the setup script if you would want to validate ( I did this test on 10gR1 and 10gR1)..

-- create partitioned table
-- ------------------------
dmo>ed
Wrote file afiedt.buf

1 Create table test_partition
2 ( trans_date timestamp,
3 trans_id number ,
4 trans_info varchar2(10)
5 )
6 partition by range(trans_date)
7 ( partition p1 values less than (to_date('01-01-2006','DD-MM-YYYY')),
8 partition p2 values less than (to_date('01-01-2007','DD-MM-YYYY')),
9 partition pn values less than (maxvalue)
10* )
dmo>/

Table created.

dmo>ed
Wrote file afiedt.buf

1 Create unique index test_part_id_idx on
2* test_partition(trans_date,trans_id) local
dmo>/

Index created.

dmo>alter table test_partition add (
2 constraint test_partition_pk primary key(trans_date,trans_id)
3 using index test_part_id_idx
4 );

Table altered.

-- load data into partition p1
--
dmo>ed
Wrote file afiedt.buf

1 Insert /*+ APPEND */
2 into test_partition
3 select
4 to_date('04-05-2005','DD-MM-YYYY'),rownum,'junk here'
5 from all_objects a, all_objects b
6* where rownum<99999>/

99998 rows created.

dmo>commit;

Commit complete.

-- load data into partition p2
--
dmo>Insert /*+ APPEND */
2 into test_partition
3 select
4 to_date('04-05-2006','DD-MM-YYYY'),rownum,'junk here'
5 from all_objects a, all_objects b
6 where rownum<99999>commit;

Commit complete.

-- load data into last partition pn
--

dmo>Insert /*+ APPEND */
2 into test_partition
3 select
4 to_date('04-05-2009','DD-MM-YYYY'),rownum,'junk here'
5 from all_objects a, all_objects b
6 where rownum<99999>commit;

Commit complete.


-- Now lets gather stats and make sure they get populated

dmo>ed
Wrote file afiedt.buf

1 begin
2 dbms_stats.gather_table_stats (
3 ownname=>USER,
4 tabname=>'TEST_PARTITION',
5 cascade=>TRUE
6 );
7* end;
dmo>/

PL/SQL procedure successfully completed.

dmo>ed
Wrote file afiedt.buf

1 select partition_name,NUM_ROWS,blocks from user_tab_partitions
2* where table_name='TEST_PARTITION'
dmo>/

PARTITION_NAME NUM_ROWS BLOCKS
------------------------------ ---------- ----------
P1 99998 403
P2 100644 403
PN 99998 403
so far , everything looks good.

Now lets query .. and watch the execution plan output

-- query1: get data from partition p2
-- ----------------------------------
dmo>select * from test_partition
2 where trans_date=to_date('04-05-2006','DD-MM-YYYY')
3 /

99998 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 1369430197


-------------------------------------------------------------------
| Id | Operation | Name | Rows |Pstart| Pstop |
-------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 100K| | |
| 1 | PARTITION RANGE SINGLE| | 100K| 2 | 2 |
|* 2 | TABLE ACCESS FULL | TEST_PARTITION | 100K| 2 |
-------------------------------------------------------------------


Predicate Information (identified by operation id):
---------------------------------------------------

2 - filter("TRANS_DATE"=TIMESTAMP'2006-05-04 00:00:00')
-- query2
-- ------

dmo>ed
Wrote file afiedt.buf

1 select * from test_partition
2* where trans_date=to_date('04-05-2005','DD-MM-YYYY')
dmo>/

99998 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 1369430197

--------------------------------------------------------------------
| Id | Operation | Name | Rows | Pstart| Pstop |
--------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 99998 | | |
| 1 | PARTITION RANGE SINGLE| | 99998 | 1 | 1 |
|* 2 | TABLE ACCESS FULL | TEST_PARTITION | 99998 | 1 |
--------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - filter("TRANS_DATE"=TIMESTAMP'2005-05-04 00:00:00')

so Oracle optimizer seems to be picking up the right estimation..Everything good so far..

Now DB analysis team want to add a constraint to make sure you dont enter any date with time remember date column includes time component in Oracle) or subsecond units (if the column is timestamp) ,instead just the date alone to be stored in transdate column..

so what do they do..add a simple check constraint to make sure all data in transdate are simply dates with 00:00:00 for time component.

-- introduce a check constraint
-- ----------------------------

dmo>ed
Wrote file afiedt.buf

1 alter table test_partition add
2 (
3 constraint do_date_check check (trans_date=trunc(trans_date))
4* )
dmo>/

Table altered.
Now rerun the query and check the execution plan (note the cardinality estimated)

-- check the plan again
-- ---------------------
dmo>select * from test_partition newsql
2 where trans_date=to_date('04-05-2006','DD-MM-YYYY')
3 /

99998 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 1369430197

-------------------------------------------------------------------
| Id | Operation | Name | Rows |Pstart| Pstop |
-------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1006 | | |
| 1 | PARTITION RANGE SINGLE| | 1006 | 2 | 2 |
|* 2 | TABLE ACCESS FULL | TEST_PARTITION | 1006 | 2 |
-------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - filter(TRUNC(INTERNAL_FUNCTION("TRANS_DATE"))=TIMESTAMP'2006-05-04 00:00:00' AND
"TRANS_DATE"=TIMESTAMP'2006-05-04 00:00:00')

see how the estimation went wrong? big time? (from 10000 rows to 1000 rows) - which is way way off the mark.


There is no fix yet..(atleast in 10g)..hopefully this will get fixed in version 11.x..but I just happened to discover this odd behavior while I was troubleshooting some other sql performance tuning issue..and I was so glad I learnt something new on that day.
(At client place I added a hint to make it work)

so watch out if you have partitions and check constraints on partitioned key column..You may be getting hit somewhere without your knowledge.

now to summarise..I will run a test sql with and without constraints and let you see the explain plan results..(and assume you have joined to many other tables in one sql, one wrong cardinality estimates might change the whole plan ..change the join orders..good enough to make all sorts of mess)


-------
Test sql:
--------
dmo>ed
Wrote file afiedt.buf

1 select * from test_partition newsql
2 where trans_date=to_date('04-05-2006','DD-MM-YYYY')
3* and trans_id between 1000 and 2000
dmo>

Version 1: With Check constraint Enabled:
-----------------------------------------

-----------------------------------------------------------------------
| Id | Operation | Name | Rows |
-----------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10 |
| 1 | PARTITION RANGE SINGLE | | 10 |
| 2 | TABLE ACCESS BY LOCAL INDEX ROWID| TEST_PARTITION | 10 |
|* 3 | INDEX RANGE SCAN | TEST_PART_ID_IDX | 3 |
-----------------------------------------------------------------------

Version 2: With Check constraint Disabled:
-------------------------------------------
dmo>alter table test_partition disable constraint do_date_check;

Table altered.

run the sql here

-----------------------------------------------------------------------
| Id | Operation | Name | Rows |
-----------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1009 |
| 1 | PARTITION RANGE SINGLE | | 1009 |
| 2 | TABLE ACCESS BY LOCAL INDEX ROWID| TEST_PARTITION | 1009 |
|* 3 | INDEX RANGE SCAN | TEST_PART_ID_IDX | 336 |
-----------------------------------------------------------------------


Tomorrow is busy day and so looks thursday and Friday..Hopefully I will find something interesting to blog during the weekend.

Bye until next...:)





Monday, March 19, 2007

"I could swear I created an Index..but Oracle doesnt seem to USE it all of a sudden.."

okay, so we know what LIOs mean and why to take that into consideration while tuning..

now moving on to Oracle community's famous belief - "Full table scans are evil..Indexes are super good"

I will save myself from lot of texual explanation and just get into copy & pasting business..its easy ,plus you will also get used to this blogs style (no theory where its not necessary)..

just reread the topic again and you know, how many times you have said that yourself or heard that from your fellow team members..

-- okay, we have the same table we created before..
-- table with ~ 100K rows..
-- lets say you (or a developer) wanted to get all rows from table whose id value is
-- greater than 100 (probably it wouldnt make sense to select off a
-- sequence id like that..but just for this demo, just think its
-- some meaningful column like salary or basic pay or something)

so here comes the query (with plan and statistics listed below)
---------------------------------------------------------------


DEMO@orcl>select * from test_table1 where id>100
2 /

49780 rows selected.

Elapsed: 00:00:00.57

Execution Plan
----------------------------------------------------------
Plan hash value: 2372207557

---------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 49782 | 583K| 33 (7)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| TEST_TABLE1 | 49782 | 583K| 33 (7)| 00:00:01 |
---------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("ID">100)


Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
3438 consistent gets
1 physical reads
0 redo size
1118460 bytes sent via SQL*Net to client
36879 bytes received via SQL*Net from client
3320 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
49780 rows processed
Guess what happens next? Our developer gets freaked out..just short of calling 911 and complaining about dreadful Full table scan causing him/her a heart attack..

Then , our senior developer (or a DBA) comes in with a big smile and say "Buddy, you know..Oracle always does things like that..and this is when you use Hints..and I have been tuning such stupid oracle mistakes for last n years..".

What follows is a quick search on google for 'Oracle+hints+in+sql+index+use' and quick scan of top 5 articles and a quick grasp of what hint should be used..and here in the next 10-15 minutes, the same query gets rewritten with an Index hint like below

Senior Developer rewrites the query and exhibits his/her sql expertise with hints
--------------------------------------------------------------------

DEMO@orcl>ed
Wrote file afiedt.buf

1 select /*+ INDEX(T TEST_TABLE_UIDX) */ -- thats the new hint added
2* * from test_table1 t where id>100
DEMO@orcl>/

49780 rows selected.

Elapsed: 00:00:00.57

Execution Plan
----------------------------------------------------------
Plan hash value: 3929903905

-----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 49782 | 583K| 238 (2)| 00:00:03 |
| 1 | TABLE ACCESS BY INDEX ROWID| TEST_TABLE1 | 49782 | 583K| 238 (2)| 00:00:03 |
|* 2 | INDEX RANGE SCAN | TEST_TABLE_UIDX | 49782 | | 113 (2)| 00:00:02 |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("ID">100)


Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
6857 consistent gets
0 physical reads
0 redo size
1317523 bytes sent via SQL*Net to client
36879 bytes received via SQL*Net from client
3320 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
49780 rows processed
so looking at explain plan, both the developer and senior developer now breath easy and smile..the most dreadful "full table scan" just got disappeared and replaced by "Index Scan".


But in reality what happened was..

Operation Blocks read (LIOs)
--------- ------------------
Full Table
3438
Index Scan
6857

so the tuning team , just managed to double the workload without realizing Oracle was indeed doing the right thing before..


so next time you see a full table scan, first check if the index is in place..if its there , but still oracle is doing full table scan, time to check what percentage of data are you fetching from the table.

Remember, Index lookup happens block by block (slow)..whereas Oracle performs table scan in bulk (set of 16 blocks or 32 blocks etc depending on parameter setting)..so table scan could be lot more faster than index lookup..


I can already hear.."ok dude, enough!!! Just say your final conclusion..Will ya?"..so here it is
  • Full Table scans are not ALWAYS bad ( there are perfect and valid reasons and scenarios for Oracle to switch to Full table scan ignoring existing Index.)
  • Hints..are meant for fixes (temporary)..dont show your expertise by googling up and adding 10-15 unheard hints..Instead think, why Oracle is not doing what you think it should do..and use autotrace to compare your way and Oracle's way..Validate which option performs less LIOs.
  • Nexttime anyone in your team suggests Index hint to avoid full table scan..dont look at the time ..look at the LIOs between the two versions..The more you talk about Blocks and LIOs, the more your DBA fears to come close to you :)



Catch you in next ..until then "Happy Sql"-ing..:)

Sunday, March 18, 2007

What do you mean "do less work".?

"dont go by time, look at actual work done (less is better)"
so what does that mean? how do we know how much work a SQL generates at server end?

the answer is AUTOTRACE..the feature comes in-built with SQL+ and all you have to do it turn it ON - when you need it.

SQL> SET AUTOTRACE ON EXP STAT
SQL> run your query (you will see query results followed by explain plan followed by some more junk info what we call it as "Statistics")

Yes, thats as simple as that.

when you are done.. Switch it off by
SQL> SET AUTOTRACE OFF
--

okay now, lets step into tuning world...

this blog is intended to explain ..how to see the workload of any SQL using Autotrace and how to interpret it and how to compare two different versions of SQL doing the same work and pick the best performant version..


I will be posting the scripts here..( most of my tests are done in my laptop database 10gR2 running on Windows)


step-1: Create a test table
----------------------------

DEMO@orcl>ed
Wrote file afiedt.buf

1 Create table test_table1 as
2 select cast(rownum as number(6)) id,
3 chr(65+mod(rownum,3)) code,
4 rpad(rownum,5,'x') data
5* from all_objects where rownum<99999
DEMO@orcl>/

Table created.

Notes: this table has abt 100K rows..
  1. id column is unique (Primary Key candidate)
  2. code column - is more like some status..contains 'A' or 'B' or 'C'
  3. data column - some junk data

step-2: Table is ready..so lets query
-------------------------------------
DEMO@orcl>set autotrace traceonly exp stat -- start tracing
DEMO@orcl>set timing on
-- shows timer
DEMO@orcl>select * from test_table1 where id=1000;

-- sample query..to retreive one row from that table with almost 100K rows


Elapsed: 00:00:00.17
-- thats the timer info..displayed in "Hour:Minutes:Seconds:subseconds" format


Execution Plan
----------------------------------------------------------
Plan hash value: 2372207557

---------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2 | 38 | 33 (7)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| TEST_TABLE1 | 2 | 38 | 33 (7)| 00:00:01 |
---------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("ID"=1000)

Note
-----
- dynamic sampling used for this statement

** The explain plan shows a FULL TABLE SCAN on the table...**

Statistics
----------------------------------------------------------
88 recursive calls
0 db block gets
196 consistent gets
123 physical reads
0 redo size
521 bytes sent via SQL*Net to client
381 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed

-- and thats the stats info we wanted to display..
-- here is simple formula
total work = db block gets + consistent gets (the unit is Oracle blocks)

-- so for this sql -
we did 0+196= 196 block reads (lets call it ANSWER-A)


step-3: And why did Oracle read so many blocks to fetch you just 1 row (row with id=1000)
-------------------------------------------------------------------

the answer is simple..because we know we are asking for 1 row..but Oracle doesnt know how many rows the table has with id=1000..it could be 0 or 1 or many or even all 1000 rows..

so oracle did a full search from top to bottom of the table (aka full table scan) and found the matching rows..

lets help oracle by creating an index (and letting it know more about data inside the table)

DEMO@orcl>Create index test_table_uidx on test_table1(id)
2 tablespace users;

Index created.

Elapsed: 00:00:00.21

-- we created an index on id column

DEMO@orcl>begin

2 dbms_stats.gather_table_stats(ownname=>USER,
3 tabname=>'TEST_TABLE1',cascade=>TRUE);
4 end;
5 /

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.82

-- we gathered some stats on table and index..dont worry about this step now if you are a developer (because this is more of DBA job and you can always blame it on DBA :)..As a developer, I always enjoyed blaming it on my dbas and giving them hard time and letz maintain the same spirit here :) )


step-4: Okay, now Index is ready too..so whats the big difference?
------------------------------------------------------------------

DEMO@orcl>select * from test_table1 where id=1000;

-- rerunning the same query from step 2

Elapsed: 00:00:00.03

Execution Plan
----------------------------------------------------------
Plan hash value: 3929903905

-----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 12 | 2 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| TEST_TABLE1 | 1 | 12 | 2 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | TEST_TABLE_UIDX | 1 | | 1 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------

-- hey, it shows it used the new index created

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("ID"=1000)


Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
4 consistent gets
0 physical reads
0 redo size
525 bytes sent via SQL*Net to client
381 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed


-- so lets compute the work done now..
-- work done = 0+4 (blocks) = 4 Oracle blocks
(lets call it ANSWER-B)

--
now its a child game ..all we have to compare is ANSWER A to ANSWER-B and decide which is the best

196 block or 4 blocks? You decide !!! (yes, I am not good at Math :) )


so whats the final say here??
-----------------------------
two things..
- never go by time..check above ..we got answers in Step-2 and step -4 almost at the same time..both versions of sql took less than 1 second..and if you had simply gone by time factor, you wouldnt have even started to tune the sql in the first place (17 subseconds..is pretty darn good)..
- Next time anyone talks about SQL tuning, start asking how much work is done by SQL..
and does that sound like a layman's language? okay lets add some Oracle-touch to it and make it sound like a Pro's question.. How about asking " Well , how many LIOs does that SQL perform...?
"

LIOs = Logical I/Os = db block gets + consistent gets



Secret of SQL tuning: The more you focus on reducing LIOs , the faster you are becoming a smart Performance Tuner..and the good thing is, there are NOT many in the field who seem to know this little trick ..:)






So whats there in "SQL Tuning", apart from simply staring at the query for half a day?

here is my first question from one of my collegues..(thanks much for getting me started)


"
General and basic idea about how to do tuning, what are the methods available, when to do full table scan ?,
when to use RULE hint, what are the things to keep in mind when working with high volume data.
How often we should do "analyze". Writing efficient SQL, like when to use where clauses, what not to use, etc.....
"

------------------------


General & basic idea -
  • dont go by time, look at actual work done (less is better)
  • Look at waits (less waits = more scalable)
  • dont start with a mindset - "My query should be using an Index..because Indexes are always good"

Methods available to tune -
  • Explain plan shows what path was chosen by Optimizer to execute the sql
  • Autotrace - shows the actual workload (how many blocks read, Physical IOs required, Sorts done etc)
  • SQL_TRACE - actual process in database is traced and all performance metrics are dumped into a trace file (in database server)
When to use Hints -
Preferred answer : never (in an ideal Oracle world)
Practical answer : when you know optimizer is not generating the right plan for one of the reasons below
1. statistics insufficient to tell the optimizer the correct data distribution
2. when you have a better idea (functionality wise) on the data which optimizer would never figure out from stats
3. incorrect init parameters set by DBA (which makes the optimizer to always lean towards unbalanced execution plan)
4. quick fix - when you know how the optimizer should work..but dont have a clue or enough time to investigate why its not
doing so by default
5. After upgrades,if any sql is not working the optimal way - possibly due to bugs


It would get too long to get into each one of the sub-item listed above and explain what that means..(too much theory is always boring to me)..so lets leave that aside for a while and get into some real ACTION..

Hang on with your seat belts fastened..

More in next

PS: I know, not everything in the question (RULE hint, analyze etc) was answered..we will cover it in the later posts to come

Intro...

Hello..
okay one more Oracle blog added to the million already existing out there..

Well the reasons why I came up with this blog are...

  • a) this is accessible from everywhere
  • b) easier to track comments (if any post happens to get some follow-up comments)
  • c) no identity: you dont have to reveal yourself to ask a question or even to technically dispute anything posted here.Being anonymous , I think will be more effective for an open discussion.
Okay, now before I start with technical posts, one clarification ..
I tend to keep this blog as simple and understandable as possible..without getting into much expert level..I would like to keep it fun and readable for developers and starters..and so please dont think of some extreme cases and reply "well your example wont work ..because if this this and this happens, your's wont work".. Other than that, I am open for any comments..feel free to correct me ,if you think I am wrong anywhere ..

yes, I do make mistakes and dont claim myself an expert or anything..In fact, I am just like anyone else who bumps into new things in oracle and learns it everyday..Well, to me, Oracle database is changing so much every day and with every version, there cannot be really be one expert to know everything..so lets all learn and benefit out of sharing