Sunday, April 29, 2007

Job security - not looking bad recently did a survey on fastest growing jobs (across all areas not just IT) and published the results here.

Database Administrator job shows up at #12..which is not bad at all and especially considering the ratio of developers-to-DBA positions are always something like 25:1 , this is not bad at all..

check out the metrics..
Number in 2004: 104,000
Number in 2014: 144,000 <-- 40000 new positions in 10 years..that is about 4000/year!!
Education required: Bachelor's degree

Number 5 looks very promising too..One more area where developers beat DBAs :)

look at the metrics
Number in 2004: 460,000
Number in 2014: 682,000 <-- 222,000 new positions in 10 years

Well well , the future looks very bright atleast for the next 5 years..Unless Oracle buys over "Pizza-hut" as part of their ongoing strategic acquisition/alliance :)

Saturday, April 21, 2007

I am a developer..Why should I care about REDO?

Thats the sort of mentality most people have in Oracle Development team..The code (SQL or PLSQL or PSP or forms) should work as expected.why should a developer worry about redo? And isnt that anyways a DBA's job?

All perfect questions and assumptions..Lets see some facts abt Redo.

Redo logs are physical structures(files) of database that captures all the changes (similar to Transaction Logs in Sqlserver) and provides guarantee that even if your database crashes next minute right after your transaction is completed, you can be ensured to recover them back.

So how does it work?
Put it in very simple way..think you have a database supporting 100 applications..say 100 different may have data stored across say 100 datafiles (just an example) since database doesnt which files are critical and which files are not..(all are datafiles..whether an application is used by 1 user or 1 million users..a change to data is a change to data!) saving changes to all 100 files every single second is next to impossible (& unwise) task. So what does oracle do? Oracle manages changes to data (to datafiles) at its own when the user changes a row, the data may or may not get into actual datafile.. the changed data for all you know may just be in RAM (volatile memory) and if the box crashes/reboots the contents are gone..(just like - if you have stored something in buffer by using Ctrl+C in windows and restart the machine..your buffer data is lost)

so how is the data guaranteed to recover back? Thats where redo log comes in play..Oracle keeps writing all changes to data (like what row..old value..etc) to an area in SGA called Log buffer..the moment a transaction is committed, instead of writing into data files, log buffer contents are written into redo log files.

okay , so why should a developer be concerned?
Well this has something to do with Oracle's internals..Say your DBA allocated 3 redo files..Oracle will write to #1..when it gets full..move to #2..when that gets full..move on to #3..and when thats full..move onto ..where?

it has to overwrite #1..but if it overwrites #1 logfile contents, how are you doing to get the data if crash happens..Are you thinking? Oracle just does the same thing!!! It starts thinking ..halting everything else going on at the waits for another process called archiver, to take a copy of Redolog#1 to some safe area before it can overwrite the logfile.

Fine..enuf with internals..still why should developer be worried abt redo?

the answer to that question now is very simple..A DBA can only allocate enough number of redo log files to keep it going trouble-free..but its the statements or code which generates its the code written by developers which is root cause of redo generated..Oracle is great in parallelism..horizontal scaling ..multi-user support..all we have heard from marketing folks..but if there one place where its gets really into a bottleneck situation is "REDO".

so think about it..say a developer writes a statement which generates 2K bytes of redo data more than optimal/tuned statement..more such developers = more such statements; more users executing this application = more times the same statement will have to rexecute. potentially generating gobs of redo data at the database..more redo data = more redologs filled in at database..bringing the database to a halt each time Oracle has to wait for archiver to kick-in and back up the logfile before overwriting it.. do I know how much redo my statements are generating..

go back to autotrace..and run the statement with trace would see a statistic showing up as "redo size"..It cant better any easier than that..

And believe me..the DBA can only see if the system is running out of log files and patch it more..DBA doesnt know which statement is causing such excessive redo (althought the session can be traced and then sql..99% that never happens..because the situation will be more of fire-fighting mode..when the database hangs, DBA cannot walk to developer's cube and ask "how dare you spoiled my day?")

So think about it this way..the less redo you (as a developer) generate..the more
you are doing a favor to yourself and your DBA(s) too.. sort of like making his/her life peaceful..If you really work hard on getting keeping your code's redo size down, your DBA better realize your efforts and start treating you a free dinner or thank-you session at a bar.

thats also explains how a developer can easily mess up with DBAs weekend,if needed..write some bad code and put it as batch job to start sharp at Friday 10PM..your database will hang and DBAs get called/paged automatically..what a simple way to ruin their weekend -huh?

Whoever said Developers dont have rights to challenge DBAs - is utterly wrong. :)

Wednesday, April 11, 2007

What you want is most likely coded and available already

Agreed, developers have this passion to code..anything and everything..

I have been to many meetings with Java folks (fun..always fun!!:) ) wherein they start the meeting with "we can do this with really cool JMS implementation ..and built a hash table for this UI and a entity bean here ...another public class with business logic here.." and every other possible term which would make me wonder "Am I really in the same software field as these folks?"..And then after everyone sort of settles down I usually start with - "everything sounds really cool..but what are we trying to do here..Can I know the problem first?" :)

Okay now, back to the topic what I wanted to Blog..lets forget Java developers ..How abt Oracle backend Programmers? Time and again I see they love to code and implement something which is already there..

Here is something I recently encountered at a client place.."We wanted to dump this table data into csv format so that we can send it to xyz this is our PL/SQL program to do that"

The program looks like

- open a file in write mode
- declare a cursor
- open the cursor
- fetch one row at a time to variable
- exit no more data to fetch
- write last the row fetched to file
- move the pointer to next line in file
- go back to fetching step and fetch the next row
- close the cursor

- catch exceptions etc..

the cursor was returning a resultset of 200K rows and this was taking about 2 hours to complete, when it was shown to me..

so whats wrong in the logic? Nothing ..absolutely..except that the programmer didnt realize there is much simpler way and that comes part of Oracle software - which could saved all the coding efforts..

okay is my test table (testtab) and consider if I want to simply dump the data to csv format..

SQL> l
1* select * from test_tab
SQL> /

--- - -----
1 B 1xxxx
2 C 2xxxx
3 A 3xxxx
4 B 4xxxx
5 C 5xxxx

6 A 6xxxx
7 B 7xxxx
8 C 8xxxx
9 A 9xxxx

Thats how the data looks..
All I have to do is change my sql setting to have a unique way to differenciate column seperator( which is by default set to space) other than data.

SQL> set colsep '","' <-- that does what I wanted

SQL> l

1* select '"'|| id,t.code,||'"' from test_tab t

<-- change the query a little bit to add the double quotes to front and end of every row

SQL> spool c:\Venkat\blog\ <-- redirects the output to file

SQL> /

ID ","C","T.DATA
"1 ","B","1xxxx"
"2 ","C","2xxxx"
"3 ","A","3xxxx"
"4 ","B","4xxxx"
"5 ","C","5xxxx"
"6 ","A","6xxxx"

"7 ","B","7xxxx"
"8 ","C","8xxxx"
"9 ","A","9xxxx"
SQL> spool off

-- now open the file and trim off the unnecessary lines from top and bottom (or you can use the sqlplus commands to suppress them)

Open the file ( in Excel and it prompts you with Delimited or Fixed space question.. Pick delimited option and click on next, it would ask for delimiter/seperator used.. and I clicked on Others and entered a comma and also the text qualifier is set to double quotes

and thats all about whats step , Excel converts the data and shows data in Spreadsheet format..


You want to talk performance..?
- here its plain SQL..dumping the text
- NO PLSQL involved
- no row by row FILE operation

What the plsql program was doing in 2 hours was done in less than 5 minutes with simple and straight sql..


Check out sqlplus reporting functionality.. sometimes a simple "break on column skip 1" and TTITLE combination can generate better reports than what a crystal report or BI product out there is doing with repeated queries..

Sometimes its not all about just solving..solving it the efficient way too

Okay time to sign off now

( I had an interesting assignment to work with hibernate last week..and I am confident that Oracle tuning is nowhere close to dying..The options hibernate allows and the mess it creates - Gotto love that product for the job security it creates for Database Tuners :) )

Sunday, April 1, 2007

Software development - a continuous learning process

Its interesting to read this recent post in one of my favourite sites.

I cant agree more - as I get to learn most of new stuff only when I am given a task..dont remember the last time I read something from documentation and understood on my first read.

and about the academic qualification - again a big nod from me.if someone goes back to my college and disclose that I am working as Oracle consultant, I am sure my lecturer would bet any money that you are lying..Because I hardly sat in any sessions..and lab sessions (wherein you have to program and show the results) were really allergic to me..:)..I had a big green book which was titled BDP (Business Data Processing I think)..dont remember how many times I read that..but couldn't understand anything after first page .:)

but this part really made me feel better..
"Start a blog. Pick a writing schedule and stick with it; I recommend once a week at minimum. Select a general theme for your blog and write on topics related..."

I m glad I am atleast advancing in the battlefield in the right direction..but I guess I have work on my timing schedule..

Actually I was working on posting a new blog on Index Organized tables(aka IOTs)..but I am not really convinced with my own test scripts..looking for a better/close-to real-world example to post that.

Catch you in next