Thinking Out Loud

July 19, 2012

Know Filter Data

Filed under: 11g,oracle,performance — mdinh @ 2:12 am

I got a request on how to improve performance for table update with 33,617,240 rows running for 2.5 hours as shown:

UPDATE a
SET a.status =
(CASE WHEN a.dt1 IS NOT NULL THEN ‘PROCESSED’ WHEN a.dt2 IS NULL THEN NULL END);

32935613 rows updated.
Elapsed: 02:23:39.85

As written, 32,935,613 of 33,617,240 rows updated.

This is a good demonstration on how filtering early and understanding the data is a HUGE benefit.

If status is already ‘PROCESSED’, then is it necessary to update?
select count(*) from a where dt1 IS NOT NULL and status<>’PROCESSED’;

COUNT(*)
———-
14032

If status is already NULL, then is it necessary to update?
select count(*) from a where dt2 IS NULL and status IS NOT NULL;

COUNT(*)
———-
572

The answer to both questions is NO and we don’t want to peform blanket update.

The SQL can be written as:

UPDATE (
SELECT /*+ PARALLEL(a,4) FULL(a) */
status,
(CASE
WHEN dt1 IS NOT NULL THEN ‘PROCESSED’
WHEN dt2 IS NULL THEN NULL
END) new_status
FROM a
WHERE (dt1 IS NOT NULL AND status <> ‘PROCESSED’) OR (dt2 IS NULL AND status IS NOT NULL)
)
SET status=new_status;

14604 rows updated.
Elapsed: 00:04:43.04

The result is only 14,604 versus 32,935,613 rows updated.
Huge difference in performnace.

Without hints: 47s direct path read + 28s CPU
With hints: 17s direct path read + 13s CPU

Advertisements

Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.

%d bloggers like this: