Thinking Out Loud

June 25, 2011

fuser – identify users of files and devices

Filed under: solaris — mdinh @ 4:22 am

I learned about fuser when I had deleted 64GB file from /tmp but the space was never reclaimed.

This shows you alert log is being opened in another session using vi.

> ll
total 28992
drwxr-xr-x  25 root     root        8192 Oct 23  2010 ..
-rw-r--r--   1 root     sys           60 Oct 23  2010 csn.6249
-rw-r--r--   1 root     root          76 Oct 23  2010 iconf_entries.6185
drwxrwxrwt   2 root     root         117 Oct 29  2010 .oracle
drwxrwxrwt   4 root     sys          536 Jun 24 21:05 .
-rw-r-----   1 oracle   dba      14785325 Jun 24 21:05 alert_dbtest.log

> ps -afe|grep alert_dbtest.log
  oracle 11589 16520   0 21:05:58 pts/8       0:00 grep alert_dbtest.log
  oracle 10555 19200   0 21:05:47 pts/6       0:01 vi alert_dbtest.log

I am now going to delete alert log file.

> rm alert_dbtest.log

> ll
total 112
drwxr-xr-x  25 root     root        8192 Oct 23  2010 ..
-rw-r--r--   1 root     sys           60 Oct 23  2010 csn.6249
drwxrwxrwt   2 root     root         117 Oct 29  2010 .oracle
drwxr-xr-x   2 oracle   dba          117 Nov  1  2010 hsperfdata_oracle
drwxrwxrwt   4 root     sys          463 Jun 24 21:06 .

Who’s got the file opened?

> ps -o pid,args -p "$(fuser /tmp 2>/dev/null)"
  PID COMMAND
10555 vi alert_dbtest.log
16520 -bash
19200 -bash

> kill -9 10555

Note:  tested on Solaris 10

Great reference here as well Less known Solaris Features: fuser

Advertisements

June 15, 2011

Have you seen ? CONSTRAINT_TYPE ?

Filed under: 11g,oracle — mdinh @ 4:00 am

When I was testing DBMS_REDEFINITION to convert a non-partitioned table to a partitioned table, I came across constraint_type “?” from TOAD.

WTH! I have must have done something wrong.

It turns out that TOAD used the following SQL to determine constraint_type and does not label non-constraints:

SELECT CN.NAME constraint_name, c.type#, decode(c.type#, 1, 'Check', 2, 'Primary Key', 3, 'Unique Key',
       4, 'Referential Integrity', 5, 'Check Option on a View', 6, 'Read Only Option on a View', 7,'Check', '?') constraint_type
FROM   SYS.CDEF$ C, SYS.CON$ CN, SYS.OBJ$ O, SYS.USER$ U, SYS.CON$ RC, SYS.USER$ RU, SYS.OBJ$ RO
WHERE  C.CON# = CN.CON#
AND    C.OBJ# = O.OBJ#
AND    O.OWNER# = U.USER#
AND    C.RCON# = RC.CON#(+)
AND    RC.OWNER# = RU.USER#(+)
AND    C.ROBJ# = RO.OBJ#(+)
AND    U.NAME = 'SCOTT'
AND    O.NAME = 'SH'
AND    c.type# not in (8, 12)
order by 1;

Notice how TOAD purposely left out  cdef$.type# 8 & 12, but forgot about the others.

Querying dba_contraints shows different results:

SQL> select CONSTRAINT_NAME,CONSTRAINT_TYPE from dba_constraints where OWNER='SCOTT' and TABLE_NAME='SH';

no rows selected

SQL>

Here’s how I was able to find the definition for the various cdef$.type#

From ORCLE_HOME/rdbms/admin –

> grep 'cdef\$' *.sql
catmeta.sql:          oi.obj# = cdef$.enabled and
catmeta.sql:          bitand(cdef$.defer,8) = 8 and                  /* system generated */
catmeta.sql:          cdef$.type# = 3 and

This hinted me to look at catmeta.sql and found this

  type_num      number,                                  /* constraint type: */
                            /* 1 = table check, 2 = primary key, 3 = unique, */
                             /* 4 = referential, 5 = view with CHECK OPTION, */
                                                 /* 6 = view READ ONLY check */
               /* 7 - table check constraint associated with column NOT NULL */
                                   /* 8 - hash expressions for hash clusters */
                                         /* 9 - Scoped REF column constraint */
                                    /* 10 - REF column WITH ROWID constraint */
                                  /* 11 - REF/ADT column with NOT NULL const */
                                 /* 12 - Log Groups for supplemental logging */
                                    /* 14 - Primary key supplemental logging */
                                     /* 15 - Unique key supplemental logging */
                                    /* 16 - Foreign key supplemental logging */
                                     /* 17 - All column supplemental logging */

Here is a little testing I did to confirm  cdef$.type# = 17 (All column supplemental logging)

SQL> SELECT CN.NAME constraint_name, c.type#, decode(c.type#, 1, 'Check', 2, 'Primary Key', 3, 'Unique Key',
  2         4, 'Referential Integrity', 5, 'Check Option on a View', 6, 'Read Only Option on a View', 7,'Check', '?') constraint_type
  3  FROM   SYS.CDEF$ C, SYS.CON$ CN, SYS.OBJ$ O, SYS.USER$ U, SYS.CON$ RC, SYS.USER$ RU, SYS.OBJ$ RO
  4  WHERE  C.CON# = CN.CON#
  5  AND    C.OBJ# = O.OBJ#
  6  AND    O.OWNER# = U.USER#
  7  AND    C.RCON# = RC.CON#(+)
  8  AND    RC.OWNER# = RU.USER#(+)
  9  AND    C.ROBJ# = RO.OBJ#(+)
 10  AND    U.NAME = 'SCOTT'
 11  AND    O.NAME = 'SH'
 12  AND    c.type# not in (8, 12)
 13  order by 1;

no rows selected

SQL> alter table scott.sh add supplemental log data (all) columns;

Table altered.

SQL> SELECT CN.NAME constraint_name, c.type#, decode(c.type#, 1, 'Check', 2, 'Primary Key', 3, 'Unique Key',
  2         4, 'Referential Integrity', 5, 'Check Option on a View', 6, 'Read Only Option on a View', 7,'Check', '?') constraint_type
  3  FROM   SYS.CDEF$ C, SYS.CON$ CN, SYS.OBJ$ O, SYS.USER$ U, SYS.CON$ RC, SYS.USER$ RU, SYS.OBJ$ RO
  4  WHERE  C.CON# = CN.CON#
  5  AND    C.OBJ# = O.OBJ#
  6  AND    O.OWNER# = U.USER#
  7  AND    C.RCON# = RC.CON#(+)
  8  AND    RC.OWNER# = RU.USER#(+)
  9  AND    C.ROBJ# = RO.OBJ#(+)
 10  AND    U.NAME = 'SCOTT'
 11  AND    O.NAME = 'SH'
 12  AND    c.type# not in (8, 12)
 13  order by 1;

CONSTRAINT_NAME                     TYPE# CONSTRAINT_TYPE
------------------------------ ---------- --------------------------------
SYS_C0010094                           17 ?

SQL>

Guess I can close the SR now.

June 11, 2011

Instrumentation versus Monitoring

Filed under: oracle — mdinh @ 2:47 pm

Please help share advice for my constant battle with instrumentation versus monitoring.

The requirements:

  • The response time analysis and alerting should be automated such that when a process has executed, it should know the timing of the previous execution.
  • If the response time for the current execution exceeds the prior by a defined percentage threshold, then send alert.

Example:

There are many java processes with 10,000+ executions against Oracle database with a normal response of 1 sec.

If the process normally completes in 1 seconds and now takes 5 seconds (exceeding threshold of of 4x), then alert needs to be sent.

Would instrumentation be better?

If there is no available resource to instrument the code, then would monitoring suffice?

My preference would be instrumentation and I would appreciate a different point of view.

Operating System provides method for instrumentation:

> time du -sh /oracle/
 3.6T   /oracle

real    0m0.456s
user    0m0.006s
sys     0m0.065s

Oracle provides method for instrumentation and here is my test case:

SQL> declare
  2   l_sla_warning  constant NUMBER := 10; -- response time in hsecs
  3   l_sla_failed   constant NUMBER := 20; -- response time in hsecs
  4   l_start  NUMBER := dbms_utility.get_time;
  5   l_date   DATE;
  6   l_timing NUMBER;
  7  begin
  8    for x in 1..10000 loop
    select sysdate into l_date from dual;
  9   10    end loop;
 11    l_timing := dbms_utility.get_time - l_start;
 12    case
 13     when l_timing > l_sla_failed then
          dbms_output.put_line('FAILED! Elapsed time in hsecs '||l_timing||' exceeds sla_failed of '||l_sla_failed);
 14   15      when l_timing > l_sla_warning then
 16       dbms_output.put_line('WARNING! Elapsed time in hsecs '||l_timing||' exceeds sla_warning of '||l_sla_warning);
        else null;
 17   18    end case;
end;
 19   20  /
++++++++++ FAILED! Elapsed time in hsecs 34 exceeds sla_failed of 20 ++++++++++ (manually added + for emphasis)

PL/SQL procedure successfully completed.

SQL> declare
  2   l_sla_warning  constant NUMBER := 10; -- response time in hsecs
  3   l_sla_failed   constant NUMBER := 20; -- response time in hsecs
  4   l_start  NUMBER := dbms_utility.get_time;
  5   l_date   DATE;
  6   l_timing NUMBER := 0;
  7  begin
  8    for x in 1..100 loop
  9      select sysdate into l_date from dual;
 10    end loop;
  l_timing := dbms_utility.get_time - l_start;
 11   12    case
 13     when l_timing > l_sla_failed then
 14       dbms_output.put_line('FAILED! Elapsed time in hsecs '||l_timing||' exceeds sla_failed of '||l_sla_failed);
 15      when l_timing > l_sla_warning then
 16       dbms_output.put_line('WARNING! Elapsed time in hsecs '||l_timing||' exceeds sla_warning of '||l_sla_warning);
 17     else null;
 18    end case;
 19  end;
 20  /

PL/SQL procedure successfully completed.

SQL> declare
  2   l_sla_warning  constant NUMBER := 10; -- response time in hsecs
  3   l_sla_failed   constant NUMBER := 20; -- response time in hsecs
  4   l_start  NUMBER := dbms_utility.get_time;
  5   l_date   DATE;
  6   l_timing NUMBER;
  7  begin
  8    for x in 1..5000 loop
    select sysdate into l_date from dual;
  9   10    end loop;
 11    l_timing := dbms_utility.get_time - l_start;
 12    case
        when l_timing > l_sla_failed then
 13   14          dbms_output.put_line('FAILED! Elapsed time in hsecs '||l_timing||' exceeds sla_failed of '||l_sla_failed);
 15      when l_timing > l_sla_warning then
 16       dbms_output.put_line('WARNING! Elapsed time in hsecs '||l_timing||' exceeds sla_warning of '||l_sla_warning);
 17     else null;
 18    end case;
end;
 19  / 20
++++++++++ WARNING! Elapsed time in hsecs 18 exceeds sla_warning of 10  ++++++++++ (manually added + for emphasis)

PL/SQL procedure successfully completed.

SQL>

I have received a recommendation to use ASH (Oracle’s Diagnostic & Tuning Pack feature)

June 2, 2011

Alert Log Monitoring using ADRCI

Filed under: 11g,oracle — mdinh @ 1:48 am

This has been tested on: SunOS 5.10 Generic_142909-17 sun4u sparc SUNW,SPARC-Enterprise

If my memory serves me correctly, you will need to change “ps -eo args” for Linux and other OS accordingly.

Note: db_unique_name = lax_${sid} (primary) / san_${sid} (standby), which you might have already guessed.

Here is the script:

#!/bin/sh

. /home/oracle/.common.conf > /dev/null

DN=`dirname $0`
BN=`basename $0`
EXCEPTIONS=$SCRIPT_DIR/alert_exceptions

DCP=lax # primary data center
DCS=san # standby data center
for sid in `ps -eo args|grep ora_smon|grep -v grep|awk -F_ '{print $3}'`
do
  OUTF=$LOG/${sid}.alert.log
  ORACLE_SID=$sid
  . oraenv
  PRIM=diag/rdbms/${DCP}_${sid}/$sid
  STBY=diag/rdbms/${DCS}_${sid}/$sid
  if [ -r $ORACLE_BASE/$PRIM ]; then
    DIAG=$PRIM
    DB=${DCP}_${sid}
  else
    if [ -r $ORACLE_BASE/$STBY ]; then
      DIAG=$STBY
      DB=${DCS}_${sid}
    fi
  fi
  adrci > $OUTF << EOF   
  set home ${DIAG}   
  SHOW ALERT -term -P "MESSAGE_TEXT like 'ORA-%' and originating_timestamp > systimestamp-1/1440"
  exit
EOF
  if [ `egrep '^ORA-' $OUTF|egrep -cvf $EXCEPTIONS` != 0 ]; then
    mailx -r oracle@domain.com -s "OERR: $BN $DB" $EMAIL < $OUTF
  fi
done
exit

Here is an example of email notification:

From: oracle@domain.com
Sent: Wednesday, June 01, 2011 1:43 PM
To: dba
Subject: OERR: monitor_alert.sh dw02

ADRCI: Release 11.2.0.2.0 - Production on Wed Jun 1 13:43:01 2011

Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.

ADR base = "/u01/app/oracle"
adrci> adrci>
ADR Home = /u01/app/oracle/diag/rdbms/lax_dw02/dw02:
*************************************************************************
2011-06-01 13:42:10.542000 -07:00
ORA-1652: unable to extend temp segment by 4 in tablespace                 TEMP
ORA-1652: unable to extend temp segment by 4 in tablespace                 TEMP
adrci>

UPDATE: There was an error in the original cut & paste. Improved script.

Create a free website or blog at WordPress.com.