How to identify long running SQL and kill them
From OraWiki
Contents |
Author
| (c) 2009 by Steven of Oz | ||
|---|---|---|
| Author | ||
| Name | Gerald Roehrbein | |
| http | www.OraForecast.com | |
| Gerald.Roehrbein@OraForecast.com | ||
| Release | 0.9 | |
| Created | 2009.03.11 | |
| Last update | 2009.03.11 | |
| Changes | Started article | |
| Estimated duration of installation | a few seconds | |
| License | Freeware | |
Purpose
- How to identify long running SQL and kill such sessions with a regular scheduled job
Generic Oracle 10.2.0.3 - AIX
Sometimes there are long running SQL statements, never ending because they hit a bug. This article describes how to identify and kill the sessions related to this SQL using a database job.
One Query with very long waits for "db file sequential read" up to a few seconds or minutes for one wait. No IO problem. Query used "IN" clause and "subselect".
To identify long running SQL use Fast Small Oracle Remote Analyzer function:
fsora all_session_waits fsora session <sid from all_session_waits>
Installation
Job definition:
DECLARE
X NUMBER;
BEGIN
SYS.DBMS_JOB.SUBMIT
( job => X
,what => 'sys.kill_long_running;'
,next_date => to_date('11.03.2009 11:16:55','dd/mm/yyyy hh24:mi:ss')
,interval => 'SYSDATE+60/1440 '
,no_parse => TRUE
);
SYS.DBMS_OUTPUT.PUT_LINE('Job Number is: ' || to_char(x));
END;
/
commit;
Stored procedure to identify long running SQL's and kill their server process:
CREATE OR REPLACE procedure SYS.kill_long_running is
CURSOR CUR
IS
SELECT A.SID SID, B.SERIAL# SERIAL# , A.EVENT "EVENT" /*, P1, P2,P3 */,
substr(
substr (b.sql_fulltext,1,20) || '...'||
substr (b.sql_fulltext,
instr (upper(b.sql_fulltext),'FROM'),
5+ (instr (upper(b.sql_fulltext),'WHERE')-
instr (upper(b.sql_fulltext),'FROM')
)),1,60) "SQL",
b.program "PROGRAM", to_char(b.logon_time,'dd.mm.yy hh24:mi:ss') "LOGON"
FROM V$SESSION_WAIT A,
(
SELECT
S.SID , S.SERIAL# serial#, t.sql_text sql_fulltext, s.program, s.logon_time
FROM V$SQLAREA T, V$SESSION S, V$PROCESS P
WHERE
S.PADDR = P.ADDR(+)
and t.hash_value=s.sql_hash_value
and t.address=s.sql_address
ORDER BY 1
) B
WHERE
A.SID=B.SID
AND A.EVENT NOT LIKE '%message%'
and b.sql_fulltext like '<sql to search for>'
and (sysdate-b.logon_time)*24*60*60>=<seconds allow to run>
order by 1;
r cur%ROWTYPE;
ddl_cur INTEGER;
result INTEGER;
sqlstmt varchar2(100);
begin
open cur;
loop
FETCH cur INTO r;
exit when cur%NOTFOUND;
sqlstmt:='alter system kill session '''||r.sid||','||r.serial#||'''';
dbms_output.put_line(sqlstmt);
ddl_cur:=DBMS_SQL.OPEN_CURSOR;
DBMS_SQL.PARSE(ddl_cur,sqlstmt,DBMS_SQL.V7);
result:=DBMS_SQL.EXECUTE(ddl_cur);
DBMS_SQL.CLOSE_CURSOR(ddl_cur);
end loop;
close cur;
end;
You have to change code above and replace <sql to search for> with the SQL the procedure have to search for and replace <seconds allow to run> with the max number of seconds the SQL have to run.
To use something like this script should be an exception because this is a "bad" workaround for a problem with not simple solution. In 20 years it was required to use something like this once!
I required this solution to fix a long running SQL which hits a bug inside an application which could not be changed easily. The SQL consumed masses of CPU ressources. To kill it was a solution which supported 99% of the applications functions but the one killed not.
This method also helped to solve a performance problem because the solution stopped CPU consuming Oracle server sessions. After implementation an 8 CPU's server with a load of 40 was 90% idle and 99% of the user said that the performance of their application rapidly increased.
Sometimes there are very expensive ressource consumers. In an IBM LPAR CPU costs are high and Oracle licence costs are an interesting item too! Eight 3 Ghz. CPU's multiplied with 10.000-20.000 Euro, multiplied with number of standby systems is just a high amount of money which could be saved! You can invest in hardware or in performance experts. A good performance expert is less expensive. I bet!
Usage
- Example
Recommendations
- None
Example(s)
Security related issues
- None.
How to make this feature as secure as possible
- None.
Source code of this extension
- See above
Planned enhancements
- Will follow
Discussions
The reason to implement this was an SQL which hits an unknown Oracle bug.
fsora session <sid> produces a report like this:
This report show long waits for "sequential read" but the IO of the system was perfect. To have a workaround until the software vendor changed the SQL solution described in this article was implemented. The real solution was to rewrite the SQL to get same result with better response time.
The averages of "sequential read" are fine but if you look in the right lower corner than you see that system waits since 6 seconds for one (!!!) sequential read and this goes up to 60 seconds. Interesting thing here is that table and indexes performed pretty good but when used in this special query not. It was'nt possible to find further facts generating a session trace.
The real problem seems to be inside the database engine and the symptome shown by trace or V$SESSION_EVENT analysis seems to be false. I've never seen such an effect the last 15 years analyzing thousands of Oracle database so this is really an absolute mystic problem.
The SQL which had the problem used an "IN" clause with a subselect.
- If you need further information or assistance with performance tuning or PL/SQL development feel free to contact mailto:performance@oraforecast.com
History
- 2009.03.11 Start of documentation
Known bugs
- Reported Description of problem Fixed Release
- 2009.03.11 No issues known
Running example
- Not required

