批量错误用户名与密码导致业务用户HANG住(library cache lock)
数据库版本
SQL> select * from v$version;
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
PL/SQL Release 11.2.0.3.0 - Production
CORE 11.2.0.3.0 Production
TNS for HPUX: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production
问题如下
SQL> conn doudou/oracle (HANG住了)
查看等待事件
select
count(*),
CASE WHEN state != 'WAITING' THEN 'WORKING'
ELSE 'WAITING'
END AS state,
CASE WHEN state != 'WAITING' THEN 'On CPU / runqueue'
ELSE event
END AS sw_event
FROM
v$session_wait
GROUP BY
CASE WHEN state != 'WAITING' THEN 'WORKING'
ELSE 'WAITING'
END,
CASE WHEN state != 'WAITING' THEN 'On CPU / runqueue'
ELSE event
END
ORDER BY
1 DESC, 2 DESC
/
library cache lock WAITING 585
rdbms ipc message WAITING 16
Space Manager: slave idle wait WAITING 3
jobq slave wait WAITING 2
Streams AQ: waiting for time management or cleanup tasks WAITING 1
VKRM Idle WAITING 1
smon timer WAITING 1
Streams AQ: qmn coordinator idle wait WAITING 1
pmon timer WAITING 1
Streams AQ: qmn slave idle wait WAITING 1
DIAG idle wait WAITED KNOWN TIME 1
DIAG idle wait WAITING 1
library cache lock WAITED KNOWN TIME 1
VKTM Logical Idle Wait WAITING 1
asynch descriptor resize WAITED SHORT TIME 1
SQL*Net message from client WAITING 1
结合等待事件去分析
1.library cache lock 等待严重,另一方面考虑只有单独的这个业务用户doudou不能登录,其他业务类型的用户doudou01不受任何影响。再次怀疑可能是11g 密码延迟机制导致的这个问题。
2.然后查看了一下用户修改密码的时间
select * from sys.user$ where name='DOUDOU';
PTIME=2013/11/6 11:22:09 --PTIME is the date the password was last changed
CTIME=2013/11/6 11:22:09 --CTIME is the date the user was created
从这里可以看出我们DOUDOU用户,没有修改过密码,但是为什么会出现大量的library cache lock,没有修改密码,但是新业务配置的用户密码会不会有错误呢,这样询问了开发人员,原来他们的配置有错误,用户密码配置错误了。也就是错误的用户和密码批量请求导致了大量的library cache lock。
搜索MOS找到了类似的案例
Library Cache Locks Due to Invalid Login Attempts (Doc ID 1309738.1)
Cause
Numerous failed logins attempts can cause row cache lock waits and/or library cache lock waits.
Set the below event in the spfile or init.ora file and restart the database:
alter system set event ="28401 TRACE NAME CONTEXT FOREVER, LEVEL 1" scope=spfile;
or
EVENT="28401 TRACE NAME CONTEXT FOREVER, LEVEL 1"
3.问题解决,正确的用户密码配置之后,并设置参数EVENT="28401 TRACE NAME CONTEXT FOREVER, LEVEL 1",大量的library cache lock逐渐减少,最后消除。新业务也正常使用了
附表
user$ 视图解释
Test cases below show:
•CTIME is the date the user was created.
•LTIME is the date the user was last locked. (Note that it doesn't get NULLed when you unlock the user).
•PTIME is the date the password was last changed.
•LCOUNT is the number of failed logins.
记录用户登录失败触发器:
CREATE OR REPLACE TRIGGER logon_denied_to_alert
AFTER servererror ON DATABASE
DECLARE
message VARCHAR2(168);
ip VARCHAR2(15);
v_os_user VARCHAR2(80);
v_module VARCHAR2(50);
v_action VARCHAR2(50);
v_pid VARCHAR2(10);
v_sid NUMBER;
v_program VARCHAR2(48);
BEGIN
IF (ora_is_servererror(1017)) THEN
-- get ip FOR remote connections :
IF upper(sys_context('userenv', 'network_protocol')) = 'TCP' THEN
ip := sys_context('userenv', 'ip_address');
END IF;
SELECT sid INTO v_sid FROM sys.v_$mystat WHERE rownum < 2;
SELECT p.spid, v.program
INTO v_pid, v_program
FROM v$process p, v$session v
WHERE p.addr = v.paddr
AND v.sid = v_sid;
v_os_user := sys_context('userenv', 'os_user');
dbms_application_info.read_module(v_module, v_action);
message := to_char(SYSDATE, 'YYYYMMDD HH24MISS') ||
' logon denied from ' || nvl(ip, 'localhost') || ' ' ||
v_pid || ' ' || v_os_user || ' with ' || v_program || ' – ' ||
v_module || ' ' || v_action;
sys.dbms_system.ksdwrt(2, message);
END IF;
END;
/