当前位置:  数据库>oracle

双节点RAC各个节点主机频繁自动重启故障解决

    来源: 互联网  发布时间:2017-04-25

    本文导语: 1)         背景介绍: 最近在vmware中搭建了一个Oracle10g RAC的双节点实验平台并将oracle RAC从10.2.0.1升级到10.2.0.5,后来发现两台linux经常自动重启;  2)         平台信息:vmware7 + OEL5.7X64 + ASMLib2.0 + ORACLE10.2.0.5 3)       ...

1)         背景介绍:

最近在vmware搭建了一个Oracle10g RAC的双节点实验平台并将oracle RAC从10.2.0.1升级到10.2.0.5,后来发现两台linux经常自动重启; 
 
2)         平台信息
vmware7 + OEL5.7X64 + ASMLib2.0 + ORACLE10.2.0.5
 
3)         /var/说明 iis7站长之家/message日志
Ø NODE1:Linux1
Apr 18 20:44:18 Linux1 syslogd 1.4.1: restart.
Apr 18 20:44:18 Linux1 kernel: klogd 1.4.1, log source = /proc/kmsg started.
Apr 18 20:44:18 Linux1 kernel: Initializing cgroup subsys cpuset
Apr 18 20:44:18 Linux1 kernel: Initializing cgroup subsys cpu
Apr 18 20:44:18 Linux1 kernel: Linux version 2.6.32-200.13.1.el5uek (mockbuild@ca-build9.us.oracle.com) (gcc version 4.1.2 20080704 (Red Hat 4.1.2-50)) #1 SMP Wed Jul 27 21:02:33 EDT 2011
Apr 18 20:44:18 Linux1 kernel: Command line: ro root=/dev/VolGroup00/LogVol00 rhgb quiet
Apr 18 20:44:18 Linux1 kernel: KERNEL supported cpus:
Apr 18 20:44:18 Linux1 kernel:   Intel GenuineIntel
Apr 18 20:44:18 Linux1 kernel:   AMD AuthenticAMD
Apr 18 20:44:18 Linux1 kernel:   Centaur CentaurHauls
Apr 18 20:44:18 Linux1 kernel: BIOS-provided physical RAM map:
Apr 18 20:44:18 Linux1 kernel: BIOS-e820: 0000000000000000 - 000000000009f800 (usable)
Apr 18 20:44:18 Linux1 kernel: BIOS-e820: 000000000009f800 - 00000000000a0000 (reserved)
Apr 18 20:44:18 Linux1 kernel: BIOS-e820: 00000000000ca000 - 00000000000cc000 (reserved)
Apr 18 20:44:18 Linux1 kernel: BIOS-e820: 00000000000dc000 - 00000000000e4000 (reserved)
Apr 18 20:44:18 Linux1 kernel: BIOS-e820: 00000000000e8000 - 0000000000100000 (reserved)
Apr 18 20:44:18 Linux1 kernel: BIOS-e820: 0000000000100000 - 00000000bfef0000 (usable)
Apr 18 20:44:18 Linux1 kernel: BIOS-e820: 00000000bfef0000 - 00000000bfeff000 (ACPI data)
Apr 18 20:44:18 Linux1 kernel: BIOS-e820: 00000000bfeff000 - 00000000bff00000 (ACPI NVS)
Apr 18 20:44:18 Linux1 kernel: BIOS-e820: 00000000bff00000 - 00000000c0000000 (usable)
Apr 18 20:44:18 Linux1 kernel: BIOS-e820: 00000000e0000000 - 00000000f0000000 (reserved)
Apr 18 20:44:18 Linux1 kernel: BIOS-e820: 00000000fec00000 - 00000000fec10000 (reserved)
Apr 18 20:44:18 Linux1 kernel: BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved)
Apr 18 20:44:18 Linux1 kernel: BIOS-e820: 00000000fffe0000 - 0000000100000000 (reserved)
Apr 18 20:44:18 Linux1 kernel: BIOS-e820: 0000000100000000 - 0000000140000000 (usable)
Apr 18 20:44:18 Linux1 kernel: DMI present.
Ø NODE2:Linux2
Apr 18 20:43:35 Linux2 kernel: o2net: connection to node Linux1 (num 0) at 192.168.3.131:7777 has been idle for 30.0 seconds, shutting it down.
Apr 18 20:43:35 Linux2 kernel: (swapper,0,0):o2net_idle_timer:1498 here are some times that might help debug the situation: (tmr 1334752985.559806 now 1334753015.306532 dr 1334752985.559360 adv 1334752985.559806:1334752985.559807 func (b651ea27:504) 1334752951.27068:1334752951.27323)
Apr 18 20:43:35 Linux2 kernel: o2net: no longer connected to node Linux1 (num 0) at 192.168.3.131:7777
Apr 18 20:43:56 Linux2 kernel: o2net: connection to node Linux1 (num 0) at 192.168.3.131:7777 shutdown, state 7
Apr 18 20:44:05 Linux2 kernel: (o2net,3480,0):o2net_connect_expired:1659 ERROR: no connection established with node 0 after 30.0 seconds, giving up and returning errors.
Apr 18 20:44:24 Linux2 avahi-daemon[4341]: Registering new address record for 192.168.0.136 on eth0.
Apr 18 20:44:26 Linux2 kernel: o2net: connection to node Linux1 (num 0) at 192.168.3.131:7777 shutdown, state 7
Apr 18 20:44:28 Linux2 last message repeated 2 times
Apr 18 20:44:28 Linux2 kernel: (o2hb-9938799A41,3564,1):o2dlm_eviction_cb:267 o2dlm has evicted node 0 from group 9938799A418642218A66FE77029DE473
Apr 18 20:44:28 Linux2 kernel: (ocfs2rec,19793,1):ocfs2_replay_journal:1605 Recovering node 0 from slot 0 on device (8,65)
Apr 18 20:44:30 Linux2 kernel: o2net: connection to node Linux1 (num 0) at 192.168.3.131:7777 shutdown, state 8
Apr 18 20:44:31 Linux2 kernel: (ocfs2rec,19793,0):ocfs2_begin_quota_recovery:407 Beginning quota recovery in slot 0
Apr 18 20:44:31 Linux2 kernel: (ocfs2_wq,3567,1):ocfs2_finish_quota_recovery:598 Finishing quota recovery in slot 0
Apr 18 20:44:31 Linux2 kernel: (dlm_reco_thread,3573,0):dlm_get_lock_resource:836 9938799A418642218A66FE77029DE473:$RECOVERY: at least one node (0) to recover before lock mastery can begin
Apr 18 20:44:31 Linux2 kernel: (dlm_reco_thread,3573,0):dlm_get_lock_resource:870 9938799A418642218A66FE77029DE473: recovery map is not empty, but must master $RECOVERY lock now
Apr 18 20:44:31 Linux2 kernel: (dlm_reco_thread,3573,0):dlm_do_recovery:523 (3573) Node 1 is the Recovery Master for the Dead Node 0 for Domain 9938799A418642218A66FE77029DE473
以上信息在两台机器中会交换出现,说明并不是总是固定的一台机器对另外一台超时
 
 
4)         根据message信息报错,应该是o2cb的idle时间超限导致的,系统中O2CB服务状态为:
[oracle@Linux1]service o2cb status
Driver for "configfs": Loaded
Filesystem "configfs": Mounted
Stack glue driver: Loaded
Stack plugin "o2cb": Loaded
Driver for "ocfs2_dlmfs": Loaded
Filesystem "ocfs2_dlmfs": Mounted
Checking O2CB cluster ocfs2: Online
Heartbeat dead threshold = 301
 Network idle timeout: 30000                            /此处单位为毫秒,正式message中报的30秒
 Network keepalive delay: 2000
 Network reconnect delay: 2000
Checking O2CB heartbeat: Active


































































    
 
 
 
本站(WWW.)旨在分享和传播互联网科技相关的资讯和技术,将尽最大努力为读者提供更好的信息聚合和浏览方式。
本站(WWW.)站内文章除注明原创外,均为转载、整理或搜集自网络。欢迎任何形式的转载,转载请注明出处。












  • 相关文章推荐
  • 二叉树常用算法(求总节点个数和叶子节点个数)
  • 请教:如何在一个节点上利用另一个节点上的编译器呢?
  • Hadoop 1.2.1 单节点安装(Single Node Setup)步骤
  • 紧急求助:如何动态修改某个选中jTree节点的节点名称。
  • Hadoop介绍及最新稳定版Hadoop 2.4.1下载地址及单节点安装
  • 给定链表中间节点指针,删除中间节点的方法
  • jquery 取子节点及当前节点属性值
  • jquery 取子节点及当前节点属性值的方法
  • 递归删除一个节点以及该节点下的所有节点示例
  • jQuery获取节点和子节点文本的方法
  • jQuery如何获取节点与子节点文本?
  • 急救,请问如何得到jTree中被选中节点的父节点的path或者row?
  • DevExpress获取节点下可视区域子节点集合的实现方法
  • 请教Swing高手,如何在JTree中如何通过一个TreePath判断它代表的节点是不是叶节点?
  • 如何将一个树的全部节点都展开?(包括子目录下面的节点),最好有源代码!谢谢了!
  • MySQL递归查询树状表的子节点、父节点具体实现
  • c#操作xml的代码(插入节点、修改节点、删除节点等)
  • 如何在一个DOM中更新一个节点或者一部分节点
  • i节点的问题
  • RAC +GPFS添加节点的问题~~~~~~~~~~~~·
  • jquery的父子兄弟节点查找示例代码


  • 站内导航:


    特别声明:169IT网站部分信息来自互联网,如果侵犯您的权利,请及时告知,本站将立即删除!

    ©2012-2021,