ORACLE SOS

 找回密码
 立即注册

QQ登录

只需一步,快速开始

搜索
查看: 12240|回复: 4

我的RAC节点2挂掉了,无法启动

[复制链接]

1

主题

4

帖子

23

积分

新手上路

Rank: 1

积分
23
发表于 2015-3-12 11:20:14 | 显示全部楼层 |阅读模式
节点1正常
[grid@rac1 ~]$ crsctl stat res -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS      
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATA.dg
               ONLINE  ONLINE       rac1                                         
ora.FRA.dg
               ONLINE  ONLINE       rac1                                         
ora.LISTENER.lsnr
               ONLINE  ONLINE       rac1                                         
ora.asm
               ONLINE  ONLINE       rac1                     Started            
ora.gsd
               OFFLINE OFFLINE      rac1                                         
ora.net1.network
               ONLINE  ONLINE       rac1                                         
ora.ons
               ONLINE  ONLINE       rac1                                         
ora.registry.acfs
               ONLINE  ONLINE       rac1                                         
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       rac1                                         
ora.cvu
      1        ONLINE  ONLINE       rac1                                         
ora.itsm.db
      1        ONLINE  ONLINE       rac1                     Open               
      2        ONLINE  OFFLINE                                                   
ora.oc4j
      1        ONLINE  ONLINE       rac1                                         
ora.rac1.vip
      1        ONLINE  ONLINE       rac1                                         
ora.rac2.vip
      1        ONLINE  INTERMEDIATE rac1                     FAILED OVER         
ora.scan1.vip
      1        ONLINE  ONLINE       rac1                        



节点二就出问题了:
[root@rac2 ~]# crsctl stat res -t
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4000: Command Status failed, or completed with errors.

我尝试启动:
[root@rac2 ~]# crsctl start cluster -all
CRS-4404: The following nodes did not reply within the allotted time:
rac1, rac2
CRS-2672: Attempting to start 'ora.cssd' on 'rac2'
CRS-2672: Attempting to start 'ora.diskmon' on 'rac2'
CRS-2676: Start of 'ora.diskmon' on 'rac2' succeeded
CRS-4705: Start of Clusterware failed on node rac2.
CRS-4000: Command Start failed, or completed with errors.


我查看了下OCSSD日志,节点PING都通的
015-02-22 18:41:03.029: [    CSSD][2481449280]clssgmClientConnectMsg: Connect from con(0x28b9) proc(0x2b651d0) pid(3845) version 11:2:1:4, properties: 1,2,3,4,5
2015-02-22 18:41:03.029: [    CSSD][2481449280]clssgmClientConnectMsg: msg flags 0x0000
2015-02-22 18:41:03.031: [    CSSD][2481449280]clssscSelect: cookie accept request 0x2b651d0
2015-02-22 18:41:03.031: [    CSSD][2481449280]clssscevtypSHRCON: getting client with cmproc 0x2b651d0
2015-02-22 18:41:03.031: [    CSSD][2481449280]clssgmRegisterClient: proc(4/0x2b651d0), client(1/0x2b50a80)
2015-02-22 18:41:03.031: [    CSSD][2481449280]clssgmJoinGrock: global grock CRF- new client 0x2b50a80 with con 0x7f1e000028e8, requested num -1, flags 0x4000e00
2015-02-22 18:41:03.031: [    CSSD][2481449280]clssgmJoinGrock: ignoring grock join for client not requiring fencing until group information has been received from the master; group name CRF-, member number -1, flags 0x4000e00
2015-02-22 18:41:03.032: [    CSSD][2481449280]clssgmDiscEndpcl: gipcDestroy 0x28e8
2015-02-22 18:41:03.509: [    CSSD][2465671488]clssgmWaitOnEventValue: after CmInfo State  val 3, eval 1 waited 0
2015-02-22 18:41:03.709: [    CSSD][2470410560]clssnmvDHBValidateNcopy: node 1, rac1, has a disk HB, but no network HB, DHB has rcfg 320163447, wrtcnt, 109095, LATS 41143754, lastSeqNo 109094, uniqueness 1424715417, timestamp 1426129157/97505444
2015-02-22 18:41:04.510: [    CSSD][2465671488]clssgmWaitOnEventValue: after CmInfo State  val 3, eval 1 waited 0
2015-02-22 18:41:04.712: [    CSSD][2470410560]clssnmvDHBValidateNcopy: node 1, rac1, has a disk HB, but no network HB, DHB has rcfg 320163447, wrtcnt, 109096, LATS 41144754, lastSeqNo 109095, uniqueness 1424715417, timestamp 1426129158/97506444

然后我看下数据库是未启动的,我尝试启动数据库:
[oracle@rac2 ~]$ sqlplus / as sysdba


SQL*Plus: Release 11.2.0.4.0 Production on Sun Feb 22 18:42:59 2015


Copyright (c) 1982, 2013, Oracle.  All rights reserved.


Connected to an idle instance.


SQL> startup
ORA-01078: failure in processing system parameters
ORA-01565: error in identifying file '+DATA/itsm/spfileitsm.ora'
ORA-17503: ksfdopn:2 Failed to open file +DATA/itsm/spfileitsm.ora
ORA-15077: could not locate ASM instance serving a required diskgroup
SQL>

我有尝试启动节点2ASM实例,节点一的数据库和ASM实例都是正常的:


[grid@rac2 ~]$ sqlplus / as sysasm

SQL*Plus: Release 11.2.0.4.0 Production on Sun Feb 22 18:44:04 2015

Copyright (c) 1982, 2013, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> startup
ORA-01078: failure in processing system parameters
ORA-29701: unable to connect to Cluster Synchronization Service
不知道什么原因了,请赐教


回复

使用道具 举报

1

主题

4

帖子

23

积分

新手上路

Rank: 1

积分
23
 楼主| 发表于 2015-3-12 11:47:00 | 显示全部楼层
附件是日志,节点1,2互PING SSH都没问题
回复 支持 反对

使用道具 举报

95

主题

266

帖子

1719

积分

管理员

Rank: 9Rank: 9Rank: 9

积分
1719
发表于 2015-3-12 12:04:06 | 显示全部楼层
2015-02-22 07:31:49.082: [UiServer][1449077056] CS(0x7f1c5c064560)set Properties ( root,0x361e410)
2015-02-22 07:31:49.094: [UiServer][1451178304]{2:7263:43} Sending message to PE. ctx= 0x7f1c5c07c890, Client PID: 7246
2015-02-22 07:31:49.094: [UiServer][1451178304]{2:7263:43} Master is not known. Rejecting the command: 13
2015-02-22 07:31:49.470: [GIPCXCPT][1729603904] gipchaInternalResolve: failed to resolve ret gipcretKeyNotFound (36), host 'rac2', port 'bb5b-e793-981d-7626', hctx 0x2d85bf0 [0000000000000010] { gipchaContext : host 'rac2', name '84bf-806f-7ec4-8d29', luid '42b44c60-00000000', numNode 1, numInf 1, usrFlags 0x0, flags 0x5 }, ret gipcretKeyNotFound (36)
2015-02-22 07:31:49.471: [GIPCHGEN][1729603904] gipchaResolveF [gipcmodGipcResolve : gipcmodGipc.c : 806]: EXCEPTION[ ret gipcretKeyNotFound (36) ]  failed to resolve ctx 0x2d85bf0 [0000000000000010] { gipchaContext : host 'rac2', name '84bf-806f-7ec4-8d29', luid '42b44c60-00000000', numNode 1, numInf 1, usrFlags 0x0, flags 0x5 }, host 'rac2', port 'bb5b-e793-981d-7626', flags 0x0
2015-02-22 07:31:49.473: [GIPCXCPT][1729603904] gipchaInternalResolve: failed to resolve ret gipcretKeyNotFound (36), host 'rac2', port '0da4-309d-2db4-bf1b', hctx 0x2d85bf0 [0000000000000010] { gipchaContext : host 'rac2', name '84bf-806f-7ec4-8d29', luid '42b44c60-00000000', numNode 1, numInf 1, usrFlags 0x0, flags 0x5 }, ret gipcretKeyNotFound (36)
2015-02-22 07:31:49.473: [GIPCHGEN][1729603904] gipchaResolveF [gipcmodGipcResolve : gipcmodGipc.c : 806]: EXCEPTION[ ret gipcretKeyNotFound (36) ]  failed to resolve ctx 0x2d85bf0 [0000000000000010] { gipchaContext : host 'rac2', name '84bf-806f-7ec4-8d29', luid '42b44c60-00000000', numNode 1, numInf 1, usrFlags 0x0, flags 0x5 }, host 'rac2', port '0da4-309d-2db4-bf1b', flags 0x0
2015-02-22 07:31:49.668: [UiServer][1449077056] CS(0x7f1c5c03e110)set Properties ( grid,0x34f5050)
2015-02-22 07:31:49.680: [UiServer][1451178304]{2:7263:44} Sending message to PE. ctx= 0x7f1c5c03fdd0, Client PID: 3652
2015-02-22 07:31:49.680: [UiServer][1451178304]{2:7263:44} Master is not known. Rejecting the command: 14
2015-02-22 07:31:49.795: [   CRSPE][1453279552]{2:7263:2} Join request has been processed by the Master.

尝试ping rac2 试试看,另外贴出来hosts文件

Q Q:107644445
Tel:13429648788
Email:dba@xifenfei.com
个人Blog(惜分飞)
提供专业ORACLE技术支持(数据恢复,安装实施,升级迁移,备份容灾,故障诊断,系统优化等)
回复 支持 反对

使用道具 举报

1

主题

4

帖子

23

积分

新手上路

Rank: 1

积分
23
 楼主| 发表于 2015-3-12 12:08:53 | 显示全部楼层
如下是RAC2的,RAC1也都能PING通,SSH也都没问题
[grid@rac2 ~]$ ping rac2
PING rac2.localdomain (192.168.0.106) 56(84) bytes of data.
64 bytes from rac2.localdomain (192.168.0.106): icmp_seq=1 ttl=64 time=0.037 ms
64 bytes from rac2.localdomain (192.168.0.106): icmp_seq=2 ttl=64 time=0.037 ms
^C
--- rac2.localdomain ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1000ms
rtt min/avg/max/mdev = 0.037/0.037/0.037/0.000 ms
[grid@rac2 ~]$ ping rac2-priv
PING rac2-priv.localdomain (192.168.1.106) 56(84) bytes of data.
64 bytes from rac2-priv.localdomain (192.168.1.106): icmp_seq=1 ttl=64 time=0.039 ms
^C
--- rac2-priv.localdomain ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.039/0.039/0.039/0.000 ms
[grid@rac2 ~]$ ping rac2-vip
PING rac2-vip.localdomain (192.168.0.110) 56(84) bytes of data.
64 bytes from rac2-vip.localdomain (192.168.0.110): icmp_seq=1 ttl=64 time=5.34 ms
^C
--- rac2-vip.localdomain ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 5.342/5.342/5.342/0.000 ms
[grid@rac2 ~]$ cat /etc/hosts
# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1                localhost.localdomain localhost
::1                localhost6.localdomain6 localhost6
192.168.0.105   rac1.localdomain        rac1
192.168.0.106   rac2.localdomain        rac2
# Private
192.168.1.105   rac1-priv.localdomain   rac1-priv
192.168.1.106   rac2-priv.localdomain   rac2-priv
# Virtual
192.168.0.109   rac1-vip.localdomain    rac1-vip
192.168.0.110   rac2-vip.localdomain    rac2-vip
# SCAN
192.168.0.11   scan.localdomain        scan
[grid@rac2 ~]$

回复 支持 反对

使用道具 举报

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

QQ|Archiver|手机版|ORACLE SOS 技术论坛

GMT+8, 2024-5-2 07:45 , Processed in 0.019662 second(s), 20 queries .

Powered by Discuz! X3.4

Copyright © 2001-2020, Tencent Cloud.

快速回复 返回顶部 返回列表