11.2 RAC 修改了目录权限(u01)后crs不能启动的解决方法–使用rootcrs.pl -init修复

联系:QQ(5163721)

标题:11.2 RAC 修改了目录权限(u01)后crs不能启动的解决方法–使用rootcrs.pl -init修复

作者:Lunar©版权所有[文章允许转载,但必须以链接方式注明源地址,否则追究法律责任.]

还原节点损坏的场景:

[root@lunardb01 grid]# chown -R oracle:oinstall /u01
[root@lunardb01 grid]# crsctl start crs
CRS-4123: Oracle High Availability Services has been started.
[root@lunardb01 grid]# ps -ef|grep d.bin
root     27170     1  6 19:27 ?        00:00:01 /u01/app/11.2.0/grid/bin/ohasd.bin reboot
grid     27400     1  3 19:27 ?        00:00:00 /u01/app/11.2.0/grid/bin/oraagent.bin
root     27609 19818  0 19:27 pts/1    00:00:00 grep d.bin
[root@lunardb01 grid]# ps -ef|grep d.bin
root     27170     1  5 19:27 ?        00:00:01 /u01/app/11.2.0/grid/bin/ohasd.bin reboot
grid     27400     1  2 19:27 ?        00:00:00 /u01/app/11.2.0/grid/bin/oraagent.bin
root     27621 19818  0 19:27 pts/1    00:00:00 grep d.bin
[root@lunardb01 grid]# ps -ef|grep d.bin
root     27170     1  1 19:27 ?        00:00:01 /u01/app/11.2.0/grid/bin/ohasd.bin reboot
grid     27400     1  0 19:27 ?        00:00:00 /u01/app/11.2.0/grid/bin/oraagent.bin
root     28150 19818  0 19:28 pts/1    00:00:00 grep d.bin
[root@lunardb01 grid]# 

可以看到,此时crs起不来了,后台报错:

-----ohasd的报错:
2014-10-04 19:27:27.643: [   CRSPE][1148361024] {0:0:2} RI [ora.mdnsd 1 1] new internal state: [STARTING] old value: [STABLE]
2014-10-04 19:27:27.643: [   CRSPE][1148361024] {0:0:2} Sending message to agfw: id = 223
2014-10-04 19:27:27.644: [   CRSPE][1148361024] {0:0:2} CRS-2672: Attempting to start 'ora.mdnsd' on 'lunardb01'

2014-10-04 19:27:27.644: [    AGFW][1137854784] {0:0:2} Agfw Proxy Server received the message: RESOURCE_START[ora.mdnsd 1 1] ID 4098:223
2014-10-04 19:27:27.644: [    AGFW][1137854784] {0:0:2} Creating the resource: ora.mdnsd 1 1
2014-10-04 19:27:27.644: [    AGFW][1137854784] {0:0:2} Initializing the resource ora.mdnsd 1 1 for type ora.mdns.type
2014-10-04 19:27:27.644: [    AGFW][1137854784] {0:0:2} SR: acl = owner:grid:rw-,pgrp:oinstall:rw-,other::r--,user:grid:rwx
2014-10-04 19:27:27.645: [   CRSPE][1148361024] {0:0:2} ICE has queued an operation. Details: Operation [START of [ora.gpnpd 1 1] on [lunardb01] : local=0, unplanned=00x2aaab00c68f0] cannot run cause it needs W lock for: WO for Placement Path RI:[ora.mdnsd 1 1] server [lunardb01] target states [ONLINE ], locked by op [START of [ora.mdnsd 1 1] on [lunardb01] : local=0, unplanned=00x2aaab00b72e0]. Owner: CRS-2683: It is locked by 'SYSTEM' for command 'Resource Autostart : lunardb01'
—crsd的报错:
2014-10-04 19:26:23.937: [ CRSCOMM][1158867264][FFAIL] Ipc: Couldnt clscreceive message, no message: 11
2014-10-04 19:26:23.938: [ CRSCOMM][1158867264] Ipc: Client disconnected.
2014-10-04 19:26:23.938: [ CRSCOMM][1158867264][FFAIL] IpcL: Listener got clsc error 11 for memNum. 1
2014-10-04 19:26:23.938: [ CRSCOMM][1158867264] IpcL: connection to member 1 has been removed
2014-10-04 19:26:23.938: [CLSFRAME][1158867264] Removing IPC Member:{Relative|Node:0|Process:1|Type:3}
2014-10-04 19:26:23.938: [CLSFRAME][1158867264] Disconnected from AGENT process: {Relative|Node:0|Process:1|Type:3}
2014-10-04 19:26:23.938: [    AGFW][1165171008] {1:33686:190} Agfw Proxy Server received process disconnected notification, count=1
2014-10-04 19:26:23.939: [    AGFW][1165171008] {1:33686:190} /u01/app/11.2.0/grid/bin/oraagent_grid disconnected.
2014-10-04 19:26:23.939: [    AGFW][1165171008] {1:33686:190} Agent /u01/app/11.2.0/grid/bin/oraagent_grid[5646] stopped!
2014-10-04 19:26:23.939: [ CRSCOMM][1165171008] {1:33686:190} IpcL: removeConnection: Member 1 does not exist.
–alert的报错:
2014-10-04 19:27:23.293
[ohasd(27170)]CRS-2112:The OLR service started on node lunardb01.
2014-10-04 19:27:23.314
[ohasd(27170)]CRS-1301:Oracle High Availability Service started on node lunardb01.
2014-10-04 19:27:23.314
[ohasd(27170)]CRS-8017:location: /etc/oracle/lastgasp has 2 reboot advisory log files, 0 were announced and 0 errors occurred
2014-10-04 19:27:24.351
[/u01/app/11.2.0/grid/bin/orarootagent.bin(27307)]CRS-5016:Process "/u01/app/11.2.0/grid/bin/acfsload" spawned by agent "/u01/app/11.2.0/grid/bin/orarootagent.bin" for action "check" failed: details at "(:CLSN00010:)" in "/u01/app/11.2.0/grid/log/lunardb01/agent/ohasd/orarootagent_root/orarootagent_root.log"
2014-10-04 19:27:27.171
[/u01/app/11.2.0/grid/bin/orarootagent.bin(27307)]CRS-2302:Cannot get GPnP profile. Error CLSGPNP_NO_DAEMON (GPNPD daemon is not running).
2014-10-04 19:29:27.802
[/u01/app/11.2.0/grid/bin/oraagent.bin(27400)]CRS-5818:Aborted command 'start' for resource 'ora.mdnsd'. Details at (:CRSAGF00113:) {0:0:2} in /u01/app/11.2.0/grid/log/lunardb01/agent/ohasd/oraagent_grid//oraagent_grid.log.
2014-10-04 19:29:31.812
[ohasd(27170)]CRS-2757:Command 'Start' timed out waiting for response from the resource 'ora.mdnsd'. Details at (:CRSPE00111:) {0:0:2} in /u01/app/11.2.0/grid/log/lunardb01/ohasd/ohasd.log.
2014-10-04 19:31:34.907
[/u01/app/11.2.0/grid/bin/oraagent.bin(29240)]CRS-5818:Aborted command 'start' for resource 'ora.mdnsd'. Details at (:CRSAGF00113:) {0:0:2} in /u01/app/11.2.0/grid/log/lunardb01/agent/ohasd/oraagent_grid//oraagent_grid.log.
2014-10-04 19:31:38.918
[ohasd(27170)]CRS-2757:Command 'Start' timed out waiting for response from the resource 'ora.mdnsd'. Details at (:CRSPE00111:) {0:0:2} in /u01/app/11.2.0/grid/log/lunardb01/ohasd/ohasd.log.
2014-10-04 19:33:41.993
[/u01/app/11.2.0/grid/bin/oraagent.bin(30882)]CRS-5818:Aborted command 'start' for resource 'ora.gpnpd'. Details at (:CRSAGF00113:) {0:0:2} in /u01/app/11.2.0/grid/log/lunardb01/agent/ohasd/oraagent_grid//oraagent_grid.log.
2014-10-04 19:33:46.004
[ohasd(27170)]CRS-2757:Command 'Start' timed out waiting for response from the resource 'ora.gpnpd'. Details at (:CRSPE00111:) {0:0:2} in /u01/app/11.2.0/grid/log/lunardb01/ohasd/ohasd.log.

可以看到,卡在ora.mdnsd服务不能启动:

[root@lunardb01 grid]# crsctl status res -t -init
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS       
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.asm
      1        ONLINE  OFFLINE                               Instance Shutdown   
ora.cluster_interconnect.haip
      1        ONLINE  OFFLINE                                                   
ora.crf
      1        ONLINE  OFFLINE                                                   
ora.crsd
      1        ONLINE  OFFLINE                                                   
ora.cssd
      1        ONLINE  OFFLINE                                                   
ora.cssdmonitor
      1        ONLINE  OFFLINE                                                   
ora.ctssd
      1        ONLINE  OFFLINE                                                   
ora.diskmon
      1        ONLINE  OFFLINE                                                   
ora.drivers.acfs
      1        ONLINE  OFFLINE                                                   
ora.evmd
      1        ONLINE  OFFLINE                                                   
ora.gipcd
      1        ONLINE  OFFLINE                                                   
ora.gpnpd
      1        ONLINE  OFFLINE                                                   
ora.mdnsd
      1        ONLINE  OFFLINE                               STARTING            
[root@lunardb01 grid]# 

使用rootcrs.pl的init选项尝试修复,结果是不行的:

[root@lunardb01 lunardb01]# $GRID_HOME/crs/install/rootcrs.pl -init
Using configuration parameter file: /u01/app/11.2.0/grid/crs/install/crsconfig_params
[root@lunardb01 lunardb01]# crsctl start crs
CRS-4123: Oracle High Availability Services has been started.
[root@lunardb01 lunardb01]# 
[root@lunardb01 ohasd]# ps -ef|grep d.bin
root     12642     1  0 19:48 ?        00:00:01 /u01/app/11.2.0/grid/bin/ohasd.bin reboot
grid     14804     1  0 19:51 ?        00:00:00 /u01/app/11.2.0/grid/bin/oraagent.bin
root     15481 19818  0 19:52 pts/1    00:00:00 grep d.bin
[root@lunardb01 ohasd]# ps -ef|grep d.bin
root     12642     1  0 19:48 ?        00:00:01 /u01/app/11.2.0/grid/bin/ohasd.bin reboot
grid     14804     1  0 19:51 ?        00:00:00 /u01/app/11.2.0/grid/bin/oraagent.bin
root     15663 19818  0 19:52 pts/1    00:00:00 grep d.bin
[root@lunardb01 ohasd]# 

后台日志的报错信息,跟上面的是雷同的。
可见,使用rootcrs.pl -init修复目录权限,在chown -R /u01面前,作用不大。

此条目发表在 RAC 分类目录,贴了 , , 标签。将固定链接加入收藏夹。

发表评论

电子邮件地址不会被公开。 必填项已用 * 标注