今天检查Exadata的IB网络时,使用 infinicheck 检查,发现db节点有报错,cell节点正常。
当前主机是Exadata X5-2:
[root@dm01db01 ibdiagtools]# /opt/oracle.SupportTools/CheckHWnFWProfile -c loose [SUCCESS] The hardware and firmware matches supported profile for server=ORACLE_SERVER_X5-2 [root@dm01db01 ibdiagtools]#
infinicheck的执行结果(该命令可以有很丰富的参数,但是也可以不带任何参数,缺省就可以):
INFINICHECK
[Network Connectivity, Configuration and Performance]
[Version IBD VER 2.d ]
Verifying User Equivalance of user=root to all hosts.
(If it isn't setup correctly, an authentication prompt will appear to push keys to all the nodes)
Verifying User Equivalance of user=root to all cells.
(If it isn't setup correctly, an authentication prompt will appear to push keys to all the nodes)
#### CONNECTIVITY TESTS ####
[COMPUTE NODES -> STORAGE CELLS]
(30 seconds approx.)
[SUCCESS]..............Results OK
[SUCCESS]....... All can talk to all storage cells
Verifying Subnet Masks on all nodes
[SUBNET MASKS DIFFER].....2 entries found
Prechecking for uniformity of rds-tools on all nodes
[SUCCESS].... rds-tools version is the same across the cluster
Checking for bad links in the fabric
[SUCCESS].......... No bad fabric links found
[COMPUTE NODES -> COMPUTE NODES]
(30 seconds approx.)
[SUCCESS]..............Results OK
[SUCCESS]....... All hosts can talk to all other nodes
#### PERFORMANCE TESTS ####
[(1) Storage Cell to Compute Node]
(375 seconds approx)
[ INFO ].............Performance test between 192.168.10.5 and 192.168.10.3 has been started.
[ INFO ].............Performance test between 192.168.10.6 and 192.168.10.4 has been started.
[ INFO ].............Performance test between 192.168.10.7 and 192.168.10.2 has been started.
[ INFO ].............Performance test between 192.168.10.8 and 192.168.10.1 has been started.
[ INFO ].............Performance test between 192.168.10.9 and 192.168.10.3 has been started.
[ INFO ].............Performance test between 192.168.10.10 and 192.168.10.4 has been started.
[CRITICAL].............192.168.10.3 rds-stress commands did not run as expected on this host.PLEASE run [./infinicheck -z] to cleanup before re-run.PLEASE ensure that user equivalence for root is setup (./infinicheck -s) Also ensure all other workloads are turned off
[CRITICAL].............192.168.10.4 rds-stress commands did not run as expected on this host.PLEASE run [./infinicheck -z] to cleanup before re-run.PLEASE ensure that user equivalence for root is setup (./infinicheck -s) Also ensure all other workloads are turned off
[CRITICAL].............192.168.10.3 rds-stress commands did not run as expected on this host.PLEASE run [./infinicheck -z] to cleanup before re-run.PLEASE ensure that user equivalence for root is setup (./infinicheck -s) Also ensure all other workloads are turned off
[CRITICAL].............192.168.10.4 rds-stress commands did not run as expected on this host.PLEASE run [./infinicheck -z] to cleanup before re-run.PLEASE ensure that user equivalence for root is setup (./infinicheck -s) Also ensure all other workloads are turned off
[(2) Every COMPUTE NODE to another COMPUTE NODE]
(195 seconds approx)
[ INFO ].............Performance test between 192.168.10.2 and 192.168.10.3 has been started.
[ INFO ].............Performance test between 192.168.10.1 and 192.168.10.4 has been started.
[CRITICAL].............192.168.10.3 rds-stress commands did not run as expected on this host.PLEASE run [./infinicheck -z] to cleanup before re-run.PLEASE ensure that user equivalence for root is setup (./infinicheck -s) Also ensure all other workloads are turned off
[CRITICAL].............192.168.10.4 rds-stress commands did not run as expected on this host.PLEASE run [./infinicheck -z] to cleanup before re-run.PLEASE ensure that user equivalence for root is setup (./infinicheck -s) Also ensure all other workloads are turned off
[(3) Every COMPUTE NODE to ALL STORAGE CELLS]
(looking for SymbolErrors)
(195 seconds approx)
[ INFO ].............Performance test between 192.168.10.5 and 192.168.10.3 has been started.
[ INFO ].............Performance test between 192.168.10.6 and 192.168.10.4 has been started.
[ INFO ].............Performance test between 192.168.10.7 and 192.168.10.2 has been started.
[ INFO ].............Performance test between 192.168.10.8 and 192.168.10.1 has been started.
[ INFO ].............Performance test between 192.168.10.9 and 192.168.10.3 has been started.
[ INFO ].............Performance test between 192.168.10.10 and 192.168.10.4 has been started.
[CRITICAL].............192.168.10.3 rds-stress commands did not run as expected on this host.PLEASE run [./infinicheck -z] to cleanup before re-run.PLEASE ensure that user equivalence for root is setup (./infinicheck -s) Also ensure all other workloads are turned off
[CRITICAL].............192.168.10.4 rds-stress commands did not run as expected on this host.PLEASE run [./infinicheck -z] to cleanup before re-run.PLEASE ensure that user equivalence for root is setup (./infinicheck -s) Also ensure all other workloads are turned off
[CRITICAL].............192.168.10.3 rds-stress commands did not run as expected on this host.PLEASE run [./infinicheck -z] to cleanup before re-run.PLEASE ensure that user equivalence for root is setup (./infinicheck -s) Also ensure all other workloads are turned off
[CRITICAL].............192.168.10.4 rds-stress commands did not run as expected on this host.PLEASE run [./infinicheck -z] to cleanup before re-run.PLEASE ensure that user equivalence for root is setup (./infinicheck -s) Also ensure all other workloads are turned off
[SUCCESS]....... No port errors found
Infinicheck failures reported.. please check log files
从这里我们看到,凡是到db节点的都报错。
infinicheck命令底层是调用的rds-stress命令,例如: rds-stress -r 192.168.10.1 -p 10584
当然,除了infinicheck意外,还有其他很多检查方法,比如rds-ping(ExaWatcher和OSWatcher中调用的这个命令)。
很奇怪,为什么就db节点报错?
于是,使用infinicheck 带参数-b -g 来检查和配置一下DB节点的IB的SSH连通性:
这里我犯了个错误:这个命令需要配置IB的基于IP的SSH(root),而不是主机名
INFINICHECK
[Network Connectivity, Configuration and Performance]
[Version IBD VER 2.d ]
ping: unknown host dm01db01-priv
[FAILURE] Host dm01db01-priv is Unreachable and is excluded from testing
ping: unknown host dm01db02-priv
[FAILURE] Host dm01db02-priv is Unreachable and is excluded from testing
Please supply Infiniband IP addresses only in cell_ib_group
这里很清晰的告诉我们,ping不通,O(∩_∩)O哈哈~,这个就好办了。
接下来,我们手工ping看看:
[root@dm01db01 ~]# ping dm01db02-priv ping: unknown host [root@dm01db01 ~]#
那么ping第2个节点的主机名试试看,证实一下是不是解析的问题:
[root@dm01db01 ~]# ping 192.168.10.3 PING 192.168.10.3 (192.168.10.3) 56(84) bytes of data. 64 bytes from 192.168.10.3: icmp_seq=1 ttl=64 time=0.026 ms 64 bytes from 192.168.10.3: icmp_seq=2 ttl=64 time=0.025 ms ^C --- 192.168.10.3 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1902ms rtt min/avg/max/mdev = 0.025/0.025/0.026/0.005 ms [root@dm01db01 ~]#
这里我们看到,果然是解析的问题。
由于IB网络是Exadata内部互联用的,因此没有在DNS解析,只在/etc/hosts中解析。
而/etc/hosts文件是由onecommand配置的(除非手工安装,否则使用了onecommand后,所有配置文件都由onecommand根据配置xml文件自动生成)
[root@dm01db01 ~]# cat /etc/hosts #### BEGIN Generated by Exadata. DO NOT MODIFY #### 127.0.0.1 localhost.localdomain localhost # 192.168.10.1 dm01db01-priv1.lunar.com dm01db01-priv1 # 192.168.10.2 dm01db01-priv2.lunar.com dm01db01-priv2 此处略去了vip和scan ip #### END Generated by Exadata #### #### BEGIN Added by Configuration Utility #### 192.168.10.1 dm01db01-priv1 dm01db01-priv1.800best.com 192.168.10.10 dm01cel03-priv2 dm01cel03-priv2.800best.com 192.168.10.2 dm01db01-priv2 dm01db01-priv2.800best.com 192.168.10.3 dm01db02-priv1 dm01db02-priv1.800best.com 192.168.10.4 dm01db02-priv2 dm01db02-priv2.800best.com 192.168.10.5 dm01cel01-priv1 dm01cel01-priv1.800best.com 192.168.10.6 dm01cel01-priv2 dm01cel01-priv2.800best.com 192.168.10.7 dm01cel02-priv1 dm01cel02-priv1.800best.com 192.168.10.8 dm01cel02-priv2 dm01cel02-priv2.800best.com 192.168.10.9 dm01cel03-priv1 dm01cel03-priv1.800best.com #### END Added by Configuration Utility #### [root@dm01db01 ~]#
从这里我们看到,IB网络的IP配置格式是错误的,正确的是:
127.0.0.1 localhost.localdomain localhost
错误的是:
192.168.10.1 dm01db01-priv1.lunar.com dm01db01-priv1
修改了上述hosts文件后,
纠正hosts文件后,发现ping主机名的问题解决了:
[root@dm01db01 ~]# ping dm01db01-priv1 PING dm01db01-priv1.800best.com (192.168.10.1) 56(84) bytes of data. 64 bytes from dm01db01-priv1.800best.com (192.168.10.1): icmp_seq=1 ttl=64 time=0.010 ms 64 bytes from dm01db01-priv1.800best.com (192.168.10.1): icmp_seq=2 ttl=64 time=0.010 ms ^C --- dm01db01-priv1.800best.com ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1586ms rtt min/avg/max/mdev = 0.010/0.010/0.010/0.000 ms [root@dm01db01 ~]# ping dm01db01-priv1.800best.com PING dm01db01-priv1.800best.com (192.168.10.1) 56(84) bytes of data. 64 bytes from dm01db01-priv1.800best.com (192.168.10.1): icmp_seq=1 ttl=64 time=0.011 ms 64 bytes from dm01db01-priv1.800best.com (192.168.10.1): icmp_seq=2 ttl=64 time=0.008 ms ^C --- dm01db01-priv1.800best.com ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1525ms rtt min/avg/max/mdev = 0.008/0.009/0.011/0.003 ms [root@dm01db01 ~]#
这里还有个问题很奇怪,cell节点的hosts文件也是错误的,但是却可以ping通,怀疑跟DNS缓存有关系:
[root@dm01cel02 ~]# cat /etc/hosts #### BEGIN Generated by Exadata. DO NOT MODIFY #### 127.0.0.1 localhost.localdomain localhost # 192.168.10.7 dm01cel02-priv1.800best.com dm01cel02-priv1 # 192.168.10.8 dm01cel02-priv2.800best.com dm01cel02-priv2 10.45.1.194 dm01cel02.800best.com dm01cel02 #### END Generated by Exadata #### #### BEGIN Added by Configuration Utility #### 192.168.10.1 dm01db01-priv1 dm01db01-priv1.800best.com 192.168.10.10 dm01cel03-priv2 dm01cel03-priv2.800best.com 192.168.10.2 dm01db01-priv2 dm01db01-priv2.800best.com 192.168.10.3 dm01db02-priv1 dm01db02-priv1.800best.com 192.168.10.4 dm01db02-priv2 dm01db02-priv2.800best.com 192.168.10.5 dm01cel01-priv1 dm01cel01-priv1.800best.com 192.168.10.6 dm01cel01-priv2 dm01cel01-priv2.800best.com 192.168.10.7 dm01cel02-priv1 dm01cel02-priv1.800best.com 192.168.10.8 dm01cel02-priv2 dm01cel02-priv2.800best.com 192.168.10.9 dm01cel03-priv1 dm01cel03-priv1.800best.com #### END Added by Configuration Utility #### [root@dm01cel02 ~]# ping dm01db01-priv1 PING dm01db01-priv1 (192.168.10.1) 56(84) bytes of data. 64 bytes from dm01db01-priv1 (192.168.10.1): icmp_seq=1 ttl=64 time=0.056 ms 64 bytes from dm01db01-priv1 (192.168.10.1): icmp_seq=2 ttl=64 time=0.064 ms 64 bytes from dm01db01-priv1 (192.168.10.1): icmp_seq=3 ttl=64 time=0.051 ms ^C --- dm01db01-priv1 ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 2637ms rtt min/avg/max/mdev = 0.051/0.057/0.064/0.005 ms [root@dm01cel02 ~]# ping dm01db01-priv1.800best.com PING dm01db01-priv1 (192.168.10.1) 56(84) bytes of data. 64 bytes from dm01db01-priv1 (192.168.10.1): icmp_seq=1 ttl=64 time=0.043 ms 64 bytes from dm01db01-priv1 (192.168.10.1): icmp_seq=2 ttl=64 time=0.027 ms 64 bytes from dm01db01-priv1 (192.168.10.1): icmp_seq=3 ttl=64 time=0.029 ms ^C --- dm01db01-priv1 ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 2833ms rtt min/avg/max/mdev = 0.027/0.033/0.043/0.007 ms [root@dm01cel02 ~]#
现在,再次使用infinicheck 带参数-b -g 来检查一下DB节点的IB的SSH连通性,这次没问题了(过程很长,而且无需用户输入和交互,因此不列举在这里了)。
配置完成后,测试一下:
[root@dm01db01 oracle.SupportTools]# dcli -g all_ibip_group -l root 'date' 192.168.10.1: Sun Apr 5 08:14:58 CST 2015 192.168.10.2: Sun Apr 5 08:14:58 CST 2015 192.168.10.3: Sun Apr 5 08:14:59 CST 2015 192.168.10.4: Sun Apr 5 08:14:58 CST 2015 192.168.10.5: Sun Apr 5 08:14:58 CST 2015 192.168.10.6: Sun Apr 5 08:14:59 CST 2015 192.168.10.7: Sun Apr 5 08:14:58 CST 2015 192.168.10.8: Sun Apr 5 08:14:58 CST 2015 192.168.10.9: Sun Apr 5 08:14:58 CST 2015 192.168.10.10: Sun Apr 5 08:14:58 CST 2015 [root@dm01db01 oracle.SupportTools]#
这里我们看到,每个节点有2个IP,是因为从11.2.3.3.0以后,Exadata为了增加带宽,不再绑定IB了,所有的IB私有网络类似一个大的VLAN。
下面我们再次执行一下infinicheck ,因为网络不好,infinicheck执行时间大概要5分钟以上,因此使用screen后台执行:
[root@dm01db01 ~]# screen -S lunar
[root@dm01db01 ~]# date;/opt/oracle.SupportTools/ibdiagtools/infinicheck;date
Sun Apr 5 08:33:13 CST 2015
INFINICHECK
[Network Connectivity, Configuration and Performance]
[Version IBD VER 2.d ]
Verifying User Equivalance of user=root to all hosts.
(If it isn't setup correctly, an authentication prompt will appear to push keys to all the nodes)
Verifying User Equivalance of user=root to all cells.
(If it isn't setup correctly, an authentication prompt will appear to push keys to all the nodes)
#### CONNECTIVITY TESTS ####
[COMPUTE NODES -> STORAGE CELLS]
(30 seconds approx.)
[SUCCESS]..............Results OK
[SUCCESS]....... All can talk to all storage cells
Verifying Subnet Masks on all nodes
[SUCCESS] ......... Subnet Masks is same across the network
Prechecking for uniformity of rds-tools on all nodes
[SUCCESS].... rds-tools version is the same across the cluster
Checking for bad links in the fabric
[SUCCESS].......... No bad fabric links found
[COMPUTE NODES -> COMPUTE NODES]
(30 seconds approx.)
[SUCCESS]..............Results OK
[SUCCESS]....... All hosts can talk to all other nodes
#### PERFORMANCE TESTS ####
[(1) Storage Cell to Compute Node]
(375 seconds approx)
[ INFO ].............Performance test between 192.168.10.5 and 192.168.10.3 has been started.
[ INFO ].............Performance test between 192.168.10.6 and 192.168.10.4 has been started.
[ INFO ].............Performance test between 192.168.10.7 and 192.168.10.2 has been started.
[ INFO ].............Performance test between 192.168.10.8 and 192.168.10.1 has been started.
[ INFO ].............Performance test between 192.168.10.9 and 192.168.10.3 has been started.
[ INFO ].............Performance test between 192.168.10.10 and 192.168.10.4 has been started.
[SUCCESS]..............Results OK
[(2) Every COMPUTE NODE to another COMPUTE NODE]
(195 seconds approx)
[ INFO ].............Performance test between 192.168.10.2 and 192.168.10.3 has been started.
[ INFO ].............Performance test between 192.168.10.1 and 192.168.10.4 has been started.
[SUCCESS]..............Results OK
[(3) Every COMPUTE NODE to ALL STORAGE CELLS]
(looking for SymbolErrors)
(195 seconds approx)
[ INFO ].............Performance test between 192.168.10.5 and 192.168.10.3 has been started.
[ INFO ].............Performance test between 192.168.10.6 and 192.168.10.4 has been started.
[ INFO ].............Performance test between 192.168.10.7 and 192.168.10.2 has been started.
[ INFO ].............Performance test between 192.168.10.8 and 192.168.10.1 has been started.
[ INFO ].............Performance test between 192.168.10.9 and 192.168.10.3 has been started.
[ INFO ].............Performance test between 192.168.10.10 and 192.168.10.4 has been started.
[SUCCESS]..............Results OK
[SUCCESS]....... No port errors found
INFINICHECK REPORTS SUCCESS FOR NETWORK CONNECTIVITY and PERFORMANCE
----------DIAGNOSTICS -----------
6 Cell ips found: ..
192.168.10.5 | 192.168.10.6 | 192.168.10.7 | 192.168.10.8 | 192.168.10.9 | 192.168.10.10
4 Host ips found: ..
192.168.10.3 | 192.168.10.4 | 192.168.10.2 | 192.168.10.1
########## Host to Cell Connectivity ##########
Analyzing cells_conntest.log...
[SUCCESS]..... All nodes can talk to all other nodes
Now Analyzing Compute Node-Compute Node connectivity
########## Inter-Host Connectivity ##########
Analyzing hosts_conntest.log...
[SUCCESS]..... All hosts can talk to all its peers
########## Performance Diagnostics ##########
### [(1) STORAGE CELL to COMPUTE NODE ######
Analyzing perf_cells.log.* logfile(s)....
--------Throughput results using rds-stress --------
2300 MB/s and above is expected for runs on quiet machines
dm01db02( 192.168.10.3 ) to dm01cel01( 192.168.10.5 ) : 3987 MB/s...OK
dm01db02( 192.168.10.4 ) to dm01cel01( 192.168.10.6 ) : 4204 MB/s...OK
dm01db01( 192.168.10.2 ) to dm01cel02( 192.168.10.7 ) : 3848 MB/s...OK
dm01db01( 192.168.10.1 ) to dm01cel02( 192.168.10.8 ) : 3876 MB/s...OK
dm01db02( 192.168.10.3 ) to dm01cel03( 192.168.10.9 ) : 3868 MB/s...OK
dm01db02( 192.168.10.4 ) to dm01cel03( 192.168.10.10 ) : 3971 MB/s...OK
########## Performance Diagnostics ##########
#### [(2) Every DBNODE to its PEER ######
Analyzing perf_hosts.log.* logfile(s)....
--------Throughput results using rds-stress --------
2300 MB/s and above is expected for runs on quiet machines
dm01db02( 192.168.10.3 ) to dm01db01( 192.168.10.2 ) : 3958 MB/s...OK
dm01db02( 192.168.10.4 ) to dm01db01( 192.168.10.1 ) : 3865 MB/s...OK
-------------------------
Results are available in the file diagnostics.output
Sun Apr 5 08:39:30 CST 2015
[root@dm01db01 ~]#
这里我们看到,每秒的吞吐量在3.8GB/s~4GB左右(注意是大B),很强悍,O(∩_∩)O哈哈~



