HPE Storage Users Group

A Storage Administrator Community




Post new topic Reply to topic  [ 12 posts ]  Go to page 1, 2  Next
Author Message
 Post subject: Oracle RAC Cluster Problem
PostPosted: Tue Oct 16, 2018 2:44 pm 

Joined: Thu Apr 28, 2016 1:29 pm
Posts: 34
Hi all,
I wonder if anybody here is using Oracle Cluster with 3PAR and ran into this issue before
We have Oracle Clusterware 12c on CentOS 7.5 zoned to 2 of 8440 arrays. FC-SSD no NL disks
Recently node reboots during 3.2.2MU6 upgrade caused Oracle cluster servers to reboot.
Anybody can say why?


Top
 Profile  
Reply with quote  
 Post subject: Re: Oracle RAC Cluster Problem
PostPosted: Wed Oct 17, 2018 2:58 pm 

Joined: Thu Feb 04, 2016 4:12 pm
Posts: 28
with the extremely limited information provided...
oracle ASM + linux dont have the best reputation of handling path failures gracefully.

1) are both arrays configured properly for Persistent Ports? (port failover/failback)
2) did the ports failover during the upgrade/node reboot?
3) did they failback?
4) http://h20628.www2.hp.com/km-ext/kmcsdirect/emr_na-c03290635-13.pdf
5) multipath.conf settings should be set up as there are some key settings that help with 3par path failover (not host)
some of the major ones:
no_path_retry 18
path_selector "round-robin 0"
rr_weight uniform
rr_min_io_rq 1
path_checker tur
failback immediate

i have not had issues with 3par and oracle rac. in previous version of oracle rac on other arrays where the paths fail during a controller/node reboot, we did have issues with rac hosts not handling the path failures. with persistent ports from 3par, that has not occured in many years. one of their best features imho.

hope this helps.


Top
 Profile  
Reply with quote  
 Post subject: Re: Oracle RAC Cluster Problem
PostPosted: Sun Oct 21, 2018 10:21 pm 
Site Admin
User avatar

Joined: Tue Aug 18, 2009 10:35 pm
Posts: 1328
Location: Dallas, Texas
I suspect zoning/cabling issues. Review to ensure you are following best practice.

Port Persistence should have mitigated many multipath issues, assuming the best practices for cabling/zoning with regards to Port Persistence were followed.

_________________
Richard Siemers
The views and opinions expressed are my own and do not necessarily reflect those of my employer.


Top
 Profile  
Reply with quote  
 Post subject: Re: Oracle RAC Cluster Problem
PostPosted: Mon Oct 22, 2018 1:16 pm 

Joined: Thu Apr 28, 2016 1:29 pm
Posts: 34
Proc_rqrd wrote:
with the extremely limited information provided...
oracle ASM + linux dont have the best reputation of handling path failures gracefully.

1) are both arrays configured properly for Persistent Ports? (port failover/failback)
2) did the ports failover during the upgrade/node reboot?
3) did they failback?
4) http://h20628.www2.hp.com/km-ext/kmcsdirect/emr_na-c03290635-13.pdf
5) multipath.conf settings should be set up as there are some key settings that help with 3par path failover (not host)
some of the major ones:
no_path_retry 18
path_selector "round-robin 0"
rr_weight uniform
rr_min_io_rq 1
path_checker tur
failback immediate

i have not had issues with 3par and oracle rac. in previous version of oracle rac on other arrays where the paths fail during a controller/node reboot, we did have issues with rac hosts not handling the path failures. with persistent ports from 3par, that has not occured in many years. one of their best features imho.

hope this helps.


Thanks for your reply. Our DBA is checking multipath config. I`m new in this place. I attached how the DB host is zoned. Myself personally, I always zone 1 initiator 1 target. They zoned single initiator to multiple targets. Could that cause any problem? If it could, why didn`t they have any problem in the previous upgrades and node replacements that were done?

I didn`t have anybody from server side to watch the hosts. What I observed was when the node went down, I checked from 3par side with showhost -pathsum, so all the hosts show Node0 gone, showing 1,2,3. Then Node0 came back up, and all the hosts show 0,1,2,3 while DB hosts still show 1,2,3 They didn`t recover the path, then I got the notification that the servers rebooted.


Attachments:
DB01_Dev_Zone.jpg
DB01_Dev_Zone.jpg [ 39.95 KiB | Viewed 25325 times ]
Top
 Profile  
Reply with quote  
 Post subject: Re: Oracle RAC Cluster Problem
PostPosted: Mon Oct 22, 2018 10:27 pm 
Site Admin
User avatar

Joined: Tue Aug 18, 2009 10:35 pm
Posts: 1328
Location: Dallas, Texas
So Port Persistence is a technology that leverages NPIV on the SAN switch to move target WWNs to surviving 3par nodes so that the hosts don't realize that a path ever went offline.

Node0 port 1 will fail over to Node1 port1. (0 and 1 make a node pair)
Node2 port 1 will fail over to Node3 port1. (2 and 3 make a node pair)

For this to work, both Node0 and Node1 need to have port 1 on the same SAN Fabric, SAN-A for example. If you have the system miscabled/configured such that 0_0_1 is on SAN-A, and 1_0_1 is an SAN-B, Port persistence will not work because when Node0 is offline, Node1 will bring that WWN online onto SAN-B instead SAN-A where the zones are for that WWN. Based on your screenshot, it looks like you should be good here, just verify that NPIV is enabled on the switches, and maybe verify that the alias for the 3par ports are correct (0_0_1 goes to Node0, slot0, port1).

While Port Persistence could have mitigated your issue, there is still a root issue on the host as to why the paths did not recover, and a crash occurred. Do these servers boot from SAN by chance? Make sure the Host connectivity guide is followed for your OS.

Redhat support: https://access.redhat.com/solutions/1249543
If you can't login to RH, dont worry - it defers to getting your setting from this HPe document, page 101: http://h20628.www2.hp.com/km-ext/kmcsdi ... 635-13.pdf

For RH 6.2 and later
Code:
Raw
devices {
    device {
                vendor                  "3PARdata"
                product                 "VV"
                path_grouping_policy    "group_by_prio"   
                getuid_callout          "/lib/udev/scsi_id --whitelisted --device=/dev/%n"
                path_selector           "round-robin 0"
                path_checker            "tur"             
                features                "0"
                hardware_handler        "1 alua"         
                prio                    "alua"           
                rr_weight               uniform
                no_path_retry           18               
                rr_min_io               100               
           }

_________________
Richard Siemers
The views and opinions expressed are my own and do not necessarily reflect those of my employer.


Top
 Profile  
Reply with quote  
 Post subject: Re: Oracle RAC Cluster Problem
PostPosted: Tue Oct 23, 2018 11:04 am 

Joined: Thu Feb 04, 2016 4:12 pm
Posts: 28
only to add from the excellent explanation above...
showhost -pathsum will not show u if path persistence was done or not a showport would show your their failover state as in:
N:S:P Mode State ----Node_WWN---- -Port_WWN/HW_Addr- Type Protocol Label Partner FailoverState
0:0:1 target ready xxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxx host FC - 1:0:1 none

since node 1 would be the "path" to the npiv logged in node0 ports...node 0 would in fact not be part of the showhost -pathsum. that command seems to look at physical node/port and not the failover port.

your root issue is still outstanding...but i understand your limited visibility and control of other factors (servers/DBA). hence path persistence further mitigating system issues, even outside of your control. if you follow best practices..then at least you are.

just my experience with oracle RAC with asm volumes on hosts prior to rhel 7 was somewhat overly sensitive to path failures. the pdf i linked to should help.

also your zoning single initiator / multiple target is currently supported with newer arrays. 3par previously did not recommend it. aslong as richards explanation of physical san connectivity and zoning are true...port failover should be possible. from your screenshot of one fabric...if the other fabric is all port 2's port persistence should be possible. verify your show port see if you have failover partners etc.


Top
 Profile  
Reply with quote  
 Post subject: Re: Oracle RAC Cluster Problem
PostPosted: Tue Oct 23, 2018 8:52 pm 

Joined: Thu Apr 28, 2016 1:29 pm
Posts: 34
Thank you so much for wonderful explanations. So how about this. I asked our dba to compare multipath.conf with best practices. Here is how we have it. Do those cause this issue?


Attachments:
20181023_214932.jpg
20181023_214932.jpg [ 1.44 MiB | Viewed 25291 times ]
Top
 Profile  
Reply with quote  
 Post subject: Re: Oracle RAC Cluster Problem
PostPosted: Tue Oct 23, 2018 10:57 pm 
Site Admin
User avatar

Joined: Tue Aug 18, 2009 10:35 pm
Posts: 1328
Location: Dallas, Texas
Glad you found a config issue, so you know what needs to changes need to be tested/verified. This can certainly cause path failover irregularities.

When talking to your Linux OS support guys, be clear that these settings are not just 'best practice', but are "implementation guide" requirements. The entire PDF should be reviewed (skipping the iscsi/fcoe sections if not applicable) by the Linux OS owner and be included in your build standard for 3PAR attached linux hosts.

What does 'multipath -ll' look like? On your 3PAR, what is the host persona set to? (see page 12 of the HP doc linked above to make sure its set correctly).

_________________
Richard Siemers
The views and opinions expressed are my own and do not necessarily reflect those of my employer.


Top
 Profile  
Reply with quote  
 Post subject: Re: Oracle RAC Cluster Problem
PostPosted: Wed Oct 24, 2018 7:58 am 

Joined: Thu Apr 28, 2016 1:29 pm
Posts: 34
Richard Siemers wrote:
Glad you found a config issue, so you know what needs to changes need to be tested/verified. This can certainly cause path failover irregularities.

When talking to your Linux OS support guys, be clear that these settings are not just 'best practice', but are "implementation guide" requirements. The entire PDF should be reviewed (skipping the iscsi/fcoe sections if not applicable) by the Linux OS owner and be included in your build standard for 3PAR attached linux hosts.

What does 'multipath -ll' look like? On your 3PAR, what is the host persona set to? (see page 12 of the HP doc linked above to make sure its set correctly).


Hosts are set to persona 2
Here is the output of multipath -ll

3p4l36 (360002ac000000000000000520001ce56) dm-18 3PARdata,VV
size=2.0T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
`-+- policy='service-time 0' prio=50 status=active
|- 1:0:4:36 sdz 65:144 active ready running
|- 2:0:3:36 sdap 66:144 active ready running
|- 1:0:5:36 sdbt 68:112 active ready running
|- 2:0:4:36 sdci 69:96 active ready running
|- 1:0:6:36 sddo 71:96 active ready running
|- 2:0:5:36 sdec 128:64 active ready running
|- 1:0:7:36 sdfi 130:64 active ready running
`- 2:0:6:36 sdfw 131:32 active ready running
3p4l35 (360002ac000000000000000510001ce56) dm-17 3PARdata,VV
size=2.0T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
`-+- policy='service-time 0' prio=50 status=active
|- 1:0:4:35 sdx 65:112 active ready running
|- 2:0:3:35 sdan 66:112 active ready running
|- 1:0:5:35 sdbs 68:96 active ready running
|- 2:0:4:35 sdch 69:80 active ready running
|- 1:0:6:35 sddm 71:64 active ready running
|- 2:0:5:35 sdeb 128:48 active ready running
|- 1:0:7:35 sdfg 130:32 active ready running
`- 2:0:6:35 sdfv 131:16 active ready running
3p4l17 (360002ac000000000000000440001ce56) dm-4 3PARdata,VV
size=10G features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
`-+- policy='service-time 0' prio=50 status=active
|- 1:0:4:17 sdd 8:48 active ready running
|- 1:0:5:17 sdas 66:192 active ready running
|- 2:0:3:17 sdp 8:240 active ready running
|- 2:0:4:17 sdbi 67:192 active ready running
|- 1:0:6:17 sdcm 69:160 active ready running
|- 2:0:5:17 sddb 70:144 active ready running
|- 1:0:7:17 sdeg 128:128 active ready running
`- 2:0:6:17 sdev 129:112 active ready running
3p4l34 (360002ac000000000000000500001ce56) dm-16 3PARdata,VV
size=2.0T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
`-+- policy='service-time 0' prio=50 status=active
|- 1:0:4:34 sdv 65:80 active ready running
|- 2:0:3:34 sdam 66:96 active ready running
|- 1:0:5:34 sdbq 68:64 active ready running
|- 2:0:4:34 sdcg 69:64 active ready running
|- 1:0:6:34 sddj 71:16 active ready running
|- 2:0:5:34 sdea 128:32 active ready running
|- 1:0:7:34 sdfe 130:0 active ready running
`- 2:0:6:34 sdfu 131:0 active ready running
3p4l16 (360002ac000000000000000430001ce56) dm-3 3PARdata,VV
size=10G features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
`-+- policy='service-time 0' prio=50 status=active
|- 1:0:4:16 sdc 8:32 active ready running
|- 1:0:5:16 sdaq 66:160 active ready running
|- 2:0:3:16 sdn 8:208 active ready running
|- 2:0:4:16 sdbf 67:144 active ready running
|- 1:0:6:16 sdcl 69:144 active ready running
|- 2:0:5:16 sdcz 70:112 active ready running
|- 1:0:7:16 sdee 128:96 active ready running
`- 2:0:6:16 sdet 129:80 active ready running
3p4l33 (360002ac0000000000000004f0001ce56) dm-15 3PARdata,VV
size=2.0T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
`-+- policy='service-time 0' prio=50 status=active
|- 1:0:4:33 sdt 65:48 active ready running
|- 2:0:3:33 sdak 66:64 active ready running
|- 1:0:5:33 sdbn 68:16 active ready running
|- 2:0:4:33 sdcd 69:16 active ready running
|- 1:0:6:33 sddi 71:0 active ready running
|- 2:0:5:33 sddy 128:0 active ready running
|- 1:0:7:33 sdfc 129:224 active ready running
`- 2:0:6:33 sdfs 130:224 active ready running
3p4l15 (360002ac000000000000000420001ce56) dm-2 3PARdata,VV
size=10G features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
`-+- policy='service-time 0' prio=50 status=active
|- 1:0:4:15 sdb 8:16 active ready running
|- 1:0:5:15 sdao 66:128 active ready running
|- 2:0:3:15 sdl 8:176 active ready running
|- 2:0:4:15 sdbd 67:112 active ready running
|- 1:0:6:15 sdcj 69:112 active ready running
|- 2:0:5:15 sdcx 70:80 active ready running
|- 1:0:7:15 sded 128:80 active ready running
`- 2:0:6:15 sder 129:48 active ready running
3p4l32 (360002ac0000000000000004e0001ce56) dm-14 3PARdata,VV
size=2.0T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
`-+- policy='service-time 0' prio=50 status=active
|- 1:0:4:32 sdq 65:0 active ready running
|- 2:0:3:32 sdai 66:32 active ready running
|- 1:0:5:32 sdbm 68:0 active ready running
|- 2:0:4:32 sdcc 69:0 active ready running
|- 1:0:6:32 sddf 70:208 active ready running
|- 2:0:5:32 sddw 71:224 active ready running
|- 1:0:7:32 sdfa 129:192 active ready running
`- 2:0:6:32 sdfq 130:192 active ready running
3p4l29 (360002ac0000000000000004b0001ce56) dm-11 3PARdata,VV
size=2.0T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
`-+- policy='service-time 0' prio=50 status=active
|- 1:0:4:29 sdk 8:160 active ready running
|- 1:0:5:29 sdbg 67:160 active ready running
|- 2:0:3:29 sdac 65:192 active ready running
|- 2:0:4:29 sdbw 68:160 active ready running
|- 1:0:6:29 sdda 70:128 active ready running
|- 2:0:5:29 sddp 71:112 active ready running
|- 1:0:7:29 sdeu 129:96 active ready running
`- 2:0:6:29 sdfj 130:80 active ready running
3p4l31 (360002ac0000000000000004d0001ce56) dm-13 3PARdata,VV
size=2.0T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
`-+- policy='service-time 0' prio=50 status=active
|- 1:0:4:31 sdo 8:224 active ready running
|- 2:0:3:31 sdag 66:0 active ready running
|- 1:0:5:31 sdbk 67:224 active ready running
|- 2:0:4:31 sdca 68:224 active ready running
|- 1:0:6:31 sddd 70:176 active ready running
|- 2:0:5:31 sddu 71:192 active ready running
|- 1:0:7:31 sdey 129:160 active ready running
`- 2:0:6:31 sdfo 130:160 active ready running
3p4l28 (360002ac0000000000000004a0001ce56) dm-10 3PARdata,VV
size=2.0T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
`-+- policy='service-time 0' prio=50 status=active
|- 1:0:4:28 sdj 8:144 active ready running
|- 1:0:5:28 sdbe 67:128 active ready running
|- 2:0:3:28 sdaa 65:160 active ready running
|- 2:0:4:28 sdbu 68:128 active ready running
|- 1:0:6:28 sdcy 70:96 active ready running
|- 2:0:5:28 sddn 71:80 active ready running
|- 1:0:7:28 sdes 129:64 active ready running
`- 2:0:6:28 sdfh 130:48 active ready running
3p4l30 (360002ac0000000000000004c0001ce56) dm-12 3PARdata,VV
size=2.0T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
`-+- policy='service-time 0' prio=50 status=active
|- 1:0:4:30 sdm 8:192 active ready running
|- 1:0:5:30 sdbh 67:176 active ready running
|- 2:0:3:30 sdae 65:224 active ready running
|- 2:0:4:30 sdby 68:192 active ready running
|- 1:0:6:30 sddc 70:160 active ready running
|- 2:0:5:30 sdds 71:160 active ready running
|- 1:0:7:30 sdew 129:128 active ready running
`- 2:0:6:30 sdfl 130:112 active ready running
3p4l27 (360002ac000000000000000490001ce56) dm-9 3PARdata,VV
size=2.0T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
`-+- policy='service-time 0' prio=50 status=active
|- 1:0:4:27 sdi 8:128 active ready running
|- 1:0:5:27 sdbc 67:96 active ready running
|- 2:0:3:27 sdy 65:128 active ready running
|- 2:0:4:27 sdbr 68:80 active ready running
|- 1:0:6:27 sdcw 70:64 active ready running
|- 2:0:5:27 sddl 71:48 active ready running
|- 1:0:7:27 sdeq 129:32 active ready running
`- 2:0:6:27 sdff 130:16 active ready running
3p4l26 (360002ac000000000000000480001ce56) dm-8 3PARdata,VV
size=2.0T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
`-+- policy='service-time 0' prio=50 status=active
|- 1:0:4:26 sdh 8:112 active ready running
|- 1:0:5:26 sdbb 67:80 active ready running
|- 2:0:3:26 sdw 65:96 active ready running
|- 2:0:4:26 sdbp 68:48 active ready running
|- 1:0:6:26 sdcu 70:32 active ready running
|- 2:0:5:26 sddk 71:32 active ready running
|- 1:0:7:26 sdep 129:16 active ready running
`- 2:0:6:26 sdfd 129:240 active ready running
3p4l25 (360002ac000000000000000470001ce56) dm-7 3PARdata,VV
size=2.0T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
`-+- policy='service-time 0' prio=50 status=active
|- 1:0:4:25 sdg 8:96 active ready running
|- 1:0:5:25 sday 67:32 active ready running
|- 2:0:3:25 sdu 65:64 active ready running
|- 2:0:4:25 sdbo 68:32 active ready running
|- 1:0:6:25 sdcs 70:0 active ready running
|- 2:0:5:25 sddh 70:240 active ready running
|- 1:0:7:25 sden 128:240 active ready running
`- 2:0:6:25 sdfb 129:208 active ready running
3p4l42 (360002ac000000000000000580001ce56) dm-24 3PARdata,VV
size=2.0T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
`-+- policy='service-time 0' prio=50 status=active
|- 1:0:4:42 sdal 66:80 active ready running
|- 2:0:3:42 sdba 67:64 active ready running
|- 1:0:5:42 sdcf 69:48 active ready running
|- 2:0:4:42 sdcv 70:48 active ready running
|- 1:0:6:42 sddz 128:16 active ready running
|- 2:0:5:42 sdeo 129:0 active ready running
|- 1:0:7:42 sdft 130:240 active ready running
`- 2:0:6:42 sdgc 131:128 active ready running
3p4l39 (360002ac000000000000000550001ce56) dm-21 3PARdata,VV
size=2.0T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
`-+- policy='service-time 0' prio=50 status=active
|- 1:0:4:39 sdaf 65:240 active ready running
|- 2:0:3:39 sdav 66:240 active ready running
|- 1:0:5:39 sdbz 68:208 active ready running
|- 2:0:4:39 sdco 69:192 active ready running
|- 1:0:6:39 sddt 71:176 active ready running
|- 2:0:5:39 sdei 128:160 active ready running
|- 1:0:7:39 sdfn 130:144 active ready running
`- 2:0:6:39 sdfz 131:80 active ready running
3p4l24 (360002ac000000000000000460001ce56) dm-6 3PARdata,VV
size=2.0T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
`-+- policy='service-time 0' prio=50 status=active
|- 1:0:4:24 sdf 8:80 active ready running
|- 1:0:5:24 sdaw 67:0 active ready running
|- 2:0:3:24 sds 65:32 active ready running
|- 2:0:4:24 sdbl 67:240 active ready running
|- 1:0:6:24 sdcq 69:224 active ready running
|- 2:0:5:24 sddg 70:224 active ready running
|- 1:0:7:24 sdel 128:208 active ready running
`- 2:0:6:24 sdez 129:176 active ready running
3p4l41 (360002ac000000000000000570001ce56) dm-23 3PARdata,VV
size=2.0T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
`-+- policy='service-time 0' prio=50 status=active
|- 1:0:4:41 sdaj 66:48 active ready running
|- 2:0:3:41 sdaz 67:48 active ready running
|- 1:0:5:41 sdce 69:32 active ready running
|- 2:0:4:41 sdct 70:16 active ready running
|- 1:0:6:41 sddx 71:240 active ready running
|- 2:0:5:41 sdem 128:224 active ready running
|- 1:0:7:41 sdfr 130:208 active ready running
`- 2:0:6:41 sdgb 131:112 active ready running
3p4l38 (360002ac000000000000000540001ce56) dm-20 3PARdata,VV
size=2.0T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
`-+- policy='service-time 0' prio=50 status=active
|- 1:0:4:38 sdad 65:208 active ready running
|- 2:0:3:38 sdat 66:208 active ready running
|- 1:0:5:38 sdbx 68:176 active ready running
|- 2:0:4:38 sdcn 69:176 active ready running
|- 1:0:6:38 sddr 71:144 active ready running
|- 2:0:5:38 sdeh 128:144 active ready running
|- 1:0:7:38 sdfm 130:128 active ready running
`- 2:0:6:38 sdfy 131:64 active ready running
3p4l23 (360002ac000000000000000450001ce56) dm-5 3PARdata,VV
size=2.0T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
`-+- policy='service-time 0' prio=50 status=active
|- 1:0:4:23 sde 8:64 active ready running
|- 1:0:5:23 sdau 66:224 active ready running
|- 2:0:3:23 sdr 65:16 active ready running
|- 2:0:4:23 sdbj 67:208 active ready running
|- 1:0:6:23 sdcp 69:208 active ready running
|- 2:0:5:23 sdde 70:192 active ready running
|- 1:0:7:23 sdej 128:176 active ready running
`- 2:0:6:23 sdex 129:144 active ready running
3p4l40 (360002ac000000000000000560001ce56) dm-22 3PARdata,VV
size=2.0T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
`-+- policy='service-time 0' prio=50 status=active
|- 1:0:4:40 sdah 66:16 active ready running
|- 2:0:3:40 sdax 67:16 active ready running
|- 1:0:5:40 sdcb 68:240 active ready running
|- 2:0:4:40 sdcr 69:240 active ready running
|- 1:0:6:40 sddv 71:208 active ready running
|- 2:0:5:40 sdek 128:192 active ready running
|- 1:0:7:40 sdfp 130:176 active ready running
`- 2:0:6:40 sdga 131:96 active ready running
3p4l37 (360002ac000000000000000530001ce56) dm-19 3PARdata,VV
size=2.0T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
`-+- policy='service-time 0' prio=50 status=active
|- 1:0:4:37 sdab 65:176 active ready running
|- 2:0:3:37 sdar 66:176 active ready running
|- 1:0:5:37 sdbv 68:144 active ready running
|- 2:0:4:37 sdck 69:128 active ready running
|- 1:0:6:37 sddq 71:128 active ready running
|- 2:0:5:37 sdef 128:112 active ready running
|- 1:0:7:37 sdfk 130:96 active ready running
`- 2:0:6:37 sdfx 131:48 active ready running


Top
 Profile  
Reply with quote  
 Post subject: Re: Oracle RAC Cluster Problem
PostPosted: Thu Oct 25, 2018 1:08 pm 

Joined: Thu Feb 04, 2016 4:12 pm
Posts: 28
to reitirate the implementation guide should be adhered to 100%. my fault if i represented that linked pdf as a best practice, i should say "supported practice"
in my original reply...key parts of the conf file for 6.2+
5) multipath.conf settings should be set up as there are some key settings that help with 3par path failover (not host)
some of the major ones:
no_path_retry 18
path_selector "round-robin 0"
rr_weight uniform
rr_min_io_rq 1
path_checker tur
failback immediate

all are important the green help maintain a healthy scsi bus on your host. the round robin with a weight of 1 IO means it sends 1 iop before moving on to the next active path. this assist with timeouts, path failure detection and the ability to recover from a failed path with outstanding IO's that may need to be resent/reset.

path checker tur (test unit ready) a bit of explanation:
"issue SCSI command TEST UNIT READY to the device, preferred compared to readsector or directio if the LUN supports tur, as on failure it does not fill up the system log with messages to verify TUR works fine, can issue manually: sg_turs -v /dev/sdx"

with 3par path persistence and the above configuration, a linux host would have a very hard time drowning in its mpath puddle, unless it was a full system outage on the array.


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 12 posts ]  Go to page 1, 2  Next


Who is online

Users browsing this forum: No registered users and 19 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group | DVGFX2 by: Matt