HPE Storage Users Group

A Storage Administrator Community




Post new topic Reply to topic  [ 1 post ] 
Author Message
 Post subject: 3par XCP-ng iSCSI performance problems
PostPosted: Thu Jun 21, 2018 5:23 am 

Joined: Thu Jun 21, 2018 5:14 am
Posts: 1
We have problem with array performance using iSCSI on XCP-ng.

Hardware:

3x HP ProLiant DL380 Gen9:
2 x Xeon E5-2697 v4 2.30GHz (36 cores x2)
RAM 96 GB DDR-4 2400MHz
NICs:
4x NetXtreme BCM5719 1Gig *
4x Intel I350 1Gig

4x DELL PowerEdge R720xd:
2 x Xeon E5-2680-v2 2.80 GHz (20 cores x2)
RAM 128 GB DDR-3 1333MHz
1st, 3rd NICs:
4x NetExtreme BCM5720 1Gig *
2x Intel 82599ES 10Gig

2nd, 4rd NICs:
1x NetXtrene II BCM57800 (2x 1Gig , 2x 10Gig) *
2x NetXtreme II BCM5709 1Gig

switches: 2x HPE 5130-48G-4SFP+ (JG934A) :
Software Version 7.1.045, Release 3113P05
IRF enabled on 2 Ten-GigabitEthernet ports

array: HP 3PAR 8200:
HPE 3PAR OS 3.2.2.709 (MU6) (currentyly newest)
6x SSD150K 400G AREA0400S5xnNTRI
12x FC10K 1,2T STHB1200S5xeN010
4x iSCSI 10 Gbps QLOGIC QTH8362
SSD RAID1 (set size 4 data), Adaptive Flash Cache disabled

*officially xen 7.4 iscsi not supported (http://hcl.xenserver.org/)

XCP-ng Free 7.4

We are aware of fact that SAN network should use only 10Gig ports, but that's the hardware we have.

We tried several scenarios, each time we get SSD read and write performance about 200MB/s on VMs (win10 pro, win srv 2012, debian 9 - all with virtualization additions) :

1) We've connected HPs with 4x 1GB Intel NICs and DELLs with 2x BCM5720 and BCM57800 1GB.
We've configured bonding and LACP (IP and port of source and destination) on XCP and switch.
In this configuration all ports were in one VLAN and same subnet, each 3par port has different IP.
iSCSI SR was added with multipathing - four controller comma separated IPs and Target IQN *.
Effects:
- 4 of 4 path active (4 iSCSI sessions)
- snmp on switch showed almost equally balanced bandwidth on all 4 nics
- all failover scenarios working fine

We are aware that mixing bonding with multipathing is not prefered and multipathing in most cases should work better( https://docs.citrix.com/content/dam/doc ... config.pdf)

2) We've connected HPs with 4x 1GB Intel NICs and DELLs with 2x BCM5720 and BCM57800 1GB.
We've disabled bonding and LACP, each NICs and 3par port moved to separate VLAN (4 vlans).
iSCSI SR was added with multipathing - four controller comma separated IPs and Target IQN *.
Effects:
- 4 of 4 path active (4 iSCSI sessions)
- snmp on switch showed almost equally balanced bandwidth on all 4 nics
- all failover scenarios working fine

3) To exclude some switch/bonding/multipathing configuration error we've connected DELL with BCM57800 10Gb NIC directly to 3par controller. We've tried 1 path vs forced multiple paths.


All options were tested with:
a) MTU 1500/9000,
b) SSD RAID1/5, Adaptive Flash Cache enabled/disabled, FC HDD RAID1/5
c) As proposed in: https://support.hpe.com/hpsc/doc/public ... cale=en_US
we have tested:
- 3Par Persona1 with default config
Code:
# multipath -ll
      360002ac000000000000000230001fee5 dm-0 3PARdata,VV             
      size=200G features='0' hwhandler='0' wp=rw
      `-+- policy='service-time 0' prio=1 status=active
        |- 19:0:0:0 sdb 8:16 active ready running
        |- 20:0:0:0 sdc 8:32 active ready running
        |- 21:0:0:0 sdd 8:48 active ready running
        `- 22:0:0:0 sde 8:64 active ready running

then with:
Code:
      /etc/iscsi/iscsid.conf
         node.session.timeo.replacement_timeout = 15
         node.conn[0].timeo.noop_out_interval = 5  / 10
         node.startup = automatic
         
         path_grouping_policy "group_by_prio"  / multibus

- 3Par Persona2 with:
Code:
      # cat /etc/multipath.conf
         defaults {
          polling_interval 10
          find_multipaths yes
         }
         devices {
          device {
          vendor "3PARdata"
          product "VV"
          features "0"
          path_selector "round-robin 0"
          path_grouping_policy "group_by_prio"
          hardware_handler "1 alua"
          prio "alua"
          failback "immediate"
          rr_weight "uniform"
          no_path_retry 18
          rr_min_io 100
          path_checker "tur"
          fast_io_fail_tmo 20
          dev_loss_tmo "infinity"
          }
         }
         
      # multipath -ll
      360002ac000000000000000230001fee5 dm-0 3PARdata,VV             
      size=200G features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
      `-+- policy='round-robin 0' prio=50 status=active
        |- 15:0:0:0 sdb 8:16 active ready running
        |- 16:0:0:0 sdc 8:32 active ready running
        |- 17:0:0:0 sdd 8:48 active ready running
        `- 18:0:0:0 sde 8:64 active ready running

d) tried to tune kernel parameters with various values like:
Code:
      sysctl -w net.core.rmem_max=134217728
      sysctl -w net.core.wmem_max=134217728
      sysctl -w net.ipv4.tcp_rmem="4096 87380 134217728"
      sysctl -w net.ipv4.tcp_wmem="4096 65536 134217728"
      sysctl -w net.core.netdev_max_backlog=300000
      modprobe tcp_cubic
      sysctl -w net.ipv4.tcp_congestion_control=cubic
      sysctl -w net.ipv4.tcp_sack=0
      sysctl -w net.ipv4.tcp_fin_timeout=15
      sysctl -w net.ipv4.tcp_timestamps=0

      sysctl -w net.ipv4.tcp_mtu_probing=1
      sysctl -w net.ipv4.conf.all.arp_filter=1
      sysctl -w net.ipv4.tcp_reordering=0
      ifconfig eth$i txqueuelen 10000

e) tried to turn on and off hardware offloading on NICs
Code:
      ethtool -K eth$i gso on/off
      ethtool -K eth$i gro off/on
      ethtool -K eth$i tso on/off
      ethtool -K eth$i lro off/on
      ethtool -K eth$i tx off/on


f) tried both xe-switch-network-backend bridge/openvswitch


On some of configuration combinations performance was much lower, but in any of them we couldn't get any faster than 200MB/s read or write.


On the other hand on:
1) Vmware Esxi on Dell BCM57800 10Gig port directly to 3Par port:
a) software iSCSI
-Win 10 Pro (without vmware tools):
SSD RAID1 (set size 4 data), Adaptive Flash Cache disabled, 900MB/s read 530MB/s write

-Debian 9 (with openvmtools)
SSD RAID1 (set size 4 data), Adaptive Flash Cache disabled, 600MB/s read 400MB/s write
b) hardware iSCSI offloading - 150MB/s read/write


2) Bare metal Win 10 Pro on Dell BCM57800 10Gig port directly to 3Par port:
SSD RAID1 (set size 4 data), Adaptive Flash Cache disabled, 600MB/s read 430MB/s write


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 1 post ] 


Who is online

Users browsing this forum: No registered users and 69 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group | DVGFX2 by: Matt