HPE Storage Users Group

A Storage Administrator Community




Post new topic Reply to topic  [ 18 posts ]  Go to page 1, 2  Next
Author Message
 Post subject: Adaptive Optimization (AO) - The silver bullet?
PostPosted: Wed Feb 20, 2019 8:32 pm 

Joined: Wed Nov 08, 2017 8:57 am
Posts: 42
Hello, We have had performance issues on this SAN for over a year..

It's a 7200 filled as follows..

(28) 900GB SAS 10K Drives
(41) Virtual Volumes in RAID5 CPG
(27) Virtual Volumes in RAID6 CPG
(8) 92GB SSD 150K Drives
(1) Virtual Volume in RAID1 CPG


The 7200 is connected to an HP C7000 that houses 8 Blades all running ESXi. We have around 95 Horizon VDI VMs running Windows 7 and around 40 VMs running Windows Server operating system..

We are at the point where we are about to spend the money to address the issue.. I have SSH’d into the 3PAR and run statvlun -hostsum -ni -sortcol 3,dec and statvv -sortcol 2,dec -ni. This is the part that is confusing to me.. I have seen the total I/O be over 5,000 for a considerable amount of time (say, 5 minutes straight, as an example).. Of those 5,000 I/O only a few are on the SSD VV so basically it’s all on the FC CPG’s..

I need to know what values here are the ones to look at for a performance hit.. I don’t know how we can pull over 5,000 total I/O when the NinjaStar says we should be pulling 3,112 max.. I know Ninja is more of a reference tool, but still. Also, we constantly get the alerts below..

FC-10k tier has disks exceeding low water mark of recommended IOPS
FC-10k tier has disks exceeding high water mark of recommended IOPS


What values should I look at that would aid in troubleshooting and also aid in what would possibly fix this?

We do have a RDM that I’ve seen go over 4,000 I/O at times (it’s a SQL database) that we are looking in to.. But I’m not sure just fixing that will address our issues..

Currently we do not have all ports cabled, and we know this.. The reason only two ports are utilized is because we have two sup modules which have (4) 10gb ports each.. However, when using them as 10GB ports you can only use (2).. So that leaves us with (2) 10gb ports on each sup card for a total of (4).. We have (2) going to the C7000 and (2) going to the 3Par.. We cannot enable more ports, currently..

One question we had is, Is it possible to make a direct connection from the Flex10 modules on the C7000 chassis to the 3Par to make a direct iscsi connection without a switch in-between. Ideally with 2-4 connections from each Flex10 direct into the 3Par?

We know that would help if we could use all the ports, but not sure it would address the issue at hand, which we believe is an I/O / spindle issue..

We were looking at buying (4) 920 GB SSD’s and adding them to the array to try and address the issue..

We do not have AFC or AO enabled.. The only thing on the SSD VV is the VDI Replica Image.. I've been told by HP a few times that it's barely doing anything.. Should I look into AO? Any downsides to AO?

Thanks so much..


Top
 Profile  
Reply with quote  
 Post subject: Re: Adaptive Optimization (AO) - The silver bullet?
PostPosted: Thu Feb 21, 2019 1:33 am 

Joined: Mon Sep 21, 2015 2:11 pm
Posts: 1570
Location: Europe
viewtopic.php?f=18&t=3085

There are some pointers and commands there to see if AO can help and a story where it helped.

You cannot use AO on dedupe volumes if your replicas are deduped and they use all SSD capacity.

And RAID1 VV..... Please enlighten me as to why.

_________________
The views and opinions expressed are my own and do not necessarily reflect those of my current or previous employers.


Top
 Profile  
Reply with quote  
 Post subject: Re: Adaptive Optimization (AO) - The silver bullet?
PostPosted: Thu Feb 21, 2019 8:35 am 

Joined: Wed Nov 08, 2017 8:57 am
Posts: 42
MammaGutt wrote:
https://3parug.com/viewtopic.php?f=18&t=3085

There are some pointers and commands there to see if AO can help and a story where it helped.

You cannot use AO on dedupe volumes if your replicas are deduped and they use all SSD capacity.

And RAID1 VV..... Please enlighten me as to why.


Thanks for the reply. I did read through that guy's post and was going to reply on that thread but since my setup is different I didn't want to hijack..

None of our volumes are deduped..

As for the RAID1 SSD VV.. Great question.. I honestly have no idea.. I'll admit I'm green to storage so if you could elaborate as to why that isn't the best choice.. Ninja shows R5 at 69,600 I/O and R1 at 98,700 I/O.. Looks like R5 is around 230GB usable while R1 is only 160GB usable.

Thanks again.


Top
 Profile  
Reply with quote  
 Post subject: Re: Adaptive Optimization (AO) - The silver bullet?
PostPosted: Thu Feb 21, 2019 11:09 am 

Joined: Mon Sep 21, 2015 2:11 pm
Posts: 1570
Location: Europe
Oh. You actually have 92(100GB SLC SSDs). I assumed it was a typo and should be 920GB.

Anyways. If you extract the SR output I can make some assumptions.

Btw, do you have SR and AO license? And what 3PAR OS version are you running? 3.2.2 has Express Layout which will allow R5 7+1(which is okay if SSD is only used for AO). Earlier version will only allow 3+1 with 8 SSDs.


Btw, if you hit the high watermark you have serious performance issues. AO with very limited SSD capacity might not be able to fix that.

_________________
The views and opinions expressed are my own and do not necessarily reflect those of my current or previous employers.


Top
 Profile  
Reply with quote  
 Post subject: Re: Adaptive Optimization (AO) - The silver bullet?
PostPosted: Thu Feb 21, 2019 11:35 am 

Joined: Wed Nov 08, 2017 8:57 am
Posts: 42
MammaGutt wrote:
Oh. You actually have 92(100GB SLC SSDs). I assumed it was a typo and should be 920GB.

Anyways. If you extract the SR output I can make some assumptions.

Btw, do you have SR and AO license? And what 3PAR OS version are you running? 3.2.2 has Express Layout which will allow R5 7+1(which is okay if SSD is only used for AO). Earlier version will only allow 3+1 with 8 SSDs.


Btw, if you hit the high watermark you have serious performance issues. AO with very limited SSD capacity might not be able to fix that.


Running 3.2.2.612 (MU4).. No, no typo :( Only 100GB'ers.. Yes, have both SR and AO licensing.

I've attached some info that will hopefully help.. And yes, we do have serious performance issues lol.. Looking to see what will fix it.. Have a quote to get (4) 920GB SSDs but I'm not buying anything until I know for sure what will fix this..

From what I'm seeing, most of the I/O comes from one particular VV that is an RDM for a SQL server.. I have the apps team looking at it because it's generating a TON of I/O and I'm wondering if they just have some bad queries or something.. Curious if they would address that if my issue would go away completely..

Thanks again for the help.


Attachments:
Report.pdf [3.47 MiB]
Downloaded 1140 times
Top
 Profile  
Reply with quote  
 Post subject: Re: Adaptive Optimization (AO) - The silver bullet?
PostPosted: Thu Feb 21, 2019 11:49 am 

Joined: Mon Sep 21, 2015 2:11 pm
Posts: 1570
Location: Europe
I will have to look at this at a bigger screen later(worst case over the weekend) but I see a SQL drive really killing the FC drives. There is also some bigger chunks which on their own are OK on FC, but when adding everything together it just becomes too much.

I would not buy anything less than 8 SSDs, but I need to look a little bit deeper to see how much capacity you should get to get you out of the biggest problems.

I'm just throwing this out there as you have a 7K which are in EoL lifecycle and most likely is already on a costly running support contract. It might be a better option from a RIO/TCO point of view to refresh the entire array rather than throwing more money into something old, but no matter what you do I will look at SR data and see how much SSD capacity you should have.

_________________
The views and opinions expressed are my own and do not necessarily reflect those of my current or previous employers.


Top
 Profile  
Reply with quote  
 Post subject: Re: Adaptive Optimization (AO) - The silver bullet?
PostPosted: Fri Feb 22, 2019 8:23 am 

Joined: Wed Nov 08, 2017 8:57 am
Posts: 42
MammaGutt wrote:
I will have to look at this at a bigger screen later(worst case over the weekend) but I see a SQL drive really killing the FC drives. There is also some bigger chunks which on their own are OK on FC, but when adding everything together it just becomes too much.

I would not buy anything less than 8 SSDs, but I need to look a little bit deeper to see how much capacity you should get to get you out of the biggest problems.

I'm just throwing this out there as you have a 7K which are in EoL lifecycle and most likely is already on a costly running support contract. It might be a better option from a RIO/TCO point of view to refresh the entire array rather than throwing more money into something old, but no matter what you do I will look at SR data and see how much SSD capacity you should have.


I can't thank you enough for your time!

As far as the EOL.. Yes, we are aware, and unfortunately a new SAN isn't in the budget right now..

Regarding the 8 SSD, yes, that was our original intent but they are $37K (920GB) / $41K (1.92TB).. (4) drives is around $11K..

Reason we do RAID1 on those is so we have cage redundancy. Being you need at least 4 cages to do RAID5 we did RAID1.. I believe we were supposed to do R1 on every CPG for that reason but didn't on the 10K.. Please correct me if I'm wrong regarding cage redundancy and RAID levels..


Top
 Profile  
Reply with quote  
 Post subject: Re: Adaptive Optimization (AO) - The silver bullet?
PostPosted: Fri Feb 22, 2019 12:02 pm 

Joined: Wed Nov 08, 2017 8:57 am
Posts: 42
Another question..

When trying to troubleshoot.. Currently, I'm mainly looking at I/O with the commands below.. If I see a specific host or VV for a considerable amount of time hammering the I/O I go find out what it is.. However, there isn't always an issue with what the host/VM is doing..

Attachment:
File comment: statvv -sortcol 2,dec -ni
1.jpg
1.jpg [ 392.17 KiB | Viewed 29515 times ]


Attachment:
File comment: statvlun -hostsum -ni -sortcol 3,dec
2.jpg
2.jpg [ 269.38 KiB | Viewed 29515 times ]


What other numbers should I be looking at? SVT is service time, right? What numbers are acceptable? QLEN is queue length? That is the amount of I/O waiting, which any amount here I'm assuming would be bad?

Totally unrelated question..

In SSMC, under hosts.. Would it affect anything to change the names of the hosts? I want to match them up with the actual host names.. Whoever set these up thought it would be a good idea to just put "SERVERG9-03" instead of the naming convention we use IE: SITECODE-ESXI-03.

Thanks again for your help.


Top
 Profile  
Reply with quote  
 Post subject: Re: Adaptive Optimization (AO) - The silver bullet?
PostPosted: Mon Feb 25, 2019 8:05 am 

Joined: Wed Nov 08, 2017 8:57 am
Posts: 42
MammaGutt wrote:
I will have to look at this at a bigger screen later(worst case over the weekend) but I see a SQL drive really killing the FC drives. There is also some bigger chunks which on their own are OK on FC, but when adding everything together it just becomes too much.

I would not buy anything less than 8 SSDs, but I need to look a little bit deeper to see how much capacity you should get to get you out of the biggest problems.



Good morning mate.. I was just wondering if you had a chance to look at this yet. Thanks again!


Top
 Profile  
Reply with quote  
 Post subject: Re: Adaptive Optimization (AO) - The silver bullet?
PostPosted: Mon Feb 25, 2019 3:47 pm 

Joined: Mon Sep 21, 2015 2:11 pm
Posts: 1570
Location: Europe
Sorry for the late response.

Just making this "quick" as I don't have time to calc everything down from frontend IOPS to backend IOPS to realistic backend IOPS on your disks. And comparing the different volumes to eachother.

1.9TB of SSD should take the majority of the hot data (assuming it is a somewhat fit for AO). Simple glance says it will do something like 75% of the IOPS but I didn't take a very deep look now, sorry.

Without doing the exact numbers on that, you should be coming a long way on that with 8x 400/480GB if those are still available for sale. Don't go 4x 920GB... if you get acceptance for 920GB you should do 8 (to keep the same number of SSDs as the 100GB and to minimize the capacity allocated for sparing. You might need to set sparing algoritm to minimal (only lose one SSD to sparing) and use the maximum available set size (which would be R5 7+1). Keep in mind that you should only use maximum set size for AO CPGs, so you would have to do AO for the "NON-AO-SSD-R1" CPG as well. There is a lot of wasted space there today.

Btw, your systems most active volumes/area in terms of write is ????-SQL-PRD-TEMP ... It's small but it's hot.


As for troubleshooting part. I always look at VLUN service time. Most cases when you see performance problems, people notice increase in service time. Your system is really having issues as .srdata is actually mongst the heavy hitters which from my experience tend to happen when it is never able to catch up.

It isn't always the VV/VLUN with the high service time which is at fault but it tells you that something is wrong. QLEN should be low, that this is just as much a host/server configuration as it could be a storage issue.

And yes, my OCD tells me that one physical server should have the same reference on any equipment in your environment... Switchport tagging, SAN zoning, storage array, CMDB ++++

_________________
The views and opinions expressed are my own and do not necessarily reflect those of my current or previous employers.


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 18 posts ]  Go to page 1, 2  Next


Who is online

Users browsing this forum: Google [Bot] and 42 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group | DVGFX2 by: Matt