Ah my first official post during my tenure at Pure and it couldn’t have happened at a better time! Just in time for the Purity 4.0 release which we just announced today. While there are plenty of under-the-cover enhancements I am going to focus on the two biggest parts of the release: new hardware and replication. There are other features such as for example hardware security token locking but I am not going to go into those in this post. So first let’s talk about the advancement in hardware!
Prior to this release there was one current version of the FlashArray (FA) which was the 420 model. Today we announced two additional models, the 405 for those with smaller environments or just lower capacity needs and the beast, FA 450, which is a big step up in stats from the 405 and 420. Before I go deeper into the numbers remember that even though these models come in small, medium and large–the feature set doesn’t! All three models of the FA have all of the features that Purity offers across the board. Find some more information here. Sweet, now for the technical specifications:
|Model||FA 405||FA 420||FA 450|
|Scale||40 TB Usable||125 TB Usable||250 TB usable|
|Performance||100,000 32K IOPS3 GB/s||150,000 32K IOPS5 GB/s||200,000 32K IOPS7 GB/s|
|Connectivity||4 x 8 Gb/s FC or4 x 10 Gb/s iSCSI||8 x 8 Gb/s FC or8 x 10 Gb/s iSCSI||12 x 16 Gb/s FC or12 x 10 Gb/s iSCSI|
|Controllers||4 x Intel 8-core CPUs256GB DRAM2 x 1U HA Controllers, 660W||4 x Intel 8-core CPUs512GB DRAM2 x 2U HA Controllers, 800W||4 x Intel 12-core CPUs1024GB DRAM2 x 2U HA Controllers, 1000W|
A few things to call out in these numbers. First the capacity listed is the usable capacity, so physically you have much less capacity in the array. To get this number, we take the most common data reduction rate we observe across all of our customers and apply it to what physical capacity we offer in the array. So expect to see something around those capacities on your array when it is all said and done. Secondly, the IOPS numbers. Notice that we list possible IOPS for 32 KB workloads. Traditionally IOPS numbers are listed with much smaller I/O sizes, which then increases the listed possible IOPS. We decided at Pure to take a look at what our customers are actually doing (on average) using our Cloud Assist monitoring environment. For those of you not familiar with Cloud Assist it is our dial-home mechanism that constantly monitors performance (among a ton of other information) in almost real-time for all of our deployed customer arrays (save dark sites). We noticed that a large majority are averaging I/O sizes much larger than 4 or 8 or even 16 KB. 32 KB seemed to be pretty much the median so this is what we list for our performance numbers (if this median changes in the future our listings will change too). Another important note is with the 405 model. You may notice in the picture that the whole array is only 4U. 2U for the controller space and 2U for the storage. In the 420 and 450 model the controllers are each 2U but keeping this controller size and maintaining a 4 U limit for this model would hurt our availability (single-controller design would be required). This would not be acceptable, so instead the controllers are 1U in the 405 allowing for HA even in the smallest model. The last thing I want to mention is something I really like about this product. If you buy a 405 or a 420 it can be upgraded non-disruptively to a more advanced model. So if you decided at a later date you need more capacity or horsepower and your current model won’t do–an upgrade is totally an option. No data-migration or unavailability is required! Furthermore, as Pure releases faster controllers you can upgrade to those. We strive to make sure there is backwards compatibility as we move forward to continue to allow for this upgrade cycle.
Check out the following video overview of the FlashArray 400 series: Jim Sangster
The flagship software feature in the Purity 4.0 release is of course replication (score!). As many readers may know I did extensive work with replication in my previous job, so I was really excited to see this arrive shortly after I joined Pure. First, Purity 4.0 has three main software feature sets:
- FlashReduce–This is the feature set of compression, de-duplication, pattern removal etc.
- FlashProtect–This is the combination of reliability and protection features of RAID-3D, data-at-rest-encryption and non-disruptive everything.
- FlashRecover–This is the newest feature set, including the previous local snapshot technology and introducing the remote replication ability and protection policies.
So how does Purity remote replication work? Remote replication essentially leverages the same technology as local snapshots. We create a “FlashReduced” metadata snapshot of the source volume but instead of storing it on the local array it rests on the remote array. The first snapshot will have the data sent over to the target array and for the rest only differentials are required to be sent over the link. In theory, these remote snapshots on the target array may not even require additional capacity on it. Since they are reduced in the context of the target array data it is possible that the data residing there will allow for high reductions in the data that arrives on the target array–the target treats it no differently than anything else it stores on the SSDs. To configure replication you can simply create a policy for a given device or set of devices (sets will create consistent copies across the group). You can create local snapshots at a given time interval and then have the system automatically keep or destroy them at given intervals as well. The same for remote replication. This version of remote replication is asynchronous with a replication internval as little as 5 minutes. This means that every 5 minutes we will take a remote snapshot of the device. The current code supports 5,000 snapshots total. One of the things that really struck me is how EASY it is to setup and manage, there really is not much to do. I will go into much more detail on how to do all of this in an upcoming post but at a high-level:
- Authorize replication at an array level. Each source and target needs to be paired to allow replication in either direction
- Run through quick replication wizard to create a local and/or remote replication policy for a device or set of devices.
- Enable the replication
That’s really it. The hardware connection needs to obviously be in place but nothing else needs to be configured. Note that FlashArray does not consume front-end host ports to enable replication. Additional built-in ethernet connections on the controllers (if you have an array now you already have this hardware, don’t worry) provide the physical pathway to replicate, so consequently replication traffic is sent over TCP/IP, not FC or otherwise. Once that is done select a device or group of them and create a policy, then enable it. Pretty straight forward.
One of the great things about this design choice is that there is no need to take multiple separate snaps of a given PiT on the remote side for back-up, test/dev or whatever. The same snapshot can be used for either–simply copy the snap to a volume for testing and the original snap won’t be changed and can still be used for recovery/back-up or future tests of that PiT. Note that snapshots cannot be directly mounted, they need to be copied to an actual volume for the data copy to be accessed (note I said copied, not mounted–the original snapshot remains untouched during volume access).
Furthermore, since this is snapshot-based replication if there is some type of application initiated-corruption, or more commonly accidental deletion, on the source it will not ruin both the source and target copy (the corruption/deletion would traditionally be replicated) since the target is usually just one copy that continuously gets replicated to. The latest PiT might get it but the previous PiT will be unaffected–this eliminates the need to take copies of the source snapshot to protect against corruption propagation.
Check out these videos for information and a demo of replication:
Sandeep Singh @storagesandeep
Chadd Kenney @pureflashgeek
That’s all for now–stay tuned for a deeper dive on replication and how it works!