This looks like it would be perfect for Tarsnap -- the data Tarsnap stores is almost always kept for 30+ days, and it's almost always in objects of 128kB or more. The $0.01/GB for reads would be annoying (one of the reasons Tarsnap is hosted in EC2 is because it has free data transfer to and from S3; data is regularly retrieved and stored back after filtering out blocks marked for deletion) but it would be cheaper.
One thing concerns me however: Standard – IA has an availability SLA of 99%.
If this is just a reduced SLA but the actual availability is likely to be similar, that's fine. But if the actual availability is not expected to hit 99.9% -- say, if the backend implementation is "one copy in online storage, plus a backup in Glacier which gets retrieved if the online copy dies" that would be completely inadequate.
Thanks! This is a very important detail which isn't documented anywhere: Retries are likely to succeed. A service where 1% of requests fail but failures are completely uncorrelated is far more usable than a service where 0.01% of requests fail but they keep on failing no matter how many times you retry them.
Additionally, assuming your block data is being hash-addressed, i.e. not changing the S3 objects once they are in S3, adding CloudFront in front of your buckets may go a long way to increasing that percentage.
I had to read these sentences a few times to understand what you were trying to say: "You now have the choice of three S3 storage classes (Standard, Standard – IA, and Glacier) that are designed to offer 99.999999999% (eleven nines) of durability. Standard – IA has an availability SLA of 99%."
No availability is mentioned for the others, but I assume it's 100%? Perhaps a simple table could help readers to scan and visually compare the values of two properties across three service classes?
When choosing between standard and this it would be helpful to understand the pros and cons. With the current description (below) it's as if the difference is only in pricing. But I assume there is a technical difference as well.
Also, the availability number could be explained better -- why is it different.
Standard - IA offers the high durability, throughput, and low latency
of Amazon S3 Standard, with a low per GB storage price and per GB retrieval fee.
Based on the post, it seems availability drops but durability remains the same. You might need to rest to get an object, and you'll be successful eventually.
Right. But it matters how much availability drops, and also what the correlation is between failures -- if they're completely uncorrelated but there's a 1% failure rate, you just retry, but if 1% of objects are going to be unavailable for the next four hours, that's a problem.
> it’s vital that customers have immediate, instant access to any of [our photos] at a moment’s notice – even if they haven’t been viewed in years. [IA] offers the same high durability and performance ... so we can continue to deliver the same amazing experience for our customers.
The way this was phrased implies that this customer's use-case had a hard requirement that all of their data be in "online" storage at all times, and their satisfaction implies that IA does, in fact, hit this requirement.
S3 Classic: Your file gets replicated on 3 live HDs (and/or HDs backed by RAID arrays—not sure about the internal S3 storage topology).
S3 Infrequent: Your file gets stored on 1 live HD (or single hardware redundancy component) and a copy in Glacier. If your live HD dies, your file will be automatically restored from Glacier to a new HD (but your data may be inaccessible during the automatic re-deploy).
Glacier: offline bluray combined with error correction accessed by robot arms, temporarily restored to live HDs on demand.
Whichever they use, most disks are likely unpowered most of the time. Like the Facebook equivalent where they can only power 1 out of 12 disks at any time.
I understand that 99% doesn't mean that 1% of the objects will be always unaccesible. Instead, I guess that they mean is that they allow themselves to have up to 80 hours a year of downtime for any bucket.
One thing concerns me however: Standard – IA has an availability SLA of 99%.
If this is just a reduced SLA but the actual availability is likely to be similar, that's fine. But if the actual availability is not expected to hit 99.9% -- say, if the backend implementation is "one copy in online storage, plus a backup in Glacier which gets retrieved if the online copy dies" that would be completely inadequate.
Hopefully we'll get more details over time.