I don't fully understand the underlying cause, so take this with a pinch of salt... but my girlfriend's father's photo collection almost got wiped out by Google Photos.
After being locked out of his Samsung tablet (supposedly it set itself a lockscreen out of the blue), I checked whether it had backed his photos up on Google Photos, but nothing there... After resetting it (it seems Samsung removed the ability to reset the lockscreen password via your Google account) I assumed that the photos were lost. However upon opening the app we rejoiced when they started appearing. Shortly afterwards, the Google Photos app popped up a message stating that an upgrade was required, after which all of the photos had disappeared again.
The workaround was to reset the tablet, open up Google Photos, wait until the photos had synced and then disconnect from the internet as soon as possible to prevent Google Photos from trying to update itself (the message couldn't be dismissed). My hunch was that the version of Google Photos that shipped with the tablet was very old and they have long-since updated the format for storing photos, hence why they wouldn't show up online?
Methinks you could definitely get in touch with Google about this.
Product support may well actually do what it's supposed to do...
Alternatively, keep the tablet on cotton wool until you see someone mention in here they're from Google, check their profile for contact info and email them directly. Keep doing this until you get a reply back, and get them to poke your issue over to the right department. :P
I did. I got a couple of random low-res thumbnails (they were just generic enough that I wasn't sure if they were actually original photos, or Google samples), and that was it – the rest were mysteriously missing.
They were clearly there on the server somewhere, but something was causing them not to be shown on the web, or the latest version of the Google Photos app.
Sure, but I've read a story like this about about pretty much every company. In our house we have our photos in 3 places: the device, remote backup, and local offline backup (external hard drive). 2 is easy, 3 takes a little bit of a routine.
Right, it's why I don't trust these services myself. To be fair, while I say that Google Photos almost wiped his photo collection out, we didn't actually realise that sync had been enabled in the first place. I guess in a funny kind of way it saved his photo collection (otherwise it would have been toast when we reset the device to get past the lock), even if it did make it painstakingly frustrating to retrieve them again.
But one has to make sure they subscribe to the paid plan, because Google offers free unlimited storage only if you let them resize/compress the photos. Which is something I refuse.
Anyway, Dropbox wins on the OS-level integration, which also doubles as a bullshit- and hassle-free way to transfer files between your devices.
But well, at this point I'm a paying customer of both Dropbox and Google because I run out of free-tier storage on both :).
My sister refuses to receives photos by Dropbox because, she says, it's less usable than WeTransfer and it looks like a virus to her. When sitting with her, I noticed:
- Dropbox aggressively suggests to create a new account to recipients,
- She's unknowledgable, so when a popup appears, she assumes it's mandatory. Hence she installed Dropbox without wanting it on her phone and PC and couldn't understand why so many steps are required to download a few photos. Also, "why does it try to upload all my folders ?!?" hence the spammy/virus impression.
- Dropbox sends an Android notification for every upload, which is both annoying and worrying to her, because she doesn't want her private life to go online.
I always assumed the mass went with Dropbox. Turns out the first example I see from that audience thinks it's a virus. Big lesson of user experience here.
I can imagine. My use patterns mostly let me avoid this, but I've seen some of the stuff you've described.
It seems that any good business, in its quest to grow, will eventually start making annoying and user-hostile things. Dropbox definitely was cleaner and better in the past than it is now. It's a story I repeated by company after company - they reach peak quality, and instead of leaving things as they are, they have to "innovate" in more and more crap.
I'm one of the earlier users of Dropbox, so I still have a "Public" folder with an ability to create direct links to uploaded files. They've turned this feature off for new accounts some time ago - I guess because people started using Dropbox as a CDN. Still, it's one of its most useful features for me. I use the "Public" folder almost every time I want to transfer some files people - it lets the recipient avoid all that popups and captive forms bullshit.
I have 1/2 TB of pictures kids pic, videos from cell phone, SLR etc backup to multiple HDD. If everyone put 1/2 TB or more to google, can google backend really handle that, if so for how long?
Also need to consider how long it will take me to download back those pic once they decide to shutdown the "free" service.
> If everyone put 1/2 TB or more to google, can google backend really handle that, if so for how long?
Not everyone is going to do that though (anytime soon anyway) so that's not a real concern. It's like asking when gmail first launched "okay but what if EVERYONE uses the full gigabyte?"
> Also need to consider how long it will take me to download back those pic once they decide to shutdown the "free" service.
This is an incredibly good point, both in terms of bandwidth considerations (particularly their ratelimiting) and in terms of products randomly disappearing with limited takeout windows.
FWIW, https://get.google.com/albumarchive/<G+ UID> will net you takeout archives of your image albums. Incidentally this works with any Google account that doesn't have public photo access turned off, and is rather fun to play with (as is the site: search operator :D)
--
> can google backend really handle that, if so for how long?
OK. (Been wanting to do this math for a while, actually...) Let's see. This is all back-of-the-envelope and I wouldn't mind some more concrete numbers to work with!
YT reencodes all videos into several formats.
I'm looking at http://youtu.be/1tQ5XwvjPmA, which is 1:20:58 long. It was uploaded fairly recently so has the full complement of encodings. I see:
- 5 DASH audio bitrates: 51k (27.53MB), 66k (31.93MB), and 120k (58.02MB) for clients that can decode OPUS, 89k Vorbis (46.67MB), and 132k M4A (73.16MB)
- 6 DASH video sizes in both WebM/MP4 (so 12 total formats): 256x144 (43.09MB / 63.54MB); 426x240 (39.79MB / 140.34MB); 640x360 (71.80MB / 122.65MB); 854x480 (118.37MB / 266.00MB); 1280x720 (234.63MB / 548.81MB); and 1920x1080 (463.04MB / 1.05GB). (Yes, WebM is amazing compared to MP4.)
- Three legacy video formats: 176x144 3GP (39.51MB), 320x180 3GP (116.05MB), 640x360 WebM (211.30MB), 640x360 MP4 (205.97MB), and 1280x720 MP4 (621.68MB).
So, for this standard, 30fps 1080p video, YouTube is actually storing... 4.51GB of data. Huh! Nice.
If this video is 1h20m, 1-(60/80) means I should subtract 25% from 4.51, and I get 3.38GB for one hour of video.
OK. Taking that figure of 700 hours... that's 2366GB (2.31TB) per minute :)
In other words YouTube needs to find disk capacity for 39.42GB of data every second.
I'm not sure how to multiply by an increasing gradient with a back-of-the-envelope calculation, so I'll punt and pretend it was 700 hours/min all the way back to 2014, so the past 2 years. Quite inaccurate, but possibly still interesting:
Uhh.... that's... ah. 22PB. Err, 19.76PB to be precise.
This is for the boring 30fps-and-under 1080p videos out there. Not the 60fps, 2K/4K/8K (!), 360° and similar stuff, and there's an increasing pile of that being uploaded.
22 PB = total Youtube data need for last two year.
1/2 TB per user (like me)
22PB = 44,000 users.
Google need 1000 times that space in their data centers to handle 44 million users.
Also, I might think those 1/2 TB of data are very valuable, But only a few of them are interesting to a few of my friends, family members. They are probably very hard to monetize. Even for myself, I only browse them may a few times every a few years.
If I am a PM for such product and try to propose to Alphabet to build 1000 new youtube size data center to handle only 44 millions users, I would have hard time to justify it.
FWIW, I'm not familiar with how and where the Internet Archive gets their funding, but in 2014 they had 50PB of storage (https://archive.org/web/petabox.php). So IA can manage 50PB as a small-to-medium private company. (Incidentally they've been running since '99.)
Both IA and BackBlaze are private/nontraded, which means have they have lower operating capital. Diskspace is simply not that expensive now.
There's a guy on a DC++ filesharing server (find a server list - it's one of the biggest ones) who has been sharing 400TB of data for some time. Speaking of DC++, most newer clients show the total shared data for all users connected to the server you're on, and that number on some of those larger servers is usually 1-2PB.
I also saw a guy on reddit a while back who was in exactly the right place at the right time when his workplace was upgrading, and he now has a nice $200/mo electricity bill in the form of, you guessed it, 400TB of diskspace. I'm not sure if he got it all for free, but I think he may have.
So it's not a money problem; it's a space problem and a power problem. This is why flash storage is so interesting, it generates less heat and can be packed somewhat more densely, and it uses less power too. Once Flash-vs-platter hits the 49%/51% in terms of relative cost things are going to get interesting.
At the moment the major retailers are just doing simple things like firmware customizations to run their disks at lower speeds (for nearline storage) or start up with the disk off and stuff like that. Facebook's cold storage datacenters also use Reed-Solomon encoding instead of RAID/ZFS for redundancy at less used space.
I actually do think Google have actually done the kinds of allocations you speak of, using thin provisioning; after all, literally every new Google account gets 15GB of diskspace! And then there's sync profile data, whatever internal metadata is associated with the account (such as your search history), etc, that needs to be stored too.
I fully believe Google have multiple exabyte-scale datacenters. If they don't I'll be genuinely surprised.
Using thin provisioning (which is ultimately just "how much are they really using, and how can we encourage them not to use more than X") is how they manage it.
So you're right - actually provisioning enough free storage for these users would definitely be an unpleasant task. But they carefully balance what everyone uses with what they have available.
This kind of high quality, high effort comment is why I love this site so much. Thanks for crunching the numbers and making me drop my jaw at the amount of data.