This reminds me of the early binary DOC file format that was essentially a dump of Word's working memory. You could find all sorts of leaked data in the slack space between text chunks. Sometimes from other processes since `malloc` doesn't zero memory. I think there were a few instances of people being caught doing bad things because of identifying information in a Word DOC.
But why is this even happening? The standard procedure to overwrite a file is to save to a temporary file first so an error that occurs during the write won't damage the existing file. Are they really doing an in-place write and how does that affect the shadow copy if you wanted to restore the previous version of the file?
The old .doc format was never "a dump of Word's working memory", implying copy of raw bytes. It's rather Word's internal object graph serialized into COM Structured Storage (https://en.wikipedia.org/wiki/COM_Structured_Storage), which is basically a FAT-like filesystem inside a single file. This is convenient for the app because it gets an FS-like API and can serialize data into as many "files" as is convenient, and dynamically update data inside without overwriting the entire container file every time (which, back when this all was designed in late 80s - early 90s, would be slow).
Thus the reason why you could end up with old bits of a Word document sticking around inside the .doc is the same as to why your FS has bits of deleted files: the space has been marked as free, and nothing else overwrote it yet.
But none of this applies to images, so the explanation here ought to be different.
> The old .doc format was never "a dump of Word's working memory", implying copy of raw bytes. It's rather Word's internal object graph serialized into COM Structured Storage
Probably the "dump of Word's working memory" part emanates from Word for DOS, which predates COM by the order of a decade.
MacWord 3.0+ (and then WinWord 1.1+) had fast-save which leaned on the in-memory piece-table data structure to write to disk only the changes to the Word document.
This is also why "Save As..." with old Word versions would often produce files much smaller than "Save" would - it was writing a brand new, compact file which was effectively "garbage collected".
It was hard on the developer, but some of the features it enabled were very impressive, like the ability to arbitrarily embed documents into other documents in a way that allows composite rendering as a single piece, without the app managing either part aware of the nature of the other. In fact, I'm not aware of any modern equivalent of this tech, not even on Windows (since Office stopped supporting embedding of its documents via OLE by third-party apps).
It feels like a lot of "things everyone knows" are slowly getting lost over time, as developers work at higher and higher levels of abstraction without deep knowledge of the layers beneath them (which is of course the whole point of abstractions, but they're never perfect)
Being a true "full stack" engineer is a superpower when it comes to performance optimisation, or vulnerability research.
Eh, that’s not really true. Adding abstraction allows for providing APIs that can handle cases like these correctly. For example, Apple provides a very capable versioning system for files that does “the right thing”, which in this case would create a new file for reliability.
Sure, abstractions aren't inherently evil, but bad ones are. The abstraction you described sounds like a sensible one, which couldn't have been designed without a deep understanding of the system as a whole (or at the very least, the adjacent layers).
All abstractions leak. There are some physical facts about software we keep denying for some reason. There is no silver bullet. Every enterprise systems turns into a big ball of mud over time. Team structures get imprinted in the design of systems built by these teams.
And every abstraction leaks. Living on a given level without at least an accurate mental model of everything bellow it limits your ability as a developer. Sure you can just do scripting for a web dev team your whole career. If that's what you want...
Water dissolves pretty much everything too and yet we build structures that are useful even when it rains. Likewise, all software engineers get through their careers without learning most things. It’s totally fine, as long as they understand how to poke at a leaky abstraction when necessary. If atomic file renames is a performance problem for them, or it’s breaking hardlinks, or one of the several other ways that this would leak on them, then they can go learn what’s going on and update their understanding as necessary. Good abstractions aren’t watertight but rather don’t leak in unexpected and dangerous ways.
But I don't want to waste disk space by storing the same data twice. And doing that with copies only works on a handful of filesystems, most notably BTRFS and XFS.
Symbolic links don't give me an easily obtainable list of all "copies" of that file, and while they might survive atomic writes, they're also vulnerable to the "main" file being renamed/moved/etc.
(Of course the thing that'd solve my actual root problem would be proper OS and file system level support for tagging, but until then it seems that there are only imperfect solutions, each with its own set of drawbacks.
I.e. third party software is not well integrated with the file explorer and the file open/save dialogues etc., and now I'm dependent on that software lest I lose all my carefully tagged data, whereas hard or symbolic link-based solutions are clunky to use and vulnerable to either atomic saves or file renaming etc.)
I guess it depends on how long users are expected to work on the documents/flies, and whether there is some other safety mechanism. Basically: a power outage, full disk etc. corrupting the file at save costs X hours of lost work and for some X you better make sure the app can revert to a known state via a transactional save and/or an autosave feature.
Personally I'd probably always save to a new file even if the amount of work potentially lost is negligible (as in a snipping tool). The cost of doing so is extremely small in development so that if you ever save one customer's file, you probably saved more time than it cost you to implement transactional file writing. It's a few extra lines, if you opt out of the more complex scenarios (E.g. you can't make atomic renames to move if the target is a network share).
Windows 10 1607 and newer support it now if you call SetFileInformationByHandle() with FileRenameInfoEx and specify the FILE_RENAME_FLAG_POSIX_SEMANTICS flag. Not sure how commonly it's used in standard libraries, but I've seen it popping up more and more.
So, Acropalypse is “people use image snipping and markup tools to redact images, but those tools often allow the unredacted images to be recovered”? Yeah, that’s a pretty big violation of the implicit expectations of users.
To be more specific "acropalypse" refers to any image editor which does not truncate the original file before overwriting it with the edited version, leading to portions of the original being recoverable by an adversary.
It originally referred to a specific vulnerability in the Markup app found on Google Pixel devices (CVE-2023-21036), but apparently now includes other unrelated apps.
Reminds me of the old Underhanded C competition where people deliberately tried to write undetectable but faulty code for image redaction: http://underhanded-c.org/_page_id_17.html
Oh, that’s actually way worse than I thought! It’s not that these apps have some kind of undo feature or are aiming for non-destructive editing, and that’s not what users expect - the apps actually are attempting to perform redaction and they’re poorly implemented.
Kudos on the name by the way, I love a good tight pun name for a vulnerability.
I can't speak for this windows tool, but for the android image editor the editor _was_ doing the correct thing. Then the underlying IO library changed in breaking way (and diverged from other APIs using the same mode strings) to not truncate when opening existing files.
The API was (accidentally?) changed long before the Android tool was written, if I understand correctly. Depending on your point of view, that means it's either a poorly designed and documented API, an incorrect API that doesn't match the documentation, or a testing failure during development of the app.
It would be interesting to see if any popular ZIP libraries have a similar issue. If you add a file that's already in the archive, for example, and the new one is larger, I could see it just sticking the new one at the end of the archive and appending a new central directory structure.
Is the periodic reencoding of the huffman tree part of DEFLATE or in the PNG are they just using a sequence of separate DEFLATE structures appended together that each have their own huffman tree? If it's the former, you might be able to recover part of a ZIP archive entry that was reduced in size and written over the original version.
Weird. If I take a screenshot with Snip & Sketch on Win10 and save it (full.png) - then in S&S crop a small portion of it and save it (crop.png) there is a significant file size difference between full.png and crop.png.
Or re-compressed all JPEGs. Especially Print to PDF probably has no hope of ever being particularly efficient, as it's re-encoding things that have already been processed for printing.
This is beyond scary. I have no idea how my screenshots I took and cropped with Snipping Tool and send to someone: at work, at home, across multiple machines and accounts.
Isn't that the default on Android phones if you screenshot then crop. As opposed to screenshot, then open it and crop.
I tend to use the rectangular option in the snipping tool as a way to be certain that I won't forget to crop important info. Both of these make me think I need to check my process and see if it is relevant.
It seems like the problem is that if you use the snipping tool to save to a file that already exists it only modifies as much of the file as is required so that the new image you saved is visible - the rest of the file's data is preserved. The problem is saving to a file with more information than the cropped information requires - the original images information "overflow" isn't removed. So if you're saving to a new file it's fine, there's no extra information.
My stubbornness to relearn and move away from “printscreen+winkey+r+mspaint+enter+select+Ctenophora+c” pays off. I don’t know what it is about snipping tools (and I use sharex too” but I always find myself going back to paint, as the tools feel like they get in the way.
I've always used "prtscn+paint" as well. I've tried various on other ways and I'll usually be like "that's kinda cool, maybe I'll try that." Next time I need a screenshot it'll already be pasted in paint far before it occurs to me "i think i forgot something, guess it wasn't important."
Apparently Discord and probably other platforms don't remove the excess data at the end of the file. Maybe they will start doing that now. But for all the existing cropped screenshots out there, it definitely means that there is a vulnerability.
Nope, or at least, not through any obvious paths I've ever seen, including testing just now.
Also, both discord and imgur trimmed an image I uploaded to just the image data. Slack did not, however. I certainly wouldn't want to rely on that behavior, though.
Every techie person I know who deals with windows has not dared to move to windows 11 so far, so I'm not too to surprised. I looked it up and the data matches my anecdote, only 18% of windows users are on 11, almost 70% on windows 10.
(As a non web-dev, maybe, )I don't think I've cared about the file size of cropped images in the past couple years. When I crop stuff it's because I want to send it to someone, not because I'm gonna archive it or anything.
For android it was a backwards incompatible, undocumented API change [0]. For windows it just looks like the UWP API is complicated enough that you can footgun yourself. For ex OpenAsync+DataWriter would do it as the default behavior [1].
Snipping Tool used to have this "feature" whereby any drawing performed on a screenshot would be animated as an overlay (this was only visible in the UWP Photos app at the time). Funnily enough, that's the first thing that came to mind when I heard of Acropalypse.
You need to take a screenshot, save it, crop it and then resave over the same file.
I mean it's bad, but... who does that?
EDIT: It also happens if you take a completely different screenshot and just save over (overwrite) another (bigger) image. That's worse, but still not that common IMHO.
I don't think the comment was about expected behavior. It's about who would be suspetible to this. Saving text files is fine all the time, you're right. They're saying, how often do people perform the steps in that order?
It is extremely common for me to realize I don't like the screenshot I took, and to save the replacement over the original file. It is pretty common for me to be more selective about what is in the replacement, and hence for it to be smaller than the original triggering this bug.
It's a relatively uncommon scenario, granted, but when you multiply those odds by the number of screenshots that get shared on a daily basis, it's pretty bad imho.
This would have been avoided if the PNG format (or at least one commonly used editor) required that the data filled the whole file, or rendered extra junk when there was extra data at the end.
> The PNG decoder has to determine whether a syntax error is fatal (unrecoverable) or not, depending on its requirements and the situation. For example, most decoders can ignore an invalid IEND
It doesn't explicitly mention what decoders should do on encountering data after IEND, but The general philosophy for decoders seems to be that errors should be handled gracefully where possible, even if the file is technically malformed (which is maybe something that could be clarified or expanded upon?)
Back in the olden days, when people were using protocols like Kermit and XModem to download their questionable images from BBS's, You would often get a file whose size was rounded up to the nearest block size. In that situation, failing for extra data at the end would have been a fatal move for someone implementing an image decoder, and I think PNG might be just barely old enough that the designers remembered that.
From a usability perspective, it's preferable to recover from and workaround any data stream errors rather than crash with an exception error.
Why? Because users have stuff to do. They don't know about and don't care for errors.
The most prominent example are web browsers. Browsers are supposed to crash when fed invalid HTML, and this was even mandated when XHTML was trying to replace HTML. Users fucking hated it, and XHTML crashed and burned and HTML with its error-safe handling has stayed to this day.
Thanks for explaining, from a user perspective the best thing to do is definitely graceful recovery, but bugs like this highlight that there's value to the ecosystem in failing loudly whenever anything is out of spec. The appropriate balance of those behaviors is obviously a function of the nature of the ecosystem, and could be different for different tools.
I'd suggest that for PNG in particular, one reasonable behavior would be to show a little red "File corrupted!" warning or popup for these sort of out-of-spec-but-probably-recoverable issues, which would probably have been noticed and filed as a bug for the developers of these tools, even if only a small fraction of viewers had that behavior. Something like a thumbnail viewer should maybe just opt to early-abort whenever anything sketchy is going on, especially since out-of-spec behaviors can lead to security issues.
Is Mac's Preview app susceptible to this bug too? Right now it seems only confirmed for Pixel phones and Windows Snipping Tool, but I use Preview and I haven't seen anyone confirm or deny it for that one.
You should notice that the file size didn't actually get any smaller after you cropped it. The full-size image file is not truncated before the cropped version is written. The left-behind data can be recovered.
The root cause seem to be an open mode which opens the original file for writing without truncation, but writing to the original file directly in the first place seemed precarious. The software I use tend to write to a temporary file first and then do a rename to replace the original file.
The bit about recovering LZ77 stream without the prefix sounds very useful as a data recovery tool.
Got sent a few .PSDs (Photoshop format) and there was some artifacts which looked like they weren't intended for the recipient. Of course, this is part of the .PSD format and not a privacy issue per se. Just be careful sending .PSDs in an e-mail with stuff in them you don't want the recipient to see.
Photoshop has a checkbox to "delete cropped pixels", if unchecked using the crop tool again will show the entire original image. It also stores an undo history, so even with "delete cropped pixels" you can just undo the crop.
Like he says on Twitter - was this deliberate? Doesn't seem like it yet, and there's no proof of it, but it's pretty scary. There's excellent plausible deniability, and the law enforcement benefits could be substantial.
I was just about to ask, "Is this still coincidence, or have we jumped straight to enemy action?"
That two different tools fail in the same, "surface level invisible" but mighty convenient, and "Whoopsie, file operations are hard!" way... raises some interesting questions.
Mostly, "What else of this nature is present in all our modern operating systems?" Because this sort of failure sure looks an awful lot like the "Underhanded C" contest from 2008: http://www.underhanded-c.org/_page_id_17.html Redact an image, while leaking a lot of information in the process to someone who "knows the trick."
> This is an excellent example of the contest’s philosophy: make the code extremely simple, innocent, obvious, and wrong.
I just took a look at spectacle (KDE's screenshot tool, version 22.12.3) and it's not affected. It properly truncates the file when overwriting a file that already exists (after cropping or any other edit I tested). The inode number stays the same, so it's not doing the create-temp-file + rename method.
Android was playing games with the file open mode (open(file, "w") didn't overwrite and truncate), and that kind of change would be unforgivable on linux.
Also on linux you'd quickly notice something was wrong with the file sizes.
The wordage "acropalypse" and this image chosen makes it seem like doing this is going to somehow break your computer. It is a bad vulnerability, but not that bad.
Not sure if this person is being deliberately vague/obtuse, but it's unclear if this affects you if you save each screenshot as a new file, or only if you do the strange "overwrite existing file" thing the OP did.
Overwriting existing file is strange now? This is just victim-blaming for the incompetence of software engineers. As a user there's nothing wrong with overwriting an existing file.
Saving a cropped image as an original with an edit list strikes me as a completely obvious and normal thing to do. It's an affordance that allows the user to go back and un-crop things. If you undertook a field survey you would find users who were glad for this feature and I imagine zero users who had been harmed by it.
Which you should take as a "maybe I have misunderstood the issue".
This is not an edit list, or the ability to undo operations. The problem is you have an existing file ("foo.png" or whatever):
oooooooooo
then you make a new file with this data in memory:
nnnnn
This could be something like you opened the original foo.png file and made a change that results in the file shrinking, or it could be some completely unrelated data. Now you save this as foo.png and the editing program assumes opening a file for writing will erase the old data and writes out the new data. The end result is you have this file:
nnnnnooooo
e.g. the tail of the old file is still present in the file data. It turns out it's possible to recover the pixel data from those tail bytes, in spite of losing the compression state.
Now the assumption that writing over an existing file will truncate/erase the original file is reasonable. That's the default behavior of most file IO APIs. I don't know what the windows app is doing, but the original android bug was an API that takes a mode string copied from posix's fopen, in which "w" means "open for writing, and erase any existing file", but then later on made an undocumented change that made opening a file with "w" no longer erase existing files.
That isn't what is happening in either the original bug for the Google Pixel Markup tool or this bug in the Windows Snipping Tool. What is happening is that the software is overwriting the original image with the new one, but not truncating the file, so if the new compressed image data is smaller than the old, the remainder of the old image still exists at the end of the file.
This data is not made available to the user to uncrop the data at a later time, so it provides no benefit to the user, just risk of privacy violations.
But why is this even happening? The standard procedure to overwrite a file is to save to a temporary file first so an error that occurs during the write won't damage the existing file. Are they really doing an in-place write and how does that affect the shadow copy if you wanted to restore the previous version of the file?