That complaint about books and stealing personally strikes me as deeply silly even by permission culture standards. The whole point of books is to learn from them. Proper summarization already separates plagerism from original content (even if it is preferrable to provide citations). It doesn't matter how it is derived - either the end product is fair use or it is effectively unauthorized publishing from including too much source content.
We should be rejoicing at the ability to have an assistant that digests the world's libraries not worrying that someone might make a profit off of it without permission.
But we won't have an assistant that digested the world's libraries. We'll have an advertising company gatekeeping the digitally digested world's libraries.
I think that's worth worry about. As well, if Google in their drive to monetize content that they don't own, causes the various publishers and IP owners to go on the legal attack, any other option/startup will be quickly dissuaded from building a similar, or better, assistant.
The HathiTrust is a partnership of the academic libraries involved in Google Books and other digitization efforts. They offer the Google-originated scans free to the public for works that are already in the public domain, and allow university members and research partners to access scans of books that are still under copyright.
Deliberately creating uncertainty around the copyright in ML-created works (through legislation), would be a low-key and indirect way of impeding the automation of creative work. Not that I'm advocating it.
We should be rejoicing at the ability to have an assistant that digests the world's libraries not worrying that someone might make a profit off of it without permission.