> Can I download a book without paying for it, and print copies of it?
No, but you can read a book, learn its contents, and then write and publish your own book to teach the information to others. The operation of an AI is rather closer to that than it is to copyright violation.
"Should" there be protections against AI training? Maybe! But copyright law as it stands is woefully inadequate to the task, and IMHO a lot of people aren't really treating with this. We need a functioning government to write well-considered laws for the benefit of all here. We'll see what we get.
Yes, but the learning isn't constrained by those laws. If I steal a book and read it, I'm guilty of the crime of theft. You can put me in jail, try me before a jury, fine me, and put me in prison according to whatever laws I broke.
Nothing in my sentence constrains my ability to teach someone else the stuff I learned, though! In fact, the first amendment makes it pretty damn clear that nothing can constrain that freedom.
Also, note that the example is malformed: in almost all these cases, Meta et. al. aren't "stealing" anything anyway. They're downloading and reading stuff on the internet that is available for free. If you or I can't be prosecuted for reading a preprint from arXiv.org or whatever, it's a very hard case to make that an AI can.
Again, copyright isn't the tool here. We need better laws.
Sure, but OpenAI (same as Google, and Facebook, and all the others) is illegally copying the book, and they want this to be legal for them.
It's perhaps arguable whether it's OK for an LLM to be trained on freely available but licensed works, such as the Linux source code. There you can get in arguments about learning vs machine processing, and whether the LLM is a derived work etc
But it's not arguable that copying a book that you have not even bought to store in your corporate data lake to later use for training is a blatant violation of basic copyright. It's exactly like borrowing a book from a library, photocopying it, and then putting it in your employee-only corporate library.
One thing is downloading pirated copy and reading it for yourself and another thing is running a business based on downloading millions of pirated works.
Yes, but this is not the right model. What OpenAI wants is to borrow a book, make a copy of it, and keep using that copy, in training their models. This is fully and simply illegal, under any basic copyright law.
No, but you can read a book, learn its contents, and then write and publish your own book to teach the information to others. The operation of an AI is rather closer to that than it is to copyright violation.
"Should" there be protections against AI training? Maybe! But copyright law as it stands is woefully inadequate to the task, and IMHO a lot of people aren't really treating with this. We need a functioning government to write well-considered laws for the benefit of all here. We'll see what we get.