@BiNonBi

BiNonBi@lemmy.blahaj.zone · 1 year ago

You are kind of hitting on one of the issues I see. The model and the works created by the model may b considered two separate things. The model itself may not be infringing in of itself. It’s not actually substantially similar to any of the individual training data. I don’t think anyone can point to part of it and say this is a copy of a given work. But the model may be able to create works that are infringing.

BiNonBi@lemmy.blahaj.zone · 1 year ago

That is not actually one of the criteria for fair use in the US right now. Maybe that’ll change but it’ll take a court case or legislation to do.

BiNonBi@lemmy.blahaj.zone · 1 year ago

NPR reported that a “top concern” is that ChatGPT could use The Times’ content to become a “competitor” by “creating text that answers questions based on the original reporting and writing of the paper’s staff.”

That’s something that can currently be done by a human and is generally considered fair use. All a language model really does is drive the cost of doing that from tens or hundreds of dollars down to pennies.

To defend its AI training models, OpenAI would likely have to claim “fair use” of all the web content the company sucked up to train tools like ChatGPT. In the potential New York Times case, that would mean proving that copying the Times’ content to craft ChatGPT responses would not compete with the Times.

A fair use defense does not have to include noncompetition. That’s just one factor in a fair use defense and the other factors may be enyon their own.

I think it’ll come down to how “the purpose and character of the use, including whether such use is of a commercial nature or is for nonprofit educational purposes” and “the amount and substantiality of the portion used in relation to the copyrighted work as a whole;” are interpreted by the courts. Do we judge if a language model by the model itself or by the output itself? Can a model itself be uninfringing and it still be able to potentially produce infringing content?

BiNonBi@lemmy.blahaj.zone · 1 year ago

I think you mean RTFM. But in this case it’s WTFM so I can RTFM.