Slightly OT.
I would point out that the jury is still out (literally :-)) on this from a legal pov.
The thing is that a copy is a copy is a copy, just as if you produced a verbatim copy of something that you “saw” once you are likely going to run in to trouble. How that maps to the specific combination of producers and users of LLMs is going to be interesting, but currently is undecided. There are many dimensions to this, just consider the US fair use doctrine that doesn’t exist elsewhere in such an expansive form.
In any case as long as the contributors to a github hosted project are aware of the situation IMHO it is their call if they want to use it or not. OSS isn’t just about licences.
1 Like