Law in the Internet Society

View   r4  >  r3  ...
JaredHopperSecondEssay 4 - 21 Jan 2024 - Main.EbenMoglen
Line: 1 to 1
 
META TOPICPARENT name="SecondEssay"
Line: 12 to 12
 

Wikipedia as a Training Ground

Changed:
<
<
Wikipedia is licensed under Creative Commons and is thus available for commercial usage, meaning that users are relatively free to share and edit the text so long as it is attributed correctly. But how does this work with AI training? It seems that AI could freely use the information to educate itself, but in folding that knowledge into the commercial product that AI will increasingly become, it is not clear how companies can properly attribute Wikipedia’s contributions when the information is integrated into the Digital God’s mind without clear delineation? If it wanted to, Wikipedia could absolutely change their license and prohibit AI usage, but this seems counter to their open-source mission. Some people worry, furthermore, that such free training would “allow[] companies like OpenAI? to exploit the open web to create closed commercial datasets for their models.” This concern is less about copyright than it is the integrity of such knowledge, driven by the fear that AI interpretation upon AI interpretation will warp the information displayed on sites like Wikipedia.
>
>
Wikipedia is licensed under Creative Commons and is thus available for commercial usage, meaning that users are relatively free to share and edit the text so long as it is attributed correctly.

No, you've got the license terms quite wrong. It's CC-BY-SA, not CC-BY, and that makes a world of difference. Let's correct for that and see where we are.

But how does this work with AI training? It seems that AI could freely use the information to educate itself, but in folding that knowledge into the commercial product that AI will increasingly become, it is not clear how companies can properly attribute Wikipedia’s contributions when the information is integrated into the Digital God’s mind without clear delineation? If it wanted to, Wikipedia could absolutely change their license and prohibit AI usage, but this seems counter to their open-source mission. Some people worry, furthermore, that such free training would “allow[] companies like OpenAI? to exploit the open web to create closed commercial datasets for their models.” This concern is less about copyright than it is the integrity of such knowledge, driven by the fear that AI interpretation upon AI interpretation will warp the information displayed on sites like Wikipedia.

 In general, if we want to train AI to be a responsible source of knowledge, allowing free access to open-source sites like Wikipedia seems like a smart move. Monetizing Wikipedia isn’t the goal, but safeguards may be necessary for companies who work on a for-profit model, such as explicit restrictions and software that monitors and curbs AI behavior.

Revision 4r4 - 21 Jan 2024 - 21:08:35 - EbenMoglen
Revision 3r3 - 17 Dec 2023 - 19:10:06 - JaredHopper
This site is powered by the TWiki collaboration platform.
All material on this collaboration platform is the property of the contributing authors.
All material marked as authored by Eben Moglen is available under the license terms CC-BY-SA version 4.
Syndicate this site RSSATOM