wildc8rd

joined 1 year ago
[–] wildc8rd@vlemmy.net 3 points 1 year ago

Machine Learning is very dependent on clean data. Reddit and other sites with strong moderation provide a process to generate clean data that is very valuable for Large Language Models. By reducing the quality of their moderation teams and content Reddit is probably reducing the overall value of the training data they're trying to sell by locking down their APIs.

 

As Reddit chases after a legacy social media model are they killing the golden goose they intend on selling? Large Language Models and Machine Learning does a great job of locking the processes and feedback loops. What will that mean for how we build these models in the future and should we do more to consider where they are being trained before applying them to our lives?