2024-08-31 00:55
A good read on the topic of "model collapse", where an ML model eventually degrades in performance when repeatedly training on synthetic data.
Differentiating between synthetic and real data will become increasingly difficult so this will likely be a problem in the future...
Research paper: https://www.nature.com/articles/s41586-024-07566-y
Article: https://www.nytimes.com/interactive/2024/08/26/upshot/ai-synthetic-data.html?unlocked_article_code=1