Was reading https://arxiv.org/abs/2311.17035 and wondered: Is the degree of extractable data an indication of incomplete training? If the model is outright memorizing large chunks of training data, that seems to indicate that some parameters aren't being used efficiently. That's model entropy that didn't go toward rules/generalization. Which in turn suggests that some sort of modification to the training loss function. Maybe that the entropy over the final softmax shouldn't be _too

串文

2023-11-29 18:27

Was reading https://arxiv.org/abs/2311.17035 and wondered: Is the degree of extractable data an indication of incomplete training? If the model is outright memorizing large chunks of training data, that seems to indicate that some parameters aren't being used efficiently. That's model entropy that didn't go toward rules/generalization. Which in turn suggests that some sort of modification to the training loss function. Maybe that the entropy over the final softmax shouldn't be _too_ low?

讚

回覆

轉發

作者

Michael O
ywrtrwy

粉絲

串文

32+

讚

回覆

轉發

24小時粉絲增長

無資料

互動率

(讚 + 回覆 + 轉發) / 粉絲數

6.25%

回覆 (BETA)

最先回覆的內容
發文後	用戶	內容
2 分鐘內	Michael O ywrtrwy	Alternatively, it could indicate quality problems in the training data. I wonder how often the extracted data was repeated in the training dataset?