2023-11-29 18:27
Was reading https://arxiv.org/abs/2311.17035 and wondered: Is the degree of extractable data an indication of incomplete training? If the model is outright memorizing large chunks of training data, that seems to indicate that some parameters aren't being used efficiently. That's model entropy that didn't go toward rules/generalization. Which in turn suggests that some sort of modification to the training loss function. Maybe that the entropy over the final softmax shouldn't be _too_ low?
0
回覆
1
轉發

作者

Michael O
ywrtrwy
profile
粉絲
16
串文
32+

回覆

轉發

24小時粉絲增長

無資料

互動率

(讚 + 回覆 + 轉發) / 粉絲數
6.25%

© 2025 Threadser.net. 版權所有。

Threadser.net 與 Meta Platforms, Inc. 無關,未經其認可、贊助或特別批准。

Threadser.net 也不與 Meta 的"Threads" 產品存在任何關聯。