2025-01-04 04:27
I have a creeping intuition that the residual connections, flowing through the network with no degradation/impediment, are somehow holding back modern large transformer architectures. ResNet was a breakthrough, but I wonder if there's another way that encourages better internal representations and specializations.
3
回覆
3
轉發

回覆

轉發

24小時粉絲增長

無資料

互動率

(讚 + 回覆 + 轉發) / 粉絲數
Infinity%

回覆 (BETA)

最先回覆的內容
發文後用戶內容
3 小時內
profile
Ryan
prater_ry
Can you explain further?

© 2025 Threadser.net. 版權所有。

Threadser.net 與 Meta Platforms, Inc. 無關,未經其認可、贊助或特別批准。

Threadser.net 也不與 Meta 的"Threads" 產品存在任何關聯。