New top story on Hacker News: No More Adam: Learning Rate Scaling at Initialization Is All You Need

No More Adam: Learning Rate Scaling at Initialization Is All You Need
14 by jinqueeny | 0 comments on Hacker News.


No comments:

Post a Comment