pl.aiwright - GPT-4 dialogue for Disco Elysium: The Final Cut

Research into efficient optimization techniques seems pretty important given the scale of LLMs these days. Nice to see a second-order approach that achieves reasonable wall-clock improvements.

nsa@kbin.social · 3 years ago

Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training

nsa@kbin.social · 3 years ago

If there isn’t any discussion on reddit (no discussion in this case), I don’t see a reason to link to reddit; you can just link to the project page. That said, if you think there is important discussion happening that is helpful for understanding the paper, then use a teddit link instead, like:

https://teddit.net/r/MachineLearning/comments/14pq5mq/r_hardwiring_vit_patch_selectivity_into_cnns/

nsa@kbin.social · 3 years ago

Please don’t post links to reddit.

nsa@kbin.social · 3 years ago

Inverse Scaling: When Bigger Isn't Better

nsa