Parallel Experiments
1.71K subscribers
62 photos
1 video
3 files
811 links
Stay informed. Stay authentic.

Welcome to the public part of my brain. Here I share curations and thoughts.

Created with ❤️ by @linghao.
Download Telegram
https://gregorygundersen.com/blog/2025/10/01/large-language-models/

预感这篇会是 LLM Researcher 必读:作者把跨越数十年的语言模型研究梳理成了一条清晰的时间线,讲述我们是怎么一步一步得到今天的 transformer based LLM 的。文章的思路非常 from first principles,并且用前后一致的符号串起了 N 篇不同的论文的要点。

非常喜欢文尾的一段话:

> If you feel that it’s a bit perverse that next-word prediction is a sufficient objective to solve elite math problems, if this feels like a stochastic parrot outsmarting you, then you might feel some of the discomfort early linguists felt at statistical language modeling. This is the visceral feeling of the bitter lesson. Our specialized knowledge feels expendable and our intuitions about understanding seem irrelevant in the face of raw computation and speed.
8
https://www.imdb.com/title/tt32376165/

第一时间看了拆弹部队导演 Kathryn Bigelow 的新片 A House of Dynamite。不打算剧透所以在这里不说太多,但可以简单评价一下:

这可能是迄今对于美国现代核威慑和核反击预案最充满戏剧冲突、用了最多篇幅去描绘的荧幕呈现。在这之前可能是 Madam Secretary S04E22 Night Watch 那一集。

这个片子更像是一种陈列和观点表达,所以故事性上可能不如像是我个人心目中核战片 Top 1 的 The Sum of All Fears,但那毕竟已经是 20 多年前的片子了,视觉上有些脱节了。

总的来说非常值得一看,个人觉得片子最大的几个亮点我这里都刻意没有提到。
3