Published: June 11, 2025
4
4
36

So now Songlin is mad. It began when I saw an obviously wrong MQAR result of RWKV-7 posted by Songlin (see https://x.com/BlinkDL_AI/statu... I told Songlin to use RWKV-LM, and got a very fierce reply in official FLA group. Songlin pinned the personal attack for several days.🙃

Image in tweet by BlinkDL

On the contrary to @SonglinYang4 's claim, the RWKV community (including zhiyuan and more) have been working hard to align FLA with RWKV-LM. Moreover there is reference RWKV-7 implementation at https://github.com/BlinkDL/RWK... with only hundreds of lines of essential code.

Image in tweet by BlinkDL

Here is a fair comparison made by https://arxiv.org/abs/2505.185... using RWKV-LM reference implementation. RWKV-7 leads in pretrain loss. But we have all seen multiple papers using other broken RWKV baselines. Transformer baselines are often broken in “new LLM arch” paper too.

Image in tweet by BlinkDL

Finally, horrible behavior by another guy, and I asked him to remove it immediately, as everyone can confirm. This happened after @SonglinYang4 used the infamous “shabi” to describe RWKV. It is sad that Songlin decided to start a fight on X again.

Image in tweet by BlinkDL

Share this thread

Read on Twitter

View original thread

Navigate thread

1/4