So now Songlin is mad. It began when I saw an obviously wrong MQAR result of RWKV-7 posted by Songlin (see https://x.com/BlinkDL_AI/statu... I told Songlin to use RWKV-LM, and got a very fierce reply in official FLA group. Songlin pinned the personal attack for several days.🙃
On the contrary to @SonglinYang4 's claim, the RWKV community (including zhiyuan and more) have been working hard to align FLA with RWKV-LM. Moreover there is reference RWKV-7 implementation at https://github.com/BlinkDL/RWK... with only hundreds of lines of essential code.
Here is a fair comparison made by https://arxiv.org/abs/2505.185... using RWKV-LM reference implementation. RWKV-7 leads in pretrain loss. But we have all seen multiple papers using other broken RWKV baselines. Transformer baselines are often broken in “new LLM arch” paper too.
Finally, horrible behavior by another guy, and I asked him to remove it immediately, as everyone can confirm. This happened after @SonglinYang4 used the infamous “shabi” to describe RWKV. It is sad that Songlin decided to start a fight on X again.




