We were unable to load Disqus. If you are a moderator please see our troubleshooting guide.

Ahmed Dawod • 4 months ago

This is one of the most intellectual reads I had in a while in ML. It is a long read but really really interesting.
Constantly reading about scaling LLMs makes you feel the world is stupid :)

Fang-Pen Lin • 4 months ago

Thanks for the kind words! 😄

Yeah, I don't like the "if you have a hammer, everything looks like a nail" approach. I like to explore ideas from first principles. I miss the old days when OpenAI was still exploring interesting things, like emergent behavior in hide and seek. Now everybody only focuses on LLMs, which is a bit sad. But you know what? If they don't do the research I like, I'll do it for them! 😂

Matthew Adams • 4 months ago

Fascinating article. Looking forward to more.

One thought struck me while reading it, similar to the notion of throwing out backpropagation in its bedrock-shaking impact. In both backpropagation & marketplace strategies, the architecture of the layers is constant. What if, somehow, the neural network architecture itself could be incrementally altered? This seems to me like some kind of metamarketplace idea, but I obviously haven't really thought it through. It seems like a different enough beast that backpropagation wouldn't apply or would take a form that hasn't been seen yet. What I find appealing about a metamarketplace approach is that the process of modifying the architecture itself mimics, I suspect, the plasticity of organic neural networks.

Fang-Pen Lin • 4 months ago

Thanks for the kind words! 🙌 Yes, as mentioned in this article, the idea of Marketplace actually stemmed from building MAZE (Massive Augmented Zonal Environments):

https://fangpenlin.com/posts/2025/02/06/maze-how-i-would-build-agi/
https://fangpenlin.com/posts/2025/02/18/maze-my-ai-models-are-finally-evolving/

The main focus of that project was to evolve the structure of neural networks. At the same time, I also wanted to evolve their weights. That's why I took a detour and developed Marketplace. Based on the Lottery Ticket Hypothesis, neural networks and their sub-networks are already present:

https://nearlyright.com/how...

and
https://www.youtube.com/watch?v=jeFMWtddkTs

Therefore, I view the training process as akin to CNC machining—removing unwanted parts. However, this approach limits the types of neural networks we can train because you can only remove so much from the initial structure.

My dream is to build a "3D printer" for neural networks. As you mentioned, we should be able to extend (or evolve) the neural model layer by layer, neuron by neuron, as needed. I believe this is 100% possible, as it mirrors how nature works. The real challenge, however, is performing this efficiently. Currently, GPU models must be compiled into kernels to run efficiently on the cores. Once compiled, they are largely fixed, allowing only changes to the input data. While JIT compilers can generate new kernels on the fly, this process is still time-consuming. As I noted in the article, the key challenge is achieving efficiency. Otherwise, the scale required to exhibit useful traits may be prohibitively expensive.

I’m also thinking hard about how to integrate my work with MAZE to evolve both structure and weights simultaneously, ideally on GPUs. Otherwise, the process would be too slow.