Why Neural Network Can Discover Symbolic Structures with Gradient-based Training: An Algebraic and Geometric Foundation

Best AI papers explained - A podcast by Enoch H. Kang

Categories:

This academic paper introduces a theoretical framework explaining how discrete symbolic structures can naturally emerge in neural networks through continuous gradient-based training. The authors model neural network optimization as a Wasserstein gradient flow in a measure space, demonstrating that under geometric constraints like group invariance, the network's parameters undergo gradient decoupling and a reduction in degrees of freedom. This process drives the network toward compositional representations that align with algebraic operations, leading to solutions for reasoning tasks. The paper further establishes data scaling laws for achieving symbolic tasks and provides guidelines for designing neurosymbolic architectures that integrate continuous learning with discrete algebraic reasoning.