Intelligence vs Skill: A Conversation on Bayesian Program Learning
Intelligence vs. Skill: A Conversation with Alessandro Pomerini on Bayesian Program Learning
In this thought-provoking discussion, Alessandro Pomerini shares insights from his research at the Santa Fe Institute, where he works with renowned AI researcher Melanie Mitchell. Alessandro, who transitioned from professional gaming to AI research, offers fascinating perspectives on the nature of intelligence, skill acquisition, and program learning.
The Distinction Between Skill and Intelligence
According to François Chollet, whom Alessandro references extensively, there's a fundamental difference between skill at a specific task and intelligence:
"Intelligence for Chollet is the efficiency with which you can take whatever information you're starting with—whether prior knowledge or just data—and turn that information into new skills that perform well across a scope of tasks."
Alessandro illustrates this with his reflections on OpenAI's DOTA 2 bot:
"When I first saw OpenAI5 beat world champions, it was mind-blowing. As a professional gamer, I understood the skill required to play at that level. But looking back years later, I don't find the DOTA system that impressive anymore—it trained on the equivalent of 45,000 years of gameplay data. It's incredibly inefficient in how it uses data to acquire skills."
This inefficiency contrasts sharply with human intelligence, which can learn new skills with remarkable data efficiency. The DOTA bot demonstrates tremendous skill but lacks the intelligence to adapt to new scenarios that weren't included in its training data.
The ARC Challenge: Measuring Intelligence, Not Just Skill
The Abstraction and Reasoning Corpus (ARC) created by François Chollet offers a benchmark for measuring intelligence rather than skill. Alessandro explains:
"For Chollet, there's essentially no task where performing well demonstrates intelligence unless it's a meta-task evaluating how efficiently you can acquire skills across a broad range of tasks while controlling for priors and experience."
ARC consists of a thousand tasks, each containing a few input-output grid pairs that demonstrate transformations. The challenge is to identify the underlying concept from just 2-3 examples and apply it to new inputs. This approach tests a system's ability to induce rules efficiently from minimal data—a hallmark of intelligence rather than mere pattern recognition or memorization.
From Deduction to Induction: The Role of Conjecture
Drawing from philosophers like David Deutsch and Karl Popper, Alessandro highlights the importance of conjecture in solving these kinds of problems:
"Looking at a single ARC task with just two or three examples, there's no way to mechanistically derive the correct program. A lot of it is induction—it's essentially guessing. As Karl Popper said, 'Scientific theories are not derived from observations; they're tested on observations.'"
This perspective frames problem-solving as a search process, where we conjecture possible solutions and test them against observations. The challenge becomes: how do we search efficiently?
DreamCoder: An Innovative Program Synthesis System
Alessandro delves into DreamCoder, a program induction system developed by Kevin Ellis during his PhD at MIT. DreamCoder learns to solve problems while simultaneously learning how to simplify the search for solutions:
"DreamCoder is an algorithm that, as it's learning and searching for program solutions, also learns how to simplify search for those solutions through two mechanisms: reducing the depth of search and reducing the breadth of search."
The system works in three stages:
1. **Search** for program solutions
2. **Chunk** program components and add them to a library (reducing search depth)
3. **Train** a neural search policy to guide future searches (reducing search breadth)
These stages bootstrap each other in an iterative process of discovery and learning.
Bayesian Program Learning by Decompiling
Alessandro's research builds on DreamCoder with a novel approach: instead of chunking program components based solely on compression efficiency, his system leverages knowledge embedded in the neural search policy:
"Rather than chunk program components based on what best compresses our current solutions, we look at the functionality that the neural network wants to use most often. We try to extract useful program components that it wants to use and add them to the library."
This "decompiling" approach extracts high-level patterns from the neural network's weights, creating useful chunks that align with how the system is already trying to search for solutions. The result is faster learning with less data—essentially improving the system's intelligence (in Chollet's sense) rather than just its skill.
Results and Future Directions
When asked about results, Alessandro explains that his decompiling approach helps systems learn to solve more problems faster:
"By leveraging knowledge learned by the neural recognition model, we can make more useful chunks when we have very limited experience. This feeds back into the system, allowing it to solve more tasks. We start building a lead, and once we have more tasks, we can learn to make even better chunks and generalize more."
While his current work shows promise, Alessandro is already exploring new approaches to the ARC challenge that move beyond traditional program space search:
"I'm doing a similar sort of combinatorial search that will result in a transformation rule, but it's not searching directly in program space. It has properties that allow the search mechanism to be more efficient, adapt to feedback, prune large parts of the search space, and jump to corresponding places more easily."
Conclusion
As AI systems continue to demonstrate impressive skills in specific domains, Alessandro's research reminds us that true intelligence lies in efficient skill acquisition rather than performance alone. By understanding the mechanisms that allow systems to learn from minimal data—chunking useful components, guiding search efficiently, and leveraging knowledge across tasks—we move closer to building AI with more human-like intelligence.
The conversation highlights a fascinating aspect of AI research: sometimes, the most impressive systems aren't those that perform tasks with superhuman skill, but those that learn with human-like efficiency. This perspective may prove crucial as we work toward more adaptable, generalizable AI systems that can truly understand the world rather than just excel at narrowly defined tasks.
Make a donation:
Comments
Post a Comment