Rethinking Artificial Intelligence After ChatGTP - From LLMs To Modular Assured Systems
Rethinking AI after ChatGPT: From Monolithic LLMs to Modular, Assured Systems
On behalf of the Johns Hopkins Institute for Assured Autonomy’s seminar series, distinguished AI pioneer Dr. Tom Dietterich recently delivered a timely talk on the state of large language models (LLMs) and why—despite their breathtaking capabilities—we urgently need a new, modular approach to build safer, more reliable AI systems.
Speaker Background
Dr. Dietterich, Distinguished Professor Emeritus at Oregon State University, is one of the founding voices in machine learning, with over 200 publications, two books, and fellowships in the ACM and AAAI. His current research spans robust AI, human–AI collaboration, and AI for sustainability.
Take‑Home Message
LLMs like GPT‑4 are astonishingly capable “out of the box,” yet they suffer from fundamental flaws—hallucinations, self‑contradictions, lack of updatability, poor reasoning, and more. Industry workarounds (e.g., RLHF, retrieval augmentation, tool plugins) only patch symptoms. To build AI we can truly trust, we must architect modular systems in which an LLM is just one component.
1. Why LLMs Captivated Us
- Web‑scale pretraining equips LLMs with broad “world knowledge,” enabling:
• Question answering across many domains
• Dialogue and negotiation
• Summarization and translation
• Code generation and formal‑language conversions
• “In‑context learning” of new tasks from a handful of examples
2. The Deep Flaws of Today’s LLMs
a. Hallucinations & Contradictions
• Invent facts (e.g., false awards, synthetic citations)
• Self‑contradictory storytelling
b. Dangerous or Offensive Outputs
• Jailbreaking prompts to produce biased or harmful content
c. Verifiability & Attribution Failures
• No way to trace an answer back to source documents
d. Poor Updatability
• Re‑training costs in the hundreds of millions; near‑impossible to “teach” a single new fact
e. Shaky Reasoning & Planning
• Struggle with classic AI planning benchmarks (e.g., Blocks World, logistics)
• Tend to “retrieve recipes” rather than perform symbolic reasoning
f. Calibration & Trust
• Overconfident predictions, especially after RLHF fine‑tuning
3. Patching LLMs: Industry Workarounds
• Retrieval‑Augmented Generation
– Embed and index external documents, fetch relevant passages, then condition the LLM on them to reduce hallucinations and add attribution.
• Reinforcement Learning from Human Feedback (RLHF)
– Rank outputs by human preference, then fine‑tune the model to prefer higher‑rated responses (mitigates—but does not eliminate—toxic outputs).
• Tool & Plugin Ecosystems
– Expose calendar, web search, calculators, code compilers, robotic controllers; have the LLM “call out” to specialized modules when needed.
• Self‑Critique & Consensus Techniques
– Generate multiple candidate answers, cross‑check them, or have the LLM critique and refine its own responses.
These techniques have made impressive strides—Bing Copilot, GPT‑4V, Perplexity.ai, and others all build on these ideas. Yet they still rely on one giant, monolithic model that is hard to inspect, update, or guarantee against errors.
4. A Vision for Modular AI
Rather than force every capability into a single LLM, Dr. Dietterich advocates a cognitive‑inspired, modular architecture that separates:
1. Language Understanding & Generation
2. Declarative World Knowledge (e.g., Knowledge Graphs)
3. Situation Modeling & Dialogue Management
4. Formal Reasoning & Planning Engines
5. Episodic Memory & Meta‑Cognition
Benefits:
- Easier to update and audit the factual knowledge base (no more re‑training entire LLM).
- Stronger guarantees by plugging in verifiable planners or proof checkers.
- Clearer attribution to source documents, improving transparency.
- Resilience against hallucinations, malicious “prompt injections,” and inconsistent behaviors.
5. Building the Knowledge Base
Past efforts (e.g., Tom Mitchell’s Never‑Ending Learning Project) crawled the web to extract triples and gradually curated a large knowledge graph. New work harnesses LLMs themselves to:
• Extract and canonicalize relations from text
• Automatically define and merge synonymous predicates
• Populate a graph that can then answer queries or feed back into the LLM
6. Towards Richer Dialogue: Situation Models
Human conversation relies on an internal “situation model” that tracks goals, beliefs, events, and intentions. A next‑generation system would:
1. Update its situation model on every user utterance.
2. Plan or adapt its dialogue strategy based on that model.
3. Retrieve relevant knowledge, reason or plan formally, and generate the next utterance.
7. Open Challenges & Call to Action
Dr. Dietterich closed by emphasizing that AI is far from solved. Key research questions include:
• How best to extract, represent, and maintain factual knowledge at web scale?
• How to build accurate situation models that prevent “off‑topic” or pragmatically dangerous behavior?
• How to integrate perception, episodic memory, planning, and meta‑cognition?
• How to reason about source trustworthiness and conflicting evidence?
He urged graduate students and researchers to dive into these problems. Commercial LLMs may grab headlines today, but a new wave of assured, modular AI lies ahead—and the future needs fresh talent to build it.
Comments
Post a Comment