Skip to main content

This is the second piece in a series examining the most effective uses of LLMs and why a more holistic approach to AI combined with different forms of reasoning is needed to help us make better decisions.

In our previous article in this series, we explored the current landscape of AI, unraveling its potential and limitations. While Large Language Models (LLMs) are powerful tools for understanding and generating human language, they lack the rigorous precision needed for solving complex problems where accuracy and transparency are paramount.

LLMs are so good at producing fluent, intelligent sounding language that they are creating expectations that people can speak with machines and have them act in “super intelligent” ways. This is driving unprecedented demand for AI from businesses, who now expect LLMs to solve complex problems that require more formal, precise, and efficient mathematical logic. The problem is that LLMs alone do not, and will not, meet this expectation based strictly on how they work.

Today, we dive deeper into the very notion of precision, complex reasoning, and step into the fascinating realm of language itself. How can we leverage the power of AI to solve complex problems while ensuring its insights are transparent, accountable, and accurate? The answer lies in bridging the gap between natural language and formal reasoning.

The paradox of natural language

Humans effortlessly weave knowledge and meaning using natural language, relying on collective experiences and shared understandings. However, this inherent efficiency comes at a cost. Personal experiences, biases, and emotions infuse our words, blurring the edges of meaning and leading to divergent interpretations of the same text.

Even in situations where clarity is paramount, natural language can be surprisingly ambiguous. Take prescription instructions, for example. The simple phrase “Take one dose bi-weekly” can be misread in various ways, leading to potentially harmful misunderstandings. Patients might take the wrong dosage, jeopardizing the effectiveness of the treatment or even experiencing adverse effects. This is just one very simple example that highlights the subjectivity inherent in natural language.

Without the grounding of a formal system, language expressions are just pointers into our common (or not so common) experiences, leaving them open to misinterpretations and misunderstandings. This subjectivity can lead  individuals to believe they are on the same page when they’re not, resulting in confusing conversations and a failure to establish shared meaning.

This phenomenon plays out across various situations, from simple instructions to complex legal documents and even in esoteric political discussions. Individuals may believe they understand a message, only to later discover they were off the mark, or conversely, find common ground despite initial disagreements.

This inherent paradox of natural language presents a significant challenge for clear and reliable communication. On the one hand, it allows us to connect and share information with remarkable efficiency. On the other hand, its ambiguity can lead to misunderstandings and hinder our ability to effectively collaborate and solve complex problems.

This is why the development of formal languages and reasoning systems has been so crucial, as they provide a framework for precise communication and logical analysis. For complex, high-stakes decisions we have learned to separate natural language and even knowledge from the rigorous and carefully-designed mechanisms of formal reasoning.

The unambiguous world of formal language

Humans created formal languages, like mathematics, because natural language is ambiguous, imprecise, and opaque. This makes natural language difficult to do accurate thinking and calculations. However, a sequence of statements in a formal language has a precise meaning and consistent answers. It’s like having a rulebook that everyone follows, so there’s no confusion. That rulebook has precise, unambiguous, and complete definitions plus a clear, objective procedure on how to follow the rules.

Reliable decision-making processes, like any formal reasoning, require precise semantics, or meaning. Systems like computer programming languages are formal languages. They use clear rules and have specific ways to get correct answers. There is a mathematically precise, objectively correct decision procedure to determine exactly how to interpret each expression, and how its meaning feeds into the meaning of the next expression.

There is no room for alternative interpretations. That logic is encoded by the programming language compiler. You run the compiler, and the program works or doesn’t. If it works, it produces the same output every single time that is definitionally consistent with the meaning of the program. Correctness is grounded and well-defined; it is not subject to interpretation.

While natural language is great for everyday communication, it’s simply not powerful enough for mission-critical thinking and decision-making. We may use natural language to articulate to others the findings of our critical thinking and decision-making, but natural language is too subjective and slippery a mechanism through which to automate that critical thinking and decision-making process.

For these tasks, we need a more robust and reliable tool like formal language. For any complex reasoning task where you can’t afford to be wrong, we may start with natural language expressions and then go through the process of encoding them into some form of mathematical logic. We do this to transform something that is ambiguous into something grounded in precise semantics to reliably get accurate answers. I address how we can achieve this later in this series. For now, let’s focus on the challenge natural language presents in the context of LLMs.

The natural language challenge with LLMs

I’ve talked about the paradox of natural language and its fallibility in the face of solving complex problems. We also looked at the necessity of an unambiguous formal language to step in and solve those complex problems without the burden of uncertain semantics.

But of course, what we’re really evaluating here is the ability for LLMs — which rely on natural language — to solve complex problems correctly every single time. And because it relies on natural language, it inherently brings natural language’s ambiguity, imprecision, and opaqueness to the answers it gives.

Recently, I met a client who’s using an LLM to answer their customers’ questions. They shared how the LLM enables fantastically fluent and well written responses. They also shared that while some answers were good, others were simply wrong. The LLM even insisted it was right and falsely justified it. The problem? Their customers, lacking expertise in the subject matter, couldn’t tell the difference and took both correct and incorrect answers as truthful responses.

One may suggest that the anecdote is unfair because GPT sometimes gets it right even if it sometimes gets it wrong, but that is precisely the point. Its reasoning method is unpredictable because it is not a formal reasoning system. It “reasons” by predicting what a human might say next based on statistically analyzing what they said before in similar contexts.

One might also point out that people should not expect LLMs to reason formally given how they are built. That is correct, but many people increasingly have these expectations, and the more humans and businesses rely on LLMs, the greater this problem becomes.

The challenge with using LLMs to solve complex business problems

All this raises a crucial question: How can we ensure the quality of LLMs when solving complex business problems?

Before we tackle that, let’s ask ourselves what a complex business problem actually is? This is not about solving SAT word problems, word games, and leisure puzzles like sudoku. This is a broad class of problems that involve long chains of mathematically precise reasoning steps, for example:

  1. Generating optimal plans for a supply chain model that can account for many shifting constraints and real-time data across every node in a network.
  2. Analyzing complex investment scenarios to optimize financial portfolios and make major investment decisions.
  3. Accelerating complex pharmaceutical literature review to find new targets for molecules, or secondary indications for existing drugs.

These are complex business problems where:

  1. Lines of reasoning that justify answers are long and multidirectional.
  2. There is a large and complex network of logical dependencies that require precise, flawless forward and backward logical and numerical computation.
  3. Mistakes can have disastrous financial consequences.

This is a class of business problems where the outcome needs to be 100% accurate and the long chain of mathematically precise reasoning steps is an opportunity for the LLM to drop the ball at any step. A single error in the reasoning chain causes the entire chain to break, because the chain is only ever as strong as the weakest link.

Consider a line of reasoning that has 10 steps. Even if we charitably say an LLM might achieve 90% accuracy at each step, after 10 steps accuracy has dropped to 35%. After 20 steps, typical with complex business problems, it’s 12%. That is unacceptable to businesses when they can’t afford to be wrong.

In these cases, how can we ensure the quality of LLMs when solving the complex business problem?

Many people think the short answer is simply more training data, supervising and fine-tuning the LLM, implementing chain of thought, or using Retrieval Augmented Generation (RAG). However, this isn’t always enough if you really want to ensure accuracy and transparency — it’s like expanding a natural language system’s vocabulary for specific domains or restricting what it’s able to say, not moving it into the realm of complex reasoning.

If you start training, supervising, or fine-tuning the LLM, you’re then likely to have to make additional provisions for its use, including:

  • Having an established and dependable procedure for verifying the accuracy of answers so that you’re simply not relying on subjective interpretations of the words used.
  • If there is a procedure for verifying accuracy, involving complex reasoning that requires the flawless application of math and logic that even experts might fail.

Once you get into the complexity of using an LLM that needs additional functionality like explainability, verifiable accuracy, or the ability to perform complex reasoning, then automating the task with an LLM alone is simply not the best choice.

LLMs generated demand for AI capable of complex reasoning they can’t deliver alone

For complex reasoning problems where we need to ground our understanding in a reliable decision-making process, natural language is not the right medium. Without any underlying formalism, natural language’s ambiguity and subjectivity are great for casually navigating around another human’s brain, but not the best for ensuring precise, reliable outcomes.

LLMs are incredibly powerful tools to transform and generate natural language that are driving unprecedented demand for AI from businesses, but they have fundamental limits.

Reliable complex reasoning requires formal and efficient mathematical algorithms that would be absurd to learn through natural language. This includes formal systems and computation devices from mathematical constraint modeling, efficient constraint propagation and backtracking, dependency management and truth maintenance, and many more.

Using LLMs to create fluent natural language interfaces to these formal systems can be a game-changer for solving complex business problems across many industries, but it’s critical to recognize that LLMs alone are not adequate. LLMs unlock huge opportunities for humans to interact with machines, but AI must go beyond natural language to include the rigor of formal reasoning. We need more holistic AI that is accurate, reliable, and transparent in order to shape a better tomorrow.

Having dissected the inherent limitations of natural language and the precision of formal systems, next in our series, we will explore how to combine the two to deliver more holistic AI.