AI, particularly generative AI and large-scale language models (LLMs), have made tremendous technological progress and are reaching an inflection point for widespread industry adoption. With McKinsey reporting that AI powerhouses are already “going all in on artificial intelligence,” companies know they must embrace the latest AI technologies or be left behind.
However, the AI safety field is still immature and poses enormous risks to companies using the technology. It is not difficult to find examples of AI and machine learning (ML) being malicious. In fields from medicine to law enforcement, algorithms that are supposed to be fair and unbiased are found to have hidden biases that further exacerbate existing social inequalities and pose enormous reputational risks to their creators.
Microsoft's Tay Chatbot is perhaps the most well-known cautionary tale for businesses. Trained to speak in a teenage conversational style before being re-educated by internet trolls to spew unfiltered racist, misogynistic bile, he was quickly eliminated by the embarrassed tech giant. The damage has been done. Even the much-vaunted ChatGPT has been called “stupider than you think.”
Corporate leaders and boards understand that their companies must begin to leverage the transformative potential of Gen AI. But how do you even begin to think about identifying and prototyping early use cases when working in a minefield of AI safety issues?
The answer is to focus on class use cases, what I call the “needle in a haystack” problem. The haystack problem is a problem in which it is relatively difficult for humans to search for or generate potential solutions, but it is relatively easy to identify possible solutions. The unique nature of these problems makes them ideally suited for early industrial use cases and adoption. And once you recognize the pattern, you realize that Haystack problems abound.
Here are some examples.
1: Copy Edit
Checking long documents for spelling and grammar errors is difficult. Computers have been able to spot spelling errors since the early days of Word, but it wasn't until the AI generation that accurately spotting grammatical errors proved more difficult, often even incorrectly flagging perfectly valid phrases as ungrammatical.
We can see how copy editing fits into the Haystack paradigm. It can be difficult for humans to find grammatical errors in long documents. Once AI identifies potential errors, humans can easily determine whether the error is in fact ungrammatical. This last step is important because even modern AI-based tools are imperfect. Services like Grammarly already leverage LLM to do this.
2: Write boilerplate code
One of the most time-consuming aspects of writing code is learning the syntax and conventions of a new API or library. This process is repeated by millions of software engineers every day, requiring a lot of research through documentation and tutorials. Services like Github Copilot and Tabnine have leveraged Gen AI trained on the collective code written by these engineers to automate the tedious steps of generating boilerplate code as needed.
This problem fits well into the Haystack paradigm. Although it is time-consuming for a human to do the research required to generate working code from an unfamiliar library, it is relatively easy to verify that the code works correctly (e.g., run it). Finally, as with any other AI-generated content, engineers must further verify that the code works as intended before shipping it to production.
3: Search scientific literature
With millions of papers published every year, keeping up with the scientific literature is a challenge even for experienced scientists. But these papers provide a goldmine of scientific knowledge, with patents, drugs, and inventions always available to discover if you can process, assimilate, and combine the knowledge.
Particularly challenging are interdisciplinary insights that require expertise in two very unrelated fields, with few experts having mastered both. Fortunately, this problem is also suitable for Haystack classes. It is much easier to read the papers from which AI-generated potential novel ideas are sourced to ensure their integrity than to generate new ideas that span millions of scientific works.
And if AI can learn roughly molecular biology as well as math, it won't be limited by the academic constraints that human scientists face. Products like Typeset are already a promising step in this direction.
Human verification is important
The key insight in all of the above use cases is that while solutions may be created with AI, they are always verified by humans. Allowing AI to speak directly to the world or take action on behalf of large corporations is incredibly risky, and history is full of past failures.
Human verification of the output of AI-generated content is critical to AI safety. Focusing on Haystack problems improves human-verified cost-benefit analysis. This allows AI to focus on helping humans solve difficult problems while maintaining easy and critical decision-making and double-checking for human operators.
Focusing on Haystack use cases during these early stages of LLM can help companies build AI experiences while mitigating potentially serious AI safety concerns.
Tianhui Michael Li is the President of the Pragmatic Institute and the founder and president of The Data Incubator, a data science training and placement company.
data decision maker
Welcome to the VentureBeat community!
DataDecisionMakers is a place for professionals, including technical people, who work with data to share data-related insights and innovations.
If you want to read about cutting-edge ideas, latest information, best practices, and the future of data and data technology, join DataDecisionMakers.
You might also consider contributing your own article!
Learn more at DataDecisionMakers