debris about database chatbots

Contents

Databases And You

And Now For Something Completely Different

iChatBio Is Not Going Fine

Databases And You

Databases are a fantastic technology. You place information into digital boxes, place a bunch of labels on the boxes, and can make searches using the words in the box or on the labels to retrieve and sort them as you see fit.

This is great. Databases work very well, which is why the vast majority of our digital world is built on them.

Importantly, they are not a 'lossy' storage solution.^[1.1] You do not lose accuracy when you store information in a database or use text lookups to retrieve the information. A database, by design, should not misrepresent its contents.

Sidenote [1.1]:

Lossy solutions reduce the accuracy/quality of the stored item. JPEG images have smaller file sizes, but they achieve this via compression that reduces the quality of the image compared to the original.

This is in stark contrast to large language models (i.e., 'AI' models like ChatGPT), which are lossy in multiple ways.

Firstly, LLMs do not contain the actual works they are trained on, so they are not a good way to directly store information losslessly. This is why, if an LLM is asked to 'recite' a text it was trained on, its response may very well not match the actual text.

Secondly, even when an LLM is able to directly access a separate source of information (like a database) and relay its contents to the user, this is still a lossy operation. It can easily provide incorrect representations of that information. XXX link to study demonstrating recall and summary issues in retrieval/RAG models XXX

Thirdly, LLMs are not able to reliably use logical reasoning to deduce or summarise information about a text. Nor can they reliably perform actions with rigid criteria (such as 'search for labels with exact matches to X' or 'retrieve all boxes that contain X except those that also contain Y').^[1.2]

Sidenote [1.2]:

This has has been robustly demonstrated with the recent GPT-5 models, which still often fail at simple tasks such as 'count the number of letters in this word' or 'select a word from this list with only one vowel'.

“It is not just that LLMs fail to induce proper world models of chess. It’s that they never induce proper worlds of anything. Everything that they do is through mimicry, rather than abstracted cognition across proper world models.”

— Gary Marcus, a highly qualified machine learning researcher, in Generative AI’s crippling and widespread failure to induce robust models of the world

All three of these loss types arise because, while LLMs generate their responses via an incredibly complex network of relationships between concepts, XXX link to the neural exploration demo here XXX they do not perform actual logical reasoning based on a consistent world model.

This is not a subjective analysis. This, much like the inevitability of hallucination, XXX link to the paper proving hallucination inevitability here XXX is a mathematical reality.

Navigating relationships between concepts is not reasoning. It produces a fantastic simulacra of reasoning, XXX link to the 'linguistic vector-space poetry' video XXX but it is not reasoning.

This is a problem because many tasks people want to use LLMs for rely on reasoning and information accuracy. In particular, this can be problematic when the suggested use case is 'providing information on a topic to people who have questions about the topic'.

Anyway, iDigBio, a biodiversity research organisation with close ties to University of Florida, XXX cojnfirm exact relationship. UF logo in iDB website implies UF is a sponsor? XXXhas begun integrating an LLM chatbot (iChatBio) into their biodiversity database. It's probably going fine, right?

iChatBio Is Not Going Fine

People are not very pleased with the idea of sticking an LLM in a database. This is understandable, given that the database and its search function already work quite well.

Adding an LLM interface to a database is, straightforwardly, a worse user experience that delivers less accurate results. The kind of people who use massive scientific databases care about usability and accuracy.

An employee of the UF-operated XXX confirm if just on campus or actually UF-operated XXXFlorida Museum of Natural History voiced concerns about this integration. They provided evidence of iChatBio making fairly basic errors when simple queries were made. This included the following exchange:

User: How many kinds of bees are in Florida?

iChatBio: According to iDigBio, there are currently 0 bee species listed in their records for Florida using the common name "bee" as a search term.^[1.3]

Sidenote [1.3]:

This is obviously incorrect. Searched manually, iDigBio contains tens of thousands of individual records of bee species in Florida.

Why did the chatbot return this result? Did it enter its query into the wrong search box? Did it misread the results of its search? There is no way for the user to know, which is part of the problem.

It certainly didn't have the reasoning faculties to notice that there should be records of bees in Florida and double-check its findings. It just spat the result out as if it were perfectly acceptable.

The result? The person raising the concerns was fired. XXX this is likely where the link to mossworm's tumblr post on the firing should go XXX

I should stress that this person was an employee of University of Florida, working at the Florida Museum of Natural History, and not an employee of iDigBio, the company that operates iChatBio.

XXX this is where I would like to have an interview/statement from mossworm. need a more detailed timeline, some direct quotes? and a little more context on why UF is so invested in not disparaging iChatBio and iDigBio XXX

Taxonomy Of A Failure

As a science-enabling initiative, iChatBio is already less successful than the regular iDigBio database.

On the most basic level, I am nearly certain that criticism of basic usability errors in the iDigBio database has never been used as pretext for firing a qualified, competent, and engaged biologist. As far as biodiversity conservation efforts go, that seems important.

I am also confident that the regular database interface has never confidently told anyone that it contains no records of "bees" in Florida — at least, not without user error.

Swinging two-for-two, standard database technology also sweeps a couple more easy victories by being both more responsive and less energy-intensive per query.

Burrowing down into the technology doesn't improve things. A big problem is that, unlike a database, there are very few easy ways to incrementally tweak and meaningfully improve LLMs like iChatBio in response to bugs or user issues.

The 'corrections' a user might offer in iChatBio's chat window are not stored in the LLM, and iChatBio does not learn from them. This is not how LLM chatbots work. One can change a LLM's system prompt, but that can only scale so far.

This is, in fact, a key reason why 95% of AI-related initiatives are not returning on investment. XXX link to the industry report here XXX LLMs do not 'learn' or incorporate new information into a long-term worldview in normal operation, making working with them an endless slog of navigating around the exact same type issues, every single day.

“They are not built like traditional software. They are not spreadsheets. They are not databases. They do not work like humans. They do not sanity check their own work. They don’t think like economists[...] or even ordinary people [...] They don’t induce abstract models of the world that they can reliably engage, even when those models are given to them on a silver platter (as with the rules of chess).”

— Gary Marcus, LLMs are not like you and me—and never will be.

There is, in short, no coherent reason why a chatbot should be jammed into iDigBio's database. It is a tremendous waste of people-power, regular power, and users' time.

In fact, it is such a staggeringly unpopular and short-sighted idea that it should be profoundly embarrasing to have ever advocated for this project. Perhaps even professionally detrimental.

Everyone who understands how LLMs work is either honest about their lack of value to most organisations or desperately trying to pretend otherwise in order to make money.

iDigBio is either about to reverse course or lose incredible amounts of user and community trust and goodwill. Judging by their open dismissal of critique, XXX link to post discussing their livestream and refusal to answer questions during XXX I suspect the former.

☼