
Can Semantic Search be more interpretable?

Can Semantic Search be more interpretable?

Warning : I am not a information retrieval researcher, so take my blog post with a pinch of salt In my last blog post, I described a simplified description of a framework for infomation retrieval from the paper -

As academic search engines and databases incorporate the use of generative AI into their systems, an important concept that all librarian should grasp is that of retrieval augmented generation (RAG). You see it in use in all sorts of "AI products" today from chatbots like Bing Copilot, to Adobe's Acrobat Ai assistant that allow you to chat with your PDF.

Let's be clear here, Google Scholar is ill designed for use for systematic reviews . I am not trying to argue otherwise. (Obligatory warning, I am not a real systematic review librarian) But why exactly?

EDIT - April 2025 Since I wrote this blog post in April 2024, "Deep Research" tools have become all the rage, that combine agentic search and producing long form reports are now all the rage. New: See updated Oct 2024 review of Undermind here!

I've watched with interest, as academic search engines use AI to improve searching. Elicit is probably currently the leading example of this, using transformer based language models to improve search relevancy ranking use RAG (retrieval augmented generation) across multiple documents to generate a paragraph of text to answer the query with citations extract information to create a research matrix table of papers

One of the tricks about using the newer "AI powered" search systems like Elicit, SciSpace and even JSTOR experiment search is that they recommend that you type in your query or what you want in full natural language and not keyword search style (where you drop the stop words) for better results. So for example do

I've spent a large part of my career as an academic librarian studying the question of discovery from many angles.

Earlier related pieces - How Q&A systems based on large language models (eg GPT4) will change things if they become the dominant search paradigm - 9 implications for libraries In the ever-evolving landscape of information retrieval and library science, the emergence of large language models, particularly those based on the transformer architecture like GPT-4, has opened up a Pandora's box of possibilities and challenges.

Note: This is a lightly edited piece of something I wrote for my institution What is Google’s Search Generative Experience (SGE)? In past ResearchRadar pieces, we have discussed about how search engines both general (e.g. Bing Chat, Perplexity) and academic (e.g Elicit, Scite Assistant, Scopus (upcoming)) are integrating search with generative AI (via Large Language Models) using techniques like RAG (Retrieval Augmented

A decade ago in 2012, I observed how the dominance of Google had slowly affected how Academic databases and OPACs/ catalogues (now discovery services) work.