Skip to content

Consensus: Harnessing AI-Powered Engines for Streamlined Research Paper Searches

In the midst of exploring various AI models, perhaps owing to a habit cultivated during my time in research institutes, I find myself invariably sifting through research papers pertinent to the models, systems, or libraries I’m working on.

Even if it’s just a cursory glance over the abstracts, introductions, conclusions, and figures, it lends me some insight into what I’m utilizing and the underlying mechanisms that drive it. It elevates the experience beyond merely executing code provided by the development team.

Perhaps my inherent nature leans toward skepticism, but without this preliminary exploration, I would be left with a nagging sense of uncertainty.

Today, I stumbled upon an outstanding research paper search website called Consensus, and I couldn’t help but document it here. What sets Consensus apart is that it doesn’t merely rely on conventional crawling techniques or employ NLP (Natural Language Processing) to extract and display high-ranking papers. Instead, it uses AI models to curate and present results, adding an extra layer of intelligence to the search process.

One of the advantages this brings is that when inputting content for search, we can break free from the shackles of “keyword searching” and instead, pose questions in a more colloquial manner.


How to Use

First, head over to the search platform at https://consensus.app/search/. Next, all you have to do is type in the question you want to ask.

The results that appear can naturally be clicked on to confirm the original research paper content; if you find the results particularly satisfactory, there’s also an option to directly share them via a button adjacent to each listing.


How Consensus Works

On the official website, there is a detailed explanation of how the AI model behind the searches operates. Put simply, it employs a language model to search and compile information from academic research papers. The creators specify that it is not a chatbot (which can be understood as a language model not tuned with conversational data).

The development team sourced their information from the Semantic Scholar database, which boasts a collection of over 200 million papers spanning all scientific domains. Consensus is set to continually incorporate more papers into its database and updates the dataset monthly.

Further, Consensus utilizes its own extraction model (which has been trained through tens of thousands of papers annotated by Ph.D. holders) to scour the entirety of the 200 million paper database. The data extracted by the model consists of sentences in which the authors of the papers expound upon their research findings based on empirical evidence.

The entire search engine follows this system process:

  1. Users input queries in the search field.
  2. Stop words such as “what”, “is”, “are” etc. are eliminated from the query and an elementary keyword search is conducted on the papers and extracted sentences database, narrowing down the scope.
  3. Vector searches are performed in the narrowed-down papers and extracted sentences (about 5,000 data points), evaluating the relationship between the query and potential results.
  4. The system amalgamates this with other analytical data, calculates relevance scores, and generates up to 20 possible results which are then returned.

This is indeed an impressive way to search. From my tests so far, at least in the realms of multimodal, AGI, LLM, among a few others that I tested, I found some exceptionally insightful papers. This felt more “vivid” compared to some other research paper search engines I was accustomed to using, where it often feels like you are merely searching for keywords, which easily results in overlooking significant papers.


References


Read More

Leave a Reply