Skip to content
Start main Content

SciSight for Exploratory COVID-19 Research

With the COVID pandemic continuing into the third year, in this week’s Research Bridge, let’s take a look at SciSight, a system for exploratory search for current COVID-19 literature corpus.

The COVID-19 Open Research Dataset (CORD-19) is a free resource of tens of thousands of scholarly articles about COVID-19, SARS-CoV-2, and related coronaviruses for use by the global research community. The dataset was created by the Allen Institute for AI in partnership with other research groups such as the National Institutes of Health (NIH), and has been fully incorporated into Semantic Scholar.

The first release (March 2020) of CORD-19 corpus contained 28,000 publications, and as of Feb 2022, this had increased to 280,000. To boost discovery over this corpus, SciSight is designed for exploratory search of the COVID-19 to find out what groups are working on what directions, see how biomedical concepts interact and evolve over time, and discover new connections.

Let’s highlight SciSight’s distinct functions with some examples:

Network of Science

Explore the progress being made against COVID-19, with a visualization of research groups, salient authors, topics, and their ties.

vaccine topic group authors

Figure 1. Visualizing the network of groups working on COVID-19 vaccination. 


Exploratory Paper Search

See how authors and topics interact over time with this exploratory faceted search tool.

facet search scisight

Figure 2. Faceted search to display papers that discuss using remdesivir as intervention for viral load outcome study. 

Collocation Explorer

Search for a term to display corpus-wide associations between biomedical entities, such as drugs and conditions.

collocation explorer of biomedical entities

Figure 3. Visualizing the network of top proteins/genes/cells associated with “E protein” in the corpus.

Download CORD-19

The COVID-19 Open Research Dataset (CORD-19) is a growing resource of many COVID-19 text mining and discovery research projects. The dataset is updated weekly and can be downloaded. You may read this conference proceeding to gain an in-depth description of the CORD-19 dataset and what it contains. 

– By Jennifer Gu, Library

Hits: 280

Go Back to page Top

Tags: ,

published February 8, 2022