Skip to content
Start main Content

Semantic Reader – Augmented Application to Make Reading PDFs Easier

How can researchers read and skim papers more effectively? This question prompted the team at Semantic Scholar to develop an AI-powered application called Semantic Reader. In this post, let’s see how it works.

Challenges of Reading Papers on PDF

While the process of finding papers has been transformed by internet technology, the experience of reading research papers has remained largely unchanged for decades. Keeping up with the latest publications could remain a daunting task for researchers, given the density and quantity of text in scientific literature. 

Even though the PDF format is widely used because of its portability, reading PDFs has several downsides, including static content, poor accessibility for low-vision readers, and difficulty reading on mobile devices. 

Introducing Semantic Reader

Semantic Reader is an augmented application that brings AI assistance into the reading process for researchers seeking to digest scientific papers.

By using artificial intelligence, Semantic Reader comprehends a document’s structure and merges it with Semantic Scholar’s academic corpus, providing detailed information in context through overlays and tooltips.

Interface of Semantic Reader

The Semantic Reader is a free interactive interface for research papers. It supports standard reading features, as well as useful augmentations over the existing PDF.

Key Features

Highlighted Overlays

One common method used to quickly assess the relevance of papers is skimming, which involves glancing across pages to identify key information from figures, headings, and paragraphs.

Semantic Reader’s highlighted overlays greatly facilitate the skimming process for key information. These overlays are categorized into three labels: Goal, Method, and Result.

Highlight Customization

Customization options over highlights

Moreover, users have the flexibility to customize the highlights by adjusting contrast, showing margin labels for easy identification, controlling the density of highlights, and toggling different types of highlighting on or off.

In-line Citation Cards (Tooltips)

Semantic Reader enhances the reading experience by visually augmenting citations (as tooltips) within a paper based on their connections to the reader’s research activities.

Additionally, TLDR (Too Long; Didn’t Read) summaries of each citation are provided, facilitating a quicker understanding of the referenced work.

Citation Cards on Semantic Reader

Semantic Reader connects citation to Semantic Scholar, providing TLDR and also key metrics.

Personal Highlights and Annotations

With the integration of the Hypothesis platform, Semantic Reader enables users to highlight and take notes while reading papers. These highlights and annotations can be shared public with others.

Highlight And Annotation

Semantic Reader allows you to highlight and take notes while reading papers.

The Semantic Reader Project

No product is built overnight. In fact, let’s take a step back and take a look at the “Semantic Reader Project”, which is an exemplary case of iterative design through collaboration across multiple institutions.

The development of Semantic Reader involved ten early research prototypes, which address challenges related to discovery, efficiency, comprehension, synthesis, and accessibility in scientific paper reading. For a comprehensive understanding of the prototypes, refer to this arXiv paper. One notable prototype is Scim, which focused on augmented reading interfaces designed to guide readers’ attention using automatically created in-situ faceted highlights. To learn more about the Scim system, refer to “Scim: Intelligent Skimming Support for Scientific Papers“.

Currently, Semantic Reader is still in beta. There are still ongoing challenges in terms of accessibility for screen readers and parsing PDFs (especially legacy PDFs) into HTML. In the latest release of the Semantic Reader in Oct 2023, the skimming feature is only available for English computer science papers from arXiv (~480,000 records). Also, Semantic Reader is only available from desktop devices.

Upcoming Talk at HKUST

For those interested in delving deeper into how AI and NLP systems transform scientific text reading, an upcoming online event on November 22 featuring a guest talk by Dr. Lucy Wang from the University of Washington is highly recommended. This event is one of the talks to be offered in the Library’s upcoming symposium titled “Empowering Research Discovery: Transformative AI Technologies in Scholarly Communication“. 

Dr. Lucy Wang will discuss the potential of Generative AI in improving scholarly communication, making research more accessible through short summaries and engaging visuals that enhance understanding and captivate a wide range of readers.

– By Jennifer Gu, Library

Hits: 1156

Go Back to page Top

Tags: , , , , , , , ,

published October 28, 2023