Research Bridge
research_bridge_logo
Gale Digital Scholar Lab and Constellate for Text Analysis
Research Tools

Interested in doing text analysis but daunted about where to start? Fear not! Two new library resources, Gale Digital Scholar Lab and Constellate, offer tools, data resources, and user-friendly interfaces to help you begin. 
 

Gale Digital Scholar Lab

Gale Digital Scholar Lab (Gale DSL) is a platform for text analysis, data mining, and data visualization. With access to the Library's licensed content in Gale Primary Sources, you can create and analyze content sets using the digital humanities tools provided. 

To get started, log in with your Microsoft account (your HKUST SSO should work). After agreeing to the terms of use and privacy policy (they promise never to sell your data), you can create a username, select an avatar, and begin exploring. 
 

Build, Clean, and Analyze

Gale DSL provides students and scholars with text and data mining resources, visualization tools, and methodology suggestions. The platform offers self-learning materials to guide you through various stages of a digital scholarship pathway.

 Build Module In Gale DSL 

The “Build” stage involves creating and managing your content sets from our licensed content in the Gale Primary Resources collections. For example, you can analyze papers from government archives about China and Western countries from the 1800s to the 1990s, or explore the full text of The Economist (1843-2020), among other options. Alternatively, you can upload and use your own text files. 

  Clean Module In Gale DSL 

In the "Clean" stage, you can refine your content sets by applying stop words and text correction techniques. The data cleaning configurations can be reused across content sets. 

  Analyze Module In Gale DSL

The "Analyze" stage assists you in selecting the right tools and methods, such as document clustering, named entity recognition, or n-grams. Gale DSL also provides sample projects for self-guided learning.   

N-gram models reveal where sequences of words are relevant (N stands for the size of word sequences considered in the model). For example, in content sets constructed around 18th century food types, “coffee” and “tea”, the presence of the word “young” and types of people (men, women, lady, etc.) occur with greater frequency in the N-gram results of the Coffee content set (Figure.a) than the Tea content set (Figure.b). While Coffee’s N-grams note both good and evil, liquor, and common sense, Tea’s N-grams focus on the otherness of the product itself.

 Coffee content set in the 18th century ngram 

Figure a. Ngrams - Coffee content set in the 18th century

 Ngrams - tea content set in the 18th century 

Figure b. Ngrams - Tea content set in the 18th century   
 

Constellate

Constellate is a platform that facilitates text analysis through the world's leading archival repositories of scholarly and primary source content, including JSTOR, Portico, and other Ithaka collections. 

To use Constellate, similar to Gale DSL, you'll need to register and create a JSTOR user account. Once you've agreed to the terms of use and privacy policy (which also assure data protection), you can begin utilizing the platform. For seamless access to all the available data, it's recommended to create your account on-campus at HKUST, so that your IP is recognized. If this isn't possible, follow the instructions provided here: https://constellate.org/docs/log-in#pair 
 

Dataset Builder and the Constellate Lab

Similar to Gale DSL, Constellate offers an integrated text analysis platform that grants access to scholarly content. Furthermore, it provides rich open educational resources into a cloud-based lab where you can use Constellate Notebooks and other Jupyter Notebooks, execute functional code, and even share your own notebooks with other Constellate users. Additionally, there is a beta version of R environment available.

 Infographics Constellate Dashboard 

The Tutorials section has lots of text and videos from beginner (basics of text analysis and basic coding with R and Python) to advanced methods like tokenization, topic modelling or sentiment analysis. 

By leveraging the capabilities of Gale Digital Scholar Lab and Constellate, you can embark on an enriching text analysis journey, gaining valuable insights from a wide range of scholarly resources.

Edited By
Victoria Caplan, Library, lbcaplan@ust.hk
Published
12 Oct 2023
Previous News
Previous News
Next News
Previous News
Next News
Previous News
Next News
Next News
Next News
Previous News
Previous News
Next News
Previous News
Next News
Previous News
Next News
Previous News
Previous News
Research Bridge
Next News
Research Bridge
Previous News
Research Bridge
Next News
Previous News
Next News
Research Bridge
Previous News
Next News
Previous News
Previous News
Library Stories
Next News
Next News
Library Stories
Previous News
Library Stories
Next News
Library Stories
Previous News
Next News
Previous News
Next News
Previous News
Next News
Next News
Next News
Library Stories
Previous News
Library Stories
Next News
Research Bridge
Next News
Previous News
Next News
Research Bridge
Next News
Library Stories
Previous News
Next News
Research Bridge
Previous News
Previous News
Next News
Library Stories
Previous News
Library Stories
Previous News
Previous News
Next News
Library Stories
Previous News
Next News
Library Stories
Next News
Research Bridge
Previous News
Next News
Next News
Previous News
Next News
Previous News
Next News