Skip to content
Start main Content

Constellate: Your Learning Companion for Text Analysis Research

In today’s digital age, we have access to more textual data than ever before. By leveraging computational techniques, text analysis enables us to discover hidden patterns and insights within vast amounts of information. However, conducting text analysis often requires programming skills, which can be a hurdle for researchers with limited technical backgrounds. In this post, we will introduce Constellate, a new platform recently subscribed by the Library, which can serve as your learning companion for conducting text analysis research.

What is Constellate

Constellate is a platform designed to facilitate learning and conducting text analysis research. It offers an intuitive workflow for importing, analyzing, visualizing, and interpreting your text data without writing any code. You can explore your data through various techniques, including topic modeling, sentiment analysis, clustering, network analysis, and more. For technical users, Constellate also provides the option to apply more advanced algorithms and custom Python scripts.

Constellate also hosts a rich collection of datasets, primarily covering journal and book literature from JSTOR and Portico databases from 1700 to present. You can read more about their sources here.

How to Access Constellate

To enjoy the full datasets and learning materials on Constellate, make sure you access it through Library and create a personal account. You can also use your JSTOR account to log in if you already have one.

How Constellate Can Help with Your Research

Let’s take a closer look at how Constellate works with a quick demo. In this example, we will use a sample dataset of Shakespeare literature to uncover the most frequently appearing words in the corpus.

One advantage of using Constellate is that all notebooks run in the Constellate environment, so users do not need to pre-install any tools or software. Additionally, all datasets can be downloaded, allowing users to perform analyses locally if desired.

Workflow for You to Get Started

First, if you are new to text analysis research, we recommend watching this video and reading an article to learn the basics:

Now, the suggested workflows for different users:

For Beginners (no-code):

  1. Start with a sample dataset.
  2. Use the built-in interactive dashboard to explore the dataset. You can see no. of documents over time, term frequency, publications covered in the dataset, etc. You can use the filters to make your dataset more specific.
  3. Go to  Analyze and try a method to run a deeper analysis, such as word frequency or significant terms. This process does not require coding, but if you want to learn some basics, check out the Python Basics tutorials. They are easy to follow, with hands-on practice, and are very beginner-friendly.

 

 

For Intermediate Users (low-code):

  1. Once you are familiar with the Constellate environment, try building your own dataset based on your research needs.
  2. Use the built-in dashboard to quickly explore the dataset or go directly to a method for deeper analysis.
  3. Adapt the code to suit your dataset and needs. Check out the tutorials here to advance your skills, e.g. learn how to create a stopword list or use regular expressions.

 

 

For Advanced Users:

  1. Build your own dataset based on research needs.
  2. Use the built-in dashboard to quickly explore the dataset or go directly to a method for deeper analysis. You can also use your own notebook by importing from GitHub repository, or simply download the dataset and do your own analysis.
  3. Check out the tutorials here to learn text analysis techniques in demand. Tutorials also cover topics such as tokenization, named entity recognition, topic modeling (with LDA), sentiment analysis (with VADER).

 

 

How Other Researchers Use Constellate – Some Use Cases

Text analysis research is not limited to humanists. Scientists can also use the techniques to explore trends of a particular subject within the literature. Here are two examples. More use cases can be found here.

Research field Medicine Humanities 
Topic in brief Analyze trends in drug use Analyze topics in Signs, a journal in women’s and gender studies
Method used Sentiment analysis Topic modeling
Learn more in the literature 4-Fluoramphetamine in the Netherlands: Text-mining and sentiment analysis of internet forums An Interactive Topic Model of Signs

 

Workshops & Tutorials

Constellate regularly hosts online workshops covering a wide range of topics. The upcoming workshops, on Data Visualization, will be offered next week (24 – 28 April). You can sign up on this page. It’s free! The three workshops (held during night time in Hong Kong) will cover:

  • what makes a good or bad visualization;
  • how statistical descriptions of data are represented in visualizations, and;
  • how to create visualizations in Python. 

In addition to the Classes & Tutorials page, the learning materials (Jupyter notebooks) can also be found on GitHub, under the Creative Commons CC-BY license.

– By Aster Zhao, Library

Hits: 343

Go Back to page Top

Tags: , , , ,

published April 21, 2023