String comparison is a key step in data pre-processing, but functions in Excel such as MATCH and VLOOKUP falter in fuzzy string matching. In this post, let’s explore how the Python library "FuzzyWuzzy" overcomes these limitations.
Research data and datasets are becoming increasingly important in the scientific process. As more researchers make their data open and reusable, proper citation is crucial to give credit to the authors and acknowledge the data origin.
Data is the pillar of integrity in published research. How do journal editors detect integrity issues? How can publishers support integrity? A recent seminar for HKUST researchers brought up a good discussion.