Our role in this project
We are Sherry and Berry, Year 3 and Year 2 students from BSc in Data Science and Technology respectively. We were involved in the overall process of platform development, including functional design and programming. Our major duty is worked as a Python developer to develop a web platform to perform Named-Entity Recognition (NER) on Chinese paragraphs or articles. The main objective of the project is to provide users with a straightforward process of pasting or uploading a text file to the platform, where the web tool will automatically annotate entities, identify their occurrences with other features and visualization in one go. The aim of this initiative is to enable non-technical users to effortlessly conduct NER tasks through our user-friendly web interface.
Challenges and obstacles
Our first challenge is to identify what NER model(s) and decide what kind of features should be included in this platform. We conducted research on similar NER tools, such as Peking University’s “Wu Yu Dian” intelligent annotation platform, CKIP recognition platform and CORPRO. After evaluating these platforms, we summarized functionalities that are noteworthy and areas for improvement based on our concluded personal user experience into a wish list.
During the development stage, we co-developed the project with each other which posed communication and coordination challenges initially. For instance, we defined duplicated variables for the same purpose and induced some synchronization issues after some web form actions. To fine tune the platform with better performance with no errors, we come up with a standard set of variables by defining a list of shared variables. We also adopted the use of GitHub for better project management, we found this approach makes the development easier for merging our codes and recording changes to minimize the programming errors.
What we have learned
Better communication and user-oriented design
Over the design stage, we kicked start the user interface (UI) design on Figma based on the collected requirement and the wish list of features. We sought further comments from team members from the library and potential end-users after the first version of UI design was completed. We found that stakeholders may change their expectations and interpretation of the operation flow slightly different in iterations. The setting up of milestones and continued communications are crucial to create an intuitive and user-oriented design application.
IT Proficiency in programming and operations in server environment
During the testing stage, we first developed the platform locally on the personal computer and deployed it for review during the regular meeting. However, the environment between computers may not be identical and caused different programming errors as well as performance issues. We then realized the using a shared server environment to execute the developed platform can minimize the implications and smoothen the development process by leveraging the cloud based computational power for efficient processing.
Due to the change in the environment as well as the way to deliver functions and features, we must embrace the mindset of continuous learning and adopt better solutions to implement the platform. We actively sought out new information, explored alternative approaches to implement the feature with different techniques and tools. At the end, we adopted the list of tools to support our development:
Tool | Function |
---|---|
Python | Programming language for building the platform and handling all logical processes. |
Streamlit | Python framework to build and deploy websites with Python. |
Plotly | Python graphing library makes interactive graphs. |
CKIP Transformer | Python library for handle Natural Language Processing (NLP) for Chinese natural language processing. |
GitHub | Developer platform to store, manage the programming source code. |
Figma | Collaborative web application for interface design. |