Large terminology datasets that have grown over years without regulated processes often harbour significant potential for errors. The main problems include incomplete languages in multilingual datasets and duplicate entries, so terms that are included several times with identical or similar spellings and that actually refer to the same content.
Identifying and cleaning up errors in a large termbase in a meaningful way often seems like an impossible task. After all, terms that have been entered incorrectly often produce a rat’s nest of downstream problems, and relevant and correct information is often closely intertwined with redundant and incorrect entries.
Our Quality Dashboard provides users with an interface to conduct forensic examinations of their termbase to identify different error categories and gradually clean them up using the most important quality steps.
- The Quality Dashboard
The Quality Dashboard
You access the Quality Dashboard by selecting the ‘Quality’ tab in the termbase navigation.
You then see an analysis view with the error categories Duplicates and Missing languages, sorted according to the languages represented in the termbase.
The number next to the error category indicates the number of occurrences found in the termbase for the respective error category.
You access the detailed view by clicking on the error category and can then see the occurrences and make further changes.
You will find a list of all identified duplicates on the left-hand side in this view.
For each result, you will see the term found as a single or multiple duplicate, the number of entries in which this term was discovered and, below that, a list of all terms identified for this duplicate.
Click on the result to access the editing view.
Here you will see all term entries found for this duplicate, shown next to each other in a reduced form.
Cleaning up duplicates
Now you can complete several processing steps to clean up the duplicates, i.e. merge terms from different term entries that actually refer to one unit of meaning to create a single term entry.
To do so, first go to the slider at the top right of the entry and select a term entry as the main entry. This is the term entry that you want to keep in the termbase and into which terms from other entries can be merged
The best way to proceed is to select an entry as the main entry that already contains the most correct terms and information.
The other term entries from this duplicate view are then initially marked for deletion.
You then have the option of selecting terms from all entries for individual deletion or to merge them with the main entry.
You either click on the check mark or the Delete button to mark the term for merging with the main entry or for deletion. You can make your selection for all languages in the term entry here.
You can also edit the definition of the main entry as part of this step.
The entry will be shaded grey if you select the ‘Ignore Instead’ button. In this case it will not be deleted during the merge and remains unedited in the termbase.
Once you have finished editing the term entry in the duplicate view and have marked the terms for merging or deletion, depending on your wishes, you can then click on ‘Merge Entries’ at the top.
For example, if you click on ‘English’ and then ‘Missing Language’ in the quality overview, you will be shown all entries for which a term in English has not yet been added.
You can also filter for another missing language from here or search the list using a fuzzy search.
Adding terms in missing languages
Clicking on the term in the list opens the matching term entry from the right-hand side. You can then select ‘Add Language’ to add the missing language directly.
You can also assign the entries to a term task to add the missing languages.
To do so, check the boxes on the left of the hit list to select the term entries you want to include in the term task.
You can also use the batch check box at the bottom to select all found entries or use the filters to limit your list as described above.
Clicking on the ‘Add to Task’ button opens a dialogue in which you can either add the terms to a current task or create a new task with the selected entries.