Seeding strategies for semantic disambiguation

Abstract

Semantic disambiguation determines the meaning of words and phrases in a text, for which we use an automatically-generated Concept-in-Context (CiC) network. Words and phrases rarely belong to a single concept; disambiguation in Capisco relies on interplay between words that are in close vicinity in the text. Starting the disambiguation is a seeding process, that identifies the first concepts, which then form the context for further disambiguation steps. This paper introduces the seeding algorithm and explores seeding strategies for identifying these initial concepts in text volumes, such as books, that are stored in a digital library.

Citation

Hinze, A., Bainbridge, D., Wilkins, R., Taube-Schock, C., & Downie, J. S. (2018). Seeding strategies for semantic disambiguation. In Proceedings of 18th ACM/IEEE Joint Conference on Digital Libraries (JCDL 2018) (pp. 343–344). Fort Worth, Texas: ACM. https://doi.org/10.1145/3197026.3203874

Series name

Date

Publisher

ACM

Degree

Type of thesis

Supervisor