Concept Hierarchy Extraction from Textbooks

Abstract

Concept hierarchies have been useful tools for presenting and organizing knowledge. With the rapid growth in the number of online knowledge resources, automatic concept hierarchy extraction is increasingly attractive. Here, we focus on concept extraction from textbooks based on the knowledge in Wikipedia. Given a book, we extract important concepts in each book chapter using Wikipedia as a resource and from this construct a concept hierarchy for that book. We define local and global features that capture both the local relatedness and global coherence embedded in that textbook. In order to evaluate the proposed features and extracted concept hierarchies, we manually construct concept hierarchies for three well used textbooks by labeling important concepts for each book chapter. Experiments show that our proposed local and global features achieve better performance than using only keyphrases to construct the concept hierarchies. Moreover, we observe that incorporating global features can improve the concept ranking precision and reaffirms the global coherence in the book.

Publication
Proceedings of the 2015 ACM Symposium on Document Engineering