Mining algorithmic complexity in full-text scholarly documents

Bakar, Abu; Safder, Iqra; Hassan, Saeed-Ul

Item

Mining algorithmic complexity in full-text scholarly documents

Bakar, Abu
;
Safder, Iqra
;
Hassan, Saeed-Ul

Abstract

Non-textual document elements (NTDE) like charts, diagrams, algorithms play an important role to present key information in scientific documents [1]. Recent advancements in information retrieval systems tap this information to answer more complex queries by mining text pertaining to non-textual document elements. However, linking between document elements and corresponding text can be non-trivial. For instance, linking text related to algorithmic complexity with consequent root algorithm could be challenging. These elements are sometime placed at the start or at the end of the page instead of following the flow of document text, and the discussion about these elements may or may not be on the same page. In recent years, quite a few attempts have been made to extract NTDE [2-3]. These techniques are actively applied for effective document summarization, to improve the existing IR systems. Generally, asymptotic notations are used to identify the complexity lines in full text. We mine the relevant complexities of algorithms from full text by comparing the metadata of algorithm with context of paragraph in which complexity related discussion is made by authors. In this paper, we presented a mechanism for identification of algorithmic complexity lines using regular expressions, algorithmic metadata compilation of algorithms, and linking complexity related textual lines to algorithmic metadata.

Type

Conference Contribution

Citation

Bakar, A., Safder, I., & Hassan, S.-U. (2018). Mining algorithmic complexity in full-text scholarly documents. In ICADL Poster Proceedings. Hamilton, New Zealand: The University of Waikato.

Date

2018

Publisher

The University of Waikato

Rights

Mining algorithmic complexity in full-text scholarly documents

Bakar, Abu
;
Safder, Iqra
;
Hassan, Saeed-Ul

Abstract

Type

Type of thesis

Series

Citation

Date

Publisher

Degree

Supervisors

Rights

Files

Permanent link

DOI

Publisher version

Collections