    Mining algorithmic complexity in full-text scholarly documents
    (Conference Contribution, The University of Waikato, 2018) Bakar, Abu; Safder, Iqra; Hassan, Saeed-Ul
    Non-textual document elements (NTDE) like charts, diagrams, algorithms play an important role to present key information in scientific documents [1]. Recent advancements in information retrieval systems tap this information to answer more complex queries by mining text pertaining to non-textual document elements. However, linking between document elements and corresponding text can be non-trivial. For instance, linking text related to algorithmic complexity with consequent root algorithm could be challenging. These elements are sometime placed at the start or at the end of the page instead of following the flow of document text, and the discussion about these elements may or may not be on the same page. In recent years, quite a few attempts have been made to extract NTDE [2-3]. These techniques are actively applied for effective document summarization, to improve the existing IR systems. Generally, asymptotic notations are used to identify the complexity lines in full text. We mine the relevant complexities of algorithms from full text by comparing the metadata of algorithm with context of paragraph in which complexity related discussion is made by authors. In this paper, we presented a mechanism for identification of algorithmic complexity lines using regular expressions, algorithmic metadata compilation of algorithms, and linking complexity related textual lines to algorithmic metadata.
    Mining scientific trends based on topics in conference call for papers
    (Conference Contribution, The University of Waikato, 2018) Bakar, Abu; Arshad, Noor; Safder, Iqra; Hassan, Saeed-Ul
    Ever since analyzing scientific topics and evolution of technology have become vital for researchers, academics, funding institutes and research administration departments, there is a crucial need to mine scientific trends to fill this appetite more rigorously. In this paper, we procured a novel Call for Papers (CFPs) dataset in order to analyze scientific evolution and prestige of conferences that set scientific trends using scientific publications indexed in DBLP. Using ACM CSS, 1.3 million publications that appear in 146 data mining conferences are mapped into different thematic areas by matching the terms that appear in publication titles with ACM CSS. In recent years, an attempt termed as Topic Detection and Tracking (TDT) [1] is made to find the solution for the problem of "well-awareness" on this dynamic data. As conference ranking has been made by different forums on the basis of mixed indicators1. ERA2 ranks Australia's higher education research institutions. The major contributions of this paper are as follows: (i) compilation of CFPs dataset, (ii) identification of topics and keywords from CFP corpus, and (iii) measure the impact of these extracted hot topics from CFPs.
    Exploring research data management from a data user’s perspective
    (Conference Contribution, The University of Waikato, 2018) Wu, Yejun
    Current research data management practices focus more on data col-lecting, curating and sharing than on supporting data use and reuse. This re-search studies research data management from a data user’s perspective and aims to reveal what research data service features may support data reuse. Two data repositories – the TREC website and the GRI data repository – were stud-ied and compared from the perspectives of three types of data users (i.e., insid-ers, community users, and public users). The TREC website can support data reuse for insiders and community users, but not necessarily public users. The GRI repository can support data reuse for some insiders only. The findings have multiple implications for research data management.
    Deep feature engineering using full-text publications
    (Conference Contribution, The University of Waikato, 2018) Safder, Iqra; Batool, Hafsa; Hassan, Saeed-Ul
    We have observed a rapid proliferation in scientific literature and advancements in web technologies has shifted information dissemination to digital libraries [1]. In general, the research conducted by scientific community is articulated through scholarly publications pertaining high quality algorithms along other algorithmic specific metadata such as achieved results, deployed datasets and runtime complexity. According to estimation, approximately 900 algorithms are published in top core conferences during the years 2005-2009 [2]. With this significant increase in algorithms reported in these conferences, more efficient search systems with advance searching capabilities must be designed to search for an algorithm and its supported metadata such as evaluation results like precision, recall etc., particular dataset on which an algorithm executed or the time complexity achieved by that algorithm from full body text of an article. Such advanced search systems could support researchers and software engineers looking for cutting edge algorithmic solutions. Recently, state designed to search for an algorithm from full text articles [3-5]. In this work, we designed an advanced search engine for full text publications that leverages the deep learning techniques to classify algorithmic specific metadata and further to improve searching capabilities for a search system.
    A framework for bibliographic recommendation system based on Heterogeneous Retrieval Model?
    (Conference Contribution, The University of Waikato, 2018) Anthony, Poonam; Bhowmick, Plaban Kumar
    In this paper, we propose an architectural framework for recommending heterogeneous resources in a digital library.We present an outline of our proposed recommendation framework, and discuss brie its performance over SpringerNature SciGraph¹ dataset.