Permanent URI for this collection
Browse
Recent Submissions
Publication Mining algorithmic complexity in full-text scholarly documents(Conference Contribution, The University of Waikato, 2018) Bakar, Abu; Safder, Iqra; Hassan, Saeed-UlNon-textual document elements (NTDE) like charts, diagrams, algorithms play an important role to present key information in scientific documents [1]. Recent advancements in information retrieval systems tap this information to answer more complex queries by mining text pertaining to non-textual document elements. However, linking between document elements and corresponding text can be non-trivial. For instance, linking text related to algorithmic complexity with consequent root algorithm could be challenging. These elements are sometime placed at the start or at the end of the page instead of following the flow of document text, and the discussion about these elements may or may not be on the same page. In recent years, quite a few attempts have been made to extract NTDE [2-3]. These techniques are actively applied for effective document summarization, to improve the existing IR systems. Generally, asymptotic notations are used to identify the complexity lines in full text. We mine the relevant complexities of algorithms from full text by comparing the metadata of algorithm with context of paragraph in which complexity related discussion is made by authors. In this paper, we presented a mechanism for identification of algorithmic complexity lines using regular expressions, algorithmic metadata compilation of algorithms, and linking complexity related textual lines to algorithmic metadata.Publication Mining scientific trends based on topics in conference call for papers(Conference Contribution, The University of Waikato, 2018) Bakar, Abu; Arshad, Noor; Safder, Iqra; Hassan, Saeed-UlEver since analyzing scientific topics and evolution of technology have become vital for researchers, academics, funding institutes and research administration departments, there is a crucial need to mine scientific trends to fill this appetite more rigorously. In this paper, we procured a novel Call for Papers (CFPs) dataset in order to analyze scientific evolution and prestige of conferences that set scientific trends using scientific publications indexed in DBLP. Using ACM CSS, 1.3 million publications that appear in 146 data mining conferences are mapped into different thematic areas by matching the terms that appear in publication titles with ACM CSS. In recent years, an attempt termed as Topic Detection and Tracking (TDT) [1] is made to find the solution for the problem of "well-awareness" on this dynamic data. As conference ranking has been made by different forums on the basis of mixed indicators1. ERA2 ranks Australia's higher education research institutions. The major contributions of this paper are as follows: (i) compilation of CFPs dataset, (ii) identification of topics and keywords from CFP corpus, and (iii) measure the impact of these extracted hot topics from CFPs.Publication Exploring research data management from a data user’s perspective(Conference Contribution, The University of Waikato, 2018) Wu, YejunCurrent research data management practices focus more on data col-lecting, curating and sharing than on supporting data use and reuse. This re-search studies research data management from a data user’s perspective and aims to reveal what research data service features may support data reuse. Two data repositories – the TREC website and the GRI data repository – were stud-ied and compared from the perspectives of three types of data users (i.e., insid-ers, community users, and public users). The TREC website can support data reuse for insiders and community users, but not necessarily public users. The GRI repository can support data reuse for some insiders only. The findings have multiple implications for research data management.Publication Deep feature engineering using full-text publications(Conference Contribution, The University of Waikato, 2018) Safder, Iqra; Batool, Hafsa; Hassan, Saeed-UlWe have observed a rapid proliferation in scientific literature and advancements in web technologies has shifted information dissemination to digital libraries [1]. In general, the research conducted by scientific community is articulated through scholarly publications pertaining high quality algorithms along other algorithmic specific metadata such as achieved results, deployed datasets and runtime complexity. According to estimation, approximately 900 algorithms are published in top core conferences during the years 2005-2009 [2]. With this significant increase in algorithms reported in these conferences, more efficient search systems with advance searching capabilities must be designed to search for an algorithm and its supported metadata such as evaluation results like precision, recall etc., particular dataset on which an algorithm executed or the time complexity achieved by that algorithm from full body text of an article. Such advanced search systems could support researchers and software engineers looking for cutting edge algorithmic solutions. Recently, state designed to search for an algorithm from full text articles [3-5]. In this work, we designed an advanced search engine for full text publications that leverages the deep learning techniques to classify algorithmic specific metadata and further to improve searching capabilities for a search system.Publication A framework for bibliographic recommendation system based on Heterogeneous Retrieval Model?(Conference Contribution, The University of Waikato, 2018) Anthony, Poonam; Bhowmick, Plaban KumarIn this paper, we propose an architectural framework for recommending heterogeneous resources in a digital library.We present an outline of our proposed recommendation framework, and discuss brie its performance over SpringerNature SciGraph¹ dataset.Publication Clustering of research papers based on sentence roles(Conference Contribution, The University of Waikato, 2018) Fukuda, Satoshi; Tomiura, YoichiIn an academic paper search, particularly a search to confirm the originality of a us-er’s research and to create survey articles, it is important that the search returns com-prehensive results related to the user’s information need. [1] proposes a method for efficiently selecting relevant research papers from a vast abstract set, which is based on a topic model and search formula created by the user. In this paper, we construct a system that visually expresses categories included in the reduced set of papers from [1], using a clustering based on the user’s information need and selecting clusters having relevant papers. To generate the clusters based on the user’s information need, it is important to know which structures (such as “background” and “method”) in the abstract are relevant to the information need. This is based on the knowledge that if a user searches the papers related to the automatic construction of a thesaurus, he/she will judge whether a paper is relevant from sentences in the abstract describing the research purpose and method. We therefore propose a method using only the sentence content that matches the information need in the clustering.Publication Using topic modeling to understand workplace health and safety ownership(Conference Contribution, The University of Waikato, 2018) Goh, Dion Hoe-Lian; Lee, Chei Sian; Theng, Yin Leng; Zheng, Han; Aung, Htet Htet; Aroor, Megha Rani; Lee, Edmund Wei Jian; Li, ChenThis paper is a first step in understanding the concept of workplace health and safety ownership. Using topic modeling, we identified three major themes in the literature, including work related to interventions, issues concerning organizational-level ownership, and issues related to group/personal level ownership.Publication Question answering system of management philosophy based on lecture transcripts of business leaders(Conference Contribution, The University of Waikato, 2018) Mishina, Hirotaka; Aoyama, Atsushi; Maeda, AkiraIn recent years, a growing number of companies have been suffering damages due to the occurrence of frauds or illegal activities by their own employees and losing their trust in various companies including some large companies. These things are thought to happen due to the lack of corporate rules and philosophy concerning management. Based on these circumstances, in order to answer the questions on how to manage companies, this paper proposes a question answering system for company managers using lecture transcripts of Dr. Kazuo Inamori, who is one of the most respected business leaders in the world. In the proposed system, we analyze the managers’ questions, extract answers to these questions from the lecture transcripts, and return it to the users.