Statistical Learning in Multiple Instance Problems

Xu, Xin

Statistical Learning in Multiple Instance Problems

Authors

Xu, Xin

Files

thesis.pdf (1.16 MB)

Permanent Link

https://hdl.handle.net/10289/2328

Rights

Abstract

Multiple instance (MI) learning is a relatively new topic in machine learning. It is concerned with supervised learning but differs from normal supervised learning in two points: (1) it has multiple instances in an example (and there is only one instance in an example in standard supervised learning), and (2) only one class label is observable for all the instances in an example (whereas each instance has its own class label in normal supervised learning). In MI learning there is a common assumption regarding the relationship between the class label of an example and the ``unobservable'' class labels of the instances inside it. This assumption, which is called the ``MI assumption'' in this thesis, states that ``An example is positive if at least one of its instances is positive and negative otherwise''. In this thesis, we first categorize current MI methods into a new framework. According to our analysis, there are two main categories of MI methods, instance-based and metadata-based approaches. Then we propose a new assumption for MI learning, called the ``collective assumption''. Although this assumption has been used in some previous MI methods, it has never been explicitly stated,\footnote{As a matter of fact, for some of these methods, it is actually claimed that they use the standard MI assumption stated above.} and this is the first time that it is formally specified. Using this new assumption we develop new algorithms --- more specifically two instance-based and one metadata-based methods. All of these methods build probabilistic models and thus implement statistical learning algorithms. The exact generative models underlying these methods are explicitly stated and illustrated so that one may clearly understand the situations to which they can best be applied. The empirical results presented in this thesis show that they are competitive on standard benchmark datasets. Finally, we explore some practical applications of MI learning, both existing and new ones. This thesis makes three contributions: a new framework for MI learning, new MI methods based on this framework and experimental results for new applications of MI learning.

Citation

Xu, X. (2003). Statistical Learning in Multiple Instance Problems (Thesis, Master of Science (MSc)). The University of Waikato, Hamilton, New Zealand. Retrieved from https://hdl.handle.net/10289/2328

Type

Thesis

Date

2003

Publisher

The University of Waikato

Degree

Master of Science (MSc)

Supervisor

Frank, Eibe

Statistical Learning in Multiple Instance Problems

Authors

Files

Permanent Link

Publisher link

Rights

Abstract

Citation

Type

Series name

Date

Publisher

Degree

Type of thesis

Supervisor