MISVM: Multiple-Instance Support Vector Machines

Overview

MISVM contains a Python implementation of numerous support vector machine (SVM) algorithms for the multiple-instance (MI) learning framework. The implementations were created for use in the following publication:

Doran, Gary and Soumya Ray. A theoretical and empirical analysis of support vector machine methods for multiple-instance classification. To appear in Machine Learning Journal. 2013.

Installation

This package can be installed in two ways (the easy way):

# If needed:
# pip install numpy
# pip install scipy
# pip install cvxopt
pip install -e git+https://github.com/garydoranjr/misvm.git#egg=misvm

or by running the setup file manually

git clone [the url for misvm]
cd misvm
python setup.py install

Note the code depends on the numpy, scipy, and cvxopt packages. So have those installed first. The build will likely fail if it can't find them. For more information, see:

NumPy: Library for efficient matrix math in Python
SciPy: Library for more MATLAB-like functionality
CVXOPT: Efficient convex (including quadratic program) optimization

The "multiple-instance classification algorithm" (MICA) represents each bag using a convex combinations of its instances. The optimization program is then solved by iteratively solving a series of linear programs. In our formulation, we use L2 regularization, so we solve alternating linear and quadratic programs. For more information on the original algorithm, see:

Mangasarian, Olvi L., and Edward W. Wild. Multiple instance classification via successive linear programming. Journal of Optimization Theory and Applications 137.3 (2008): 555-568.

sMIL, stMIL, and sbMIL

This family of approaches intentionally bias SVM formulations to handle the assumption that there are very few positive instances in each positive bag. In the case of sbMIL, prior knowledge on the "sparsity" of positive bags can be specified or found via cross-validation:

Bunescu, Razvan C., and Raymond J. Mooney. Multiple instance learning for sparse positive bags. Proceedings of the 24th International Conference on Machine Learning. 2007.

How to Use

The classifier implementations are loosely based on those found in the scikit-learn library. First, construct a classifier with the desired parameters:

>>> import misvm
>>> classifier = misvm.MISVM(kernel='linear', C=1.0, max_iters=50)

Use Python's help functionality as in help(misvm.MISVM) or read the documentation in the code to see which arguments each classifier takes. Then, call the fit function with some data:

>>> classifier.fit(bags, labels)

Here, the bags argument is a list of "array-like" (could be NumPy arrays, or a list of lists) objects representing each bag. Each (array-like) bag has m rows and f columns, which correspond to m instances, each with f features. Of course, m can be different across bags, but f must be the same. Then labels is an array-like object containing a label corresponding to each bag. Each label must be either +1 or -1. You will likely get strange results if you try using 0/1-valued labels. After training the classifier, you can call the predict function as:

>>> labels = classifier.predict(bags)

Here bags has the same format as for fit, and the function returns an array of real-valued predictions (use numpy.sign(labels) to get -1/+1 class predictions).

In order to get instance-level predictions from a classifier, use the instancePrediction flag, as in:

>>> bag_labels, instance_labels = classifier.predict(bags, instancePrediction=True)

The instancePrediction flag is not available for bag-level classifiers such as the NSK. However, you can always predict the labels of "singleton" bags containing a single instance to assign a label to that instance. In this case, one should use caution in interpreting the label of an instance produced by a bag-level classifier, since these classifiers are designed to make predictions based on properties of an entire bag.

An example script is included that trains classifiers on the musk1 dataset; see:

Bache, K. & Lichman, M. (2013). UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science.

Install the package or add the misvm directory to the PYTHONPATH environment variable before attempting to run the example using python example.py within the example directory.

Questions and Issues

If you find any bugs or have any questions about this code, please create an issue on GitHub, or contact Gary Doran at [email protected]. Of course, I cannot guarantee any support for this software.

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
example		example
misvm		misvm
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MISVM: Multiple-Instance Support Vector Machines

Overview

Installation

Contents

SIL

MI-SVM and mi-SVM

NSK and STK

MissSVM

MICA

sMIL, stMIL, and sbMIL

How to Use

Questions and Issues

About

Releases

Packages

Contributors 7

Languages

License

garydoranjr/misvm

Folders and files

Latest commit

History

Repository files navigation

MISVM: Multiple-Instance Support Vector Machines

Overview

Installation

Contents

SIL

MI-SVM and mi-SVM

NSK and STK

MissSVM

MICA

sMIL, stMIL, and sbMIL

How to Use

Questions and Issues

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 7

Languages

Packages