Kernels for structured data

This book provides a unique treatment of an important area of machine learning and answers the question of how kernel methods can be applied to structured data. Kernel methods are a class of state-of-the-art learning algorithms that exhibit excellent learning results in several application domains....

Full description

Saved in:
Bibliographic Details
Main Author: Gartner, Thomas
Format: eBook Book
Language:English
Published: New Jersey World Scientific Publishing Co. Pte. Ltd 2008
World Scientific
World Scientific Publishing Company
WORLD SCIENTIFIC
World Scientific Publishing
Edition:1
Series:Series in machine perception and artificial intelligence
Subjects:
ISBN:9812814558, 9789812814555, 9789812814562, 9812814566
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Table of Contents:
  • Kernels for structured data -- Preface -- Contents -- Notational Conventions -- Chapter 1: Why Kernels for Structured Data? -- Chapter 2: Kernel Methods in a Nutshell -- Chapter 3: Kernel Design -- Chapter 4: Basic Term Kernels -- Chapter 5: Graph Kernels -- Chapter 6: Conclusions -- Bibliography -- Index.
  • 5.7.4.2 Comparison with previous RRL-implementations -- 5.7.5 Future Work -- 5.8 Molecule Classification -- 5.8.1 Mutagenicity -- 5.8.2 HIV Data -- 5.9 Summary -- 6. Conclusions -- Bibliography -- Index
  • Intro -- Contents -- Preface -- Notational Conventions -- 1. Why Kernels for Structured Data? -- 1.1 Supervised Machine Learning -- 1.2 Kernel Methods -- 1.3 Representing Structured Data -- 1.4 Goals and Contributions -- 1.5 Outline -- 1.6 Bibliographical Notes -- 2. Kernel Methods in a Nutshell -- 2.1 Mathematical Foundations -- 2.1.1 From Sets to Functions -- 2.1.2 Measures and Integrals -- 2.1.3 Metric Spaces -- 2.1.4 Linear Spaces and Banach Spaces -- 2.1.5 Inner Product Spaces and Hilbert Spaces -- 2.1.6 Reproducing Kernels and Positive-Definite Functions -- 2.1.7 Matrix Computations -- 2.1.8 Partitioned Inverse Equations -- 2.2 Recognising Patterns with Kernels -- 2.2.1 Supervised Learning -- 2.2.2 Empirical Risk Minimisation -- 2.2.3 Assessing Predictive Performance -- 2.3 Foundations of Kernel Methods -- 2.3.1 Model Fitting and Linear Inverse Equations -- 2.3.2 Common Grounds of Kernel Methods -- 2.3.3 Representer Theorem -- 2.4 Kernel Machines -- 2.4.1 Regularised Least Squares -- 2.4.2 Support Vector Machines -- 2.4.3 Gaussian Processes -- 2.4.4 Kernel Perceptron -- 2.4.5 Kernel Principal Component Analysis -- 2.4.6 Distance-Based Algorithms -- 2.5 Summary -- 3. Kernel Design -- 3.1 General Remarks on Kernels and Examples -- 3.1.1 Classes of Kernels -- 3.1.2 Good Kernels -- 3.1.3 Kernels on Inner Product Spaces -- 3.1.4 Some Illustrations -- 3.2 Kernel Functions -- 3.2.1 Closure Properties -- 3.2.2 Kernel Modifiers -- 3.2.3 Minimal and Maximal Functions -- 3.2.4 Soft-Maximal Kernels -- 3.3 Introduction to Kernels for Structured Data -- 3.3.1 Intersection and Crossproduct Kernels on Sets -- 3.3.2 Minimal and Maximal Functions on Sets -- 3.3.3 Kernels on Multisets -- 3.3.4 Convolution Kernels -- 3.4 Prior Work -- 3.4.1 Kernels from Generative Models -- 3.4.2 Kernels from Instance Space Graphs -- 3.4.3 String Kernels -- 3.4.4 Tree Kernels
  • 3.5 Summary -- 4. Basic Term Kernels -- 4.1 Logics for Learning -- 4.1.1 Propositional Logic for Learning -- 4.1.2 First-Order Logic for Learning -- 4.1.3 Lambda Calculus -- 4.1.4 Lambda Calculus with Polymorphic Types -- 4.1.5 Basic Terms for Learning -- 4.2 Kernels for Basic Terms -- 4.2.1 Default Kernels for Basic Terms -- 4.2.2 Positive Definiteness of the Default Kernel -- 4.2.2 Positive Definiteness of the Default Kernel . . . . 98 4.2.3 Specifying Kernels -- 4.3 Multi-Instance Learning -- 4.3.1 The Multi-Instance Setting -- 4.3.2 Separating MI Problems -- 4.3.3 Convergence of the MI Kernel Perceptron -- 4.3.4 Alternative MI Kernels -- 4.3.5 Learning MI Ray Concepts -- 4.4 Related Work -- 4.4.1 Kernels for General Data Structures -- 4.4.2 Multi-Instance Learning -- 4.5 Applications and Experiments -- 4.5.1 East/West Challenge -- 4.5.2 Drug Activity Prediction -- 4.5.3 Structure Elucidation from Spectroscopic Analyses -- 4.5.4 Spatial Clustering -- 4.6 Summary -- 5. Graph Kernels -- 5.1 Motivation and Approach -- 5.2 Labelled Directed Graphs -- 5.2.1 Basic Terminology and Notation -- 5.2.2 Matrix Notation and some Functions -- 5.2.3 Product Graphs -- 5.2.4 Limits of Matrix Power Series -- 5.3 Complete Graph Kernels -- 5.4 Walk Kernels -- 5.4.1 Kernels Based on Label Pairs -- 5.4.2 Kernels Based on Contiguous Label Sequences -- 5.4.3 Transition Graphs -- 5.4.4 Non-Contiguous Label Sequences -- 5.5 Cyclic Pattern Kernels -- 5.5.1 Undirected Graphs -- 5.5.2 Kernel Definition -- 5.5.3 Kernel Computation -- 5.6 Related Work -- 5.7 Relational Reinforcement Learning -- 5.7.1 Relational Reinforcement Learning -- 5.7.2 Kernels for Graphs with Parallel Edges -- 5.7.3 Kernel Based RRL in the Blocks World -- 5.7.3.1 State and Action Representation -- 5.7.3.2 Blocks World Kernels -- 5.7.4 Experiments -- 5.7.4.1 Parameter Influence