The redundancy of view-redundancy for co-training
Blum and Mitchell's co-training is a (very deservedly) popular semi-supervised learning algorithm that relies on class-conditional feature independence, and view-redundancy (or view-agreement) for semi-supervised learning. I will argue that the view-redundancy assumption is unnecessary, and along the way show how surrogate learning can be plugged into co-training (which is not all that surprising considering that both are multi-view semi-sup algorithms that rely on class-conditional view-independence). I'll first explain co-training with an example. Co-training - The setup Consider a latexy∈{0,1} classification problem on the feature space latexX=X1×X2. I.e., a feature vector latexx can be split into two as latexx=[x1,x2]. We make the rather restrictive assumption that latexx1 and latexx2 are class-conditionally independent for both classes. I.e., latexP(x1,x2|y)=P(x1|y)P(x2|y) for $latex y \in ...