SVM Lineal

Vamos a trabajar con el dataset Iris:

In [7]:
import pandas as pd
iris=pd.read_csv('Iris.csv')
iris
Out[7]:
Id SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm Species
0 1 5.1 3.5 1.4 0.2 Iris-setosa
1 2 4.9 3.0 1.4 0.2 Iris-setosa
2 3 4.7 3.2 1.3 0.2 Iris-setosa
3 4 4.6 3.1 1.5 0.2 Iris-setosa
4 5 5.0 3.6 1.4 0.2 Iris-setosa
5 6 5.4 3.9 1.7 0.4 Iris-setosa
6 7 4.6 3.4 1.4 0.3 Iris-setosa
7 8 5.0 3.4 1.5 0.2 Iris-setosa
8 9 4.4 2.9 1.4 0.2 Iris-setosa
9 10 4.9 3.1 1.5 0.1 Iris-setosa
10 11 5.4 3.7 1.5 0.2 Iris-setosa
11 12 4.8 3.4 1.6 0.2 Iris-setosa
12 13 4.8 3.0 1.4 0.1 Iris-setosa
13 14 4.3 3.0 1.1 0.1 Iris-setosa
14 15 5.8 4.0 1.2 0.2 Iris-setosa
15 16 5.7 4.4 1.5 0.4 Iris-setosa
16 17 5.4 3.9 1.3 0.4 Iris-setosa
17 18 5.1 3.5 1.4 0.3 Iris-setosa
18 19 5.7 3.8 1.7 0.3 Iris-setosa
19 20 5.1 3.8 1.5 0.3 Iris-setosa
20 21 5.4 3.4 1.7 0.2 Iris-setosa
21 22 5.1 3.7 1.5 0.4 Iris-setosa
22 23 4.6 3.6 1.0 0.2 Iris-setosa
23 24 5.1 3.3 1.7 0.5 Iris-setosa
24 25 4.8 3.4 1.9 0.2 Iris-setosa
25 26 5.0 3.0 1.6 0.2 Iris-setosa
26 27 5.0 3.4 1.6 0.4 Iris-setosa
27 28 5.2 3.5 1.5 0.2 Iris-setosa
28 29 5.2 3.4 1.4 0.2 Iris-setosa
29 30 4.7 3.2 1.6 0.2 Iris-setosa
... ... ... ... ... ... ...
120 121 6.9 3.2 5.7 2.3 Iris-virginica
121 122 5.6 2.8 4.9 2.0 Iris-virginica
122 123 7.7 2.8 6.7 2.0 Iris-virginica
123 124 6.3 2.7 4.9 1.8 Iris-virginica
124 125 6.7 3.3 5.7 2.1 Iris-virginica
125 126 7.2 3.2 6.0 1.8 Iris-virginica
126 127 6.2 2.8 4.8 1.8 Iris-virginica
127 128 6.1 3.0 4.9 1.8 Iris-virginica
128 129 6.4 2.8 5.6 2.1 Iris-virginica
129 130 7.2 3.0 5.8 1.6 Iris-virginica
130 131 7.4 2.8 6.1 1.9 Iris-virginica
131 132 7.9 3.8 6.4 2.0 Iris-virginica
132 133 6.4 2.8 5.6 2.2 Iris-virginica
133 134 6.3 2.8 5.1 1.5 Iris-virginica
134 135 6.1 2.6 5.6 1.4 Iris-virginica
135 136 7.7 3.0 6.1 2.3 Iris-virginica
136 137 6.3 3.4 5.6 2.4 Iris-virginica
137 138 6.4 3.1 5.5 1.8 Iris-virginica
138 139 6.0 3.0 4.8 1.8 Iris-virginica
139 140 6.9 3.1 5.4 2.1 Iris-virginica
140 141 6.7 3.1 5.6 2.4 Iris-virginica
141 142 6.9 3.1 5.1 2.3 Iris-virginica
142 143 5.8 2.7 5.1 1.9 Iris-virginica
143 144 6.8 3.2 5.9 2.3 Iris-virginica
144 145 6.7 3.3 5.7 2.5 Iris-virginica
145 146 6.7 3.0 5.2 2.3 Iris-virginica
146 147 6.3 2.5 5.0 1.9 Iris-virginica
147 148 6.5 3.0 5.2 2.0 Iris-virginica
148 149 6.2 3.4 5.4 2.3 Iris-virginica
149 150 5.9 3.0 5.1 1.8 Iris-virginica

150 rows × 6 columns

In [5]:
import seaborn as sns
sns.scatterplot('PetalLengthCm', 'PetalWidthCm', hue='Species', data=iris)
Out[5]:
<matplotlib.axes._subplots.AxesSubplot at 0x7f214cea54e0>

Usaremos una SVM para separar setosa de versicolor

In [6]:
setosa_versicolor=iris[(iris.Species=='Iris-setosa')|(iris.Species=='Iris-versicolor')]
setosa_versicolor
Out[6]:
Id SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm Species
0 1 5.1 3.5 1.4 0.2 Iris-setosa
1 2 4.9 3.0 1.4 0.2 Iris-setosa
2 3 4.7 3.2 1.3 0.2 Iris-setosa
3 4 4.6 3.1 1.5 0.2 Iris-setosa
4 5 5.0 3.6 1.4 0.2 Iris-setosa
5 6 5.4 3.9 1.7 0.4 Iris-setosa
6 7 4.6 3.4 1.4 0.3 Iris-setosa
7 8 5.0 3.4 1.5 0.2 Iris-setosa
8 9 4.4 2.9 1.4 0.2 Iris-setosa
9 10 4.9 3.1 1.5 0.1 Iris-setosa
10 11 5.4 3.7 1.5 0.2 Iris-setosa
11 12 4.8 3.4 1.6 0.2 Iris-setosa
12 13 4.8 3.0 1.4 0.1 Iris-setosa
13 14 4.3 3.0 1.1 0.1 Iris-setosa
14 15 5.8 4.0 1.2 0.2 Iris-setosa
15 16 5.7 4.4 1.5 0.4 Iris-setosa
16 17 5.4 3.9 1.3 0.4 Iris-setosa
17 18 5.1 3.5 1.4 0.3 Iris-setosa
18 19 5.7 3.8 1.7 0.3 Iris-setosa
19 20 5.1 3.8 1.5 0.3 Iris-setosa
20 21 5.4 3.4 1.7 0.2 Iris-setosa
21 22 5.1 3.7 1.5 0.4 Iris-setosa
22 23 4.6 3.6 1.0 0.2 Iris-setosa
23 24 5.1 3.3 1.7 0.5 Iris-setosa
24 25 4.8 3.4 1.9 0.2 Iris-setosa
25 26 5.0 3.0 1.6 0.2 Iris-setosa
26 27 5.0 3.4 1.6 0.4 Iris-setosa
27 28 5.2 3.5 1.5 0.2 Iris-setosa
28 29 5.2 3.4 1.4 0.2 Iris-setosa
29 30 4.7 3.2 1.6 0.2 Iris-setosa
... ... ... ... ... ... ...
70 71 5.9 3.2 4.8 1.8 Iris-versicolor
71 72 6.1 2.8 4.0 1.3 Iris-versicolor
72 73 6.3 2.5 4.9 1.5 Iris-versicolor
73 74 6.1 2.8 4.7 1.2 Iris-versicolor
74 75 6.4 2.9 4.3 1.3 Iris-versicolor
75 76 6.6 3.0 4.4 1.4 Iris-versicolor
76 77 6.8 2.8 4.8 1.4 Iris-versicolor
77 78 6.7 3.0 5.0 1.7 Iris-versicolor
78 79 6.0 2.9 4.5 1.5 Iris-versicolor
79 80 5.7 2.6 3.5 1.0 Iris-versicolor
80 81 5.5 2.4 3.8 1.1 Iris-versicolor
81 82 5.5 2.4 3.7 1.0 Iris-versicolor
82 83 5.8 2.7 3.9 1.2 Iris-versicolor
83 84 6.0 2.7 5.1 1.6 Iris-versicolor
84 85 5.4 3.0 4.5 1.5 Iris-versicolor
85 86 6.0 3.4 4.5 1.6 Iris-versicolor
86 87 6.7 3.1 4.7 1.5 Iris-versicolor
87 88 6.3 2.3 4.4 1.3 Iris-versicolor
88 89 5.6 3.0 4.1 1.3 Iris-versicolor
89 90 5.5 2.5 4.0 1.3 Iris-versicolor
90 91 5.5 2.6 4.4 1.2 Iris-versicolor
91 92 6.1 3.0 4.6 1.4 Iris-versicolor
92 93 5.8 2.6 4.0 1.2 Iris-versicolor
93 94 5.0 2.3 3.3 1.0 Iris-versicolor
94 95 5.6 2.7 4.2 1.3 Iris-versicolor
95 96 5.7 3.0 4.2 1.2 Iris-versicolor
96 97 5.7 2.9 4.2 1.3 Iris-versicolor
97 98 6.2 2.9 4.3 1.3 Iris-versicolor
98 99 5.1 2.5 3.0 1.1 Iris-versicolor
99 100 5.7 2.8 4.1 1.3 Iris-versicolor

100 rows × 6 columns

In [8]:
sns.scatterplot('PetalLengthCm', 'PetalWidthCm', hue='Species', data=setosa_versicolor)
Out[8]:
<matplotlib.axes._subplots.AxesSubplot at 0x7f214ce25d68>
In [9]:
X=setosa_versicolor[['PetalLengthCm', 'PetalWidthCm']]
y=setosa_versicolor['Species']
In [10]:
from sklearn.svm import SVC
svm_clf = SVC(kernel='linear', C=float('inf')) # Definir
svm_clf.fit(X,y) # Ajustar
Out[10]:
SVC(C=inf, cache_size=200, class_weight=None, coef0=0.0,
  decision_function_shape='ovr', degree=3, gamma='auto_deprecated',
  kernel='linear', max_iter=-1, probability=False, random_state=None,
  shrinking=True, tol=0.001, verbose=False)

Los $w_1$ y $w_2$:

In [11]:
svm_clf.coef_
Out[11]:
array([[1.29411744, 0.82352928]])

El $b$:

In [12]:
svm_clf.intercept_
Out[12]:
array([-3.78823471])

Los vectores soporte:

In [14]:
svm_clf.support_vectors_
Out[14]:
array([[1.9, 0.4],
       [3. , 1.1]])

Para pintarnos el hiperplano separador:

In [13]:
import matplotlib.pyplot as plt
import numpy as np

def plot_svc_decision_boundary(svm_clf, xmin, xmax):
    w = svm_clf.coef_[0]
    b = svm_clf.intercept_[0]

    # At the decision boundary, w0*x0 + w1*x1 + b = 0
    # => x1 = -w0/w1 * x0 - b/w1
    x0 = np.linspace(xmin, xmax, 200)
    decision_boundary = -w[0]/w[1] * x0 - b/w[1]

    margin = 1/w[1]
    gutter_up = decision_boundary + margin
    gutter_down = decision_boundary - margin

    svs = svm_clf.support_vectors_
    plt.scatter(svs[:, 0], svs[:, 1], s=180, facecolors='#FFAAAA')
    plt.plot(x0, decision_boundary, "r-", linewidth=2)
    plt.plot(x0, gutter_up, "k--", linewidth=2)
    plt.plot(x0, gutter_down, "k--", linewidth=2)
In [17]:
plot_svc_decision_boundary(svm_clf,0, 5.5)
sns.scatterplot('PetalLengthCm', 'PetalWidthCm', hue='Species', data=setosa_versicolor)
plt.axis([0,5.5,0,2])
Out[17]:
[0, 5.5, 0, 2]

Sensibilidad a las escalas

In [23]:
Xs=np.array([[1,50],[5,20],[3,80],[5,60]]).astype(np.float64)
ys=np.array([0,0, 1,1])
svm_clf = SVC(kernel='linear',C=100)
svm_clf.fit(Xs,ys)
Out[23]:
SVC(C=100, cache_size=200, class_weight=None, coef0=0.0,
  decision_function_shape='ovr', degree=3, gamma='auto_deprecated',
  kernel='linear', max_iter=-1, probability=False, random_state=None,
  shrinking=True, tol=0.001, verbose=False)
In [24]:
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X_scaled=scaler.fit_transform(Xs)
svm_clf_scaled = SVC(kernel='linear',C=100)
svm_clf_scaled.fit(X_scaled,ys)
Out[24]:
SVC(C=100, cache_size=200, class_weight=None, coef0=0.0,
  decision_function_shape='ovr', degree=3, gamma='auto_deprecated',
  kernel='linear', max_iter=-1, probability=False, random_state=None,
  shrinking=True, tol=0.001, verbose=False)
In [25]:
plt.figure(figsize=(12,3.2))
plt.subplot(121)
plot_svc_decision_boundary(svm_clf, 0, 6)
sns.scatterplot(Xs[:,0],Xs[:,1], hue=ys)
plt.title("Unscaled", fontsize=16)
plt.axis([0, 6, 0, 90])

plt.subplot(122)
plot_svc_decision_boundary(svm_clf_scaled, -2, 2)
sns.scatterplot(X_scaled[:,0],X_scaled[:,1], hue=ys)
plt.title("Scaled", fontsize=16)
plt.axis([-2, 2, -2, 2])
Out[25]:
[-2, 2, -2, 2]

Soft margin

Vamos a separar ahora virginica y versicolor

In [26]:
virginica_versicolor = iris[(iris.Species=='Iris-virginica')|(iris.Species=='Iris-versicolor')]
virginica_versicolor
Out[26]:
Id SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm Species
50 51 7.0 3.2 4.7 1.4 Iris-versicolor
51 52 6.4 3.2 4.5 1.5 Iris-versicolor
52 53 6.9 3.1 4.9 1.5 Iris-versicolor
53 54 5.5 2.3 4.0 1.3 Iris-versicolor
54 55 6.5 2.8 4.6 1.5 Iris-versicolor
55 56 5.7 2.8 4.5 1.3 Iris-versicolor
56 57 6.3 3.3 4.7 1.6 Iris-versicolor
57 58 4.9 2.4 3.3 1.0 Iris-versicolor
58 59 6.6 2.9 4.6 1.3 Iris-versicolor
59 60 5.2 2.7 3.9 1.4 Iris-versicolor
60 61 5.0 2.0 3.5 1.0 Iris-versicolor
61 62 5.9 3.0 4.2 1.5 Iris-versicolor
62 63 6.0 2.2 4.0 1.0 Iris-versicolor
63 64 6.1 2.9 4.7 1.4 Iris-versicolor
64 65 5.6 2.9 3.6 1.3 Iris-versicolor
65 66 6.7 3.1 4.4 1.4 Iris-versicolor
66 67 5.6 3.0 4.5 1.5 Iris-versicolor
67 68 5.8 2.7 4.1 1.0 Iris-versicolor
68 69 6.2 2.2 4.5 1.5 Iris-versicolor
69 70 5.6 2.5 3.9 1.1 Iris-versicolor
70 71 5.9 3.2 4.8 1.8 Iris-versicolor
71 72 6.1 2.8 4.0 1.3 Iris-versicolor
72 73 6.3 2.5 4.9 1.5 Iris-versicolor
73 74 6.1 2.8 4.7 1.2 Iris-versicolor
74 75 6.4 2.9 4.3 1.3 Iris-versicolor
75 76 6.6 3.0 4.4 1.4 Iris-versicolor
76 77 6.8 2.8 4.8 1.4 Iris-versicolor
77 78 6.7 3.0 5.0 1.7 Iris-versicolor
78 79 6.0 2.9 4.5 1.5 Iris-versicolor
79 80 5.7 2.6 3.5 1.0 Iris-versicolor
... ... ... ... ... ... ...
120 121 6.9 3.2 5.7 2.3 Iris-virginica
121 122 5.6 2.8 4.9 2.0 Iris-virginica
122 123 7.7 2.8 6.7 2.0 Iris-virginica
123 124 6.3 2.7 4.9 1.8 Iris-virginica
124 125 6.7 3.3 5.7 2.1 Iris-virginica
125 126 7.2 3.2 6.0 1.8 Iris-virginica
126 127 6.2 2.8 4.8 1.8 Iris-virginica
127 128 6.1 3.0 4.9 1.8 Iris-virginica
128 129 6.4 2.8 5.6 2.1 Iris-virginica
129 130 7.2 3.0 5.8 1.6 Iris-virginica
130 131 7.4 2.8 6.1 1.9 Iris-virginica
131 132 7.9 3.8 6.4 2.0 Iris-virginica
132 133 6.4 2.8 5.6 2.2 Iris-virginica
133 134 6.3 2.8 5.1 1.5 Iris-virginica
134 135 6.1 2.6 5.6 1.4 Iris-virginica
135 136 7.7 3.0 6.1 2.3 Iris-virginica
136 137 6.3 3.4 5.6 2.4 Iris-virginica
137 138 6.4 3.1 5.5 1.8 Iris-virginica
138 139 6.0 3.0 4.8 1.8 Iris-virginica
139 140 6.9 3.1 5.4 2.1 Iris-virginica
140 141 6.7 3.1 5.6 2.4 Iris-virginica
141 142 6.9 3.1 5.1 2.3 Iris-virginica
142 143 5.8 2.7 5.1 1.9 Iris-virginica
143 144 6.8 3.2 5.9 2.3 Iris-virginica
144 145 6.7 3.3 5.7 2.5 Iris-virginica
145 146 6.7 3.0 5.2 2.3 Iris-virginica
146 147 6.3 2.5 5.0 1.9 Iris-virginica
147 148 6.5 3.0 5.2 2.0 Iris-virginica
148 149 6.2 3.4 5.4 2.3 Iris-virginica
149 150 5.9 3.0 5.1 1.8 Iris-virginica

100 rows × 6 columns

In [27]:
sns.scatterplot('PetalLengthCm','PetalWidthCm',hue='Species',data=virginica_versicolor)
Out[27]:
<matplotlib.axes._subplots.AxesSubplot at 0x7f214bd0b780>

No son linealmente separables.

In [28]:
X=virginica_versicolor[['PetalLengthCm','PetalWidthCm']]
y=virginica_versicolor['Species']
In [29]:
# Escalamos X
X=scaler.fit_transform(X)

Modelos con distintos valores de C

In [30]:
svm_C_grande = SVC(kernel='linear', C=100)
svm_C_pequeño = SVC(kernel='linear', C=1)
In [31]:
svm_C_grande.fit(X,y)
svm_C_pequeño.fit(X,y)
Out[31]:
SVC(C=1, cache_size=200, class_weight=None, coef0=0.0,
  decision_function_shape='ovr', degree=3, gamma='auto_deprecated',
  kernel='linear', max_iter=-1, probability=False, random_state=None,
  shrinking=True, tol=0.001, verbose=False)
In [34]:
plt.figure(figsize=(12,3.2))

plt.subplot(121)
plot_svc_decision_boundary(svm_C_grande, -3,3)
sns.scatterplot(X[:,0],X[:,1], hue=y)
plt.title('C=100')

plt.subplot(122)
plot_svc_decision_boundary(svm_C_pequeño, -3,3)
sns.scatterplot(X[:,0],X[:,1], hue=y)
plt.title('C=1')
Out[34]:
Text(0.5, 1.0, 'C=1')

SVM No Lineal

Vamos a usar make_moons, que es un paquete dentro de Scikit-learn que nos genera grupos de puntos del plano aleatoriamente.

In [35]:
from sklearn.datasets import make_moons
In [36]:
help(make_moons)
Help on function make_moons in module sklearn.datasets.samples_generator:

make_moons(n_samples=100, shuffle=True, noise=None, random_state=None)
    Make two interleaving half circles
    
    A simple toy dataset to visualize clustering and classification
    algorithms. Read more in the :ref:`User Guide <sample_generators>`.
    
    Parameters
    ----------
    n_samples : int, optional (default=100)
        The total number of points generated.
    
    shuffle : bool, optional (default=True)
        Whether to shuffle the samples.
    
    noise : double or None (default=None)
        Standard deviation of Gaussian noise added to the data.
    
    random_state : int, RandomState instance or None (default)
        Determines random number generation for dataset shuffling and noise.
        Pass an int for reproducible output across multiple function calls.
        See :term:`Glossary <random_state>`.
    
    Returns
    -------
    X : array of shape [n_samples, 2]
        The generated samples.
    
    y : array of shape [n_samples]
        The integer labels (0 or 1) for class membership of each sample.

In [37]:
X, y = make_moons(n_samples=100, noise=0.15, random_state=42)

sns.scatterplot(X[:,0],X[:,1], hue=y)
Out[37]:
<matplotlib.axes._subplots.AxesSubplot at 0x7f214bc53518>

Vamos a escalar los datos:

In [38]:
X = scaler.fit_transform(X)
In [39]:
sns.scatterplot(X[:,0],X[:,1], hue=y)
Out[39]:
<matplotlib.axes._subplots.AxesSubplot at 0x7f214ba2cc88>

Vamos a usar un script para pintarnos las fronteras de clasificacion:

In [40]:
def plot_predictions(clf, axes):
    x0s = np.linspace(axes[0], axes[1], 100)
    x1s = np.linspace(axes[2], axes[3], 100)
    x0, x1 = np.meshgrid(x0s, x1s)
    X = np.c_[x0.ravel(), x1.ravel()]
    y_pred = clf.predict(X).reshape(x0.shape)
    y_decision = clf.decision_function(X).reshape(x0.shape)
    plt.contourf(x0, x1, y_pred, cmap=plt.cm.brg, alpha=0.2)
    plt.contourf(x0, x1, y_decision, cmap=plt.cm.brg, alpha=0.1)

Núcleo polinomial

$K(\mathbf{a},\mathbf{b})=(\mathbf{a}\cdot \mathbf{b} + r)^d$

  • $d$ es degree
  • $r$ es coef0
In [41]:
poly_kernel_svm=SVC(kernel='poly', degree=3, coef0=1, C=5)
poly_kernel_svm.fit(X,y)
Out[41]:
SVC(C=5, cache_size=200, class_weight=None, coef0=1,
  decision_function_shape='ovr', degree=3, gamma='auto_deprecated',
  kernel='poly', max_iter=-1, probability=False, random_state=None,
  shrinking=True, tol=0.001, verbose=False)
In [43]:
plot_predictions(poly_kernel_svm, [-2.5,2.5,-2.5,2.5])
sns.scatterplot(X[:,0],X[:,1], hue=y)
plt.title('$d=3, r=1, C=5$')
Out[43]:
Text(0.5, 1.0, '$d=3, r=1, C=5$')
In [44]:
# Con otros d y r
poly_kernel_svm=SVC(kernel='poly', degree=10, coef0=100, C=5)
poly_kernel_svm.fit(X,y)
plot_predictions(poly_kernel_svm, [-2.5,2.5,-2.5,2.5])
sns.scatterplot(X[:,0],X[:,1], hue=y)
plt.title('$d=10, r=100, C=5$')
Out[44]:
Text(0.5, 1.0, '$d=10, r=100, C=5$')

Núcleo radial (Radial Basis Function, RBF)

$K(\mathbf{a},\mathbf{b})=\exp(-\gamma ||\mathbf{a}-\mathbf{b}||^2)$

In [50]:
gammas = [0.1, 5]
Cs = [0.001, 1000]

for gamma in gammas:
    for C in Cs:
        rbf_kernel_svm_clf = SVC(kernel='rbf', gamma=gamma, C=C) # Definir
        rbf_kernel_svm_clf.fit(X,y) # Ajustar
        plot_predictions(rbf_kernel_svm_clf, [-2.5,2.5,-2.5,2.5])
        sns.scatterplot(X[:,0],X[:,1], hue=y)
        plt.title(r"$\gamma = {}, C={}$".format(gamma,C))
        plt.show()
In [ ]: