Nonparametric Two-Graph Testing

[1]:
import numpy as np
import matplotlib.pyplot as plt
np.random.seed(8888)

from graspy.inference import NonparametricTest
from graspy.embed import AdjacencySpectralEmbed
from graspy.simulations import sbm, rdpg
from graspy.utils import symmetrize
from graspy.plot import heatmap, pairplot

%matplotlib inline

Generate a stochastic block model graph

We generate a stochastic block model graph (SBM), which is shown below.

[2]:
n_components = 4 # the number of embedding dimensions for ASE
P = np.array([[0.9, 0.11, 0.13, 0.2],
              [0, 0.7, 0.1, 0.1],
              [0, 0, 0.8, 0.1],
              [0, 0, 0, 0.85]])

P = symmetrize(P)
csize = [50] * 4
A = sbm(csize, P)
X = AdjacencySpectralEmbed(n_components=n_components).fit_transform(A)
heatmap(A, title='4-block SBM adjacency matrix')
pairplot(X, title='4-block adjacency spectral embedding')
[2]:
<seaborn.axisgrid.PairGrid at 0x147e842b5240>
../../_images/tutorials_inference_nonpar_3_1.png
../../_images/tutorials_inference_nonpar_3_2.png

Nonparametric test where null is true

Now, we want to know whether the above two graphs were generated from the same latent position. We know that they were, so the test should predict that the differences between SBM 1 and 2 (up to a rotation) are no greater than those differences observed by chance.

In other words, we are testing

\begin{align*} H_0:&X_1 = X_2\\ H_\alpha:& X_1 \neq X_2 \end{align*}

and want to see that the p-value for the nonparametric test is high (fail to reject the null)

We generate a second SBM in the same way, and run a Nonparametric test on it, generating a distance between the two graphs as well as a null distribution of distances between permutations of the graph. We can see this below.

[3]:
A1 = sbm(csize, P)
heatmap(A1, title='4-block SBM adjacency matrix A1')
X1 = AdjacencySpectralEmbed(n_components=n_components).fit_transform(A1)
pairplot(X1, title='4-block adjacency spectral embedding A1')
[3]:
<seaborn.axisgrid.PairGrid at 0x147e821a4a90>
../../_images/tutorials_inference_nonpar_5_1.png
../../_images/tutorials_inference_nonpar_5_2.png

Plot of Null Distribution

We plot the null distribution shown in blue and the test statistic shown red vertical line. We see that the test static is small, resulting in p-value of 0.94. Thus, we cannot reject the null hypothesis that the two graphs come from the same generating distributions.

[4]:
nonpar = NonparametricTest()
p = nonpar.fit(A, A1)

fig, ax = plt.subplots(figsize=(10, 6))
ax.hist(nonpar.null_distribution_, 50)
ax.axvline(nonpar.sample_T_statistic_, color='r')
ax.set_title("P-value = {}".format(p), fontsize=20)
plt.show()
../../_images/tutorials_inference_nonpar_7_0.png

Nonparametric test where null is false

We generate a seconds SBM with different block probabilities, and run a Nonparametric test comaring the previous graph with the new one.

[5]:
P2 = np.array([[0.8, 0.2, 0.2, 0.5],
              [0, 0.9, 0.3, 0.2],
              [0, 0, 0.5, 0.2],
              [0, 0, 0, 0.5]])

P2 = symmetrize(P2)
A2 = sbm(csize, P2)
heatmap(A2, title='4-block SBM adjacency matrix A2')
X2 = AdjacencySpectralEmbed(n_components=n_components).fit_transform(A2)
pairplot(X2, title='4-block adjacency spectral embedding A2')
[5]:
<seaborn.axisgrid.PairGrid at 0x147e80fba940>
../../_images/tutorials_inference_nonpar_9_1.png
../../_images/tutorials_inference_nonpar_9_2.png

Plot of Null Distribution

We plot the null distribution shown in blue and the test statistic shown red vertical line. We see that the test static is small, resulting in p-value of 0. Thus, we reject the null hypothesis that the two graphs come from the same generating distributions.

[6]:
nonpar = NonparametricTest()
p = nonpar.fit(A, A2)

fig, ax = plt.subplots(figsize=(10, 6))
ax.hist(nonpar.null_distribution_, 50)
ax.axvline(nonpar.sample_T_statistic_, color='r')
ax.set_title("P-value = {}".format(p), fontsize=20)
plt.show()
../../_images/tutorials_inference_nonpar_11_0.png