Matching

Graph Matching

class graspologic.match.GraphMatch(n_init=1, init_method='barycenter', max_iter=30, shuffle_input=True, eps=0.1, gmp=True)[source]

This class solves the Graph Matching Problem and the Quadratic Assignment Problem (QAP) through an implementation of the Fast Approximate QAP Algorithm (FAQ) (these two problems are the same up to a sign change) [1].

This algorithm can be thought of as finding an alignment of the vertices of two graphs which minimizes the number of induced edge disagreements, or, in the case of weighted graphs, the sum of squared differences of edge weight disagreements. The option to add seeds (known vertex correspondence between some nodes) is also available [2].

Parameters:

n_init : int, positive (default = 1)

Number of random initializations of the starting permutation matrix that the FAQ algorithm will undergo. n_init automatically set to 1 if init_method = 'barycenter'

init_method : string (default = 'barycenter')

The initial position chosen

"barycenter" : the non-informative “flat doubly stochastic matrix,” \(J=1*1^T /n\) , i.e the barycenter of the feasible region

"rand" : some random point near \(J, (J+K)/2\), where K is some random doubly stochastic matrix

max_iter : int, positive (default = 30)

Integer specifying the max number of Franke-Wolfe iterations. FAQ typically converges with modest number of iterations.

shuffle_input : bool (default = True)

Gives users the option to shuffle the nodes of A matrix to avoid results from inputs that were already matched.

eps : float (default = 0.1)

A positive, threshold stopping criteria such that FW continues to iterate while Frobenius norm of \((P_{i}-P_{i+1}) > eps\)

gmp : bool (default = True)

Gives users the option to solve QAP rather than the Graph Matching Problem (GMP). This is accomplished through trivial negation of the objective function.

Attributes:

perm_inds_ : array, size (n,) where n is the number of vertices in the fitted graphs.

The indices of the optimal permutation (with the fixed seeds given) on the nodes of B, to best minimize the objective function \(f(P) = trace(A^T PBP^T )\).

score_ : float

The objective function value of for the optimal permutation found.

n_iter_ : int

Number of Frank-Wolfe iterations run. If n_init > 1, n_iter_ reflects the number of iterations performed at the initialization returned.

References

[Refcf056df51c-1]J.T. Vogelstein, J.M. Conroy, V. Lyzinski, L.J. Podrazik, S.G. Kratzer, E.T. Harley, D.E. Fishkind, R.J. Vogelstein, and C.E. Priebe, “Fast approximate quadratic programming for graph matching,” PLOS one, vol. 10, no. 4, p. e0121002, 2015.
[Refcf056df51c-2]D. Fishkind, S. Adali, H. Patsolic, L. Meng, D. Singh, V. Lyzinski, C. Priebe, Seeded graph matching, Pattern Recognit. 87 (2019) 203–215
fit(self, A, B, seeds_A=[], seeds_B=[])[source]

Fits the model with two assigned adjacency matrices

Parameters:

A : 2d-array, square, positive

A square adjacency matrix

B : 2d-array, square, positive

A square adjacency matrix

seeds_A : 1d-array, shape (m , 1) where m <= number of nodes (default = [])

An array where each entry is an index of a node in A.

seeds_B : 1d-array, shape (m , 1) where m <= number of nodes (default = [])

An array where each entry is an index of a node in B The elements of seeds_A and seeds_B are vertices which are known to be matched, that is, seeds_A[i] is matched to vertex seeds_B[i].

Returns:
self : returns an instance of self
fit_predict(self, A, B, seeds_A=[], seeds_B=[])[source]

Fits the model with two assigned adjacency matrices, returning optimal permutation indices

Parameters:

A : 2d-array, square, positive

A square, positive adjacency matrix

B : 2d-array, square, positive

A square, positive adjacency matrix

seeds_A : 1d-array, shape (m , 1) where m <= number of nodes (default = [])

An array where each entry is an index of a node in A.

seeds_B : 1d-array, shape (m , 1) where m <= number of nodes (default = [])

An array where each entry is an index of a node in B The elements of seeds_A and seeds_B are vertices which are known to be matched, that is, seeds_A[i] is matched to vertex seeds_B[i].

Returns:

perm_inds_ : 1-d array, some shuffling of [0, n_vert)

The optimal permutation indices to minimize the objective function

get_params(self, deep=True)

Get parameters for this estimator.

Parameters:

deep : bool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:

params : mapping of string to any

Parameter names mapped to their values.

set_params(self, **params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it's possible to update each component of a nested object.

Parameters:

**params : dict

Estimator parameters.

Returns:

self : object

Estimator instance.

Sinkhorn-Knopp Algorithm

class graspologic.match.SinkhornKnopp(max_iter=1000, epsilon=0.001)[source]

Sinkhorn Knopp Algorithm Takes a non-negative square matrix P, where P =/= 0 and iterates through Sinkhorn Knopp's algorithm to convert P to a doubly stochastic matrix. Guaranteed convergence if P has total support [1]:

Parameters:

max_iter : int, default=1000

The maximum number of iterations.

epsilon : float, default=1e-3

Metric used to compute the stopping condition, which occurs if all the row and column sums are within epsilon of 1. This should be a very small value. Epsilon must be between 0 and 1.

Attributes:

_max_iter : int, default=1000

User defined parameter. See above.

_epsilon : float, default=1e-3

User defined paramter. See above.

_stopping_condition: string

Either "max_iter", "epsilon", or None, which is a description of why the algorithm stopped iterating.

_iterations : int

The number of iterations elapsed during the algorithm's run-time.

_D1 : 2d-array

Diagonal matrix obtained after a stopping condition was met so that _D1.dot(P).dot(_D2) is close to doubly stochastic.

_D2 : 2d-array

Diagonal matrix obtained after a stopping condition was met so that _D1.dot(P).dot(_D2) is close to doubly stochastic.

References

[Rc787bb116b67-1]Sinkhorn, Richard & Knopp, Paul. (1967). "Concerning nonnegative matrices and doubly stochastic matrices," Pacific Journal of Mathematics. 21. 10.2140/pjm.1967.21.343.
fit(self, P)[source]

Fit the diagonal matrices in Sinkhorn Knopp's algorithm

Parameters:

P : 2d array-like

Must be a square non-negative 2d array-like object, that is convertible to a numpy array. The matrix must not be equal to 0 and it must have total support for the algorithm to converge.

Returns:
P_eps : A double stochastic matrix.