Nomination¶
Spectral Vertex Nomination¶

class
graspologic.nominate.
SpectralVertexNomination
(input_graph: bool = True, embedder: Union[str, graspologic.embed.base.BaseSpectralEmbed] = 'ase', n_neighbors=None, metric: str = 'euclidean', metric_params: dict = None)[source]¶ Class for spectral vertex nomination on a single graph.
Given a graph \(G=(V,E)\) and a subset of \(V\) called \(S\) (the "seed"), Single Graph Vertex Nomination is the problem of ranking all \(V\) in order of relation to members of \(S\). Spectral Vertex Nomination solves this problem by embedding \(G\) into a low dimensional euclidean space (tutorials), and then generating a nomination list by some distance based algorithm. In the simple unattributed case, for each seed vertex \(u\), the other vertices are ranked in order of euclidean distance from \(u\).
Parameters: input_graph: bool, default = True
Flag whether to expect two full graphs, or the embeddings.
 True
 .fit and .fit_predict() expect graphs as adjacency matrix, provided as
ndarray of shape (n, n). They will be embedded using the specified
embedder
.
 False
 .fit() and .fit_predict() expect an embedding of the graph, i.e. a ndarray of size (n, d).
embedder: str or BaseEmbed, default = 'ASE'
May provide either a embed object or a string indicating which embedding method to use, which may be either: "ASE" for
AdjacencySpectralEmbed
or "LSE" forLaplacianSpectralEmbed
.n_neighbors: int, default=None
The number of vertices to nominate for each seed.
metric : str, default = 'euclidean'
Distance metric to use when finding the nearest neighbors, all sklearn metrics available.
metric_params : dict, default = None
Arguments for the sklearn DistanceMetric specified via metric parameter.
Attributes: nearest_neighbors_ : sklearn.neighbors.NearestNeighbors
A fit sklearn NearestNeighbors classifier used to find closest vertices to each seed.
References
[R0bdc65bd2adf1] Fishkind, D. E.; Lyzinski, V.; Pao, H.; Chen, L.; Priebe, C. E. Vertex nomination schemes for membership prediction. Ann. Appl. Stat. 9 2015. https://projecteuclid.org/euclid.aoas/1446488749 [R0bdc65bd2adf2] Jordan Yoder, Li Chen, Henry Pao, Eric Bridgeford, Keith Levin, Donniell E. Fishkind, Carey Priebe, Vince Lyzinski, Vertex nomination: The canonical sampling and the extended spectral nomination schemes, Computational Statistics & Data Analysis, Volume 145, 2020. http://www.sciencedirect.com/science/article/pii/S0167947320300074 
fit
(self, X: numpy.ndarray, y=None)[source]¶ Constructs the embedding if not provided, then calculates the pairwise distance from each seed to each vertex in graph.
Parameters: X : np.ndarray
 If input_graph is True
 Expects a graph as an adjacency matrix, i.e. an ndarray of shape (n, n). Will be embedded using the specified embedder.
 If input_graph is False
 Expects an embedding of the graph, i.e. a ndarray of size (n, d).
y : NoneType
Included by sklearn convention.
Returns: self : object
Returns an instance of self.

predict
(self, y: numpy.ndarray) → Tuple[numpy.ndarray, numpy.ndarray][source]¶ Nominates vertices for each seed vertex. Methodology is distance based ranking.
Parameters: y : np.ndarray
The indices of the seed vertices. Should be a dim 1 array with length less than \(V\).
Returns: Nomination List : np.ndarray
Shape is
(number_vertices, number_vertices_in_seed)
. Each column is a seed vertex, and the rows of each column are a list of vertex indexes from the original adjacency matrix in order degree of match.Distance Matrix : np.ndarray
The matrix of distances associated with each element of the nomination list.

fit_predict
(self, X: numpy.ndarray, y: numpy.ndarray) → Tuple[numpy.ndarray, numpy.ndarray][source]¶ Calls this class' fit and then predict methods.
Parameters: X : np.ndarray
 If input_graph is True
 Expects a graph as an adjacency matrix, i.e. an ndarray of shape (n, n). Will be embedded using the specified embedder.
 If input_graph is False
 Expects an embedding of the graph, i.e. a ndarray of size (n, d).
y : np.ndarray.
List of unattributed seed vertex indices.
Returns: Nomination List : np.ndarray
Shape is
(number_vertices, number_vertices_in_seed)
. Each column is a seed vertex, and the rows of each column are a list of vertex indexes from the original adjacency matrix in order degree of match.Distance Matrix : np.ndarray
The matrix of distances associated with each element of the nomination list.

get_params
(self, deep=True)¶ Get parameters for this estimator.
Parameters: deep : bool, default=True
If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns: params : dict
Parameter names mapped to their values.

set_params
(self, **params)¶ Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it's possible to update each component of a nested object.Parameters: **params : dict
Estimator parameters.
Returns: self : estimator instance
Estimator instance.
Vertex Nomination via SGM¶

class
graspologic.nominate.
VNviaSGM
(order_voi_subgraph=1, order_seeds_subgraph=1, n_init=100, max_nominations=None, graph_match_kws={})[source]¶ This class implements Vertex Nomination via Seeded Graph Matching (VNviaSGM) with the algorithm described in [1].
Rather than providing a 11 matching for the vertices of two graphs, as in
GraphMatch
, VNviaSGM ranks the potential matches for a vertex of interst (VOI) in one to graph to the vertices in another graph, based on probability of matching.Parameters: order_voi_subgraph: int, positive (default = 1)
Order used to create induced subgraph on
A
about VOI where the max distance between VOI and other nodes isorder_voi_subgraph
. This induced subgraph will be used to determine what seeds are used when the SGM algorithm is called. If no seeds are in this subgraph about VOI, then a UserWarning is thrown, andnomination_list_
is None.order_seeds_subgraph: int, positive (default = 1)
Order used to create induced subgraphs on
A
andB
. These subgraphs are centered about the seeds that were determined by the subgraph generated byorder_voi_subgraph
. These two subgraphs will be passed into the SGM algorithm.n_init: int, positive (default = 100)
Number of random initializations of the seeded graph matching algorithm (SGM). Increasing the number of restarts will make the probabilities returned more precise.
max_nominations: int (default = None)
Max number of nominations to include in the nomination list. If None is passed, then all nominations computed will be returned.
graph_match_kws : dict (default = {})
Gives users the option to pass custom arguments to the graph matching algorithm. Format should be {'arg_name': arg_value, ...}. See
GraphMatch
Attributes: n_seeds_: int
Number of seeds passed in seedsA that occured in the induced subgraph about VOI
nomination_list_: 2darray
An array containing vertex nominations in the form nomination list = [[j, p_val],...] where p_val is the probability that the VOI matches to node j in graph B (sorted by descending probability)
Notes
VNviaSGM generates an initial induced subgraph about the VOI to determine which seeds are close enough to be used. If no seeds are close enough, then a warning is thrown and
nomination_list_
is set to None.All the seeds that are close enough are then used to generate subgraphs in both
A
andB
. These subgraphs are matched using several random initializations of the seeded graph matching algorithm (SGM), and a nomination list is returned. SeeGraphMatch
for SGM docsReferences
[R9555ba47fdc01] Patsolic, HG, Park, Y, Lyzinski, V, Priebe, CE. Vertex nomination via seeded graph matching. Stat Anal Data Min: The ASA Data Sci Journal. 2020; 13: 229– 244. https://doi.org/10.1002/sam.11454 
fit
(self, A, B, voi, seeds)[source]¶ Fits the model to two graphs.
Parameters: A: 2darray, square
Adjacency matrix, the graph where
voi
is knownB: 2darray, square
Adjacency matrix, the graph where
voi
is not knownvoi: int
Vertex of interest
seeds: list, 2darray
List of length two, of form [seedsA, seedsB]. The elements of seedsA and seedsB are vertex indices from
A
andB
, respectively, which are known to be matched; that is, vertex seedsA[i] is matched to vertex seedsB[i]. Note: len(seedsA)==len(seedsB).Returns:  self: An instance of self

fit_predict
(self, A, B, voi, seeds)[source]¶ Fits model to two adjacency matrices and returns nomination list
Parameters: A: 2darray, square
Adjacency matrix, the graph where
voi
is knownB: 2darray, square
Adjacency matrix, the graph where
voi
is not knownvoi: int
Vertex of interest
seeds: list, 2darray
List of length two, of form [seedsA, seedsB]. The elements of seedsA and seedsB are vertex indices from
A
andB
, respectively, which are known to be matched; that is, vertex seedsA[i] is matched to vertex seedsB[i]. Note: len(seedsA)==len(seedsB).Returns: nomination_list_ : 2darray
The nomination list.

get_params
(self, deep=True)¶ Get parameters for this estimator.
Parameters: deep : bool, default=True
If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns: params : dict
Parameter names mapped to their values.

set_params
(self, **params)¶ Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it's possible to update each component of a nested object.Parameters: **params : dict
Estimator parameters.
Returns: self : estimator instance
Estimator instance.
