Pipeline

Multigraph Embeddings

class graspy.pipeline.mug2vec(pass_to_ranks='simple-nonzero', omnibus_components=None, omnibus_n_elbows=2, cmds_components=None, cmds_n_elbows=2)[source]

Multigraphs-2-vectors (mug2vec).

mug2vec is a sequence of three algorithms that learns a feature vector for each input graph.

Steps:

1. Pass to ranks - ranks all edge weights from smallest to largest valued edges then normalize by a constant.

2. Omnibus embedding - jointly learns a low dimensional matrix representation for all graphs under the random dot product model (RDPG).

3. Classical MDS (cMDS) - learns a feature vector for each graph by computing Euclidean distance between each pair of graph embeddings from omnibus embedding, followed by an eigen decomposition.

Parameters:
pass_to_ranks: {'simple-nonzero' (default), 'simple-all', 'zero-boost'} string, or None
  • 'simple-nonzero'
    assigns ranks to all non-zero edges, settling ties using the average. Ranks are then scaled by \(\frac{rank(\text{non-zero edges})}{\text{total non-zero edges} + 1}\)
  • 'simple-all'
    assigns ranks to all non-zero edges, settling ties using the average. Ranks are then scaled by \(\frac{rank(\text{non-zero edges})}{n^2 + 1}\) where n is the number of nodes
  • 'zero-boost'
    preserves the edge weight for all 0s, but ranks the other edges as if the ranks of all 0 edges has been assigned. If there are 10 0-valued edges, the lowest non-zero edge gets weight 11 / (number of possible edges). Ties settled by the average of the weight that those edges would have received. Number of possible edges is determined by the type of graph (loopless or looped, directed or undirected).
  • None
    No pass to ranks applied.
omnibus_components, cmds_components : int or None, default = None

Desired dimensionality of output data. If "full", n_components must be <= min(X.shape). Otherwise, n_components must be < min(X.shape). If None, then optimal dimensions will be chosen by select_dimension using n_elbows argument.

omnibus_n_elbows, cmds_n_elbows: int, optional, default: 2

If n_components=None, then compute the optimal embedding dimension using select_dimension. Otherwise, ignored.

Attributes:
omnibus_n_components_ : int

Equals the parameter n_components. If input n_components was None, then equals the optimal embedding dimension.

cmds_n_components_ : int

Equals the parameter n_components. If input n_components was None, then equals the optimal embedding dimension.

embeddings_ : array, shape (n_components, n_features)

Embeddings from the pipeline. Each graph is a point in n_features dimensions.

fit(self, graphs, y=None)[source]

Computes a vector for each graph.

Parameters:
graphs : list of nx.Graph or ndarray, or ndarray

If list of nx.Graph, each Graph must contain same number of nodes. If list of ndarray, each array must have shape (n_vertices, n_vertices). If ndarray, then array must have shape (n_graphs, n_vertices, n_vertices).

y : Ignored
Returns:
self : returns an instance of self.
fit_transform(self, graphs, y=None)[source]

Computes a vector for each graph.

Parameters:
graphs : list of nx.Graph or ndarray, or ndarray

If list of nx.Graph, each Graph must contain same number of nodes. If list of ndarray, each array must have shape (n_vertices, n_vertices). If ndarray, then array must have shape (n_graphs, n_vertices, n_vertices).

y : Ignored
Returns:
embeddings : returns an instance of self.
get_params(self, deep=True)

Get parameters for this estimator.

Parameters:
deep : boolean, optional

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:
params : mapping of string to any

Parameter names mapped to their values.

set_params(self, **params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it's possible to update each component of a nested object.

Returns:
self