Layouts

NodePosition

class graspologic.layouts.NodePosition[source]

Contains the node id, 2d coordinates, size, and community id for a node.

Create new instance of NodePosition(node_id, x, y, size, community)

node_id

Alias for field number 0

x

Alias for field number 1

y

Alias for field number 2

size

Alias for field number 3

community

Alias for field number 4

count(self, value, /)

Return number of occurrences of value.

index(self, value, start=0, stop=9223372036854775807, /)

Return first index of value.

Raises ValueError if the value is not present.

Automatic Graph Layout

graspologic.layouts.layout_tsne(graph: networkx.classes.graph.Graph, perplexity: int, n_iter: int, max_edges: int = 10000000, random_seed: Union[int, NoneType] = None) → Tuple[networkx.classes.graph.Graph, List[graspologic.layouts.classes.NodePosition]][source]

Automatic graph layout generation by creating a generalized node2vec embedding, then using t-SNE for dimensionality reduction to 2d space.

By default, this function automatically attempts to prune each graph to a maximum of 10,000,000 edges by removing the lowest weight edges. This pruning is approximate and will leave your graph with at most max_edges, but is not guaranteed to be precisely max_edges.

In addition to pruning edges by weight, this function also only operates over the largest connected component in the graph.

After dimensionality reduction, sizes are generated for each node based upon their degree centrality, and these sizes and positions are further refined by an overlap removal phase. Lastly, a global partitioning algorithm (graspologic.partition.leiden()) is executed for the largest connected component and the partition ID is included with each node position.

Parameters:

graph : networkx.Graph

The graph to generate a layout for. This graph may have edges pruned if the count is too high and only the largest connected component will be used to automatically generate a layout.

perplexity : int

The perplexity is related to the number of nearest neighbors that is used in other manifold learning algorithms. Larger datasets usually require a larger perplexity. Consider selecting a value between 4 and 100. Different values can result in significanlty different results.

n_iter : int

Maximum number of iterations for the optimization. We have found in practice that larger graphs require more iterations. We hope to eventually have more guidance on the number of iterations based on the size of the graph and the density of the edge connections.

max_edges : int

The maximum number of edges to use when generating the embedding. Default is 10000000. The edges with the lowest weights will be pruned until at most max_edges exist. Warning: this pruning is approximate and more edges than are necessary may be pruned. Running in 32 bit enviornment you will most likely need to reduce this number or you will out of memory.

random_seed : int

Seed to be used for reproducible results. Default is None and will produce a new random state. Specifying a random state will provide consistent results between runs. In addition the environment variable PYTHONHASHSEED must be set to control hash randomization.

Returns:

Tuple[nx.Graph, List[NodePosition]]

The largest connected component and a list of NodePositions for each node in the largest connected component. The NodePosition object contains: - node_id - x coordinate - y coordinate - size - community

References

[1]van der Maaten, L.J.P.; Hinton, G.E. Visualizing High-Dimensional Data Using t-SNE. Journal of Machine Learning Research 9:2579-2605, 2008.
graspologic.layouts.layout_umap(graph: networkx.classes.graph.Graph, min_dist: float = 0.75, n_neighbors: int = 25, max_edges: int = 10000000, random_seed: Union[int, NoneType] = None) → Tuple[networkx.classes.graph.Graph, List[graspologic.layouts.classes.NodePosition]][source]

Automatic graph layout generation by creating a generalized node2vec embedding, then using UMAP for dimensionality reduction to 2d space.

By default, this function automatically attempts to prune each graph to a maximum of 10,000,000 edges by removing the lowest weight edges. This pruning is approximate and will leave your graph with at most max_edges, but is not guaranteed to be precisely max_edges.

In addition to pruning edges by weight, this function also only operates over the largest connected component in the graph.

After dimensionality reduction, sizes are generated for each node based upon their degree centrality, and these sizes and positions are further refined by an overlap removal phase. Lastly, a global partitioning algorithm (graspologic.partition.leiden()) is executed for the largest connected component and the partition ID is included with each node position.

Parameters:

graph : networkx.Graph

The graph to generate a layout for. This graph may have edges pruned if the count is too high and only the largest connected component will be used to automatically generate a layout.

min_dist : float

The effective minimum distance between embedded points. Default is 0.75. Smaller values will result in a more clustered/clumped embedding where nearby points on the manifold are drawn closer together, while larger values will result on a more even dispersal of points. The value should be set relative to the spread value, which determines the scale at which embedded points will be spread out.

n_neighbors : int

The size of local neighborhood (in terms of number of neighboring sample points) used for manifold approximation. Default is 25. Larger values result in more global views of the manifold, while smaller values result in more local data being preserved.

max_edges : int

The maximum number of edges to use when generating the embedding. Default is 10000000. The edges with the lowest weights will be pruned until at most max_edges exist. Warning: this pruning is approximate and more edges than are necessary may be pruned. Running in 32 bit enviornment you will most likely need to reduce this number or you will out of memory.

random_seed : int

Seed to be used for reproducible results. Default is None and will produce random results.

Returns:

Tuple[nx.Graph, List[NodePosition]]

The largest connected component and a list of NodePositions for each node in the largest connected component. The NodePosition object contains: - node_id - x coordinate - y coordinate - size - community

References

[1]McInnes, L, Healy, J, UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction, ArXiv e-prints 1802.03426, 2018
[2]Böhm, Jan Niklas; Berens, Philipp; Kobak, Dmitry. A Unifying Perspective on Neighbor Embeddings along the Attraction-Repulsion Spectrum. ArXiv e-prints 2007.08902v1, 17 Jul 2020.

Colors

graspologic.layouts.categorical_colors(partitions: Dict[Any, int], light_background: bool = True, theme_path: Union[str, NoneType] = None) → Dict[Any, str][source]

Generates a node -> color mapping based on the partitions provided.

The partitions are ordered by population descending, and a series of perceptually balanced, complementary colors are chosen in sequence.

If a theme_path is provided, it must contain a path to a json file generated by Thematic, otherwise it will use the theme packaged with this library.

Colors will be different when selecting for a light background vs. a dark background, using the principles defined by Thematic.

If more partitions than colors available (100) are selected, the colors will be cycled through again.

Parameters:

partitions : Dict[Any, int]

A dictionary of node ids to partition ids.

light_background : bool

Default is True. Colors selected for a light background will be slightly different in hue and saturation to complement a light or dark background.

theme_path : Optional[str]

A color scheme is provided with graspologic, but if you wish to use your own you can generate one with Thematic and provide the path to it to override the bundled theme.

Returns:

Dict[Any, str]

Returns a dictionary of node id -> color based on the partitions provided.

graspologic.layouts.sequential_colors(node_and_value: Dict[Any, float], light_background: bool = True, use_log_scale: bool = False, theme_path: Union[str, NoneType] = None) → Dict[Any, str][source]

Generates a node -> color mapping where a color is chosen for the value as it maps the value range into the sequential color space.

If a theme_path is provided, it must contain a path to a json file generated by Thematic, otherwise it will use the theme packaged with this library.

Colors will be different when selecting for a light background vs. a dark background, using the principles defined by Thematic.

If more partitions than colors available (100) are selected, the colors will be cycled through again.

Parameters:

node_and_value : Dict[Any, float]

A node to value mapping. The value is a single entry in a continuous range, which is then mapped into a sequential color space.

light_background : bool

Default is True. Colors selected for a light background will be slightly different in hue and saturation to complement a light or dark background.

use_log_scale : bool

Default is False.

theme_path : Optional[str]

A color scheme is provided with graspologic, but if you wish to use your own you can generate one with Thematic and provide the path to it to override the bundled theme.

Returns:

Dict[Any, str]

Returns a dictionary of node id -> color based on the original value provided for the node as it relates to the total range of all values.

Rendering

graspologic.layouts.save_graph(output_path: str, graph: networkx.classes.graph.Graph, positions: List[graspologic.layouts.classes.NodePosition], node_colors: Dict[Any, str], vertex_line_width: float = 0.01, vertex_alpha: float = 0.55, edge_line_width: float = 0.5, edge_alpha: float = 0.02, figure_width: float = 15.0, figure_height: float = 15.0, light_background: bool = True, vertex_shape: str = 'o', arrows: bool = False, dpi: int = 100)[source]

Renders a graph to file.

Edges will be displayed with the same color as the source node.

Parameters:

output_path : str

The output path to write the rendered graph to. Suggested file extension is .png.

graph : nx.Graph

The graph to be displayed. If the networkx Graph contains only nodes, no edges will be displayed.

positions : List[graspologic.layouts.NodePosition]

The positionsfor every node in the graph.

node_colors : Dict[Any, str]

A mapping of node id to colors. Must contain an entry for every node in the graph.

vertex_line_width : float

Line width of vertex outline. Default is``0.01``.

vertex_alpha : float

Alpha (transparency) of vertices in visualization. Default is``0.55``.

edge_line_width : float

Line width of edge. Default is``0.5``.

edge_alpha : float

Alpha (transparency) of edges in visualization. Default is``0.02``.

figure_width : float

Width of figure. Default is 15.0.

figure_height : float

eight of figure. Default is``15.0``.

light_background : bool

Light background or dark background. Default is``True``.

vertex_shape : str

Matplotlib Marker for the vertex shape. See https://matplotlib.org/api/markers_api.html for a list of allowed values . Default is o (i.e: a circle)

arrows : bool

For directed graphs, if True, draw arrow heads. Default is False

dpi : float

Dots per inch of the figure. Default is 100.

graspologic.layouts.show_graph(graph: networkx.classes.graph.Graph, positions: List[graspologic.layouts.classes.NodePosition], node_colors: Dict[Any, str], vertex_line_width: float = 0.01, vertex_alpha: float = 0.55, edge_line_width: float = 0.5, edge_alpha: float = 0.02, figure_width: float = 15.0, figure_height: float = 15.0, light_background: bool = True, vertex_shape: str = 'o', arrows: bool = False, dpi: int = 500)[source]

Renders and displays a graph.

Attempts to display it via the platform-specific display library such as TkInter

Edges will be displayed with the same color as the source node.

Parameters:

graph : nx.Graph

The graph to be displayed. If the networkx Graph contains only nodes, no edges will be displayed.

positions : List[graspologic.layouts.NodePosition]

The positionsfor every node in the graph.

node_colors : Dict[Any, str]

A mapping of node id to colors. Must contain an entry for every node in the graph.

vertex_line_width : float

Line width of vertex outline. Default is``0.01``.

vertex_alpha : float

Alpha (transparency) of vertices in visualization. Default is``0.55``.

edge_line_width : float

Line width of edge. Default is``0.5``.

edge_alpha : float

Alpha (transparency) of edges in visualization. Default is``0.02``.

figure_width : float

Width of figure. Default is 15.0.

figure_height : float

eight of figure. Default is``15.0``.

light_background : bool

Light background or dark background. Default is``True``.

vertex_shape : str

Matplotlib Marker for the vertex shape. See https://matplotlib.org/api/markers_api.html for a list of allowed values . Default is o (i.e: a circle)

arrows : bool

For directed graphs, if True, draw arrow heads. Default is False

dpi : float

Dots per inch of the figure. Default is 500.