Zum Inhalt

outputs

dsstools.outputs

GraphDescriptor(graph, include_defaults=True, round_floats_to=4, max_level=None) dataclass

This class provides a dataframe (~table) view of the given graph.

Every metric you add is its own column and every node its own row. It allows you to add custom metrics for more detailed analysis and save the dataframe as either csv or xlsx document.

The naming hierarchy is as follows
  • if activated, default metrics are always set first
  • if a custom metric is equal to a default metric, the values will be replaced
  • if a node attribute name is equal to regular or custom metric in df, the node attribute will have the number of duplicates as suffix
  • if two nodes have the same attribute, the attribute will be considered equal and their individual values will be in the same column

Parameters:

Name Type Description Default
graph Graph

The graph you want to save/analyse

required
include_defaults bool

The class adds betweenness, degree and centrality as default metrics for all nodes. You can deactivate this behaviour by setting this to False (default True)

True
round_floats_to int

The class rounds every float down to 4 decimal points by default. This guarantees that cells won't grow to big, making it hard to analyse the data. Increase this value for more details

4
max_level int | None

If your nodes hold some nested structure (dict of dicts) this value defines how 'deep' the level of unpacking goes. The unpacked values will become their own columns. If set to None, all values will be unpacked (default = None)

None

__create_dataframe()

Creates a dataframe view of a graph.

Every Node has its own row (index) and every attribute its own column.

If not all Nodes have the same attributes, 'None' will be set as placeholder value.

__ensure_uniqueness(col_name)

Ensures that no node attribute overrides a metric column.

Warns the user, if an attribute is named the same as a metric.

Parameters:

Name Type Description Default
col_name str

Essentially the node attribute that needs to be checked.

required

Returns:

Type Description
str

A unique name for the attribute.

__flatten_dict(flat_data, parent_key='', sep='.', level=0)

Flattens a nested dictionary up to a specified max depth.

If a dictionary is encountered at max depth, it is replaced with "PLACEHOLDER".

Parameters:

Name Type Description Default
flat_data dict

The dictionary to flatten.

required
parent_key str

The base key for nested keys.

''
sep str

Separator used for flattened keys.

'.'
level int

Current recursion depth.

0

Returns:

Type Description

A flattened dictionary.

add_custom_metrics(custom_metrics)

Allows you to add custom graph metrics by passing a dictionary of metric names and functions that operate on the graph.

Custom metrics will override default metrics if they are named the same.

Examples:

def calculate_clustering(graph):
    return nx.clustering(graph)

# Note how some values must be wrapped in a dictionary first,
# else pandas will read them as NaN
def calculate_shortest_path_length(graph):
    return dict(nx.shortest_path_length(graph))

custom_metrics = {
    'Clustering': calculate_clustering,
    'Shortest path length': calculate_shortest_path_length,
    'Closeness': lambda graph: nx.closeness_centrality(graph)
}

GraphDescriptor(graph=mygraph).add_custom_metrics(custom_metrics)

Parameters:

Name Type Description Default
custom_metrics dict[str, callable]

A dictionary where keys are metric names and values are functions accepting a NetworkX graph and return a dictionary of node-based metric values (otherwise values in dataframe might be NaN).

required

Returns:

Type Description
'GraphDescriptor'

self

write_file(save_path, *, excel_engine='openpyxl')

Saves the dataframe at the given location in the provided format.

The saving format will be determined dynamically based on the path suffix

Parameters:

Name Type Description Default
save_path str | Path

the saving location (and format)

required
excel_engine str

the type of engine you want to use for saving the file in xlsx-format. Uses 'openpyxl' as default. 'openpyxl' must be installed in order to work correctly

'openpyxl'

Returns:

Type Description
'GraphDescriptor'

self