outputs
dsstools.outputs
GraphDescriptor(graph, include_defaults=True, round_floats_to=4, max_level=None)
dataclass
This class provides a dataframe (~table) view of the given graph.
Every metric you add is its own column and every node its own row. It allows you to add custom metrics for more detailed analysis and save the dataframe as either csv or xlsx document.
The naming hierarchy is as follows
- if activated, default metrics are always set first
- if a custom metric is equal to a default metric, the values will be replaced
- if a node attribute name is equal to regular or custom metric in df, the node attribute will have the number of duplicates as suffix
- if two nodes have the same attribute, the attribute will be considered equal and their individual values will be in the same column
Parameters:
Name | Type | Description | Default |
---|---|---|---|
graph
|
Graph
|
The graph you want to save/analyse |
required |
include_defaults
|
bool
|
The class adds betweenness, degree and centrality as default metrics for all nodes. You can deactivate this behaviour by setting this to False (default True) |
True
|
round_floats_to
|
int
|
The class rounds every float down to 4 decimal points by default. This guarantees that cells won't grow to big, making it hard to analyse the data. Increase this value for more details |
4
|
max_level
|
int | None
|
If your nodes hold some nested structure (dict of dicts) this value defines how 'deep' the level of unpacking goes. The unpacked values will become their own columns. If set to None, all values will be unpacked (default = None) |
None
|
__create_dataframe()
Creates a dataframe view of a graph.
Every Node has its own row (index) and every attribute its own column.
If not all Nodes have the same attributes, 'None' will be set as placeholder value.
__ensure_uniqueness(col_name)
Ensures that no node attribute overrides a metric column.
Warns the user, if an attribute is named the same as a metric.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
col_name
|
str
|
Essentially the node attribute that needs to be checked. |
required |
Returns:
Type | Description |
---|---|
str
|
A unique name for the attribute. |
__flatten_dict(flat_data, parent_key='', sep='.', level=0)
Flattens a nested dictionary up to a specified max depth.
If a dictionary is encountered at max depth, it is replaced with "PLACEHOLDER".
Parameters:
Name | Type | Description | Default |
---|---|---|---|
flat_data
|
dict
|
The dictionary to flatten. |
required |
parent_key
|
str
|
The base key for nested keys. |
''
|
sep
|
str
|
Separator used for flattened keys. |
'.'
|
level
|
int
|
Current recursion depth. |
0
|
Returns:
Type | Description |
---|---|
A flattened dictionary. |
add_custom_metrics(custom_metrics)
Allows you to add custom graph metrics by passing a dictionary of metric names and functions that operate on the graph.
Custom metrics will override default metrics if they are named the same.
Examples:
def calculate_clustering(graph):
return nx.clustering(graph)
# Note how some values must be wrapped in a dictionary first,
# else pandas will read them as NaN
def calculate_shortest_path_length(graph):
return dict(nx.shortest_path_length(graph))
custom_metrics = {
'Clustering': calculate_clustering,
'Shortest path length': calculate_shortest_path_length,
'Closeness': lambda graph: nx.closeness_centrality(graph)
}
GraphDescriptor(graph=mygraph).add_custom_metrics(custom_metrics)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
custom_metrics
|
dict[str, callable]
|
A dictionary where keys are metric names and values are functions accepting a NetworkX graph and return a dictionary of node-based metric values (otherwise values in dataframe might be NaN). |
required |
Returns:
Type | Description |
---|---|
'GraphDescriptor'
|
self |
write_file(save_path, *, excel_engine='openpyxl')
Saves the dataframe at the given location in the provided format.
The saving format will be determined dynamically based on the path suffix
Parameters:
Name | Type | Description | Default |
---|---|---|---|
save_path
|
str | Path
|
the saving location (and format) |
required |
excel_engine
|
str
|
the type of engine you want to use for saving the file in xlsx-format. Uses 'openpyxl' as default. 'openpyxl' must be installed in order to work correctly |
'openpyxl'
|
Returns:
Type | Description |
---|---|
'GraphDescriptor'
|
self |