inputs
dsstools.inputs
Copyright (C) 2024 dssTools Developers David Seseke david.seseke@uni-hamburg.de Katherine Shay katherine.shay@studium.uni-hamburg.de Professur Digitale Sozialwissenschaften Universität Hamburg
This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License along with this program. If not, see https://www.gnu.org/licenses/.
clean_graph_data_attributes(graph)
Replace empty strings in data attributes with np.nan.
import_attributes_from_csv(graph, filepath, import_columns, index_label='', cleanup_functions=None)
Import attributes from CSV file with some cleanup.self.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
graph |
DiGraph
|
Graph on which the data should be applied to. |
required |
filepath |
str
|
Path of the CSV file. |
required |
import_columns |
list[str]
|
Columns to be imported, can be None. |
required |
index_label |
Column name used as index, defaults to first column. (default None) |
''
|
|
cleanup_functions |
Functions
|
to be applied on the DataFrame. (default None) |
None
|
Returns:
Type | Description |
---|---|
DiGraph
|
nx.DiGraph: Graph with the applied data. |
import_from_dsscode(slug, snapshot, cache=True, remove_selfloops=True, contract_redirects=False, explicit_include=False)
Import Graph object from dssCode.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
slug |
str
|
Name slug of the project (see dssCode-Interface) |
required |
snapshot |
str
|
Snapshot hash |
required |
cache |
(bool, Path, str)
|
Pass the cache directory. Defaults to temporary dir. |
True
|
remove_selfloops |
bool
|
Remove edge selfloops. |
True
|
contract_redirects |
bool
|
Contract redirecting nodes into one. |
False
|
explicit_include |
bool
|
Include only explicitely marked nodes into graph |
False
|
Returns:
Type | Description |
---|---|
DiGraph
|
nx.DiGraph: Graph with the imported data. |
import_network(filepath, remove_selfloops=True)
Import network as a NetworkX directed graph and clean up circular edges.
read_from_pickle(folder='', timestamp='')
Read cached graph from directory.
Automatically selects the newest instance, except a timestamp is given.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dir |
(str, Path)
|
Path to directory to search for pickles. If empty, default to temp dir. |
required |
timestamp |
str
|
timestamp to explicitely select for. |
''
|
Returns:
Type | Description |
---|---|
DiGraph
|
nx.DiGraph: Graph with the imported data. |