Zum Inhalt

textsearch

dsstools.textsearch

This module allows for text search in graph nodes.

logger = get_logger('WDCAPI') module-attribute

TextSearch(identifier=None, *, token=None, api='https://dss-wdc.wiso.uni-hamburg.de/api', insecure=False, timeout=60, params=None)

Bases: WDC

Class allowing to search for keywords in the WDC API.

Parameters:

Name Type Description Default
identifier str | None

Identifier of the network data. For the text search this is normally in the form 20121227_intermediaries (a date string with a short text appended).

None
token str | None

Token for authorization.

None
api str

API address to send request to. Leave this as is.

'https://dss-wdc.wiso.uni-hamburg.de/api'
insecure bool

Hide warning regarding missing https.

False
timeout int

Set the timeout to the server. Increase this if you request large networks.

60

Returns:

Type Description

Instance of TextSearch

api = api[:-1] if api.endswith('/') else api instance-attribute

endpoint = 'snapshot' class-attribute instance-attribute

identifier = identifier instance-attribute

params = params if params else {} instance-attribute

session = requests.Session() instance-attribute

timeout = timeout instance-attribute

token property writable

Get the password token.

_(domains, terms)

__query_domains(domains, query_term, missing_domains=None, key=None)

get_missing(domains)

Compare given domains and hits on the API and return the difference.

Parameters:

Name Type Description Default
domains Iterable

Domains to compare against

required
domains Iterable

Iterable:

required

Returns:

Type Description
set

Difference of domains

get_snapshots(name_tag='')

List available snapshots by name.

Parameters:

Name Type Description Default
name_tag

Filter for name tag. (Default value = "")

''

Returns:

Type Description
set

Available snapshot ids.

search(domains, terms)

Searches the given keywords across a Graph or iterator.

For using a complex, already existing Solr query it is recommended to use the following structure: {"some-key": "your-query OR some-other-query"} (see the docstring for the terms parameter).

Parameters:

Name Type Description Default
domains Graph | List

Set of identifiers to search in. Both graphs and lists are allowed.

required
terms List[str] | List[List[str]] | dict[str, str] | dict[str, List[str]] | Series | DataFrame

Terms to search for. Various structures are allowed. Lists of lists combine all response values into one response, e.g. [[A,B],[C,D]] means A and B counts will be combined into one value. This is helpful for using synonyms. In legends the first value in the inner list sets the "key". dict[str, List[str]] follow the same structure of combining the values in the list but give the result the selected key.

required

Returns:

Type Description

Updated graph or dict containing the responses.