textsearch

`dsstools.textsearch`

This module allows for text search in graph nodes.

`logger = get_logger('WDCAPI')` `module-attribute`

`TextSearch(identifier=None, *, token=None, api='https://dss-wdc.wiso.uni-hamburg.de/api', insecure=False, timeout=60, params=None)`

Bases: WDC

Class allowing to search for keywords in the WDC API.

Parameters:

Name	Type	Description	Default
`identifier`	`str \| None`	Identifier of the network data. For the text search this is normally in the form `20121227_intermediaries` (a date string with a short text appended).	`None`
`token`	`str \| None`	Token for authorization.	`None`
`api`	`str`	API address to send request to. Leave this as is.	`'https://dss-wdc.wiso.uni-hamburg.de/api'`
`insecure`	`bool`	Hide warning regarding missing https.	`False`
`timeout`	`int`	Set the timeout to the server. Increase this if you request large networks.	`60`

Returns:

Type	Description
	Instance of TextSearch

`api = api[:-1] if api.endswith('/') else api` `instance-attribute`

`endpoint = 'snapshot'` `class-attribute` `instance-attribute`

`identifier = identifier` `instance-attribute`

`params = params if params else {}` `instance-attribute`

`session = requests.Session()` `instance-attribute`

`timeout = timeout` `instance-attribute`

`token` `property` `writable`

Get the password token.

`_(domains, terms)`

`__query_domains(domains, query_term, missing_domains=None, key=None)`

`get_missing(domains)`

Compare given domains and hits on the API and return the difference.

Parameters:

Name	Type	Description	Default
`domains`	`Iterable`	Domains to compare against	required
`domains`	`Iterable`	Iterable:	required

Returns:

Type	Description
`set`	Difference of domains

`get_snapshots(name_tag='')`

List available snapshots by name.

Parameters:

Name	Type	Description	Default
`name_tag`		Filter for name tag. (Default value = "")	`''`

Returns:

Type	Description
`set`	Available snapshot ids.

`search(domains, terms)`

Searches the given keywords across a Graph or iterator.

For using a complex, already existing Solr query it is recommended to use the following structure: {"some-key": "your-query OR some-other-query"} (see the docstring for the terms parameter).

Parameters:

Name	Type	Description	Default
`domains`	`Graph \| List`	Set of identifiers to search in. Both graphs and lists are allowed.	required
`terms`	`List[str] \| List[List[str]] \| dict[str, str] \| dict[str, List[str]] \| Series \| DataFrame`	Terms to search for. Various structures are allowed. Lists of lists combine all response values into one response, e.g. [[A,B],[C,D]] means A and B counts will be combined into one value. This is helpful for using synonyms. In legends the first value in the inner list sets the "key". dict[str, List[str]] follow the same structure of combining the values in the list but give the result the selected key.	required

Returns:

Type	Description
	Updated graph or dict containing the responses.

textsearch

dsstools.textsearch

logger = get_logger('WDCAPI') module-attribute

TextSearch(identifier=None, *, token=None, api='https://dss-wdc.wiso.uni-hamburg.de/api', insecure=False, timeout=60, params=None)

api = api[:-1] if api.endswith('/') else api instance-attribute

endpoint = 'snapshot' class-attribute instance-attribute

identifier = identifier instance-attribute

params = params if params else {} instance-attribute

session = requests.Session() instance-attribute

timeout = timeout instance-attribute

token property writable

_(domains, terms)

__query_domains(domains, query_term, missing_domains=None, key=None)

get_missing(domains)

get_snapshots(name_tag='')

search(domains, terms)

`dsstools.textsearch`

`logger = get_logger('WDCAPI')` `module-attribute`

`TextSearch(identifier=None, *, token=None, api='https://dss-wdc.wiso.uni-hamburg.de/api', insecure=False, timeout=60, params=None)`

`api = api[:-1] if api.endswith('/') else api` `instance-attribute`

`endpoint = 'snapshot'` `class-attribute` `instance-attribute`

`identifier = identifier` `instance-attribute`

`params = params if params else {}` `instance-attribute`

`session = requests.Session()` `instance-attribute`

`timeout = timeout` `instance-attribute`

`token` `property` `writable`

`_(domains, terms)`

`__query_domains(domains, query_term, missing_domains=None, key=None)`

`get_missing(domains)`

`get_snapshots(name_tag='')`

`search(domains, terms)`