diggrtoolbox package

Module contents

diggrtoolbox is the main package around all the small tools which were developed in the diggr group. Each tool is located in a separated subpackage.

All tools are made available at package level, as every subpackage often only contains one class/function, separation into the subpackages appeared to be not the best idea.

Copyright (C) 2018 Leipzig University Library <info@ub.uni-leipzig.de>

@author F. Rämisch <raemisch@ub.uni-leipzig.de> @author P. Mühleder <muehleder@ub.uni-leipzig.de> @license https://opensource.org/licenses/MIT MIT License

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

class diggrtoolbox.Configgr(config_filename, inspect_locals=True, try_lower_on_fail=True)[source]

Bases: object

Developers define a default configuration for their programs using constants in the source . These constants are inspected, upon instanciation, and saved into the config object. The config file is read, and all settings are imported too. Constants are overwritten in the config, out of course are still usable in the program config.

This results in the fact, that you can set a default behaviour in the source code, let the user configure a setting in a config file, but comment it out upon shipping, to indicate that configuration of this setting is not required.

diggrtoolbox.deepget(obj, keys)[source]

Deepget is a small function enabling the user to “cherrypick” specific values from deeply nested dicts or lists. This is useful, if the just one specific value is needed, which is hidden in multiple hierarchies.

Example:
>>> import diggrtoolbox as dt
>>> ENTRY = {'data' : {'raw': {'key1': 'value1',
                               'key2': 'value2'}}}
>>> KEY2 = ['data', 'raw', 'key2']
>>> dt.deepget(ENTRY, KEY2) == 'value2'
True
diggrtoolbox.match_titles(titles_a, titles_b, rules=[<function first_letter_rule>, <function numbering_rule>])[source]

Returns match value for two lists of titles.

Titles_a:List of title strings
Titles_b:List of title string
Rules:List of matching rules
class diggrtoolbox.PlatformMapper(dataset, sep=', ')[source]

Bases: object

Reads in diggr plattform mapping file and provides a mapping dict

class diggrtoolbox.TreeExplore(tree, tab_symbol=' ')[source]

Bases: object

TreeExplore provides easy to use methods to explore complex data structures obtained e.g. from online REST-APIs. As data structures behind often grew over the years, the internal structure of these objects to be obtained often is not logical.

By providing a full text search and a show method, this tool can be helpful when first investigating, what information is to be found in the data and what is its structure.

Example:
>>> import diggrtoolbox as dt
>>> test_dict = {'id' : 123456789,
>>>              'data' : {'name' : 'diggr project',
>>>                        'city' : 'Leipzig',
>>>                        'field': 'Video Game Culture'},
>>>              'references':[{'url' : 'http://diggr.link',
>>>                             'name' : 'diggr website'},
>>>                             {'url' : 'http://ub.uni-leipzig.de',
>>>                              'name' : 'UBL website'}]}
>>> tree = dt.TreeExplore(test_dict)
>>> results = tree.search("leipzig")
Search-Term: leipzig
Route: references, 1, url,
Embedding: 'http://ub.uni-leipzig.de'
>>> print(results)
[{'embedding': 'http://ub.uni-leipzig.de',
  'route': ['references', 1, 'url'],
  'unique_in_embedding': False,
  'term': 'leipzig'}]

Note

Currently the search is case sensitive only!

find_key(key)[source]

Wrapper for the _search function to ease access to a nonprinting search function.

Parameters:term (str, int, float) – the term/object to be found in the tree.
search(term)[source]

Wrapper for the _search function, stripping all the parameters not to be used by the end user.

Parameters:term (str, int, float) – the term/object to be found in the tree.
show(tree=None, indent=0)[source]

Visualizes the whole tree. If no tree-like structure (dict/list/both) is given, the self.tree is used. This function is called recursively with the nested subtrees.

Parameters:
  • tree (dict, list) – The tree to be shown.
  • indent (int) – Current indentation level of this tree
show_search_result(result)[source]

Displays a search result together with its embedding and path.

Parameters:result (dict) – the result dict generated by _prepare_search_result
diggrtoolbox.treehash(var)[source]

Returns the hash of any dict or list, by using a string conversion via the json library.

class diggrtoolbox.ZipSingleAccess(filename, file_ext='.json')[source]

Bases: diggrtoolbox.zipaccess.zip_access.ZipAccess

This class is meant to provide access to a single JSON-file in a zipfile.

json()[source]

Opens the zipfile and returns the zipped JSON file as python object

class diggrtoolbox.ZipMultiAccess(filename, file_ext='.json')[source]

Bases: diggrtoolbox.zipaccess.zip_access.ZipAccess

This class is meant to provide access to a Zip file containing one base json file and a folder with other json files extending the first

ZipMultiAccess provides a __getitem__ method to allow more easy access to the contents.

get(file_id)[source]

Returns a specific object, which is not the base object.

Parameters:file_id (str) – Identifier of the object to be returned.
class diggrtoolbox.ZipListAccess(filename, file_ext='.json')[source]

Bases: diggrtoolbox.zipaccess.zip_access.ZipAccess

Class to read a Zipfile.

read_archive()[source]

Reads archive zipfile and returns contents as list of dicts.