I’ve updated, refreshed, and reorganised the Trove newspapers section of the GLAM Workbench. There’s currently 22 Jupyter notebooks organised under the following headings:
- Trove newspapers in context – Notebooks in this section look at the Trove newspaper corpus as a whole, to try and understand what’s there, and what’s not.
- Visualising searches – Notebooks in this section demonstrate some ways of visualising searches in Trove newspapers – seeing everything rather than just a list of search results.
- Useful tools – Notebooks in this section provide useful tools that extend or enhance the Trove web interface and API.
- Tips and tricks – Notebooks in this section provide some useful hints to use with the Trove API.
- Get creative – Notebooks in this section look at ways you can use data from Trove newspapers in creative ways.
There’s also a number of pre-harvested datasets.
Recently refreshed analyses, visualisations, and datasets include:
- Number of Trove newspaper articles by year and state (notebook)
- Analysing OCR correction in Trove’s newspapers (notebook)
- List of Trove newspapers in languages other than English (markdown formatted list)
- Newspapers with content from beyond the 1954 copyright ‘cliff of death’ (CSV file)
As part of the update, notebooks that are intended to run as apps (with all the code hidden) have been updated to use Voila. But perhaps the thing I’m most excited about are the new options for running the notebooks. As well as being able to launch the notebooks on Binder, you can now create your very own, persistent environment on Reclaim Cloud with just a click of a button.
There’s also an automatically-built Docker image of this repository, containing everything you need to run the notebooks on your own computer. Check out the new Run these notebooks section for details. I’m gradually rolling this out across all the repositories in the GLAM Workbench. #dhhacks
This is a companion discussion topic for the original entry at https://updates.timsherratt.org/2021/05/12/updates-to-the.html