Dashboard data sources

What are the Dashboard’s data sources?

To see the data sources for a specific Dashboard, click on About & FAQ on the Dashboard, and consult the list at Data Sources. The only obligatory data source is title metadata in ONIX format; each publisher then chooses the other data sources they wish to include.

The data sources currently available to be visualised in the Dashboard are detailed in the tables below. The standard data sources and variables used are included, other data sources and variables may be supported as an extra add-on service.

Where the Dashboard gets title metadata from

Where the Dashboard gets usage and mentions data from

Mentions and page views

Book views and downloads

Chapter downloads

Public access data sources

The public access data sources are those where data is made publicly available by the data source. No additional access permission is required from Dashboard partners for the Dashboard to access the following data sources if partners want them to be included on their dashboard/s.

Crossref Event Data

Crossref Event Data captures online discussion about research outputs, such as ‘a citation in a dataset or patent, a mention in a news article, Wikipedia page or on a blog, or discussion and comment on social media’. Event Data is retrieved using the Crossref Event Data API. Crossref Event Data must be queried using a DOI, which BAS obtains from Crossref metadata.

Crossref metadata

Crossref is a not-for-profit membership organisation, and an official Digital Object Identifier (DOI) Registration Agency of the International DOI Foundation. They make metadata available for all DOIs registered with Crossref. BAS uses Crossref metadata to match ISBNs obtained from a publisher's ONIX feed to DOIs to query Crossref Event Data.

OAPEN metadata

OAPEN enables libraries and aggregators to use the metadata of all available titles in the OAPEN Library. The metadata is available in different formats and BAS harvests the data in XML format and converts it into ONIX format for the OAPEN platform.

Thoth

Thoth is a free, open metadata service that publishers can use as a metadata storage solution. Thoth can provide metadata in a number of formats. BAS uses the Thoth export API to download metadata for publishers in ONIX format.

UCL Discovery

University College London (UCL) is an eBook publisher, and partner in the BAD project. UCL Discovery is UCL's open access repository, showcasing and providing access to the full texts of UCL research publications.

Private access data sources - access permission required

Google Analytics Universal

Google Analytics Universal monitors and records web traffic for specific websites. If a Dashboard partner had configured Google Analytics on their publisher website, the Google Analytics data can be used to find out which countries and territories website visitors are from.

Google Books

The Google Books Partner program hosts eBooks, including some free open access eBooks. eBook publishers can then download usage reports from Google Books. BAS uses data from the Google Play sales transaction report and the Google Books Traffic Report.

JSTOR

JSTOR is a digital library offering over 7,000 open access eBooks. Publisher usage reports offer details about the use (views and downloads) of eBooks by institution, and country.

ONIX-FTP feed from publishers

ONIX is a standard that book publishers use to share information about the books that they have published. BAS dashboard partners that have ONIX feeds are given credentials and access to their own upload folder on the Mellon SFTP server. Each publisher uploads their ONIX feed to their upload folder on a weekly, fortnightly, or monthly basis. The BAS data workflow downloads the ONIX data, transforms it (with the ONIX parser Java command line tool) and then loads it into BigQuery for further processing.

Private access data sources - no additional access permission required

IRUS Fulcrum

IRUS provides COUNTER standard access reports for eBooks hosted on the Fulcrum platform. Fulcrum is a “community-developed, open source platform for digital scholarship” which provides “users the ability to read books with associated digital enhancements, such as: 3-D models, embedded audio, video, and databases; zoomable online images, and interactive media”.

IRUS OAPEN

IRUS provides COUNTER standard access reports for eBooks hosted on the OAPEN library and platform. OAPEN "promotes and supports the transition to open access for academic books by providing open infrastructure services to stakeholders in scholarly communication". Almost all eBooks on OAPEN are provided as a PDF file for the whole book. The reports show access figures for each month, and the location (IP address) of the access. Within the OAPEN Google Cloud project (located in Europe), IP addresses are replaced with geographical information (city and country). This means that IP addresses are not stored within BAS data, and only de-identified geographical information is transferred to BAS.

Last updated