📚
Book Analytics Service
  • 📚Dashboard overview
    • Book Analytics Service
    • How the Dashboard works
    • Dashboard data sources
    • How to use your Dashboard
    • More information and contact us
      • Glossary
      • License
      • Contributing Guide
  • 🖱️Installing BAD Workflows
  • 🔭Workflows & Telescopes
    • Workflow Schedule
    • Data Telescopes
      • Google Analytics Universal
      • Google Books
      • IRUS Fulcrum
      • IRUS OAPEN
      • JSTOR
      • UCL Discovery
      • UCL Sales
    • Metadata Telescopes
      • OAPEN Metadata
      • ONIX
      • Thoth
    • ONIX Workflow
      • Data Partners
      • Schemas
      • Crossref Metadata
Powered by GitBook
On this page
  • Airflow connections
  • Telescope kwargs
  • Publishers (publishers)
  • Telescope Tasks
  • Data Download
  • Data Transform
  • BigQuery Load
  • Table Schema
  1. Workflows & Telescopes
  2. Data Telescopes

IRUS Fulcrum

Documentation for the IRUS Fulcrum telescope

PreviousGoogle BooksNextIRUS OAPEN

Last updated 2 days ago

The IRUS Fulcrum telescope collects usage statistics for titles accessed via the . Usage data is accessible through in much the same way as the IRUS OAPEN telescope. Unlike IRUS OAPEN, IRUS Fulcrum does not record sensitive IP address information. This makes dealing with the data much simpler.

The earliest available data for the Fulcrum platform is April 2022. It follows that all data is of 5 standard.

Dataset Name

irus

Table Name

irus_fulcrum

Table Type

Partitioned

Average Runtime

10 min

Average Download Size

1-10 MB

Harvest Type

API

Run Schedule

Monthly on the 4th

Catch-up Missed Runs

Each Run Includes All Data

Airflow connections

The following airflow connections are required:

Name
Description

irus_api

The IRUS requestor_id/api_key - required to access the IRUS platform

Telescope kwargs

These are fields passed as keyword arguments to the telescope upon instantiation.

Publishers (publishers)

This is a list of publisher names. Usage stats from Fulcrum will be filtered on these publisher names. Many institutions have multiple publisher names associated with them, so it is important that all related names are provided.

Telescope Tasks

Data Download

The download is done via an API call to IRUS:

https://irus.jisc.ac.uk/api/v3/irus/reports/irus_ir/?platform=235&requestor_id={requestor_id}&begin_date={start_date}&end_date={end_date}

A second call to the API is made with the following appended to the above URL:

&attributes_to_show=Country

This splits the data by country, leaving us with two datasets. These datasets will be referred to as the total and country datasets.

Before making any changes to the data, these datasets are uploaded to a Google storage bucket

Data Transform

The transform step has a few things to achieve:

  • Collate the total and country datasets into a single object

  • Remove columns that are not of interest to us

  • Add the release month to each row as a partitioning column

  • Remove rows from the data that do not relate to the publisher of interest

The result of points 1 -> 3 are evident in the schema. The final point requires some communication with the publisher. This is because a single publisher may have published titles under more than one publisher name. For example, University of Michigan has 10 associated publishing names. These names are listed as part of a dictionary in the telescope.

The resulting transformed file is uploaded to a Google Cloud bucket.

BigQuery Load

The transformed data is loaded from the Google Cloud bucket into a partitioned BigQuery table in the irus dataset, which will be created if it does not yet exist. Since the data is partitioned on the release month, there will only be a single table named irus_fulcrum.

Table Schema

Name
Type
Mode
Description

proprietary_id

STRING

NULLABLE

Proprietary identifier of the book.

ISBN

STRING

NULLABLE

ISBN of the book.

book_title

STRING

NULLABLE

Title of the book

publisher

STRING

NULLABLE

The publisher

authors

STRING

NULLABLE

The names of the authors

event_month

STRING

NULLABLE

The investigated month.

total_item_investigations

INTEGER

NULLABLE

The total number of item investigations.

total_item_requests

INTEGER

NULLABLE

The total number of item requests.

unique_item_investigations

INTEGER

NULLABLE

The number of unique item investigations.

unique_item_requests

INTEGER

NULLABLE

The number of unique item requests.

country

RECORD

REPEATED

Record to store statistics on the country level.

country.name

STRING

NULLABLE

The country name of the client registered by IRUS.

country.code

STRING

NULLABLE

The country code of the client registered by IRUS.

country.total_item_investigations

INTEGER

NULLABLE

The total number of item investigations.

country.total_item_requests

INTEGER

NULLABLE

The total number of item requests.

country.unique_item_investigations

INTEGER

NULLABLE

The number of unique item investigations.

country.unique_item_requests

INTEGER

NULLABLE

The number of unique item requests.

release_date

DATE

REQUIRED

Last day of the release month. Table is partitioned on this column.

Where the requestor ID is the API key for the IRUS API. The telescope will use the same begin and end dates (YYYY-MM) in order to retrieve data on a per-month basis. The requestor ID is the irus_api .

🔭
Fulcrum Platform
IRUS
COUNTER
airflow connection
✅
❌