Rogue Scholar Digest February 21, 2024


This is a summary of the Rogue Scholar blog posts published February 14 - February 20, 2024.


Martin Fenner

Front Matter


February 21, 2024

import requests
import locale
import re
from typing import Optional
import datetime
from IPython.display import Markdown

locale.setlocale(locale.LC_ALL, "en_US")
baseUrl = ""
published_since = "2024-02-14"
published_until = "2024-02-20"
feature_image = 2
include_fields = "title,authors,published_at,summary,blog_name,blog_slug,doi,url,image"
url = (
    + f"posts?&published_since={published_since}&published_until={published_until}&language=en&sort=published_at&order=asc&per_page=50&include_fields={include_fields}"
response = requests.get(url)
result = response.json()

def get_post(post):
    return post["document"]

def format_post(post):
    doi = post.get("doi", None)
    url = f"[{doi}]({doi})\n<br />" if doi else ""
    title = f"[{post['title']}]({doi})" if doi else f"[{post['title']}]({post['url']})"
    published_at = datetime.datetime.utcfromtimestamp(post["published_at"]).strftime(
        "%B %-d, %Y"
    blog = f"[{post['blog_name']}]({post['blog_slug']})"
    author = ", ".join([f"{x['name']}" for x in post.get("authors", None) or []])
    summary = post["summary"]
    return f"### {title}\n{url}Published {published_at} in {blog}<br />{author}<br />{summary}\n"

posts = [get_post(x) for i, x in enumerate(result["hits"])]
posts_as_string = "\n\n".join([format_post(x) for x in posts])

def doi_from_url(url: str) -> Optional[str]:
    """Return a DOI from a URL"""
    match =
    if match is None:
        return None

images = [x["image"] for x in posts if x.get("image", None) is not None]
image = images[feature_image]
markdown = f"![]({image})\n\n"
markdown += posts_as_string

Introducing rOpenSci Champions - Cohort 2023-2024
Published February 15, 2024 in rOpenSci - open tools for open science
Ezekiel Adebayo Ogundepo, Sehrish Kanwal, Andrea Gomez Vargas, Liz Hare, Francesca Belem Lopes Palmeira, Yi-Chin Sunny Tseng, Mirna Vazquez Rosas Landa, Erika Siregar, Jacqui Levy, Yanina Bellini Saibene
The rOpenSci Champions Program starts this 2024 with a new cohort of Champions. We are pleased to introduce you to our Champions and their projects!Ezekiel Adebayo Ogundepo Hello, I’m Ezekiel, a data science professional deeply fascinated by the intersection of mathematics, statistics, and real-world challenges.

New paper: pneumaticity in a rebbachisaurid caudal vertebra
Published February 15, 2024 in Sauropod Vertebra Picture of the Week
Matt Wedel
Fig. 2. Rebbachisauridae indet. (MDPA-Pv 007) from the Sierra Chata locality (Candeleros Formation) Cenomanian (Upper Cretaceous). Anterior caudal vertebra in anterior ( A1 , A3 ), posterior ( A4 , A6 ), and left lateral ( A7 , A9 ) views.

Brms hacking: linear predictors for random effect standard deviations
Published February 17, 2024 in Martin Modrák
Martin Modrák
brms is a great package. It allows you to put predictors on a lot of things. Its power is however not absolute — one thing it doesn’t let you directly do is use data to predict variances of random/varying effects.

What is LLMLingua?
Published February 18, 2024 in Stories by Research Graph on Medium
Research Graph
The AI Helper Turning Mountains of Data into Bite-Sized Instructions Author: Aland Astudillo LLMs have been changing the way the entire world deals with problems and day-by-day tasks. To make them better for specific applications, they need huge amounts of data and complex and expensive approaches to training them.

Help make waywiser better! User requests wanted
Published February 19, 2024 in rOpenSci - open tools for open science
Mike Mahoney, Maëlle Salmon
The package waywiser maintained by Mike Mahoney provides ergonomic methods for assessing spatial models.Assessing predictive models of spatial data can be challenging,both because these models are typically built for extrapolating outside the original region represented by training data and due to potential spatially structured errors,with “hot spots” of higher than expected error clustered geographically due to spatial structure in the

Thank you for your warm work… earnestly
Published February 19, 2024 in Stories by Adam Day on Medium
Adam Day
So, we’ve established that papermills like to use templates. We see templates in referee reports and in the text of cookie-cutter research papers. There’s an important insight here:Templates are used in legit academic behaviour as well as in industrial research fraud.

Mechanistic templates computed for the Grubbs alkene-metathesis reaction.
Published February 19, 2024 in Henry Rzepa’s Blog
Henry Rzepa
Following on from my template exploration of the Wilkinson hydrogenation catalyst, I now repeat this for the Grubbs variant of the Alkene metathesis reaction. As with the Wilkinson, here I focus on the stereochemistry of the mechanism as first suggested by Chauvin[1], an aspect lacking in eg the Wikipedia entry.

Read and Play Digital Music (MIDI) in R using the fluidsynth package
Published February 20, 2024 in rOpenSci - open tools for open science
Jeroen Ooms
A few weeks ago, prof Matt Crump wrote a blog post in which he explores tools to handle MIDI data in R, in preparation for a cognition experiment that involves creating musical stimuli. In the article he ends up using a mix of external command line tools ffmpeg and fluidsynth and a python module.

Taking Spatial Omics into the Next Dimension
Published February 20, 2024 in GigaBlog
Scott Edmunds
A multitude of papers on novel methods for Spatial Omics are published in a cross-journal series launching today in GigaScience and GigaByte Journals. Spatial Omics is a new field that is taking large-scale data-rich biological and biomedical research into new dimensions. Which is having a significant impact on the fundamental fields of biology and biomedicine.

Problems with the DataCite Data Citation Corpus
Published February 20, 2024 in iPhylo
Roderic Page
DataCite have released the Data Citation Corpus, together with a dashboard that summarises the corpus. This is billed as: The goal is to build a citation database between scholarly articles and data, such as datasets in repositories, sequences in GenBank, protein structures in PDB, etc. Access to the corpus can be obtained by submitting a form, then having a (very pleasant) conversation with DataCite about the nature of the corpus.

commonmeta-py now supports metadata lists
Published February 20, 2024 in Front Matter
Martin Fenner
This week the commonmeta-py Python library adds an important new feature: metadata lists. With this feature commonmeta-py no longer only operates on metadata for a single scholarly work (e.g. a journal article, book, dataset, software, or blog post), but can handle lists of scholarly works.

Back to top