Rogue Scholar Digest April 10, 2024


This is a summary of the Rogue Scholar blog posts published March 27 - April 9, 2024.


Martin Fenner

Front Matter


April 10, 2024

import requests
import locale
import re
from typing import Optional
import datetime
from IPython.display import Markdown

locale.setlocale(locale.LC_ALL, "en_US")
baseUrl = ""
published_since = "2024-03-27"
published_until = "2024-04-09"
feature_image = 0
include_fields = "title,authors,published_at,summary,blog_name,blog_slug,doi,url,image"
url = (
    + f"posts?&published_since={published_since}&published_until={published_until}&language=en&sort=published_at&order=asc&per_page=50&include_fields={include_fields}"
response = requests.get(url)
result = response.json()

def get_post(post):
    return post["document"]

def format_post(post):
    doi = post.get("doi", None)
    url = f"[{doi}]({doi})\n<br />" if doi else ""
    title = f"[{post['title']}]({doi})" if doi else f"[{post['title']}]({post['url']})"
    published_at = datetime.datetime.utcfromtimestamp(post["published_at"]).strftime(
        "%B %-d, %Y"
    blog = f"[{post['blog_name']}]({post['blog_slug']})"
    author = ", ".join([f"{x['name']}" for x in post.get("authors", None) or []])
    summary = post["summary"]
    return f"### {title}\n{url}Published {published_at} in {blog}<br />{author}<br />{summary}\n"

posts = [get_post(x) for i, x in enumerate(result["hits"])]
posts_as_string = "\n\n".join([format_post(x) for x in posts])

def doi_from_url(url: str) -> Optional[str]:
    """Return a DOI from a URL"""
    match =
    if match is None:
        return None

images = [x["image"] for x in posts if x.get("image", None) is not None]
image = images[feature_image]
markdown = f"![]({image})\n\n"
markdown += posts_as_string

Tracking references of Rogue Scholar blog posts
Published March 27, 2024 in Front Matter
Martin Fenner
The Rogue Scholar science blog archive has been collecting the references of blog posts since June 2023, and has registered Crossref DOIs for 1,114 blog posts with references as of today.

Hugging Face Autotrain
Published March 27, 2024 in iPhylo
Roderic Page
How to cite: Page, R. (2024). Hugging Face Autotrain These are notes to myself on using Hugging Face AutoTrain. The first version of this had a very nice interface where you could simply upload a folder of images and train a model. It was limited in the range of tasks and models, but made up for that in ease of use.

Nothing to lose but our tiny royalty cheques: on the proposed open access books policy for the next REF
Published March 27, 2024 in Samuel Moore
Samuel Moore
Open access policy mandates have never been an effective way of convincing researchers of the benefits of exploring alternative, open publishing practices. Forcing someone to do something will not help them engage with the reasons for doing it. Instead, the mandate feels like a simple tickbox exercise that can be ignored once fulfilled.

The changing tunes of science policy: mapping research priorities of consecutive governments
Published March 28, 2024 in Leiden Madtrics
Julián D. Cortés, Catalina Ramírez
Imagine national science policy as a musical chair game. The contestants are the science system actors, such as researchers, research groups, universities, companies, among others. Some actors can have more expertise dancing at the rhythm of salsa than hip-hop, while others might be more agile in finding a seat when the music pauses. The government plays or pauses the music, modulates its speed or changes the genre.

rOpenSci News Digest, March 2024
Published March 29, 2024 in rOpenSci - open tools for open science
The rOpenSci Team
Dear rOpenSci friends, it’s time for our monthly news roundup! You can read this post on our blog.Now let’s dive into the activity at and around rOpenSci! rOpenSci HQ Leadership changes at rOpenSci After 13 years at the helm of rOpenSci, our founding executive director Karthik Ram is stepping down.Noam Ross, rOpenSci’s current lead for peer review, will be our new Executive Director.

From the Founding Director: My Farewell to rOpenSci
Published March 29, 2024 in rOpenSci - open tools for open science
Karthik Ram
Dear rOpenSci community, This is a bit of a bittersweet announcement… After nearly 13 years, it’s time for me to step down as the Executive Director of rOpenSci. In the summer of 2011 I co-founded rOpenSci alongside a group of dedicated colleagues. What began as a casual collaboration among open science enthusiasts quickly evolved into something far more meaningful.

Hello from our New Executive Director!
Published March 29, 2024 in rOpenSci - open tools for open science
Noam Ross
I am pleased, excited, and humbled to announce that I am stepping into the role of Executive Director of rOpenSci starting April 1. First, let me give my gratitude to our outgoing Executive Director and friend Karthik Ram for his leadership and mentorship running rOpenSci the past decade. He’s been a steady hand and visionary that helped our community accomplish so much together in this time.

Automatic Reply
Published March 29, 2024 in Everything is Connected
Ernesto Priego
What would it take to get up without machines to not be powered and pushed down this is poor because it’s us spring comes and we say it feels like winter but it’s not the same, is it you should feel it at the crack of dawn at least before the clocks are changed once more this is poor because it’s us What would it take to reconsider our roles, the day-to-day the routines encoded in our brains witness the changes in the world beyond open windows,

Four paths to studying pneumaticity inexpensively, and why you should
Published March 29, 2024 in Sauropod Vertebra Picture of the Week
Matt Wedel
{.wp-image-16765 .size-large aria-describedby=“caption-attachment-16765” loading=“lazy” attachment-id=“16765” permalink=“” orig-file=“” orig-size=“2100,2800” comments-opened=“1”

Community vis-à-vis Forum
Published March 29, 2024 in Donny Winston
Donny Winston
I think of a community as a state (-ity) of having a purpose in mind (mmun->mean) together (co-), not as an endurable space. I think of a forum as an endurable space, as a doored (from the Latin fores , i.e. door) space of focus (from the French foyer ). How many makes a community? I don’t know.

Open Science Retreat #1: impressions
Published March 31, 2024 in chem-bla-ics
Egon Willighagen
Last week I attended the Open Science Retreat (#osr24nl) in a quite and relaxing region in North-Holland. The meeting was how I like all meetings to be (and I count myself lucky many of my meetings are like this): open, welcoming, constructive, diverse, and intellectually challenging. Not all scientific meetings are like this and it is easy to end up going to obligatory meetings where the discussions are of a different level.

Combining Knowledge Graphs With Language Models for Interpretability
Published April 1, 2024 in Stories by Research Graph on Medium
Amanda Kau
Incorporating Knowledge Graphs to explain reasoning processes Author Amanda Kau ( ORCID: 0009–0004–4949–9284) Introduction Large language models (LLMs) like GPT-4 possess remarkable language abilities, allowing them to function as chatbots, translators, and much more.

Open Science Retreat #2: CiTO Nanopublications
Published April 2, 2024 in chem-bla-ics
Egon Willighagen
During the Open Science Retreat I organized a short session where we looking into typing citation intentions using a new nanopublication template. First, let’s describe nanopublications (originally used in doi:10.3233/ISU-2010-0613) a bit. Scholia gives a nice overview of (macro?)publications on the topic.

NLU vs. NLG: Unveiling the Two Sides of Natural Language Processing
Published April 2, 2024 in Stories by Research Graph on Medium
Dhruv Gupta
Understanding the Power and Applications of Natural Language Processing Author Dhruv Gupta ( ORCID: 0009–0004–7109–5403) Introduction We are living in the era of generative AI. In an era where you can ask AI models almost anything, they will most certainly have an answer to the query. With the increased computational power and the amount of textual data, these models are bound to improve their performance.

Tuning Vision-Language Models and Generative Models with Knowledge Graph
Published April 2, 2024 in Stories by Research Graph on Medium
Vaibhav Khobragade
Bridging Human Perception and AI’s Future: The Convergence of Visual Understanding and Semantic Networks Author · Vaibhav Khobragade ( ORCID: 0009–0009–8807–5982) Introduction The fusion of Vision-Language Models ( VLMs ), Generative Models, and Knowledge Graphs ( KGs ) is reshaping how artificial intelligence (AI) understands and interacts with the world.

Recent Advances in Using Machine Learning with Graphs — Part 2
Published April 2, 2024 in Stories by Research Graph on Medium
Xuzeng He
Recent Advances in Using Machine Learning with Graphs — Part 2 Latest findings in multiple research directions for handling graph construction and network security issues Author · Xuzeng He ( ORCID: 0009–0005–7317–7426) Introduction A graph, in short, is a description of items linked by relations, where the items of a graph are called nodes (or vertices) and their relations are called edges (or links). Examples of

Transforming Research Culture - Introducing the Evaluation & Culture Focal Area at CWTS
Published April 2, 2024 in Leiden Madtrics
Leiden Madtrics
What reforms in how we assess and value research are necessary to better equip public science systems for the existential challenges of the 21st century? How can we understand and tackle issues such as inequitable access to scientific literature, increasing strain on peer review systems, and publisher oligopolies?

Research integrity, preprints and where should the responsibility lie?
Published April 2, 2024 in Samuel Moore
Samuel Moore
Last week, The Scholarly Kitchen posted an article by Angela Cochran,Vice President of Publishing at the American Society of Clinical Oncology, about the inability of publishers to deal with research fraud. She writes: Cochran’s argument is that although publishers manage the peer review process, it was never an expectation of peer review that they would perform ‘forensic analysis’ of datasets and associated materials.

Silencing with Red Tape
Published April 2, 2024 in I.D.E.A.S.
Tejas S. Sathe
“Do you have an IRB?” is often the first question anyone asks when a new research idea is proposed. It shouldn’t be the last. Despite the personal frustrations of surgeons who deal with them, institutional review boards (IRBs) maintain a revered status within our profession. For instance, the perceived necessity of IRB review is so ubiquitous that most readers will know to ask the above question even if they do not know why.

New paper: pneumatic dorsal ribs in Apatosaurus and Brontosaurus
Published April 3, 2024 in Sauropod Vertebra Picture of the Week
Matt Wedel
{.wp-image-21953 .size-large aria-describedby=“caption-attachment-21953” attachment-id=“21953” permalink=“” orig-file=“” orig-size=“2124,4671” comments-opened=“1”

Keystone: papermill detection with network analysis
Published April 3, 2024 in Stories by Adam Day on Medium
Adam Day
TL;DR: We can detect individuals with a high probability of being involved in milling papers. The question is: how should we respond? A few years ago, my bike was stolen. It was an organised job. The thieves arrived after dark, cut a hefty lock clean off, and my bike disappeared silently forever. I imagined it dismantled and sold off into some vast black market network. Poor bike. The police admitted that they weren’t going to do anything.

JOSSCast #7: Adding defect analysis to the Materials Project – Jimmy Shen on pymatgen-analysis-defects
Published April 4, 2024 in Journal of Open Source Software Blog |
Arfon M. Smith
Skip to main content :::::::::::::::::: {#app-content .styles_appChildrenContainer__[chunkhash-base64-5] role=“main”} Adding defect analysis to the Materials Project – Jimmy Shen on pymatgen-analysis-defects JOSSCast: Open Source for ResearchersBy The Journal of Open Source SoftwareApr 04, 2024 Share 00:00 22:11 :::::::::::::::::: Subscribe Now: Apple, Spotify, YouTube, RSS Jimmy Shen sat

Visualizing {dplyr}’s mutate(), summarize(), group_by(), and ungroup() with animations
Published April 4, 2024 in Andrew Heiss’s blog
Andrew Heiss
I’ve used Garrick Aden-Buie’s tidyexplain animations since he first made them in 2018. They’re incredibly useful for teaching—being able to see which rows left_join() includes when merging two datasets, or which cells end up where when pivoting longer or pivoting wider is so valuable.

Eclipse Day: 8 April 2024
Published April 5, 2024 in Triton Station
Stacy McGaugh
Perhaps the most compelling astronomical phenomenon accessible to a naked-eye observer is a total eclipse of the sun. These rare events have always fascinated us, and often terrified us. It is abnormal and disturbing for the sun to be blotted from the sky! A solar eclipse will occur on Monday, 8 April 2024. A partial eclipse will be visible from nearly every part of North America.

cdk2024 #1: NWO Open Science grant for the Chemistry Development Kit
Published April 7, 2024 in chem-bla-ics
Egon Willighagen
We recently got awarded our second NWO Open Science grant (OSF23.2.097), this time for the Chemistry Development Kit (CDK). “We” here is me and Alyanne de Haan, René van der Ploeg, and Marc Teunis from Hogeschool Utrecht. The proposal has been submitted for public dissemination in RIO Journal, like we did with the first NWO Open Science grant.

A conceptual view of information retrieval - can we do better with AI?
Published April 7, 2024 in Aaron Tay’s Musings about librarianship
Aaron Tay
I’ve watched with interest, as academic search engines use AI to improve searching.  Elicit is probably currently the leading example of this, using transformer based language models to

Walking the talk: a peak into Open Science practices at CWTS
Published April 8, 2024 in Leiden Madtrics
Ana Parrón Cabañero
Open Science at CWTS in retrospect Leiden University sees Open Science (OS) as a key element on the path towards making greater scientific and societal impact and fostering research quality and integrity.

Simplifying Rogue Scholar Infrastructure
Published April 8, 2024 in Front Matter
Martin Fenner
From the start last year one important goal for the Rogue Scholar science blog archive was to make it easy to use for blog authors and readers. Today I want to focus on another aspect: keep it simple to run Rogue Scholar infrastructure. To address that goal I started development work last week to further simplify one important aspect of Rogue Scholar infrastructure: metadata conversion.

Feeding the Scholarly Need
Published April 8, 2024 in Donny Winston
Donny Winston
This post marks the re-introduction of a feed for each tag on this blog. I want this so that I can post without worrying about contributing to “pollution” of the scholarly record. I can accomplish this by tagging posts as #scholarly when I want them to be e.g. fetched by The Rogue Scholar for DOI minting and for subsequent linking to my ORCiD profile. This post should hopefully be my last act of such pollution.

Beginner’s Guide to Recurrent Neural Networks (RNNs) with Keras
Published April 8, 2024 in Stories by Research Graph on Medium
Wenyi Pi
Understanding Sequential Data Modelling with Keras for Time Series Prediction Author Wenyi Pi ( ORCID : 0009–0002–2884–2771) Introduction Recurrent Neural Networks (RNNs) are a special type of neural networks that are suitable for learning representations of sequential data like text in Natural Language Processing (NLP). We will walk through a complete example of using RNNs for time series prediction, covering - a different type of AI agent style search optimized for high recall?
Published April 9, 2024 in Aaron Tay’s Musings about librarianship
Aaron Tay
In the last blog post , I argued that despite the advancements in AI thanks to transformer based large language models, most academic search still are focused mostly in supporting exploratory searches and do not focus on optimizing recall and in fact trade off low latency for accuracy.

Back to top