Rogue Scholar Digest February 14, 2024


This is a summary of the Rogue Scholar blog posts published February 7 - February 13, 2024.


Martin Fenner

Front Matter


February 14, 2024

import requests
import locale
import re
from typing import Optional
import datetime
from IPython.display import Markdown

locale.setlocale(locale.LC_ALL, "en_US")
baseUrl = ""
published_since = "2024-02-07"
published_until = "2024-02-13"
feature_image = 0
include_fields = "title,authors,published_at,summary,blog_name,blog_slug,doi,url,image"
url = (
    + f"posts?&published_since={published_since}&published_until={published_until}&language=en&sort=published_at&order=asc&per_page=50&include_fields={include_fields}"
response = requests.get(url)
result = response.json()

def get_post(post):
    return post["document"]

def format_post(post):
    doi = post.get("doi", None)
    url = f"[{doi}]({doi})\n<br />" if doi else ""
    title = f"[{post['title']}]({doi})" if doi else f"[{post['title']}]({post['url']})"
    published_at = datetime.datetime.utcfromtimestamp(post["published_at"]).strftime(
        "%B %-d, %Y"
    blog = f"[{post['blog_name']}]({post['blog_slug']})"
    author = ", ".join([f"{x['name']}" for x in post.get("authors", None) or []])
    summary = post["summary"]
    return f"### {title}\n{url}Published {published_at} in {blog}<br />{author}<br />{summary}\n"

posts = [get_post(x) for i, x in enumerate(result["hits"])]
posts_as_string = "\n\n".join([format_post(x) for x in posts])

def doi_from_url(url: str) -> Optional[str]:
    """Return a DOI from a URL"""
    match =
    if match is None:
        return None

images = [x["image"] for x in posts if x.get("image", None) is not None]
image = images[feature_image]
markdown = f"![]({image})\n\n"
markdown += posts_as_string

JOSSCast #3: Studying Superbugs – Juliette Hayer on Baargin
Published February 8, 2024 in Journal of Open Source Software Blog |
Arfon M. Smith
Subscribe Now: Apple, Spotify, YouTube, RSS Juliette Hayer joins Arfon and Abby to discuss Baargin, an open source tool she created to analyze bacterial genomes, especially those resistant to antibiotics.

Introducing the Rogue Scholar Advisory Board
Published February 8, 2024 in Front Matter
Martin Fenner
In January 2024 the new Rogue Scholar Advisory Board had its first meeting. It consists of six people with diverse expertise in scholarly blogging. Advisory Board members come from different scholarly disciplines and geographic regions, write in several languages besides English, and have different levels of technical expertise.

Announcing Rogue Scholar Preview
Published February 12, 2024 in Front Matter
Martin Fenner
Today the Rogue Scholar science blog archive launched a new feature: Rogue Scholar Preview . This new functionality enables the import of new science blogs into the preview version of the production service, located at This allows users to see how their blog posts will look like in the Rogue Scholar service, and to resolve issues if necessary.

How to use GROBID
Published February 12, 2024 in Stories by Research Graph on Medium
Research Graph
How to use GROBID to extract text from PDF Author: Aland Astudillo GROBID is a powerful and useful tool based on machine learning that can extract text information from PDF files and other files to a structured format. One of the key challenges in knowledge mining from academic articles is reading the content of PDF files.

INFORMATE: When Are the Data?
Published February 13, 2024 in Upstream
Ted Habermann, Jamaica Jones, Howard Ratner, Tara Packer
In a recent Upstream blog post we explored where data connected to papers funded by several U.S. Federal Agencies are published. Different data sharing practices across these agencies led to very different distributions of datasets across various repositories. We used CHORUS reports that combine linked article and dataset metadata as input for that work.

Back to top