Rogue Scholar Digest January 31, 2024

digest

This is a summary of the Rogue Scholar blog posts published January 24 - January 30, 2024.

Author
Affiliation

Martin Fenner

Front Matter

Published

January 31, 2024

Code
import requests
import locale
import re
from typing import Optional
import datetime
from IPython.display import Markdown

locale.setlocale(locale.LC_ALL, "en_US")
baseUrl = "https://api.rogue-scholar.org/"
published_since = "2024-01-24"
published_until = "2024-01-30"
feature_image = 1
include_fields = "title,authors,published_at,summary,blog_name,blog_slug,doi,url,image"
url = (
    baseUrl
    + f"posts?&published_since={published_since}&published_until={published_until}&language=en&sort=published_at&order=asc&per_page=50&include_fields={include_fields}"
)
response = requests.get(url)
result = response.json()


def get_post(post):
    return post["document"]


def format_post(post):
    doi = post.get("doi", None)
    url = f"[{doi}]({doi})\n<br />" if doi else ""
    title = f"[{post['title']}]({doi})" if doi else f"[{post['title']}]({post['url']})"
    published_at = datetime.datetime.utcfromtimestamp(post["published_at"]).strftime(
        "%B %-d, %Y"
    )
    blog = f"[{post['blog_name']}](https://rogue-scholar.org/blogs/{post['blog_slug']})"
    author = ", ".join([f"{x['name']}" for x in post.get("authors", None) or []])
    summary = post["summary"]
    return f"### {title}\n{url}Published {published_at} in {blog}<br />{author}<br />{summary}\n"


posts = [get_post(x) for i, x in enumerate(result["hits"])]
posts_as_string = "\n\n".join([format_post(x) for x in posts])

def doi_from_url(url: str) -> Optional[str]:
    """Return a DOI from a URL"""
    match = re.search(
        r"\A(?:(http|https)://(dx\.)?(doi\.org|handle\.stage\.datacite\.org|handle\.test\.datacite\.org)/)?(doi:)?(10\.\d{4,5}/.+)\Z",
        url,
    )
    if match is None:
        return None
    return match.group(5).lower()

images = [x["image"] for x in posts if x.get("image", None) is not None]
image = images[feature_image]
markdown = f"![]({image})\n\n"
markdown += posts_as_string
Markdown(markdown)

An open approach for classifying research publications

Published January 24, 2024 in Leiden Madtrics
Leiden Madtrics
In this post Nees Jan van Eck and Ludo Waltman introduce an open approach for classifying research publications, contributing to a broader development toward open approaches to bibliometrics.Classifying research publications into research topics or research areas is crucial for many bibliometric analyses. While there are lots of approaches for classifying publications, most of these approaches lack transparency.

JOSSCast #2: Astronomy in the Open – Dr. Taylor James Bell on Eureka!

https://doi.org/10.59349/hg16v-pxx10
Published January 25, 2024 in Journal of Open Source Software Blog |
Arfon M. Smith
Subscribe Now: Apple, Spotify, YouTube, Stitcher In this episode of Open Source for Researchers hosts Abby and Arfon explore the world of open source software in astronomy with Dr. Taylor James Bell, a BAER Institute postdoc at NASA Ames. Eureka! is an end-to-end pipeline designed for JWST (James Webb Space Telescope) time series observations.

Introducing JOSSCast: Open Source for Researchers 🎉

https://doi.org/10.59349/96nng-1rg33
Published January 25, 2024 in Journal of Open Source Software Blog |
Arfon M. Smith
Subscribe Now: Apple, Spotify, YouTube, Stitcher We’re thrilled to announce the launch of “JOSSCast: Open Source for Researchers” - a podcast exploring new ways open source can accelerate your work. Hosted by Arfon Smith and Abby Cabunoc Mayes, each episode features an interview with different authors of published papers in JOSS. There are 3 episodes available for you to listen to today!

JOSSCast #1: Eva Maxfield Brown on Speakerbox – Open Source Speaker Identification for Political Science

https://doi.org/10.59349/h0hf5-w2786
Published January 25, 2024 in Journal of Open Source Software Blog |
Arfon M. Smith
Subscribe Now: Apple, Spotify, YouTube, Stitcher In the first episode of Open Source for Researchers, hosts Arfon and Abby sit down with Eva Maxfield Brown to discuss Speakerbox, an open source speaker identification tool. Originally part of the Council Data Project, Speakerbox was used to train models to identify city council members speaking in transcripts, starting with cities like Seattle.

rOpenSci News Digest, January 2024

https://doi.org/10.59350/jps9f-gsh89
Published January 25, 2024 in rOpenSci - open tools for open science
The rOpenSci Team
Dear rOpenSci friends, it’s time for our monthly news roundup! You can read this post on our blog.Now let’s dive into the activity at and around rOpenSci!rOpenSci HQ R-Universe The R-Universe now builds MacOS ARM64 binaries for use on Apple Silicon (aka M1/M2/M3) systems! Find out more in the related tech note.Coworking Read all about coworking in our recent post! Join us for social coworking &

Mini RAG using Neo4j

https://doi.org/10.59350/4hcyx-g4j10
Published January 25, 2024 in Research Graph
Research Graph
Authors: Nakul Nambiar, Amir Aryani Knowledge graphs, which offer a structured representation of data and its relationships, are revolutionising how we organise and access information. With large amounts of data, it sometimes becomes difficult to draw insights from it. This blog article examines how to combine Neo4j, a graph database, with OpenAI’s Retrieval-Augmented Generation (RAG) model to build a robust knowledge management system.

Adding automated end-to-end testing to Rogue Scholar

https://doi.org/10.53731/bxgh0-dhj87
Published January 25, 2024 in Front Matter
Martin Fenner
Last week I reported a small change to the Rogue Scholar science blog submission form. By asking for the homepage URL of the blog instead of the feed URL, I hope to make it easier for users to register their blog. At the same time, I fixed a bug in the submission form, caused by an issue with the database backend. Unfortunately, that bug fix didn’t work as expected.

The Macintosh computer at 40.

https://doi.org/10.59350/f11dr-93t29
Published January 25, 2024 in Henry Rzepa’s Blog
Henry Rzepa
On 24th January 1984, the Macintosh computer was released, as all the media are informing us. Apparently, some are still working. I thought I would give my own personal recollections of that period. In fact, the Mac reached UK stores via a dealership only in 1985.

Commonmeta adds DataCite schema 4.5 support

https://doi.org/10.53731/293b8-kbq75
Published January 26, 2024 in Front Matter
Martin Fenner
On Wednesday DataCite released version 4.5 of the DataCite metadata schema. Today I released updated versions of the commonmeta Ruby and Python libraries that fully support the new schema. You can install them via Rubygems and PyPI, respectively.

Things I am still wondering about generative AI + Search in 2024 - impact of semantic search, generation of answers with citations and more..

https://doi.org/10.59350/vh0zy-9k287
Published January 26, 2024 in Aaron Tay’s Musings about librarianship
Aaron Tay
Earlier related pieces - How Q&A systems based on large language models (eg GPT4) will change things if they become the dominant search paradigm - 9 implications for libraries In the ever-evolving landscape of information retrieval and library science, the emergence of large language models, particularly those based on the transformer architecture like GPT-4, has opened up a Pandora’s box of possibilities and challenges.

Lessons Answer.AI can learn from history’s greatest R&D labs

https://doi.org/10.59350/tc7xn-gnn19
Published January 26, 2024 in FreakTakes
Eric Gilliam
Today’s piece was put together with the help of several conversations with Answer.AI co-founder Jeremy Howard. It is not a “traditional” FreakTakes piece; the research and advice are much more tailored to a specific group of individuals than usual.

3D Molecular model visualisation: 3 Million atoms +

https://doi.org/10.59350/d20n9-rbx62
Published January 27, 2024 in Henry Rzepa’s Blog
Henry Rzepa
In the late 1980s, as I recollected here[1] the equipment needed for real time molecular visualisation as it became known as was still expensive, requiring custom systems such as Evans and Sutherland PS390 workstations.

Grab-bag post: Parapropalaehoplophorus, a favorite book, Tate 2024

https://doi.org/10.59350/j5jfg-n3k62
Published January 29, 2024 in Sauropod Vertebra Picture of the Week
Matt Wedel
Eoneophron , Parapropalaehoplophorus , Ia io , and friends The other day Mike wrote to me about the new Hell Creek oviraptorosaur Eoneophron (Atkins-Weltman et al. 2024), commenting that he liked the ‘eoneo’ — old new — part of the name. That sent me down a little etymological rabbit hole.

An update on Rogue Scholar in the fediverse

https://doi.org/10.53731/dnfg0-hge29
Published January 29, 2024 in Front Matter
Martin Fenner
The Rogue Scholar science blogging archive joined the fediverse in August of last year. This week I want to report on an updated strategy for Rogue Scholar, and what it means for science blogs participating in Rogue Scholar. In August I launched a Mastodon instance at Rogue Scholar Social that accepted Science Blog bots as accounts, (semi-)automatically publishing summaries of blog posts via Rogue Scholar.

Open sourcing the news

https://doi.org/10.59350/1fv7c-7c536
Published January 30, 2024 in Chris Hartgerink
Chris Hartgerink
I read a lot of news, but I do not like being consumed by it. The balance between what’s happening now (news), the short or long past (history), and potential futures (foresight or analysis) is a rough one to keep. This post is about making sure that sources are readily accessible where possible, straight from the news. Part of not becoming consumed by the news and its biases is doing source research and verifying information.

Looking back to look ahead: OpenCitations’ achievements in 2023

https://doi.org/10.59350/2v9xz-dxv31
Published January 30, 2024 in OpenCitations blog
Chiara Di Giambattista
The first month of the new year has almost come to an end, and we at OpenCitations have dedicated these weeks after the holiday season to retrace the progress we reached as an open infrastructure throughout 2023, an activity that has become a tradition in the past few years.

Opening up the CWTS Leiden Ranking: Toward a decentralized and open model for data curation

Published January 30, 2024 in Leiden Madtrics
Leiden Madtrics
Today, CWTS released the Open Edition of the Leiden Ranking.

Introducing the Leiden Ranking Open Edition

Published January 30, 2024 in Leiden Madtrics
Leiden Madtrics
This post introduces the Open Edition of the CWTS Leiden Ranking, published today by CWTS in collaboration with the Curtin Open Knowledge Initiative, Sesame Open Science, and OurResearch.

Back to top