Rogue Scholar Digest March 27, 2024

digest

This is a summary of the Rogue Scholar blog posts published March 13 - March 26, 2024.

Author

Affiliation

Martin Fenner

Front Matter

Published

March 27, 2024

Code

import requests
import locale
import re
from typing import Optional
import datetime
from IPython.display import Markdown

locale.setlocale(locale.LC_ALL, "en_US")
baseUrl = "https://api.rogue-scholar.org/"
published_since = "2024-03-12"
published_until = "2024-03-26"
feature_image = 0
include_fields = "title,authors,published_at,summary,blog_name,blog_slug,doi,url,image"
url = (
    baseUrl
    + f"posts?&published_since={published_since}&published_until={published_until}&language=en&sort=published_at&order=asc&per_page=50&include_fields={include_fields}"
)
response = requests.get(url)
result = response.json()


def get_post(post):
    return post["document"]


def format_post(post):
    doi = post.get("doi", None)
    url = f"[{doi}]({doi})\n<br />" if doi else ""
    title = f"[{post['title']}]({doi})" if doi else f"[{post['title']}]({post['url']})"
    published_at = datetime.datetime.utcfromtimestamp(post["published_at"]).strftime(
        "%B %-d, %Y"
    )
    blog = f"[{post['blog_name']}](https://rogue-scholar.org/blogs/{post['blog_slug']})"
    author = ", ".join([f"{x['name']}" for x in post.get("authors", None) or []])
    summary = post["summary"]
    return f"### {title}\n{url}Published {published_at} in {blog}<br />{author}<br />{summary}\n"


posts = [get_post(x) for i, x in enumerate(result["hits"])]
posts_as_string = "\n\n".join([format_post(x) for x in posts])

def doi_from_url(url: str) -> Optional[str]:
    """Return a DOI from a URL"""
    match = re.search(
        r"\A(?:(http|https)://(dx\.)?(doi\.org|handle\.stage\.datacite\.org|handle\.test\.datacite\.org)/)?(doi:)?(10\.\d{4,5}/.+)\Z",
        url,
    )
    if match is None:
        return None
    return match.group(5).lower()

images = [x["image"] for x in posts if x.get("image", None) is not None]
image = images[feature_image]
markdown = f"![]({image})\n\n"
markdown += posts_as_string
Markdown(markdown)

Preprints: Opening Surgical Science to Greater Innovation and Transparency

https://doi.org/10.59350/hkh11-qhg09
Published March 12, 2024 in I.D.E.A.S.
Tejas S. Sathe
This is a preprint of a perspective paper that has been submitted for peer review. Think about the last time you submitted a paper. After meticulous editing, you uploaded your manuscript and hit the “submit” button. Then, you waited. And waited. The reviewers requested revisions, which you obliged. More waiting. Finally, the paper was accepted and sent for proofing. You were one of the lucky ones.

Retraction activities of EdTech journals

https://doi.org/10.59350/vv1gh-0z147
Published March 12, 2024 in Posts | Prof. Dr. Marco Kalz
Marco Kalz
Due to my current side-project of getting a plagiarised, AI-generated book retracted, I was curious to see how selected top-ranked journals in Educational Technology have been actively retracting articles. I have been using the GoogleScholar ranking and have focused on journals I would include in this list. Here is a table of my findings.

How Much Can Vision Language Models Really “See”?

https://doi.org/10.59350/zryt2-f3v34
Published March 12, 2024 in Stories by Research Graph on Medium
Research Graph
Exploring the potentials and limitations of Vision Language Models Author: Amanda Kau (ORCID: 0009–0004–4949–9284 ) The human brain is more extraordinary than any machine we could build. From an early age, many of us gain the ability to comprehend what our eyes tell us and articulate it. Furthermore, we combine evidence from all our senses to reason.

Language Models: Deep Dive into BERT

https://doi.org/10.59350/tfm92-t8y93
Published March 12, 2024 in Stories by Research Graph on Medium
Research Graph
Unlocking the power of language models: A deep dive into BERT Author: Dhruv Gupta (ORCID: 0009–0004–7109–5403 ) Clive Humby, in 2006 rightly said, “Data is the new oil”. With data being present everywhere, it has never been more valuable.

Ethics and AI: Confronting the Challenges Ahead

https://doi.org/10.59350/t5gj9-1fe45
Published March 12, 2024 in Stories by Research Graph on Medium
Research Graph
Exploring AI’s Ethical Terrain: Addressing Bias, Security, and Beyond Author: Vaibhav Khobragade ( ORCID: 0009–0009–8807–5982) Large language models (LLMs) like OpenAI’s GPT-4, Meta’s LLaMA, and Google Gemini (previously called Bard) have showcased their vast capabilities, from passing bar exams and crafting articles to generating images and website code.

Unveiling Research Trends through OpenAlex Visualization

https://doi.org/10.59350/mcb06-tyv62
Published March 12, 2024 in Stories by Research Graph on Medium
Research Graph
Exploring the OpenAlex Data Structure and Visualization Author: Qingqin Fang ( ORCID: 0009–0003–5348–4264) Introduction to OpenAlex In today’s world, the realm of research papers is brimming with countless hot topics, and the sheer volume of publications can be overwhelming.

Why were prehistoric sea creatures so tiny?

https://doi.org/10.59350/y0cds-hgw36
Published March 12, 2024 in Sauropod Vertebra Picture of the Week
Mike Taylor
My friend Toby Lowther wrote to me back in December to ask this question: It’s strange, isn’t it? The last I knew, Shonisaurus was the largest ichthyosaur, at about 20 m and 50 tonnes, and this is considerably bigger than any plesiosaur or mosasaur I know of. It’s up the sperm-whale size category, but not even close to the bigger baleen whales. Why not?

Rogue Scholar launches project plan for grant-funded projects

https://doi.org/10.53731/2kkkq-p4h51
Published March 13, 2024 in Front Matter
Martin Fenner
The Rogue Scholar science blog archive is launching a new pricing plan this week: Project . Blogs that participate in the Project plan get all the benefits of the existing Team plan – unlimited blog posts with DOI registration, full-text search, and long-term archiving with the Internet Archive – for a one-time fee of $150.

Bird’s Eye View: using R to generate inventory maps for lab reagents

https://doi.org/10.59350/mk6vy-xfr45
Published March 14, 2024 in quantixed
Stephen Royle
This is a rather niche post, but the method can likely be adapted for other use cases. In the lab we have many different cell lines stored in liquid nitrogen. The arrangement is: the vials are in specific positions in a box (10 x 10) there are 13 boxes to a cane we have 5 canes Ideally, to retrieve the correct vial from the cell store requires a map.

Jisc should not buy services that it does not want

https://doi.org/10.59350/5vg5w-edj97
Published March 15, 2024 in Sauropod Vertebra Picture of the Week
Mike Taylor
In the first of two disapointing scholarly-communication announcements last week, Jisc announced its report on progress towards open access in the UK. The key finding is: But that’s not the part that disappoints me. Here’s the part that disappoints me: Sometimes I think people don’t know what “transitional” means.

Critical Perspectives on the Metascience Reform Movement

https://doi.org/10.59350/fvz8n-z5287
Published March 15, 2024 in Critical Metascience
Mark Rubin
Few Notes on the Centre for Open Science’s Recent Symposium

The rise of the scinfluencers and the new narrative emerging from pseudo-scientific policy-making

https://doi.org/10.59350/dsf0d-xvj36
Published March 16, 2024 in Posts | Prof. Dr. Marco Kalz
Marco Kalz
The topic of science communication has gained a lot of attention in the last years and no higher educaton institution has not been involved in concepts related to “transfer” activities or the so called “third mission” of higher education institutions.

Reusing data: two new papers

https://doi.org/10.59350/zds99-03s42
Published March 17, 2024 in chem-bla-ics
Egon Willighagen
My research is about the interaction of (machine) representation and the impact on the success of data analysis (matchine learning, chemometrics, AI, etc). See the posts about molecular chemometrics. This got me into FAIR: making data interoperable and being able to (really) reuse data is the starting point of doing research.

Don’t be an absolutist. Use the here package for reproducible workflows

https://doi.org/10.59350/4a9fr-acc34
Published March 17, 2024 in JP’s blog
JP Monteagudo
TL; DR: Don’t be an absolutist– use relative paths. Use the here package instead of setwd() or getwd() to increase reproducibility and avoid wasting your and other people’s time.

On the tragic fate of PeerJ

https://doi.org/10.59350/75xn9-09028
Published March 17, 2024 in Sauropod Vertebra Picture of the Week
Mike Taylor
I said last time that Jisc’s feeble transition-to-open-access report was the first of two disapointing scholarly-communication announcements that week. The second was of course the announcement that PeerJ has been acquired by Taylor and Francis. Matt and I have both been big fans of PeerJ since before it launched, and we were delighted to have our 2013 neck-anatomy paper in the first batch of articles published there.

An Example of the DRY/DAMP Principles for Package Tests

https://doi.org/10.59350/yt047-xf054
Published March 18, 2024 in rOpenSci - open tools for open science
Maëlle Salmon
rOpenSci’s second cohort of Champions has been onboarded!Their training first started with a session on code style, was followed by three sessions on the basics of R package development, and ended with a session on advanced R package development, which consisted of a potpourri of tips with discussion, followed by time for applying these principles to the participants’ packages.Here, I want to share one of the topics covered: Package testing, and

Rogue Scholar reaches another milestone with 15,000 science blog posts

https://doi.org/10.53731/frdn9-grw19
Published March 18, 2024 in Front Matter
Martin Fenner
Last week the Rogue Scholar science blog archive reached another milestone with 15,000 (15,332 as of today) science blog posts. These posts come from 86 participating blogs with 69% written in English and 28% in German.

Large Language Models and Knowledge Graphs: Ways to combine them

https://doi.org/10.59350/49h4w-14432
Published March 18, 2024 in Stories by Research Graph on Medium
Research Graph
Latest findings in multiple research directions for tackling reasoning and common sense challenges Author: Xuzeng He ( ORCID: 0009–0005–7317–7426) Knowledge Graphs, such as Wikidata, contain rich relational information between entities and have been widely used as a structured format for storing and representing relational information.

Detecting anomeric effects in tetrahedral carbon bearing four oxygen substituents.

https://doi.org/10.59350/dfkt5-k2b20
Published March 18, 2024 in Henry Rzepa’s Blog
Henry Rzepa
I have written a few times about the so-called “anomeric effect“, which relates to stereoelectronic interactions in molecules such as sugars bearing a tetrahedral carbon atom with at least two oxygen substituents. The effect can be detected when the two C-O bond lengths in such molecules are inspected, most obviously when one of these bonds has a very different length from the other.

Uncovering The Secrets Behind The Most Influential Scholarly Publications With AceMap

https://doi.org/10.59350/pdh4d-2zf28
Published March 18, 2024 in Stories by Research Graph on Medium
Research Graph
Understanding Knowledge Networks From A Graph Perspective Author: Amanda Kau (ORCID: 0009–0004–4949–9284 ) Since 2020, over ten million scholarly articles have been published annually. To put that into perspective, say all ten million articles were released on the first day of the year.

The problem for REF 2029

https://doi.org/10.59348/fmt65-4zk03
Published March 19, 2024 in Martin Paul Eve
Martin Paul Eve
The Research Excellence Framework is the UK system for rewarding unhypothecated research funding from the government to universities. It gives a block of funding that can be used in any way that the institution sees fit to advance research. It’s particularly useful in disciplines with less project funding to give research time to individual academics. The problem is, lots of academics hate REF.

Exploring Methods of Cypher Query Optimisations

https://doi.org/10.59350/v989y-tnc93
Published March 19, 2024 in Stories by Research Graph on Medium
Research Graph
Boosting Performance for Knowledge Graphs with Neo4j APOC Library Author : Wenyi Pi (ORCID: 0009–0002–2884–2771) A knowledge graph (graph database) captures information about main entities in a domain and the relationships between them. It was an augmented feature store for connected data which gave access to compute, access and operationalise structure features.

Prompt Engineering

https://doi.org/10.59350/2spqn-rqe66
Published March 19, 2024 in Stories by Research Graph on Medium
Research Graph
Using intelligence to use artificial Intelligence: A deep dive into Prompt Engineering Author: Dhruv Gupta (ORCID: 0009–0004–7109–5403 ) Large Language Models (LLMs) have become the new normal in the field of Natural Language Processing (NLP). With their improved performance and generative power, people around the world are relying on it for

The Journey of Large Language Models: Evolution, Application, and Limitations

https://doi.org/10.59350/cxn0s-kgr39
Published March 19, 2024 in Stories by Research Graph on Medium
Research Graph
Unlocking the Future of AI: The Transformative Journey of Large Language Models Author: · Vaibhav Khobragade ( ORCID: 0009–0009–8807–5982) Introduction Human language development is innate and evolves throughout life. Machines lack this ability to evolve without advanced AI algorithms.

Unlocking Intelligence: The Journey from Data to Knowledge Graph

https://doi.org/10.59350/bvg2t-4bb37
Published March 19, 2024 in Stories by Research Graph on Medium
Research Graph
An Overview of Constructing a Knowledge Graph Author: · Qingqin Fang ( ORCID: 0009–0003–5348–4264) 1. Introduction 1.1 What is a Knowledge Graph Knowledge Graphs are structured semantic knowledge bases used to rapidly describe concepts and their relationships in the physical world.

Desirable Characteristics of Persistent Identifiers

https://doi.org/10.54900/c3hdq-0ev76
Published March 19, 2024 in Upstream
John Chodacki, Todd Carpenter
Considerations in the context of open scholarship and open infrastructure Persistent identifiers (PIDs) in scholarly communications and research infrastructure have garnered growing attention over the last several years, especially from governments who are recognizing the vital role PIDs play in creating a more efficient and trustworthy research ecosystem.

Monster Movie: making movie files from microscopy data

https://doi.org/10.59350/1hyc3-tz213
Published March 19, 2024 in quantixed
Stephen Royle
What’s the best way to make a movie file from microscopy data? Maybe you need to generate a movie for the supplementary info for a paper, or insert one into your electronic lab notebook, or to show in a talk. The problem is that the requirements for each of those is different. This situation is compounded by the fact that there are so many options to make movie files and not much guidance on what is the best method.

Tutorial 43: how to do creative work

https://doi.org/10.59350/zyb68-f4x05
Published March 19, 2024 in Sauropod Vertebra Picture of the Week
Mike Taylor
I was struck by a Mastodon post where classic game developer Ron Gilbert quoted film critic Roger Ebert as follows: And Gilbert commented: In a reply, Gretchen Anderson said her favourite version of this is: I couldn’t find the original source for this, but as I was trying to track it down I ran into this, attributed to Pablo Picasso: When I mentioned these observations to Matt, he sent me a longer-form exposition of the same phenomenon,

The Joel Test: 12 Steps to Better Code

https://doi.org/10.59350/my3v6-cjv02
Published March 19, 2024 in lab.sub - Articles
Michelle Weidling, Mathias Göbel
A few years ago an internal initiative on software quality launched at the Research and Development Department. There is always room for improvements and some guidance can help implement best practices for better workflows and for better code. The very first stop on our ride was at Joel`s Test for better code, published 24 years ago. It is time for a review — again. Do you use source control? Who doesn’t? Can you make a build in one step?

rOpenSci Champions Pilot Year: Projects Wrap-Up

https://doi.org/10.59350/268cb-p0021
Published March 20, 2024 in rOpenSci - open tools for open science
Yanina Bellini Saibene
Our first cohort of the rOpenSci Champions Program has now completed the second phase of the program by developing their project and carrying out outreach activities.

The FAIR for Research Software Principles after two years: an adoption update

https://doi.org/10.59350/h5fvg-sfh38
Published March 20, 2024 in Research Software Alliance
Michelle Barker, Leyla Jael Castro, Bernadette Fritzsch, Daniel S. Katz, Carlos Martinez-Ortiz, Anna Niehues, Alexander Struck, Qian Zhang
By Michelle Barker, Leyla Jael Castro, Bernadette Fritzsch, Daniel S. Katz, Carlos Martinez-Ortiz, Anna Niehues, Alexander Struck, Qian Zhang The FAIR for Research Software (FAIR4RS) Principles aim to promote and encourage the findability, accessibility, interoperability, and reusability (FAIR) of research software. The FAIR4RS Principles were released in 2022, with a number of organisations already planning adoption at that time.

JOSSCast #6: Streamlining Molecular Dynamics – Marjan Albooyeh and Chris Jones on FlowerMD

https://doi.org/10.59349/0twkg-97352
Published March 21, 2024 in Journal of Open Source Software Blog |
Arfon M. Smith
Subscribe Now: Apple, Spotify, YouTube, RSS Marjan Albooyeh and Chris Jones chat with Arfon and Abby about their experience building FlowerMD, an open-source library of recipes for molecular dynamics workflows. Marjan and Chris are both grad students in Dr. Jankowski’s lab at Boise State University where they use molecular dynamics to study materials for aerospace applications and organic solar cells.

Demystifying causal inference estimands: ATE, ATT, and ATU

https://doi.org/10.59350/c9z3a-rcq16
Published March 21, 2024 in Andrew Heiss’s blog
Andrew Heiss
.no-stripe .gt_table tr.odd { –bs-table-striped-bg: transparent; } .gt_footnote { text-align: left !important; } In my causal inference class, I spend just one week talking about the Rubin causal model and potential outcomes.

Global reach, local insights: Using book ISBNs to map publishing behaviour

https://doi.org/10.59350/qyt8j-f4b98
Published March 21, 2024 in Leiden Madtrics
Eleonora Dagiene
Scholarly book evaluation often prioritises ‘prestige’, which leads to inconsistent and unfair outcomes. My previous research shows that such systems consider neither the intrinsic quality of the research nor the accessibility of the work itself.

Internet Archeology: reviving a 2001 article published in the Internet Journal of Chemistry.

https://doi.org/10.59350/xqerh-wam97
Published March 21, 2024 in Henry Rzepa’s Blog
Henry Rzepa
In the mid to late 1990s as the Web developed, it was becoming more obvious that one area it would revolutionise was of scholarly journal publishing. Since the days of the very first scientific journals in the 1650s, the medium had been firmly rooted in paper. Even printed colour only became common (and affordable) from the 1980s. An opportunity to move away from these restrictions was provided by the Web.

The (potential) impact of AI on the individual research process and science in general

Published March 22, 2024 in Elephant in the Lab
Sascha Schönig
AI has the potential to change research. There’s a lot of positive potential, but at the same time, there are also fears associated with the influence on academic activity.

We are looking for a PhD student

https://doi.org/10.59350/y34ra-34f08
Published March 22, 2024 in bjoern.brembs.blog
Björn Brembs
We are looking for a PhD student interested in the functional, molecular and structural profile of neuronal circuits underlying learning, memory and behavior. In a 30-year research effort (lay summary, paper), we have recently identified a new gene (atypical PKC, aPKC) necessary for a form of motor learning in the fruit fly Drosophila and in which neurons it is required.

The Anatomy of Gossip: Dissecting Dynamics and Impacts in Surgical Residency

https://doi.org/10.59350/mehsk-d4t04
Published March 22, 2024 in I.D.E.A.S.
Joseph L’Huillier
This is a preprint of a manuscript currently under peer review. Abstract Importance: Gossip, defined by social scientists as “evaluative talk about an absent third party,” is anecdotally pervasive, yet poorly understood in surgical residency programs. Objective: This study sought to deconstruct the role of gossip in surgical residency and evaluate its impact through the lens of surgical residents.

Why entering your query in natural question leads to better result than keyword searching with the latest AI powered (Dense retrieval/embedding models) search

https://doi.org/10.59350/t3vmh-wnd26
Published March 23, 2024 in Aaron Tay’s Musings about librarianship
Aaron Tay
One of the tricks about using the newer “AI powered” search systems like Elicit, SciSpace and even JSTOR experiment search is that they recommend that you type in your query or what you want in full natural language and not keyword search style (where you drop the stop words) for better results. So for example do

The Pitfalls of Preregistration

https://doi.org/10.59350/dymqx-wk326
Published March 25, 2024 in Critical Metascience

Notes on Presentations by Chris Donkin and Stephan Lewandowsky

Recent Advances in using Machine Learning with Graphs

https://doi.org/10.59350/m6a7z-dp464
Published March 25, 2024 in Stories by Research Graph on Medium
Research Graph
Latest findings in multiple research directions for handling graph prediction and optimization Author · Xuzeng He ( ORCID: 0009–0005–7317–7426) A graph, in short, is a description of items linked by relations, where the items of a graph are called nodes (or vertices) and their relations are called edges (or links). Examples of graphs can include social networks (e.g. Instagram)

The SolidWorks Model of Simulation

https://doi.org/10.59350/64rk5-eez50
Published March 25, 2024 in Corin Wagen
Corin Wagen
Apologies for the long hiatus: we’ve had some health issues in the family, and startup life has been particularly overwhelming. With any luck, I’ll be able to return to a more regular posting frequency soon. What’s the right relationship between theory, computation, and experiment? Much has been written on this.

Automated Knowledge Graph Construction with Large Language Models

https://doi.org/10.59350/e9jz5-r3357
Published March 25, 2024 in Stories by Research Graph on Medium
Research Graph
Harvesting the Power and Knowledge of Large Language Models Author: Amanda Kau (ORCID: 0009–0004–4949–9284 ) Introduction Knowledge graphs (KGs) are networks that represent data in a graphical format.

Navigating the Long Context Conundrum: Challenges in Language Models’ Information Processing

https://doi.org/10.59350/jdd9b-v7v26
Published March 26, 2024 in Stories by Research Graph on Medium
Research Graph
Author · Qingqin Fang ( ORCID: 0009–0003–5348–4264) Introduction In the era of AI breakthroughs, large language models (LLMs) are not just advancements; they are revolutions, transforming how we interact with technology, from casual conversations with chatbots to the intricate mechanisms behind sophisticated data analysis tools.

An Introduction to Recurrent Neural Networks (RNNs)

https://doi.org/10.59350/gjsvh-zbk81
Published March 26, 2024 in Stories by Research Graph on Medium
Research Graph
Understanding how RNNs work and its applications Author Wenyi Pi https://orcid.org/0009-0002-2884-2771 Introduction In the ever-evolving landscape of artificial intelligence (AI), bridging the gap between humans and machines has seen remarkable progress.

Generative AI’s Leap into Video in 2024 and its Ethical Horizon

https://doi.org/10.59350/xjcaa-rff55
Published March 26, 2024 in Stories by Research Graph on Medium
Research Graph
Exploring the Boundaries of Creativity and Responsibility in the Age of AI-Driven Media Author . Vaibhav Khobragade (ORCID: 0009–0009–8807–5982) Introduction In 2024, the discipline of Generative AI takes a big step forward with the launch of revolutionary models that convert text into dynamic films, altering the landscape of digital content creation.

Smaller: methods to reduce the size of a PDF file

https://doi.org/10.59350/2vmnj-7g870
Published March 26, 2024 in quantixed
Stephen Royle
Hot on the heals of the post on how to downsize microscopy movie files, let’s look at ways to shrink the size of a PDF file. There’s several ways to tackle this – suggestions came from this thread on Mastodon. Scenario: you have created a preprint/manuscript/proposal in PDF format.

AIBC – India. Biocuration, long days, even longer traffic jams.

https://doi.org/10.59350/3n60a-3c451
Published March 26, 2024 in GigaBlog
Chris Hunter
The Annual International Biocuration Conference (AIBC) was held for the first in India, at the Indian Biological Data Centre (IBDC), Regional Centre for Biotechnology (RCB), Faridabad and co-hosted by the Department of Plant Molecular Biology, University of Delhi South Campus. As usual, GigaDB had representation at the event (see write-ups of many previous meetings here), Mary Ann Tuli and Chris Hunter. Both of whom were wearing two hats!

Newsletter

Thank you!