“An article a day keeps ideas in play”

It’s difficult to overstate the importance for any knowledge worker to read widely and often. All things are connected in some way or another—finding those connections is the first step toward meaningful contribution to the scientific community.

Use the tags below to search through our reading recommendations.

Developing a healthcare dataset information resource (DIR) based on Semantic Web
Developing a healthcare dataset information resource (DIR) based on Semantic Web
[no author info]
[no publisher info]   ·   18 Oct 2023   ·   pubmed:30453940
An impressive tool that allows users to ask a variety of questions about a potential dataset. Supports the basics like “how many patients”, “is it open source”–but also is able to get into more details. For example, which statistical methods have been used on this dataset (extracted from publications in PubMed), and the data elements used in this dataset. The focus of the paper isn’t to demonstrate the tool itself, but is about how their application of semantic methods allows this kind of functionality. In that sense, this is a great primer on some Semantic Web basics, like RDF, SPARQL, and how to utilize several disparate ontologies. Their ability to extract statistical methods from publications is almost like a sub-paper where they describe their rule-based NER and results. It also contains some basics on the 12 datasets that they included–most of which you should read up on and know about if you are a researcher in the health informatics space. Unfortunately, they are still limited by the great equalizer–manual extraction. The Discussion section had some good things to say about how to move away from manual curation.
Developing a healthcare dataset information resource (DIR) based on Semantic Web

An impressive tool that allows users to ask a variety of questions about a potential dataset. Supports the basics like “how many patients”, “is it open source”–but also is able to get into more details. For example, which statistical methods have been used on this dataset (extracted from publications in PubMed), and the data elements used in this dataset. The focus of the paper isn’t to demonstrate the tool itself, but is about how their application of semantic methods allows this kind of functionality. In that sense, this is a great primer on some Semantic Web basics, like RDF, SPARQL, and how to utilize several disparate ontologies. Their ability to extract statistical methods from publications is almost like a sub-paper where they describe their rule-based NER and results. It also contains some basics on the 12 datasets that they included–most of which you should read up on and know about if you are a researcher in the health informatics space. Unfortunately, they are still limited by the great equalizer–manual extraction. The Discussion section had some good things to say about how to move away from manual curation. Link to paper.

Lab Website Template

I had a great experience using the Lab Website Template from the Greene Lab. Very careful and thorough documentation. I had debated on whether or not to create a bootstrap site but ultimately decided that the citation functionality in this template will be well worth the dependence on an external group.