Explorez tous les épisodes du podcast Machine-Centric Science
| Titre | Date | Durée | |
|---|---|---|---|
| Sandra Gesing | 17 Feb 2023 | 00:41:21 | |
An interview about FAIR software, workflows, and virtual research environments (VREs) / science gateways with Sandra Gesing, currently a Senior Research Scientist and Scientific Outreach and Diversity, Equity, and Inclusion (DEI) Lead at the Discovery Partners Institute at the University of Illinois, Chicago. | |||
| Christophe Blanchi | 18 Jan 2023 | 01:12:10 | |
https://doi.org/20.500.14132/chris --> Digital Object Identifier Resolution Protocol (DO-IRP): https://www.dona.net/sites/default/files/2022-06/DO-IRPV3.0--2022-06-30.pdf | |||
| Patrick Huck | 21 Jul 2022 | 00:53:08 | |
Materials Project (MP) website: https://materialsproject.org/ Novel Materials Discovery (NOMAD) Laboratory: https://nomad-lab.eu/ Contributor Roles Taxonomy: https://credit.niso.org/ Authentication resources (FAIR A1.2): U.S. Department of Energy resources: Connecting with Patrick: | |||
| FAIR Implementation Profile (FIP) Ontology | 15 Jul 2022 | 00:09:43 | |
The FAIR Implementation Profile (FIP) Ontology: https://w3id.org/fair/fip/terms/FIP-Ontology | |||
| R1.3: metadata and data meet domain-relevant community standards | 20 Jun 2022 | 00:08:10 | |
Linked Open Vocabularies (LOV): https://lov.linkeddata.es/dataset/lov/ FAIRSharing: https://fairsharing.org/ PageRank of Linked Open Vocabularies (LOV): https://donnywinston.com/posts/pagerank-of-linked-open-vocabularies-lov/ Principles of Open Scholarly Infrastructure (POSI): https://openscholarlyinfrastructure.org/ | |||
| R1.2: Metadata and data are associated with detailed provenance | 02 Jun 2022 | 00:07:29 | |
https://www.w3.org/TR/prov-dm/#dfn-provenance # Component 1: Entities/Activities: Relation: Trigger/Starter of Start of Act (trigger E, starter Act)
Relation: Revision (E-E) # Component3 : Agents, Responsibility, and Influence Relation: Influencer/Influencee ({E,Act,Agt}-[usage,start,end,generation,invalidation,communication,derviation,attribution,association,delgation]-{E,Act,Agt}) 3 core types: entities, activities, agents. “instantaneous events” are put in context of activities. 10 influencing relations (not including 3 included subtypes of derivation - (1) [was] revision [of], (2) quotation ("was quoted from"), (3) [had] primary source). | |||
| R1.1: Meta(data) are released with a clear and accessible data usage license | 25 May 2022 | 00:11:45 | |
The Creative Commons suite of licenses: CC0, CC BY, CC BY-SA, CC-BY-ND, CC BY-NC, CC BY-NC-SA, CC BY-NC-ND. Code licenses: Server Side Public License, Affero GPL (AGPL), Lesser GPL (LGPL), Mozilla Public License (MPL), Business Source License (used e.g. by Sentry, <https://github.com/getsentry/sentry/blob/master/LICENSE>), Elastic License (for Elasticsearch), Apache 2.0, BSD, MIT. Spectrum of user freedom and redistributor freedom. "The CRAPL: An academic-strength open source license": <https://matt.might.net/articles/crapl/> | |||
| R1: (Meta)data are richly described with a plurality of accurate and relevant attributes | 18 May 2022 | 00:09:12 | |
* https://queryunderstanding.com | |||
| I3: (meta)data include qualified references to other (meta)data | 12 May 2022 | 00:05:47 | |
In the W3C Provenance Ontology: The HTML Anchor Element: | |||
| I2: (Meta)data use vocabularies that follow the FAIR principles | 04 May 2022 | 00:06:26 | |
Heather Hedden, "Foundation for a Knowledge Graph Taxonomy Design Best Practices", slides at https://zenodo.org/record/6510205 Teodora Petkova, "The Dialogic Potential of the Web of Data", slides at https://zenodo.org/record/6518557 https://en.wikipedia.org/wiki/Bohm_Dialogue Tim Berners-Lee's bag of chips https://www.w3.org/TR/vocab-dcat-2/#Class:Dataset https://schema.org/Dataset | |||
| I1: (Meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation | 27 Apr 2022 | 00:08:25 | |
GUPRIs, RDF, RDFS, OWL, SHACL, JSON, JSON-LD, JSON Schema, ActivityPub, "fediverse", XMPP, SMTP. | |||
| A2. Metadata are accessible, even when the data are no longer available | 19 Apr 2022 | 00:04:39 | |
Archival Resource Key (ARK) specification (section on policy metadata): https://datatracker.ietf.org/doc/html/draft-kunze-ark-34#section-5.1.1. Permanence Levels and the Archives for NIH NLM's Permanent Web Documents: https://www.nlm.nih.gov/pubs/techbull/ma05/ma05_archive.html. | |||
| Vineeth Venugopal | 31 Oct 2022 | 00:59:19 | |
https://en.wikipedia.org/wiki/Interatomic_potential | |||
| A1.2: The protocol allows for authentication and authorisation where necessary | 13 Apr 2022 | 00:05:04 | |
A brief dip into the world of HTTP auth. The Authorization request header. The WWW-Authenticate response header. Basic authentication. Bearer-based authentication. Authenticating securely. Shared secrets versus asymmetric encryption (for non-repudiation). | |||
| A1.1: The protocol is open, free and universally implementable | 05 Apr 2022 | 00:03:41 | |
Protocol versus implementation. HTTP, SMTP, Zulip. | |||
| A1: (Meta)data are retrievable by their identifier using a standardized communication protocol | 29 Mar 2022 | 00:02:48 | |
You want to avoid protocols with limited implementation, poor documentation, and, when possible, components involving human intervention. It may not be possible to provide secure access through a fully mechanized protocol like HTTP, for example, for highly sensitive data. However, the protocol must be clear and explicit in the metadata, whether it involves a verbal request, email, telephone number, Slack username, et cetera. The important thing is that the communication protocol for how to access is explicit and clearly defined in the metadata, whether fully mechanized or not. | |||
| F4: (Meta)data are registered or indexed in a searchable resource | 22 Mar 2022 | 00:06:42 | |
The goal here is leverage: increasing the ratio of machine action to user action in getting to the data that they want. Otherwise, your data is technically findable, but it's going to require a lot of user action. They might have to do a full data download, scan through a full table, scroll through a long webpage, and it's unlikely that they're going to actually find what they need, because they're just not going to put in that much effort. So you really want indexing. You want this leverage to have your machine help do some of the action that a user might otherwise do. | |||
| F3: Metadata clearly and explicitly include the identifier of the data they describe | 15 Mar 2022 | 00:03:01 | |
Literature references with and without DOIs. Tables of data in articles with and without unique identifiers in each row for what that row is about. The magic of including identifiers in the metadata you share. The Data Catalog (DCAT) Vocabulary: https://www.w3.org/TR/vocab-dcat-2/ | |||
| F2: Data are described with rich metadata | 08 Mar 2022 | 00:03:13 | |
Kinds of metadata - "intrinsic" (machine-defined or machine-controlled; immutable) and "extrinsic" (user-defined or user-controlled). Other-than-technical interoperability. "Quality" in the eye of the beholder / data consumer. Analogy to web-browser feature detection, and application to search engine "rich results". | |||
| F1: (Meta)data have globally unique, persistent identifiers | 01 Mar 2022 | 00:06:57 | |
| |||
| What to expect from this podcast | 22 Feb 2022 | 00:01:10 | |
A rundown of what I'm planning: FAIRdowns, inside the Box, and FIP calls, oh my! | |||
| walk-and-talk: DIKW pyramid/hierarchy | 27 Sep 2022 | 00:08:57 | |
DIKW pyramid / DIKW hierarchy - https://en.wikipedia.org/wiki/DIKW_pyramid "Data becomes information when it is stored *in* a given *formation*." "There are only three things we can do with data. We can accrete data by adding it to an existing collection, reduce data by discarding information from an existing collection, or reshape data by placing it in a different kind of collection." types of information: situational, methodological, philosophical (epistemological, axiological, ontological) Inductions vs deductions vs abductions "programs must be written for people to read, and only incidentally for machines to execute." | |||
| I Fought the Law | 07 Sep 2022 | 00:01:12 | |
`.split()`s on strings and `filter`s on `None` I varied my output with the latest fad Scatterin' parsing like a shotgun I varied my output with the latest fad | |||
| Martynas Jusevičius | 29 Aug 2022 | 00:29:59 | |
- Linked Data | |||
| FAIR-Enabling Services | 19 Aug 2022 | 00:09:57 | |
I was thinking about FAIR-enabling resources and wanted to distinguish between things that actually have to be running in order for data to be alive and for you to actually find it, access it, interoperate with it, and reuse it, versus "one-time" things that those services will need.
| |||
| Stuck Data Mining Again (Lodi) | 09 Aug 2022 | 00:02:07 | |
Just about a week ago, Things got bad, and things got worse, Rode in on semantics, The man from Stack Overflow If I only had metadata | |||
| Don't Silo Me In | 04 Aug 2022 | 00:01:19 | |
Oh give me mappings, lots of mappings, with resolving URIs. Don’t silo me in. Let me prance through semantics of namespaces that I love. Don’t silo me in. Let me use an open protocol to access these bytes, and for metadata promise me you’ll keep on the lights. Authenticate me repeatedly, but give clear usage rights. Don’t silo me in. Just give me data bare. Let me reuse my old CPUs and mint my URIs. With my own software, let me wander over yonder with least surprise. I want to probe the provenance of metadata rich and plural, and represent my knowledge to be machine actionable. And I can’t look at schemas if they’re not interoperable. Don’t silo me in. | |||
| Shreyas Cholia | 29 Jul 2022 | 00:29:46 | |
* [Materials Project](https://materialsproject.org/) * [Environmental Systems Science Data Infrastructure for a Virtual Ecosystem (ESS-DIVE)](https://ess-dive.lbl.gov/) * [National Microbiome Data Collaborative (NMDC)](https://microbiomedata.org/) * [W3C Provenance (PROV) specs](https://www.w3.org/TR/prov-overview/) * [Research Equals (R=)](https://www.researchequals.com/) * [JSON-LD](https://json-ld.org/) * [Ecological Metadata Language (EML)](https://eml.ecoinformatics.org/) * [DataCite](https://datacite.org/) * [OSTI](https://www.osti.gov/) * [DOI](https://www.doi.org/) * schema.org * [OAuth](https://oauth.net/2/) * [OpenID Connect (OIDC)](https://openid.net/connect/) * [OpenAPI](https://www.openapis.org/) * [REST](https://en.wikipedia.org/wiki/Representational_state_transfer) * [IGSN](https://www.igsn.org/) * [Data Observation Network for Earth (DataONE)](https://www.dataone.org/) * [Frictionless Data](https://frictionlessdata.io/) | |||