Is an ontology “better” than a relational data model? “More expressive power” doesn’t always mean “better”. However, ontologies allow you to ratchet up power while keeping logic in data structures.
Is an ontology “better” than a relational data model? “More expressive power” doesn’t always mean “better”. However, ontologies allow you to ratchet up power while keeping logic in data structures.
Leave beacons in your code. I would have avoided a silly error if a variable named xgb_train_data would have been named, for example, xgb_train_data_filepath instead. When you can’t leave globally unique, persistent, resolvable identifiers (GUPRIs), mind your beacons. References: F. Hermans, The Programmer’s brain: what every programmer needs to know about cognition , pp28-30. Shelter Island, NY: Manning, 2021.
Add a CITATION.cff file to your git repository. The Citation File Format is automatically rendered on GitHub and usable by Zenodo and Zotero. Already have a DOI? Let’s see about a DOI-to-CFF tool. Looks like there’s doi2cff, but it’s currently restricted to DOIs on Zenodo that are tagged as software releases.
Datasets are easier to reuse if they use standards that are well-established, particularly in a given domain. A first approach is to ask around – ask people with whom you coauthor , people you trust in your field, etc. A follow-on approach is to examine the “graph reputation” of relevant standards, particularly if they may be represented as resources with outbound links.
Lean manufacturing aims to reduce waste in production processes and to reduce response times to consumers from producers. Womack and Jones 1 authored five key principles for lean thinking in the context of manufacturing: Value : Identify the value of a product to a consumer. Value Stream - Identify the minimal process (steps, time, information, material) to produce the value.
“Dataset” is a derived notion, a psychological construct, where “versions” of the dataset are a succession of values that we perceive to be causally related. “Dataset” is a side effect.
If it’s not consistent, it can’t be valid. If it’s not valid, it can’t be accurate. If it’s not accurate, who cares if it’s timely? Subscribe to get short notes like this on Machine-Centric Science delivered to your email.
The World Wide Web Consortium (W3C) publishes a range of specifications and guidelines which help move web standards forward. However, even when restricting scope to the Latest version of specifications with the status Recommendation and with the tag Data, there are currently 77 of them: https://www.w3.org/TR/?tag=data&status=REC&version=latest!
I noticed a pattern at the top of each case study listed by Stemma.ai, which provides data catalog software as a service based on the open-source Amundsen code. Each case study’s so-called “Data Stack” comprises up to four distinct categories of functionality – Data Catalog, Data Warehouse, ETL, and Business Intelligence.
For evolvable data exchange, you need to be able to continually add qualified references galore so that participants can reason by analogy – i.e., each new thing resembles something known before. This is FAIR principle I3, which depends on I1 and I2 for robustness. Subscribe to get short notes like this on Machine-Centric Science delivered to your email. M. Minsky, The Society of Mind . New York: Simon and Schuster, 1986, p.