-
Notifications
You must be signed in to change notification settings - Fork 25
Assign a property to ALL enum values. #458
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Ensure that all enum values throughout the specification have an explicit property assigned to them. This is important for consistency of the RDF serialisation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing controversial or breaking here?
The only think it breaks is the current RDF serialisation. For now, enum values that do not have an associated property are rendered as string literals. For example,
With this PR, it will be rendered instead as
But since the RDF serialisation has yet to be officially specified, I won’t lose too much sleep over that change (and I plan to make my RDF reader still accept literal values to soften the impact of the change). As for “controversial”, I don’t think so, but since we are naming properties, there’s a lot of potential for bike-shedding. ;) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Encoding of version number should be changed to keep it a string value.
I don't like to bring this up, but please consider the following: is version really an enum? For comparison, I assume we would agree that versionDate (or releaseDate or creationData) are not enums, because they could be anything, and everyone knows what a date format is. So we use a string for that. But they couldn't be 'anything' in the real world of a particular system or entity. versionDate could only be one of a small set of strings that have been issued when the system is released. And every time you do a new version, you anoint another date as a versionDate, so the number of dates increases by 1. And you don't necessarily know ahead of time which string will be the next date, but you know how to generate it. So what is the difference between versionDate and version, that makes versionDate a string and version an enum? I'd argue that in the same way that there can be any number of versionDates in the wild (beyond the ones we end up using for a given system) , there can be any number of versions, even semantic versions. 0.0.0 through infinite #.#.#. Is the idea that by giving version strings enums we have somehow made the world more interoperable? In the justification above, it seems we are using the version for a second purpose than just indicating the version:
I agree with that statement on its own. But it's not because this is a version identifier, it's because this is a versioned IRI for the specification. So The point is that I've seen versionedIRIs and version IRIs and specification IRIs, but I've never seen any of them treated as an enum (unless it's to create a drop down menu to select your specification version, I suppose) and I've never seen any of them in the wild expressed as a CURIE. They are always strings. Sorry to make this controversial. If you still think it's worth it the sky won't fall if you do it, I just have a different take on whether it will be helpful in the bigger scheme of things. |
Yes it is. Because the version we are talking about here is not something like “the version of the mapping set”, but the version of the SSSOM specification that the mapping set declares itself to be compliant with. It is not and must not be an open-ended field.
Yes.
The CURIE is nothing but a syntactic sugar for the full IRI. Whther the CURIE ends up being used or not is left to the discretion of the RDF writer, depending on whether it is configured to write IRI in their full-length form or in their short form. A SSSOM/TSV file like this:
can be rendered in RDF either like this:
or like this:
This is of no consequence. The version IRI for the version 1.1 of the spec is still |
@gouttegd wrote:
You are right, in this case the version could also be treated differently. Either an URI as suggested, or not included in RDF at all just like XML version number is not included in most DOM representations of the document tree. In the end, this issue seems of less relevance compared to how to express the actual content of a mapping set. |
Not sure the XML version number is the appropriate comparison here. I’d rather compare
which is always included in the DOM representation of the underlying XML tree – as it should be, because if you are writing a XSL/T engine, you’d want to know which version of the XSL/T language is used by the stylesheet you just read. Granted, it is less important for SSSOM for now given that there are only two versions (1.0 and 1.1) with objectively very little differences between them. But that may very well change in the future. |
For the `mapping_cardinality_enum`, representing values with a resource identifier in RDF doesn’t really bring any benefit compared to representing the values with a string literal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incorrect indentation meant that the `meaning: sssom:NegatedPredicated` would have been interpreted as _another PredicateModifier value_, instead of the meaning of the `Not` value. Co-authored-by: Emily Hartley <ehartley@c-path.org>
It is not possible to combine the "short syntax" for enum descriptions with the use of a `meaning` qualifier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
Related to #454
[ ]docs/
have been added/updated if necessarymake test
has been run locally[ ] tests have been added/updated (if applicable)[ ] CHANGELOG.md has been updated.This PR ensures that all enum values (except for the
mapping_cardinality_enum
) throughout the LinkML model have a property assigned to them. This is important for RDF serialisation, so that all enum values are consistently rendered as resource identifiers, instead of having some values being rendered as resource identifiers and some values being rendered as string literals.I normally agree with @nichtich ’s sentiment (expressed in #450) that we should as much as possible reuse existing properties from outside SSSOM (as we already do, for example, for most values of the
entity_type_enum
) and avoid creating SSSOM-specific properties in thehttps://w3id.org/sssom/
namespace. However, for all the enum values concerned in this PR, I am not aware of pre-existing properties that would be suitable, so I did in fact create a bunch of SSSOM-specific properties. I am open to replacing some of them by more appropriate existing properties if someone can suggest some.Also open to discussing the naming of those properties.
Of note, it would be nice if those properties could resolve to something meaningful to users. In particular, for the values in
sssom_version_enum
, I think most people would reasonably expecthttps://w3id.org/sssom/version1.0
to resolve to a frozen version of the full specification as it was in the 1.0 release (and likewise forhttps://w3id.org/sssom/version1.1
and any futur version). But this is an infrastructure issue that I consider out of scope for this PR.