Syntax Comparison of N3 and Turtle
From Jena wiki
This note describes differences of syntax in N3 and Turtle, with some references to SPARQL, with all three restricted to expressing RDF.
The differences affect writers, not just parsers. The syntax differences mean that output generated by one writer can not read by a strict parser even when the intuition is that the N3 and Turtle are "the same" for a core of RDF.
This note is not a complete list of differences.
Please send additions and corrections.
Contents |
[edit] References
- N3 : N3 Grammar (2006/07/09, 2007/09/11) and Notation 3
- Turtle : Turtle Grammar (2006/12/04, 2007/09/12)
- SPARQL : SPARQL grammar (2007/06/14)
cwm does not directly implement the N3 grammar; it uses a custom-written parser (notation3.py) which should agree with the N3 description. The testing with cwm was done with version 1.1.9b1.
[edit] Forms
The syntax form [:p 1]. is legal in N3 and SPARQL; it is the
same as [] :p 1 . Similarly, the syntax form
(1 2 3) . Both are illegal in Turtle.
[]. and (). are legal in N3
(they both generate no triples in cwm).
[edit] White space
n3.n3 is not completely clear about white space. A comment gives the white space rule.
# Absorb anything until end of regexp, then stil white space # period followed IMMEDIATELY by an opener or name char is taken as "!". # Except after a "." used instead of in those circumstances, # ws may be inserted between tokens. # WS MUST be inserted between tokens where ambiguity would arise. # (possible ending characters of one and beginning characters overlap)
<a><b><c> . is legal in N3(cwm) and SPARQL, but
not in Turtle.
"abc" @en (white space between " and </code>) illegal in Turtle and N3. Legal in SPARQL.
In N3, white space is allowed in URIs, delimited by <code><> - it is ignored.
[edit] Escape Sequences
(This section is not definitely not complete)
n3.n3 does not mention any \ processing, either for \u or \t, \n etc.
cwm provides \u in strings. \u are legal and unprocessed in URIs. \u is illegal elsewhere.
Turtle allows \u in strings, and relative URIs but not in prefixed names.
Turtle allows > in relative URIs as \>.
SPARQL assumes that \u has been processed before the parser gets to see the input stream. So \u can be anywhere which is then subject to all the grammar rules.
[edit] Strings
The Turtle grammar does not exclude the use of """ inside long strings
because lcharacter production includes ".
SPARQL allows either single or double quotes ' or " to delimit strings. The triple forms for long literals are also supported.. N3 and Turtle do not allow the use of single quotes (or triple single quotes).
Line ending processing inside long strings is not fixed in any of the 3 languages.
[edit] Numbers
123. is a decimal in Turtle and SPARQL but not in N3.
123 matches both the decimal and integer rules in Turtle.
30.e0 is a double in Turtle and SPARQL. It is an integer in
cwm, followed by the start of a path expression.
The "." is interpreted as a path character in cwm. n3.n3 uses ! for a path.
Turtles allows alternative lexical forms for abbreviated numbers (integer, decimal, double) but not for typed literals in general. (It should also say that the alternative lexical form shoudl be the same value as the input but does not.)
[edit] Prefixed Names
The term prefixed name is used in preference to qname because the RDF usage, with the concatenation rule to produce IRIs, is not the same as XML usage where a qname is a pair (namespace, local name) but with no general association to an IRI.
[edit] Namespace part
_abc: is legal in N3; illegal in Turtle and SPARQL.
_: is not mentioned as blank node syntax in n3.n3.
Prefixed names do not need to have a ":" in them in N3 - this is for
the @keywords directive. Whether a token is legal as a
prefixed name will depend on what keywords have been declared.
N3 grammar has a single long regex for qnames (it's 727 characters). It can match the empty string.
[edit] Local part
SPARQL: The local part can start with a digit (0-9). This was a later change in response to community comments.
SPARQL: The local part can use a '.' but must not end in '.'.
[edit] IRIs
N3: <[^>]*>
Turtle: '<' (( character - #x3E ) | '\>') * '>'
SPARQL: '<' ([^<>"{}|^`\]-[#x00-#x20])* '>'
The IRI syntax rules are too complicated to include directly in any of the grammars and indeed depend on the IRI scheme.
[edit] Syntax and Values
Turtle allows for any short form of a number (integer, decimal, double) to result in a datatype of any legal lexical form presumably it means of the same value). This allows for canonicalization of numbers; it is not term preserving for abbreviated numbers.
The N3 grammar does not make any statement about this; cwm canonicalises values but that is a feature of the store, not the language syntax.
SPARQL is defined for simple entailment where "01"^^xsd:integer is different from "1"^^xsd:integer. SPARQL filters do value testing so these two different RDF terms will test equal using = but be different by sameTerm.
SPARQL defines an extension framework whereby different entailment regimes can be used. This would include D-entailment. That is, SPARQL makes this a semantic issue, not a syntactic issue.
[edit] Other
[edit] Literals as Subjects
Literals are legal as subjects in N3 and SPARQL but not in Turtle.
This has been noted by RDF-core.
"[The RDF core Working Group] noted that it is aware of no reason why literals should not be subjects and a future WG with a less restrictive charter may extend the syntaxes to allow literals as the subjects of statements."
[edit] @base
@base has been recently added to Turtle. N3 already had it. It can occur anywhere in a file that a directive can and can take a relative URI.
Directives in N3 and Turtle look like language tags so parser designers must be aware of this and not tokenize to language tags, or admit the directive tokens as language tags.
@base can break the concatenation of Turtle files. If a file has a relative IRI, and a file with a @base directive is put first in a concatenation, the relative IRI will resolved differently; it would have been resolved relative to where the file was read from. This is the only feature that affects concatenation of files.
In N3, there this also happens but also there is an assumed @prefix : <#> . so if an earlier file in the concatenation redefined this, later files need to have an explicit @prefix : <#> . declaration to retain the same meaning.
[edit] Notes on some other syntax features of N3
[edit] @keywords
The @keywords directive declares barewords (matched by
the qname rule without a ':'). With this in force, an undeclared bareword
foo is the same as if it had the empty prefix: :foo.
[edit] Paths forms
"!" and "^" define forward and backward paths of properties. Intermediate nodes are blank nodes. In cwm, "." is a a synonym for "!".
[edit] Other forms
- x @has :p :y .
- x @is :p @of :z .
@has, @is @of can be declared as kewords in which case the @ can be omitted.
"=" is owl:SameAs
"=>" is log:implies
"<=" is log:implies but reversing subject and object.
