-
Notifications
You must be signed in to change notification settings - Fork 661
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[css-syntax] Review requested of new Parsing text #8834
Comments
Agenda+ to get approval on this, and ask for a new publication of Syntax. |
@cdoublev Please don't delete your comments, it's very confusing reading an email, coming here to answer the questions, and finding there's no comment anymore. Restoring the comment:
Ah, it's because I'm dumb.
The return value isn't at all handwavey - it's a list of declarations and a list of rules. |
Sorry about that. My intent was to avoid hijacking the thread with a minor mistake and something I was wrong about, indeed.
...and more consistent with A link to consume a block's contents in the description of
I think it should be "To parse a list of declarations from input:". |
If you'd been wrong about it I wouldn't have said anything, but I had to repost your comment because you did catch some mistakes.
Done.
No, it's the heading that I didn't update. I fixed the algo name and the heading's own ID, but not the heading's text. ^_^ Fixed. |
I think we should treat error recovery in nested and unnested rules the same (#7961 (comment)), but other than that the new text looks good to me. I've enabled this in Firefox Nightly in bug 1835066, fwiw. |
@emilio I'm a little confused - the error recovery is the same. I noted that as the only (non-Nesting related) normative change, because switching the algos around meant that I was changing the non-nested error-recovery (to match the nested error recovery). |
Unless you're referring to wanting to change top-level error-recovery to be the same as what you get inside of rule bodies? (afaict this isn't what you were asking for in your linked comment, but I can't tell what you're asking for otherwise) |
No, I meant that as long as we assure that all rule bodies recover errors the same regardless of whether we're nested on a style rule, we're good I believe. Which is the case with the current spec text AFAICT, I just wanted to explicitly point out that, since that was one of the breaking changes I noticed while implementing this, sorry if I wasn't clear on that :) |
Sorry, I guess I shouldn't have worded it with "other than that" :) |
The CSS Working Group just discussed
The full IRC log of that discussion<florian> TabAtkins: as part of refactorings for nesting, I realized that there is some common structures we could use to make more sense<florian> TabAtkins: this results into 1 non editorial change <fantasai> scribe+ <fantasai> TabAtkins: Rules that previously only contained rules, like @media, previously used an lagorithm that only cared about rules, not declarations <fantasai> TabAtkins: now that they do care about declarations, at least in nested form -- ideally top-level form parses the same way -- and that causes an error-recovery change <TabAtkins> @media { garbage; bar {...} } <fantasai> TabAtkins: specifically ... <fantasai> TabAtkins: this ^ previously would have resulted in a single rule, "garbage; bar { ... }" <fantasai> TabAtkins: now it results in two statements, a "garbage;" statement and a "bar { ... }" statement <fantasai> TabAtkins: only affects when there's a semicolon <florian> q+ <fantasai> TabAtkins: but changes parsing on existing stylesheets <emilio> q+ <fantasai> TabAtkins: question is if that's OK <emilio> ack hober <Rossen_> ack florian <fantasai> florian: So you say previously we wouldn't discard "garbage;" and keep "bar { ... }" <fantasai> florian: that seems better, but is it Web-compatible? <fantasai> TabAtkins: Suspect so, since only small cases <fantasai> florian: Doesn't seem unliely ppl will put in a media query and forgot selector and then add full rules with a selector later, and the garbage which used to gobble up next rule and thereby disabled it <fantasai> florian: nolonger disables it <fantasai> florian: seems plausible that it could be written by accident <fantasai> florian: Let's try, but it would be possible it causes problems <Rossen_> ack emilio <fantasai> emilio: Fact that nested and unnested parse differently seems super weird, so +1 to try fixing this <fantasai> emilio: I *think* that's how I implemented in Firefox, but might have been conservative when implementing, don't remember <fantasai> Rossen_: Hearing support, any reasons to not resolve? <fantasai> RESOLVED: Accept parsing behavior change as described in issue <fantasai> TabAtkins: can I republish Syntax? <fantasai> TabAtkins: all the changes should be captured in Changes list; will double-check before publishing <fantasai> TabAtkins: still need to do review of tests that might need changing, but that's separate <fantasai> Rossen_: Any objections to new CR? <fantasai> RESOLVED: Republish CSS Syntax as CRD |
I don't know if this is particularly the place to make a public comment, but I wanted to make sure that y'all are aware for the spec that there is legacy CSS code floating around that leans on certain CSS property declarations being invalid as a load bearing compatibility """feature""". For example, older versions of Bootstrap contain declarations like |
Yes, that's very specifically still supported. It will fail to parse as both a property and a rule, and stop at the semicolon, exactly as it does today. |
Can you please clarify (not necessarily in the spec) the meaning of valid in the current context in consume a declaration, consume an at-rule, consume a qualified rule. I first thought it was about what rules/properties/descriptors are accepted in the style sheet or rule:
But then I wondered if their grammar must also be valid, and how these two validations would address the new needs for parsing nested style rules. Does it correspond to the validations in |
I'm not sure how to clarify it further than what's already there. The current parsing context (some declaration or rule, possible with parent rules, etc) determines whether any given thing is valid, according to specific grammars and/or prose in the relevant specs. The function in my CSS parser library is just a generic, relatively dumb version of this that will usually correctly tell apart a rule from a declaration, without knowing anything about what is actually allowed. You just need something that can spot that |
Understood, thanks. Assuming that consume an at-rule receives |
Yeah. It's purposely somewhat handwavy about what "current context" is, because it's obvious in context what the necessary behavior is in any given algo and there's no real need to lock it down more precisely. It just needs a way to fail on invalid stuff. |
I think my point was that I find a bit inconsistent that the algorithms for consuming rules no longer process one level at a time, leaving lower levels as a list of component values. I would expect to process the block with parse a block’s contents before checking its grammar:
But it expects a list of component values, as it is only intended for an independent block, eg. When rule blocks were consumed with consume a simple block while consuming a rule, the resulting simple block value was a list of component values to consume with the appropriate algorithm before checking the grammar. Anyway, it is probably not important. |
Isn't it possible to implement something that can spot this in consume a declaration and consume a list of component values? The former would run the latter with a flag when the declaration name is not a custom property. When this flag is set, consume a list of component values would return nothing when it processes edit: 77e9601 now fulfills this requirement, so there may no longer be a need for consume a declaration, consume an at-rule, consume a qualified rule, to require that the declaration or rule be valid in the current context. |
The only reason the old spec did so was because I didn't have a unified parsing model for all possible block contents. You had to know whether to parse a block's contents as "at-rules and declarations" or "at-rules and qualified rules", and that depended on what type of rule the block was in; it was easier to write the spec to have it delay this decision until later, when doing grammar-checking. The new rules don't actually require having any grammar knowledge, because we've baked some restrictions into the grammar itself, so I was able to unify the two algorithms into one. (Knowing the grammar just lets you parse more efficiently; it doesn't change the end result.) So there was no longer any need to delay the parsing, and we could just get everything done in one pass.
Correct, but I'm not sure why that's relevant. The "parse" algorithms are the high-level algos called by other specs, while the "consume" algorithms are the low-level things that actually do the parsing. "Parse" algorithms call into "consume" ones; not vice versa.
In theory, you can indeed remove the check from all of these places. You could just check, in "consume a block's contents", whether the returned declaration contains an improperly-used {}-block, and then do all the rest of the validity checking after parsing. But that wouldn't be very efficient; every top-level rule that starts like So, it's better to write the spec in a way that guides implementations to do the better thing. This also ensures that if we did accidentally make a change that would cause the "check validity now" and "check validity later" impls to diverge, the "check validity now" impls would stay correct; the spec wouldn't be accidentally requiring impls to change to the less efficient method. |
I thought that the "parse [a CSS]" algos could be used recursively. This is no longer possible with the unified model, because nested contents are already consumed. You also cannot use parse something according to a CSS grammar with a
In my comment, I suggested to stop consuming a list of component values (for a non-custom property value) when encoutering the But for both of these concerns, the current spec is fine with me. I understand that this is not a detailed implementation guide. When you try to conform to it as closely as possible, it can just be tricky to put pieces together sometimes. Thank you for your clarifications. |
I hope feedback is still welcome! I have a few notes from implementing this myself last week:
|
Since we resolved on the new Nesting behavior (try to parse as a declaration, then parse as a rule if that's invalid), I had to do some decent rewriting of Syntax's algorithms, and went ahead and dove into a larger rewrite to clean it up in general. I've implemented the new text in my CSS parser library and the (fairly limited, admittedly) tests I've run look good, but I'd appreciate a larger review.
Significant changes from the previous version:
Aside from allowing the new nesting behavior, all of these changes should be only editorial, with one exception: blocks that previously only contained rules (
@media
,@keyframes
, etc) previously used the "consume a list of rules", but now use the unified "consume a block's contents", which means their error-recovery in the face of semicolons changes.For example,
@media { garbage; bar {...} }
previously would contain a style rule with agarbage; bar
selector. (This is what happens at the top-level of a stylesheet, still.) Now the rule's selector will be justbar
, since thegarbage;
part will get dropped as an invalid declaration. This means that rules which were accidentally invalid and dropped due to garbage might now be valid, if there's a semicolon separating them from preceding garbage.I suspect this is fine, and I'd really like it to be, because it means the overall parsing behavior doesn't need to branch on grammar knowledge (and thus, whether a rule is known or unknown won't change its generic parsing). It used to be the case that parsing depended on this kind of knowledge, and it was super awkward to use in tooling. Also, it means that parsing doesn't change between a top-level
@media
and a nested one, except for declarations becoming valid instead of invalid; all the rules inside of the@media
remain precisely the same.But if necessary, we can hardcode some at-rules to trigger a different parsing behavior that preserves backwards compatibility more completely.
(Technically parsing in general depends on grammar knowledge anyway, since you need to know whether a declaration is valid in a given context to tell if you should try and redo parsing as a rule. But it turns out there's a simple and reliable rule you can use generically to get approximately the right behavior without having to know anything about grammars.)
The text was updated successfully, but these errors were encountered: