Skip to content

Conversation

@laemtl
Copy link
Contributor

@laemtl laemtl commented May 28, 2020

Minor fix to decode an issue description special characters (&<>).

@laemtl laemtl added 23.0.0-testing Category: Bug PR or issue that aims to report or fix a bug labels May 28, 2020
@laemtl laemtl force-pushed the 2020-05-28-Issue-tracker-desc-html-entities branch from 70a1419 to 8d5c90a Compare May 28, 2020 23:05
Copy link
Contributor

@christinerogers christinerogers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine by me.
Just to check -- @laemtl Is this the same way we strip / clean special character input elsewhere in the codebase?

@ridz1208 or @johnsaigle or @driusan can you comment/review? re using html_entity_decode

@pierre-p-s Do you have time to pull and test?

@ridz1208
Copy link
Collaborator

@laemtl were you able to identify where it was being encoded in the first place ? the PR @johnsaigle linked in the issue (#6223) was the result of sequentially encoding the data during the process of saving.

Is it the same case here ? is the data being encoded several times before being saved and if so, can you identify where it is happening ?

@laemtl
Copy link
Contributor Author

laemtl commented May 29, 2020

@ridz1208 Yes, it is the same case here, encoding happens on save, so the encoded version is saved in the db. My understanding of the problem is the data is not encoded twice, but the string is not decoded on load so the encoded characters are interpreted as plain text.

@ridz1208
Copy link
Collaborator

@laemtl yeah I just did my own little investigation. your are correct.

It is slightly different then the instrument bug because in this case its a single escape happening but when the data is loaded, it's being loaded into a static field and the browser does not seem to un-escape static fields. so the bug here is not to remove an escaping step but rather to add an unescaping step

which brings the question @johnsaigle whats the use of escaping the data if it's gonna get unescaped before being sent to the browser...

@ridz1208 ridz1208 added the State: Discussion required PR or issue that requires the resolution of a discussion with the relevant parties to proceed label May 29, 2020
@ridz1208 ridz1208 self-assigned this May 29, 2020
@johnsaigle
Copy link
Contributor

johnsaigle commented Jun 1, 2020

whats the use of escaping the data if it's gonna get unescaped before being sent to the browser...

The best way to handle data safely is to:

  • to validation on data as it comes in (reject bad data)
  • "sanitize" the output as far downstream in the rendering as possible (i.e. Smarty/React)

We have a bunch of escaping/unescaping because:

  • LORIS (still) passes around a bunch of raw HTML. (Mostly because of instruments but also some core stuff). This causes Smarty to break if we enable its autoescaping feature (Set smarty's autoescape template variable option #2956)
  • Modules are very ad-hoc and it makes it difficult to determine in one central place where output can be sanitized
  • Not everything is React. React safely handles data in innerHTML by default which takes care of a lot of our security needs. (Props are not sanitized, however.)
  • Old codebase, not enough development time, refactoring needs to be prioritized, code rot, etc etc etc

I'll make a detailed issue about this because this is a problem that we should track and it would be best to have a place I can refer people to to explain why the codebase behaves the way it does.

@johnsaigle
Copy link
Contributor

@ridz1208 See #6648

@ridz1208
Copy link
Collaborator

ridz1208 commented Jun 1, 2020

@johnsaigle what would be your recommendation for this release ? just unescape the HTML for now ?

@johnsaigle
Copy link
Contributor

@ridz1208 Yeah this is a very deep and longstanding problem. We won't be able to solve it for this release. Until we have the time and will to fix the root of this problem, we will have to keep unescaping things as bugs arise.

$issueData['whoami'] = $username;
$issueData['othersWatching'] = getWatching($issueID);
$issueData['desc'] = $desc[0]['issueComment'] ?? '';
$issueData['desc'] = html_entity_decode($comment) ?? '';
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment (to the code) explaining why this should be safe from XSS attacks?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@driusan I approved the PR, can you just make sure the comment added is good for you ?

@laemtl laemtl force-pushed the 2020-05-28-Issue-tracker-desc-html-entities branch from 8d5c90a to 8610f36 Compare June 2, 2020 17:01
@laemtl laemtl force-pushed the 2020-05-28-Issue-tracker-desc-html-entities branch from 8610f36 to 9a33353 Compare June 2, 2020 17:06
@laemtl laemtl requested a review from driusan June 2, 2020 17:07
@ridz1208 ridz1208 removed their assignment Jun 8, 2020
@driusan driusan merged commit 034eacc into aces:23.0-release Jun 8, 2020
@ridz1208 ridz1208 added this to the 23.0.0 milestone Jun 9, 2020
@laemtl laemtl removed the State: Discussion required PR or issue that requires the resolution of a discussion with the relevant parties to proceed label Jun 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Category: Bug PR or issue that aims to report or fix a bug

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[issue_tracker] Special characters in description

5 participants