Separate content from file identity in lint()#3017
Conversation
|
Maybe this is a good reason to finally add a dedicated |
|
That is a good idea! Do you want me to include it in this PR? |
6d1bb83 to
ab8a81f
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #3017 +/- ##
==========================================
+ Coverage 99.25% 99.29% +0.04%
==========================================
Files 128 128
Lines 7355 7367 +12
==========================================
+ Hits 7300 7315 +15
+ Misses 55 52 -3 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
|
@MichaelChirico |
388c8c2 to
f6dbf7c
Compare
Change mustWork = !allow_rproj to mustWork = FALSE so that find_package() returns NULL instead of erroring when given a non-existent file path. This is needed for the upcoming lint(filename, text=) mode where filename serves as identity but may not exist on disk. All callers already handle NULL return gracefully.
Split normalization of source exclusions (from # nolint comments) and config exclusions (from .lintr file) so that source exclusions use normalize_path=FALSE. This avoids Sys.glob/file.exists calls that would fail for non-existent identity paths in the upcoming lint(filename, text=) mode.
lint(filename, text=buf) now takes content from text while using filename as file identity for config discovery, knitr detection, exclusions, and output. The file need not exist on disk. Key changes: - New normalize_identity_path() helper expands relative paths for non-existent identity files - cache_file() now receives lint_obj instead of filename, matching retrieve_file() which already uses lint_obj - Documentation updated for the new usage mode
When both file and content are provided as character, expect_lint() now uses Mode 3 (lint(file, text=content)) instead of writing content to a tempfile. This lets tests specify an identity path for config discovery and knitr detection while providing content inline. Extract check_lints() helper from expect_lint() to keep both functions within the cyclocomp limit. The is.character(content) guard prevents accidental Mode 3 activation when a linter list lands in the content parameter via R's positional argument matching.
f6dbf7c to
a44ca02
Compare
|
@atusy for book-keeping, was an LLM used in whole or in part here? |
| ) | ||
| }) | ||
|
|
||
| # Mode 3: lint(filename, text=) — content from text, identity from filename |
There was a problem hiding this comment.
non-ASCII —, leftover from CJK mode?
|
I deployed Gemini to run a joint review of this and #3034 which I understand to be attempting to solve the same issue working with {lintr} in vscode. @atusy what do you make of this caution about Executive Summary & RecommendationWe recommend accepting the approach of PR 3017 with improvements, and rejecting PR 3034.
Detailed Analysis of PR 3034ApproachPR 3034 adds a guard at the beginning of find_package <- function(path, allow_rproj = FALSE, max_depth = 2L) {
+ if (!nzchar(path)) {
+ return(NULL)
+ }
path <- normalize_path(path, mustWork = !allow_rproj)Assessment
Detailed Analysis of PR 3017ApproachPR 3017 refactors
Assessment
Edge Case: Empty Filename
|
| The latter (inline data) applies whenever \code{filename} has a newline character (\\n).} | ||
| The latter (inline data) applies whenever \code{filename} has a newline character (\\n). | ||
| When combined with \code{text}, \code{filename} serves as the file identity for config discovery, exclusions, | ||
| knitr detection, and output — the file need not exist on disk.} |
There was a problem hiding this comment.
non-ASCII — here too
| if (is.null(file)) on.exit(unlink(file), add = TRUE) | ||
| file <- maybe_write_content(file, content) # NB: the lint consistency fuzz suite anchors here. | ||
| if (!is.null(file) && !missing(content) && is.character(content)) { | ||
| # Both file and content → Mode 3 (identity-aware linting) |
|
|
||
| exclusions <- normalize_exclusions(c(source_exclusions, exclusions)) | ||
| # Source exclusions use already-normalized paths from source_expression$filename. | ||
| # normalize_path=FALSE avoids Sys.glob/file.exists which fail for non-existent files (Mode 3). |
There was a problem hiding this comment.
somewhat odd to refer to Mode 3 in this file, it's a pretty "internal" terminology. I think we can just omit the parenthetical.
| source_exclusions <- normalize_exclusions(source_exclusions, normalize_path = FALSE) | ||
| config_exclusions <- normalize_exclusions(exclusions) | ||
| exclusions <- c(source_exclusions, config_exclusions) |> | ||
| remove_file_duplicates() |> |
There was a problem hiding this comment.
I'm not following these remove_*_duplicates?
|
|
||
| if (is.null(file)) on.exit(unlink(file), add = TRUE) | ||
| file <- maybe_write_content(file, content) # NB: the lint consistency fuzz suite anchors here. | ||
| if (!is.null(file) && !missing(content) && is.character(content)) { |
There was a problem hiding this comment.
do we need to support this in expect_lint()? Wouldn't it be better to keep expect_lint() simpler?
| class(lints) <- c("lints", "list") | ||
|
|
||
| cache_file(lint_cache, filename, linters, lints) | ||
| cache_file(lint_cache, lint_obj, linters, lints) |
There was a problem hiding this comment.
this change is puzzling too -- the helper is cache_file() and the argument is filename, now we're passing lint_obj. I think it's fine since this is all just digest() adornment, but maybe we should rename some internals for clarity?
Motivation
When IDEs like those using REditorSupport/languageserver lint unsaved buffers, they need to pass buffer content but resolve config, exclusions, knitr detection, and output against the real file path. Currently
lint()conflates these two concerns:filenameis both "where to read" and "who am I." This means users must save before getting up-to-date lints.This PR gives meaning to
lint(filename, text=)used together, enabling IDEs to lint unsaved buffer content while preserving full file identity.Three modes
lint("R/foo.R")lint(text = "x=1")<text>lint("R/foo.R", text = buf)text, identity fromfilenameIn Mode 3,
filenamedoes NOT need to exist on disk. Output shows the real path.Example
The key difference: lint output is attributed to the real file path, enabling IDEs to map diagnostics back to the correct buffer.
Beyond the output path, the identity filename also drives:
parse_settings = TRUE):.lintris found relative to the identity path# nolintcomments and config-based file exclusions work against the identity path.Rmdvs.R)Note:
parse_settingsdefaults toFALSEfor text-based linting to avoid breaking changes. PassTRUEexplicitly for Mode 3 if config discovery is needed.Changes
R/settings_utils.R:find_package()usesmustWork = FALSEto tolerate non-existent pathsR/exclude.R:exclude()normalizes source exclusions (from# nolint) separately from config exclusions, avoidingSys.glob/file.existson identity pathsR/lint.R: Addshas_filenameflag,normalize_identity_path()helper to expand relative paths for non-existent files, passeslint_objtocache_file()(matchingretrieve_file())R/expect_lint.R: When bothfileandcontentare provided, uses Mode 3 (lint(file, text = content))Test plan
find_package()with non-existent pathsexclude()with non-existent file pathsexpect_lint()withfile+content