-
-
Notifications
You must be signed in to change notification settings - Fork 690
SARIF: add partialFingerprints, tags/precision, and ensure absolute Windows paths in artifactLocation.uri #1297
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
SARIF: add partialFingerprints, tags/precision, and ensure absolute Windows paths in artifactLocation.uri #1297
Conversation
Hi, I am still hoping my PR gets reviewed |
@sigmavirus24 , @ericwb , @lukehinds It adds partialFingerprints, tags/precision, preserves Windows absolute paths, and unit tests/unit/formatters/ pass. Could one of you kindly review when you have a moment? Thanks |
…indows paths in artifactLocation.uri
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
15e97ad
to
87da4ed
Compare
for more information, see https://pre-commit.ci
""" # noqa: E501 | ||
""" | ||
# noqa: E501 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason why you moved this to a new line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
None of these examples are valid restructuredtext.
original_paths = set() | ||
|
||
if len(rules) > 0: | ||
for iss in issues: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's keep the same variable names to minimizing diffs.
for iss in issues: | |
for issue in issues: |
original_paths.add(iss.as_dict().get("filename", "")) | ||
except (AttributeError, TypeError, KeyError): | ||
# Best-effort only | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bandit itself would raise an issue about an empty except block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a lot of particularly bad LLM slop here. I was promised they were good at writing python because of how much python was out there. I feel lied to.
|
||
|
||
def create_result(issue, rules, rule_indices): | ||
"""Convert a Bandit Issue into a SARIF Result and ensure its rule |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is too long and needs to be rewritten
""" | ||
issue_dict = issue.as_dict() | ||
|
||
# Ensure rule exists / get index |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why add this comment?
""" # noqa: E501 | ||
.. versionadded:: 1.7.8 | ||
""" | ||
# noqa: E501 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why add a noqa here? It does absolutely nothing. Remove it
def _precision_from_confidence(confidence: str) -> str: | ||
# Bandit uses HIGH/MEDIUM/LOW strings for confidence | ||
c = (confidence or "").upper() | ||
if c in ("HIGH", "MEDIUM", "LOW"): | ||
return c.lower() | ||
return "medium" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is unnecessarily inefficient
issue_dict["code"], | ||
) | ||
|
||
# Map severity -> SARIF level; omit default "warning" per SARIF (and tests) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These comments really are unnecessary. They're explaining what happens which the function names do a decent job of already
result_props = { | ||
"issue_confidence": issue_dict["issue_confidence"], | ||
"issue_severity": issue_dict["issue_severity"], | ||
# Ensure raw path appears in serialized SARIF |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also you repeated that comment here. One or both should disappear and be explained with one very clear explanation of why some windows test needs a raw filename (whatever that is)
"original_path": filename_raw, | ||
} | ||
|
||
# Add a light-weight tags array on results too |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another useless comment
physical_location, line_range, col_offset, end_col_offset, code | ||
): | ||
""" | ||
Populates physical_location.region (and context_region if code provided). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More invalid, poorly written, style-inconsistent docstrings
) | ||
|
||
if code: | ||
# Wider context for viewer UX |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not a helpful comment
# Tags: always include "security"; include CWE tag only if present | ||
# and non-zero | ||
tags = ["security"] | ||
cwe_id = (issue_dict.get("issue_cwe") or {}).get("id") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, this is unnecessarily bad python code. Just,
cwe_id = (issue_dict.get("issue_cwe") or {}).get("id") | |
cwe_id = issue_dict.get("issue_cwe", {}).get("id") |
I don't know why I have to give you feedback for your LLM to process at this point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wondering if it was just us getting spammed by this user so I went see what other PRs they made and found this. It looks suspiciously AI generated, glad you came to the same conclusion.
(4 PRs about the same topic in a week ! each time growing exponentially)
secdev/scapy#4843
secdev/scapy#4844
secdev/scapy#4845
secdev/scapy#4848
Anyway good luck. Personally I think I'll throw in a 30days ban.
@sigmavirus24, @ericwb I’ll address the points you raised (docstring/reST fixes, variable naming consistency, removing the empty except, simplifying confidence/CWE handling, and trimming unnecessary comments) and push the updates immediately I have done the corrections. I’ll be available to iterate further if anything else comes up. @gpotter2 |
This PR improves the SARIF formatter with the following changes:
partialFingerprints
(primaryLocationLineHash
) for stable deduplication across runs/refactors.tags
for better rule categorization.precision
based on Bandit’s confidence levels (HIGH/MEDIUM/LOW → high/medium/low).artifactLocation.uri
to match test expectations.original_path
property for clarity and debugging.~ All unit tests in
tests/unit/formatters/
pass.~ Functional tests are unchanged by this PR.
This should make Bandit’s SARIF output more compliant and interoperable with tools that consume SARIF logs.
Closes #646
Related to #737