Skip to content

Conversation

inosmeet
Copy link
Contributor

I've disabled the test_language_package_none_found test for FAIL-PKG-INFO because it didn't really make any sense. And python-parser for PKG-INFO or metadata was constructed solely to bypass this test which too didn't make any sense.

Also, adding the table in our existing database (cve.db) makes me wonder if I should add the purl2cpe table into it too.
Let me know what you think @terriko @anthonyharrison

@terriko
Copy link
Contributor

terriko commented Jun 20, 2024

Hm, that's a good question. Quick brainstorm on pros and cons:

pro:

  • don't have to open two separate dbs into memory
  • might very slightly speed up the queries as a result

con:

  • cve.db will be that much bigger (though likely tiny compares to sheer volume of cve info)
  • less easy for someone to load their own custom purl2cpe.db into place for use (would anyone want this?)
  • does involve changing the code you have a bit

I don't think there's an obvious winner here, but if you think it's worth loading it in to cvedb, the cons seem pretty minimal.

@terriko
Copy link
Contributor

terriko commented Jun 20, 2024

BTW, since I didn't say in my nitpicky code review: this is looking good. I think we can punt on whether we need to add product into the dedupe table for now, but let's remove that test completely and probably not add the unknowns unless there was a reason for doing it that I misunderstood.

Copy link
Contributor

@terriko terriko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summarizing from re-review: need to remove the UNKNOWNs and fix the zstandard test (either take zstandard out or replace it with something like pydantic that doesn't have the same version problem). But I see you got the FAIL-PKG-INFO test file removed, thanks!

Copy link
Contributor

@terriko terriko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rethought comments below. I still want a change but it's a different one.

Comment on lines 178 to 193
for item in vendorlist:
if item.product_info.vendor in invalidVendorList:
vendorlist_filtered.append(
ScanInfo(
ProductInfo(
"UNKNOWN",
item.product_info.product,
item.product_info.version,
"/usr/local/bin/product",
item.product_info.purl,
),
item.file_path,
)
)
else:
vendorlist_filtered.append(item)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, after re-thinking, I think what we need is something more like this:

Suggested change
for item in vendorlist:
if item.product_info.vendor in invalidVendorList:
vendorlist_filtered.append(
ScanInfo(
ProductInfo(
"UNKNOWN",
item.product_info.product,
item.product_info.version,
"/usr/local/bin/product",
item.product_info.purl,
),
item.file_path,
)
)
else:
vendorlist_filtered.append(item)
for item in vendorlist:
if item.product_info.vendor not in invalidVendorList:
vendorlist_filtered.append(item)
# if we never found a valid vendor, add an unknown entry for this product
if len(vendorlist_filtered) == 0:
vendorlist_filtered.append( ... ) # FIXME: fill this in as above, but with a real filename

As in, you're right that we want an unknown entry if there are no valid vendors, but we only need one per product.

When you fill it in, please make sure you're using self.filename instead of "/user/local/bin/product" please! If you're doing it because it makes the tests work, we should err on the side of changing the tests.

inosmeet added 4 commits June 22, 2024 09:26
added the table inside our existing cve database

Signed-off-by: Meet Soni <[email protected]>
the test did not make any sense as the package should be found

Signed-off-by: Meet Soni <[email protected]>
Copy link
Contributor

@terriko terriko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ready to merge! Thanks for iterating with me on this!

@terriko terriko merged commit b95b8be into intel:main Jun 24, 2024
@inosmeet inosmeet deleted the deduplication branch June 28, 2024 05:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants