Skip to content

Conversation

@ziadhany
Copy link
Collaborator

@ziadhany ziadhany commented Jan 8, 2026

Signed-off-by: ziad hany <ziadhany2016@gmail.com>
@ziadhany
Copy link
Collaborator Author

ziadhany commented Jan 8, 2026

INFO 2026-01-26 19:15:30.575619 UTC Pipeline [AlpineLinuxImporterPipeline] starting
INFO 2026-01-26 19:15:30.575748 UTC Step [collect_and_store_advisories] starting
Importing data using alpine_linux_importer_v2
INFO 2026-01-26 22:39:08.084020 UTC Successfully collected 108,252 advisories
INFO 2026-01-26 22:39:08.084139 UTC Step [collect_and_store_advisories] completed in 12218 seconds (3.4 hours)
INFO 2026-01-26 22:39:08.084171 UTC Pipeline completed in 12218 seconds (3.4 hours)

from vulnerabilities.models import AdvisoryV2
from django.db.models import Count
duplicates = (
    AdvisoryV2.objects
    .values('avid')
    .annotate(count=Count('id'))
    .filter(count__gt=1)
)
len(duplicates)
Out[2]: 0
AdvisoryV2.objects.count()
Out[3]: 108252

…aseImporterPipelineV2

Signed-off-by: ziad hany <ziadhany2016@gmail.com>
@ziadhany
Copy link
Collaborator Author

ziadhany commented Jan 15, 2026

@TG1999 @pombredanne I have a question about Alpine migration. We are fetching one URL and processing the data without grouping by CVE.

The problem is that each URL reports a package version along with its fixed CVEs. How can we obtain a unique identifier for this importer? Is it a good idea to restructure the data and create a large mapping, using the CVE as the unique identifier?

Proposed structure:
CVE: [purl_1, purl_2, ...]

Example:
Package: aom

Sources:
https://secdb.alpinelinux.org/v3.22/main.json -> CVEs: "CVE-2021-30473", "CVE-2021-30474", "CVE-2021-30475"
https://secdb.alpinelinux.org/v3.21/main.json -> CVEs: "CVE-2021-30473", "CVE-2021-30474", "CVE-2021-30475"

Signed-off-by: ziad hany <ziadhany2016@gmail.com>
)

for cve in aliases:
advisory_id = f"{pkg_infos['name']}/{qualifiers['distroversion']}/{cve}"
Copy link
Collaborator Author

@ziadhany ziadhany Jan 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ex:

apache2/v3.20/2.4.26-r0/CVE-2017-7668

Fix duplication on advisory_id

Signed-off-by: ziad hany <ziadhany2016@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants