Skip to content

Conversation

@Tahoora-Tabassum
Copy link

@Tahoora-Tabassum Tahoora-Tabassum commented Jan 7, 2026

Description

This PR introduces a -v/--verbose CLI flag for the databusclient, enabling detailed HTTP request and response logging during interactions with Databus and Vault. This helps users debug network or authentication issues, trace requests, and inspect API communication without changing default behavior.

Key Features

  • Added top-level -v / --verbose CLI option for all commands.
  • Detailed HTTP logging via Python’s logging module (urllib3), active only when -v is used.
  • Default behavior remains silent; sensitive data is not exposed.
  • Tests updated to remove network dependency and validate verbose behavior.
  • Documentation updated to include verbose usage examples.

Related Issues

Resolves #27

Type of change

  • New feature (non-breaking)
  • Documentation update

Checklist

  • Code follows project style guidelines.
  • Self-reviewed code changes.
  • Tests added/updated for verbose behavior.
  • Documentation updated.
  • All unit tests pass (poetry run pytest).
  • No linting errors (poetry run ruff check).

Summary

  • New -v/--verbose flag for detailed HTTP/Vault request logging.
  • Tests adjusted to stabilize auth-related behavior.
  • CLI documentation updated with usage examples.

Summary by CodeRabbit

  • New Features

    • Global -v/--verbose flag: enables redacted HTTP request/response logging for downloads
    • mkdist command for building distribution strings with content-variant support
    • Shell completion command (bash/zsh)
  • Documentation

    • README updated with verbose flag, examples and security note advising against enabling verbose in public CI
    • WARNING: README contains unresolved merge conflict markers; CLI option doc renamed --version-id → --versionid
  • Tests

    • Added unit/integration tests validating verbose logging and redaction behavior

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Jan 7, 2026

📝 Walkthrough

Walkthrough

Adds a global CLI verbose flag (-v/--verbose) that enables redacted HTTP request/response logging. Implements conditional redacted HTTP logging and threads a verbose parameter through download, SPARQL, and Vault token-exchange flows. Adds CLI commands (mkdist, completion) and tests; README and CHANGELOG updated.

Changes

Cohort / File(s) Summary
Changelog & README
CHANGELOG.md, README.md
Adds Unreleased changelog entry and README docs for verbose mode, mkdist, and completion; README contains unresolved merge conflict markers.
HTTP Logging Utilities & Utils
databusclient/api/utils.py
Adds redacted HTTP logging paths (new/extended log_http usage) and a verbose parameter on fetch_databus_jsonld() to emit redacted request/response logs when enabled.
Download API Instrumentation
databusclient/api/download.py
Many download/SPARQL/Vault functions gain verbose: bool = False and call redacted HTTP logging at HEAD/GET/POST and auth exchanges; internal call sites updated to propagate verbose.
CLI Interface & Commands
databusclient/cli.py
Top-level app() accepts global -v/--verbose and stores it in context; adds mkdist and completion commands; download() updated to accept ctx and forward verbose flag.
Tests: Verbose & CLI
tests/test_cli_verbose.py, tests/test_utils_verbose.py, tests/test_download_auth.py, tests/test_cli.py
New/updated tests covering CLI verbose logging and redaction, mkdist behavior, completion output, and verbose integration across download/SPARQL/Vault flows.
Shell/Test Script
test.sh
CLI flag renamed from --version-id to --versionid.

Sequence Diagram(s)

sequenceDiagram
    actor User
    participant CLI
    participant DownloadAPI
    participant SPARQLEndpoint
    participant DatabusHTTP
    participant VaultAuth

    User->>CLI: run download -v ...
    CLI->>DownloadAPI: download(..., verbose=True)
    DownloadAPI->>SPARQLEndpoint: GET/POST SPARQL (log_http -> redacted log)
    DownloadAPI->>DatabusHTTP: HEAD/GET file URL (log_http -> redacted log)
    alt 401 requires vault
        DownloadAPI->>VaultAuth: POST token exchange (log_http -> redacted log)
        VaultAuth-->>DownloadAPI: return token
        DownloadAPI->>DatabusHTTP: retry GET with token (log_http -> redacted log)
    end
    DownloadAPI-->>CLI: return status/output
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 3 | ❌ 2
❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Out of Scope Changes check ⚠️ Warning Most changes align with the verbose logging objective; however, the README contains unresolved merge conflicts and the mkdist command appears to be an out-of-scope feature addition beyond issue #27. Resolve merge conflicts in README.md and clarify whether mkdist command is part of this PR's scope or should be separated into a different PR.
Docstring Coverage ⚠️ Warning Docstring coverage is 51.22% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely describes the primary change: adding a verbose flag for HTTP request logging and references the related issue.
Description check ✅ Passed The description covers key features, related issues, type of change, and checklist items aligned with the template, though some checklist items lack explicit confirmation.
Linked Issues check ✅ Passed The PR successfully implements the verbose (-v) parameter to provide detailed HTTP request/response logging as requested in issue #27, with proper redaction of sensitive data.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
README.md (1)

318-353: Resolve unresolved merge conflict markers.

The documentation contains unresolved merge conflict markers (<<<<<<< HEAD, =======, >>>>>>> upstream/main) that will break the rendered documentation and must be resolved before merging.

📍 Locations of merge conflicts

Additional conflict markers appear at:

  • Lines 370-385
  • Lines 421-438

Please resolve all conflicts by:

  1. Reviewing both versions of the conflicting sections
  2. Choosing the correct content or merging both appropriately
  3. Removing all conflict markers
  4. Ensuring documentation consistency
tests/test_download_auth.py (1)

1-13: Remove duplicate imports.

The file contains duplicate imports that should be consolidated:

  • patch is imported on both line 1 and line 11
  • DownloadAuthError is imported on both line 10 and line 13
🔧 Proposed fix
 from unittest.mock import Mock, patch
 
 import pytest
 
 import requests
 import logging
 
 import databusclient.api.download as dl
 
-from databusclient.api.download import VAULT_REQUIRED_HOSTS, DownloadAuthError
-from unittest.mock import patch
-
-from databusclient.api.download import download, DownloadAuthError
+from databusclient.api.download import VAULT_REQUIRED_HOSTS, DownloadAuthError, download
🤖 Fix all issues with AI agents
In @databusclient/api/utils.py:
- Around line 58-68: The _redact_headers function currently uses
key.startswith("x-api-key") which unintentionally matches headers like
"x-api-key-extra"; change that logic to check for an exact header name match
(compare key == "x-api-key") alongside the existing authorization check so only
the intended "X-API-KEY" header is redacted; update the condition in
_redact_headers to use equality for the lowercase key and keep the rest of the
function intact.

In @README.md:
- Around line 524-564: The README contains duplicated "mkdist command" and
"Completion" sections; remove the second duplicate block (the repeated "mkdist
command" and its following "Completion" snippet) so only the first occurrence
remains, ensuring the remaining section titles "mkdist command" and "Completion"
are preserved exactly.

In @test.sh:
- Line 4: Replace the incorrect CLI flag usage `--versionid` in test.sh with the
correct `--version-id` to match the flag defined in databusclient/cli.py; also
search README.md for any leftover merge-conflict or inconsistent occurrences of
`--versionid`/`--version-id` and normalize them to `--version-id` so docs and
tests use the exact flag name declared in the CLI.
🧹 Nitpick comments (6)
README.md (1)

323-327: Add language specifiers to code blocks.

Several code blocks are missing language specifiers for proper syntax highlighting. This is flagged by markdownlint and reduces documentation quality.

♻️ Suggested improvement

Add appropriate language identifiers to code blocks. For example:

-```
+```bash
 databusclient deploy --versionid https://databus.dbpedia.org/...

Additional locations missing language specifiers:
- Lines 529, 534, 541, 550, 555, 562

Use `bash` or `shell` for command-line examples.

</details>

</blockquote></details>
<details>
<summary>databusclient/api/utils.py (1)</summary><blockquote>

`71-71`: **Move import to top of file.**

The `import logging` statement should be at the top of the file with other imports, per PEP 8 conventions.


<details>
<summary>Proposed fix</summary>

Move the import to the top of the file:

```diff
 from typing import Optional, Tuple
 
 import requests
+import logging

And remove line 71.

databusclient/cli.py (1)

274-277: Consider using Click's default parameter for consistency.

The default value "bash" is specified in the function signature rather than in the click.argument decorator. While this works, using Click's default parameter would be more idiomatic and would show the default in help text.

Proposed fix
 @app.command()
-@click.argument("shell", type=click.Choice(["bash","zsh","fish","powershell"]), required=False)
-def completion(shell="bash"):
+@click.argument("shell", type=click.Choice(["bash","zsh","fish","powershell"]), default="bash")
+def completion(shell):
     click.echo(f"Run: eval \"$(_DATABUSCLIENT_COMPLETE=source_{shell} python -m databusclient)\"")
databusclient/api/download.py (2)

75-83: Consider importing log_http at module level to reduce repetition.

The from databusclient.api.utils import log_http statement is repeated inside each conditional block. Moving it to the module-level imports would reduce duplication and slightly improve performance by avoiding repeated import lookups.

Proposed fix

Add to the imports at the top:

 from databusclient.api.utils import (
     fetch_databus_jsonld,
     get_databus_id_parts_from_file_url,
+    log_http,
 )

Then remove all the inline from databusclient.api.utils import log_http statements throughout the file.


304-311: Hardcoded status=200 in SPARQL logging may not reflect actual status.

The log_http call after the SPARQL query uses a hardcoded status=200. While this is generally true on success (since exceptions would be raised on failure), it's not guaranteed to be accurate. SPARQLWrapper doesn't easily expose the actual HTTP status code.

Consider either:

  • Omitting the status parameter for the response log
  • Adding a comment explaining the assumption
Proposed fix (omit status)
     if verbose:
         from databusclient.api.utils import log_http

-        log_http("POST", endpoint_url, req_headers={"X-API-KEY": databus_key} if databus_key is not None else None, status=200)
+        # Note: actual status not available from SPARQLWrapper; logged as successful since no exception was raised
+        log_http("POST", endpoint_url, req_headers={"X-API-KEY": databus_key} if databus_key is not None else None)
tests/test_cli_verbose.py (1)

27-38: Patch targets could be more isolated for better test robustness.

The current patches at requests.head and requests.get work correctly, since the test invokes only the download command and all mocked responses are configured. However, the patches affect the entire requests module globally. Since download.py uses requests directly and also calls utils.py (which also imports requests), patching at databusclient.api.download.requests.head and databusclient.api.download.requests.get would provide better test isolation and avoid unintended interactions if other modules using requests are imported in the future.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9901fca and faadd58.

📒 Files selected for processing (12)
  • CHANGELOG.md
  • PR_BODY.md
  • README.md
  • databusclient/api/download.py
  • databusclient/api/utils.py
  • databusclient/cli.py
  • file.txt
  • test.sh
  • tests/test_cli.py
  • tests/test_cli_verbose.py
  • tests/test_download_auth.py
  • tests/test_utils_verbose.py
🧰 Additional context used
🧬 Code graph analysis (5)
databusclient/cli.py (1)
databusclient/api/deploy.py (1)
  • create_distribution (176-207)
tests/test_cli.py (1)
databusclient/cli.py (1)
  • app (18-32)
tests/test_cli_verbose.py (2)
tests/test_download_auth.py (1)
  • iter_content (22-26)
databusclient/cli.py (1)
  • app (18-32)
tests/test_utils_verbose.py (3)
databusclient/api/download.py (3)
  • download (669-797)
  • _get_sparql_query_of_collection (249-276)
  • _query_sparql_endpoint (279-311)
databusclient/api/utils.py (1)
  • fetch_databus_jsonld (33-55)
tests/conftest.py (1)
  • convert (24-25)
databusclient/api/download.py (1)
databusclient/api/utils.py (2)
  • log_http (74-103)
  • fetch_databus_jsonld (33-55)
🪛 markdownlint-cli2 (0.18.1)
README.md

323-323: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


327-327: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


529-529: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


534-534: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


541-541: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


550-550: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


555-555: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


559-559: Multiple headings with the same content

(MD024, no-duplicate-heading)


562-562: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🔇 Additional comments (17)
CHANGELOG.md (1)

1-7: LGTM!

The changelog entry clearly documents the new verbose feature, header redaction behavior, and security considerations. The format is standard and informative.

PR_BODY.md (1)

1-19: LGTM!

The PR description is well-structured and comprehensive. It clearly explains the feature, changes, motivation, and security considerations.

README.md (1)

185-186: LGTM!

The verbose flag documentation is clear and appropriately includes security warnings about redacted headers and avoiding verbose in public CI logs.

tests/test_download_auth.py (1)

110-137: LGTM!

The test effectively validates the verbose logging feature with proper token redaction. The test setup appropriately mocks the HTTP auth flow, and assertions correctly verify that:

  1. HTTP logging is active when verbose is enabled
  2. Sensitive headers are redacted
  3. Actual token values don't leak into logs
tests/test_cli.py (4)

5-19: LGTM! Good test coverage for mkdist happy path.

The test correctly validates:

  • Multiple --cv options are accepted
  • Keys are sorted alphabetically in output (a=1_b=2 not b=2_a=1)
  • All components (format, compression, sha-length) are correctly assembled

22-26: LGTM! Validates error handling for malformed content variants.


29-35: LGTM! Validates sha-length format enforcement.


38-42: LGTM! Basic validation for shell completion output.

tests/test_utils_verbose.py (4)

12-22: LGTM! Well-structured mock response helper.

The make_response helper correctly simulates a requests.Response object with status_code, headers, text, and raise_for_status behavior.


25-34: LGTM! Validates API key redaction in verbose logging.

The test correctly verifies that:

  • HTTP activity is logged with [HTTP] GET
  • Sensitive data is replaced with REDACTED
  • The actual secret value SECRET never appears in logs

38-47: LGTM! Validates redaction in SPARQL collection query flow.


51-76: LGTM! Good mock implementation for SPARQLWrapper testing.

The MockSPARQL class adequately simulates the SPARQLWrapper interface to test verbose logging around SPARQL POST operations.

databusclient/api/utils.py (1)

74-103: LGTM! Robust HTTP logging with graceful fallbacks.

The log_http function handles various response header formats defensively, which is appropriate for logging utilities that shouldn't break on unexpected input types. The fallback chain for handling Mock objects and different response types is well thought out.

databusclient/cli.py (2)

15-32: LGTM! Well-implemented verbose flag with proper logging setup.

The verbose flag is correctly:

  • Defined at the group level for global availability
  • Stored in Click context for propagation to subcommands
  • Configured with a handler only if none exists (prevents duplicate handlers)
  • Set to DEBUG level when enabled

234-271: LGTM! Well-structured mkdist command with proper validation.

The command properly:

  • Validates content variant format and forbidden characters
  • Prevents duplicate keys
  • Validates sha-length with a strict regex pattern
  • Ensures deterministic output via sorted keys
  • Supports both plain string and JSON output formats
databusclient/api/download.py (2)

9-11: LGTM! Module-level logger setup.

The logger is correctly configured at the module level, allowing consistent logging behavior across all functions.


669-797: LGTM! Consistent verbose flag propagation throughout download flows.

The verbose parameter is correctly accepted by the main download() function and propagated to all sub-functions across all code paths (collections, files, versions, artifacts, groups, and SPARQL queries).

Comment on lines 58 to 68
def _redact_headers(headers):
if not headers:
return headers
redacted = {}
for k, v in headers.items():
key = k.lower()
if key == "authorization" or key.startswith("x-api-key"):
redacted[k] = "REDACTED"
else:
redacted[k] = v
return redacted
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Issue with x-api-key header matching logic.

The condition key.startswith("x-api-key") will match x-api-key exactly but also any header starting with that prefix (e.g., x-api-key-extra). This is likely unintended. Additionally, the header name is X-API-KEY (exact match), so startswith seems overly broad.

Proposed fix
 def _redact_headers(headers):
     if not headers:
         return headers
     redacted = {}
     for k, v in headers.items():
         key = k.lower()
-        if key == "authorization" or key.startswith("x-api-key"):
+        if key == "authorization" or key == "x-api-key":
             redacted[k] = "REDACTED"
         else:
             redacted[k] = v
     return redacted
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def _redact_headers(headers):
if not headers:
return headers
redacted = {}
for k, v in headers.items():
key = k.lower()
if key == "authorization" or key.startswith("x-api-key"):
redacted[k] = "REDACTED"
else:
redacted[k] = v
return redacted
def _redact_headers(headers):
if not headers:
return headers
redacted = {}
for k, v in headers.items():
key = k.lower()
if key == "authorization" or key == "x-api-key":
redacted[k] = "REDACTED"
else:
redacted[k] = v
return redacted
🤖 Prompt for AI Agents
In @databusclient/api/utils.py around lines 58 - 68, The _redact_headers
function currently uses key.startswith("x-api-key") which unintentionally
matches headers like "x-api-key-extra"; change that logic to check for an exact
header name match (compare key == "x-api-key") alongside the existing
authorization check so only the intended "X-API-KEY" header is redacted; update
the condition in _redact_headers to use equality for the lowercase key and keep
the rest of the function intact.


databusclient deploy \
--version-id "https://d8lr.tools.dbpedia.org/hopver/testGroup/testArtifact/1.0-alpha/" \
--versionid "https://d8lr.tools.dbpedia.org/hopver/testGroup/testArtifact/1.0-alpha/" \
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# Find the test.sh file and verify the line in question
find . -name "test.sh" -type f 2>/dev/null | head -20

Repository: dbpedia/databus-python-client

Length of output: 82


🏁 Script executed:

# Search for references to both flag names across the codebase
rg -n '\-\-version-id|\-\-versionid|version_id' --type-list | head -5

Repository: dbpedia/databus-python-client

Length of output: 462


🏁 Script executed:

# Search for both flag variations
rg -n 'version-id|versionid' -C 2

Repository: dbpedia/databus-python-client

Length of output: 3596


🏁 Script executed:

# Read the CLI implementation around the flag definition
sed -n '35,45p' databusclient/cli.py

Repository: dbpedia/databus-python-client

Length of output: 517


🏁 Script executed:

# Search for all CLI option definitions related to version
rg -n 'version.id|versionid' databusclient/cli.py -C 3

Repository: dbpedia/databus-python-client

Length of output: 1846


🏁 Script executed:

# Check if there are any aliases or alternative names defined
rg -n '@click.option|--version' databusclient/cli.py -C 2 | head -50

Repository: dbpedia/databus-python-client

Length of output: 1831


Fix test.sh: Use correct CLI flag name --version-id instead of --versionid.

The actual CLI implementation in databusclient/cli.py defines the flag as --version-id (with hyphen), but test.sh line 4 uses --versionid (no hyphen). This will cause the test to fail at runtime. Additionally, the README.md file contains unresolved merge conflicts with inconsistent flag names that should be resolved to use --version-id consistently.

🤖 Prompt for AI Agents
In @test.sh at line 4, Replace the incorrect CLI flag usage `--versionid` in
test.sh with the correct `--version-id` to match the flag defined in
databusclient/cli.py; also search README.md for any leftover merge-conflict or
inconsistent occurrences of `--versionid`/`--version-id` and normalize them to
`--version-id` so docs and tests use the exact flag name declared in the CLI.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
README.md (3)

324-359: Critical: Unresolved merge conflict markers must be removed.

The README contains unresolved merge conflict markers (<<<<<<<, =======, >>>>>>>). These must be resolved before the PR can be merged.

Please resolve the conflicts by:

  1. Choosing the correct version of the content
  2. Removing all conflict markers
  3. Ensuring consistent formatting and examples

376-391: Critical: Additional unresolved merge conflict markers.

More merge conflict markers present in this section. Please resolve these conflicts and ensure the documentation is consistent throughout.


427-444: Critical: More unresolved merge conflict markers.

This section also contains unresolved merge conflict markers. All merge conflicts must be resolved before merging.

🤖 Fix all issues with AI agents
In @README.md:
- Around line 530-570: Remove the duplicated "### mkdist command" and "##
Completion" sections so only one copy of each remains; locate the repeated
headings ("mkdist command" and "Completion") and delete the second occurrence
(the block starting at the duplicate heading), preserving the first instance
with its Usage, Example, and code blocks; also ensure fenced code blocks are
marked with the language (e.g., ```bash or ```text/```python as appropriate) to
satisfy markdownlint and keep formatting consistent.
- Line 310: The README has an inconsistent CLI option name `--versionid`; change
it to the hyphenated `--version-id` to match the CLI code and other
examples—update the example line that currently shows `--versionid TEXT` to
`--version-id TEXT` and verify any surrounding examples or references use the
same `--version-id` token for consistency.
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between faadd58 and a6bb139.

📒 Files selected for processing (5)
  • CHANGELOG.md
  • README.md
  • databusclient/api/utils.py
  • databusclient/cli.py
  • tests/test_download_auth.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • databusclient/api/utils.py
  • CHANGELOG.md
🧰 Additional context used
🧬 Code graph analysis (2)
databusclient/cli.py (2)
databusclient/api/download.py (1)
  • download (669-797)
databusclient/api/deploy.py (1)
  • create_distribution (218-249)
tests/test_download_auth.py (2)
tests/test_utils_verbose.py (1)
  • make_response (12-22)
databusclient/api/download.py (1)
  • _download_file (31-213)
🪛 markdownlint-cli2 (0.18.1)
README.md

329-329: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


333-333: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


535-535: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


540-540: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


547-547: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


556-556: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


561-561: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


565-565: Multiple headings with the same content

(MD024, no-duplicate-heading)


568-568: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🔇 Additional comments (6)
tests/test_download_auth.py (1)

107-134: LGTM! Well-structured test for verbose logging and redaction.

The test properly validates that:

  • HTTP operations are logged when verbose is enabled
  • Sensitive authorization tokens are redacted ("REDACTED" appears)
  • Actual token values ("ACCESS", "VAULT") do not leak into logs

This aligns with the PR objectives and provides good coverage for the verbose feature.

databusclient/cli.py (4)

16-36: LGTM! Well-implemented global verbose flag.

The verbose flag is properly integrated as a global CLI option with appropriate logging configuration:

  • Context passing is correctly set up for downstream commands
  • Logger configuration is conditional and defensive (checks for existing handlers)
  • Log level appropriately set to DEBUG when verbose is enabled

182-205: LGTM! Correct context passing for verbose flag.

The download command properly receives and propagates the verbose flag from the global context to the underlying API call. The use of .get("verbose", False) provides a safe default.


232-271: LGTM! Robust implementation of mkdist command.

The command includes thorough validation:

  • Content variant format and key constraints are properly enforced
  • SHA-256 validation regex is correct (64 hex chars + ':' + length)
  • Deterministic ordering via sorted keys prevents ambiguity
  • Clear error messages for invalid inputs

The implementation aligns well with the underlying create_distribution API.


274-277: LGTM! Simple and effective completion command.

The completion command provides helpful instructions for enabling shell completion across multiple shells. Relies on Click's built-in completion support.

README.md (1)

191-192: LGTM! Clear documentation of verbose flag.

The documentation properly explains the verbose flag functionality and includes an appropriate warning about avoiding its use in public CI logs due to the sensitive nature of HTTP request/response details.


Options:
--version-id TEXT Target databus version/dataset identifier of the form <h
--versionid TEXT Target databus version/dataset identifier of the form <h
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix inconsistent option name.

Line 310 shows --versionid but the CLI code and other examples use --version-id (with hyphen). Please correct this to --version-id for consistency.

🤖 Prompt for AI Agents
In @README.md at line 310, The README has an inconsistent CLI option name
`--versionid`; change it to the hyphenated `--version-id` to match the CLI code
and other examples—update the example line that currently shows `--versionid
TEXT` to `--version-id TEXT` and verify any surrounding examples or references
use the same `--version-id` token for consistency.

Comment on lines +530 to +570
### mkdist command
Create a distribution string from components.
Usage:
```
databusclient mkdist URL --cv key=value --cv key2=value2 --format ttl --compression gz --sha-length <sha256hex>:<length>
```
Example:
```
python -m databusclient mkdist "https://example.org/file.ttl" --cv lang=en --cv part=sorted --format ttl --compression gz --sha-length aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa:12345
```
## Completion
Enable shell completion (bash example):
```
eval "$(_DATABUSCLIENT_COMPLETE=source_bash python -m databusclient)"
```
### mkdist command
Create a distribution string from components.
Usage:
```
databusclient mkdist URL --cv key=value --cv key2=value2 --format ttl --compression gz --sha-length <sha256hex>:<length>
```
Example:
```
python -m databusclient mkdist "https://example.org/file.ttl" --cv lang=en --cv part=sorted --format ttl --compression gz --sha-length aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa:12345
```
## Completion
Enable shell completion (bash example):
```
eval "$(_DATABUSCLIENT_COMPLETE=source_bash python -m databusclient)"
```
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Critical: Duplicate sections must be removed.

The mkdist and Completion sections are duplicated (lines 530-549 and 551-570). This duplication likely resulted from incomplete merge conflict resolution.

Please:

  1. Remove the duplicate sections
  2. Keep only one instance of each section
  3. Ensure proper formatting and language specification for code blocks (as flagged by markdownlint)
🧰 Tools
🪛 markdownlint-cli2 (0.18.1)

535-535: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


540-540: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


547-547: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


556-556: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


561-561: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


565-565: Multiple headings with the same content

(MD024, no-duplicate-heading)


568-568: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents
In @README.md around lines 530 - 570, Remove the duplicated "### mkdist command"
and "## Completion" sections so only one copy of each remains; locate the
repeated headings ("mkdist command" and "Completion") and delete the second
occurrence (the block starting at the duplicate heading), preserving the first
instance with its Usage, Example, and code blocks; also ensure fenced code
blocks are marked with the language (e.g., ```bash or ```text/```python as
appropriate) to satisfy markdownlint and keep formatting consistent.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (8)
databusclient/api/download.py (7)

145-166: Critical: verbose parameter missing from function signature but used in the body.

The docstring documents a verbose parameter (line 165), and the function body references verbose at multiple locations (lines 190, 195, 239, 246, 275, 280, 285), but verbose is not in the function signature. This will raise a NameError at runtime.

🐛 Proposed fix: add verbose parameter to signature
 def _download_file(
     url,
     localDir,
     vault_token_file=None,
     databus_key=None,
     auth_url=None,
     client_id=None,
     validate_checksum: bool = False,
     expected_checksum: str | None = None,
+    verbose: bool = False,
 ) -> None:

354-389: Critical: verbose parameter missing from _download_files signature.

The docstring documents verbose (line 374), but the parameter is not in the function signature. Additionally, this function should propagate verbose to _download_file calls.

🐛 Proposed fix
 def _download_files(
     urls: List[str],
     localDir: str,
     vault_token_file: str = None,
     databus_key: str = None,
     auth_url: str = None,
     client_id: str = None,
     validate_checksum: bool = False,
     checksums: dict | None = None,
+    verbose: bool = False,
 ) -> None:

And update the _download_file call:

         _download_file(
             url=url,
             localDir=localDir,
             vault_token_file=vault_token_file,
             databus_key=databus_key,
             auth_url=auth_url,
             client_id=client_id,
             validate_checksum=validate_checksum,
             expected_checksum=expected,
+            verbose=verbose,
         )

561-603: Critical: verbose parameter missing from _download_collection signature but used in body.

The docstring mentions verbose (line 582), and the variable is used on lines 584 and 586, but the signature on lines 561-570 lacks the parameter.

🐛 Proposed fix
 def _download_collection(
     uri: str,
     endpoint: str,
     localDir: str,
     vault_token: str = None,
     databus_key: str = None,
     auth_url: str = None,
     client_id: str = None,
-    validate_checksum: bool = False
+    validate_checksum: bool = False,
+    verbose: bool = False,
 ) -> None:

606-645: Critical: verbose parameter missing from _download_version signature.

The docstring mentions verbose (line 625), and it's used on line 627, but the signature lacks the parameter.

🐛 Proposed fix
 def _download_version(
     uri: str,
     localDir: str,
     vault_token_file: str = None,
     databus_key: str = None,
     auth_url: str = None,
     client_id: str = None,
-    validate_checksum: bool = False,
+    validate_checksum: bool = False,
+    verbose: bool = False,
 ) -> None:

648-695: Critical: verbose parameter missing from _download_artifact signature.

The docstring mentions verbose (line 669), and it's used on lines 671 and 677, but the signature lacks the parameter.

🐛 Proposed fix
 def _download_artifact(
     uri: str,
     localDir: str,
     all_versions: bool = False,
     vault_token_file: str = None,
     databus_key: str = None,
     auth_url: str = None,
     client_id: str = None,
-    validate_checksum: bool = False,
+    validate_checksum: bool = False,
+    verbose: bool = False,
 ) -> None:

762-798: Critical: verbose parameter missing from _download_group signature.

The docstring mentions verbose (line 783), and it's used on line 785, but the signature lacks the parameter. Also, _download_artifact calls need to propagate verbose.

🐛 Proposed fix
 def _download_group(
     uri: str,
     localDir: str,
     all_versions: bool = False,
     vault_token_file: str = None,
     databus_key: str = None,
     auth_url: str = None,
     client_id: str = None,
-    validate_checksum: bool = False,
+    validate_checksum: bool = False,
+    verbose: bool = False,
 ) -> None:

And propagate to _download_artifact:

         _download_artifact(
             artifact_uri,
             localDir,
             all_versions=all_versions,
             vault_token_file=vault_token_file,
             databus_key=databus_key,
             auth_url=auth_url,
             client_id=client_id,
             validate_checksum=validate_checksum,
+            verbose=verbose,
         )

834-859: Critical: verbose parameter missing from download function signature.

The docstring mentions verbose (line 858), and it's used on line 969, but the signature lacks the parameter. This is the public API entry point, so it must accept verbose from the CLI.

🐛 Proposed fix
 def download(
     localDir: str,
     endpoint: str,
     databusURIs: List[str],
     token=None,
     databus_key=None,
     all_versions=None,
     auth_url="https://auth.dbpedia.org/realms/dbpedia/protocol/openid-connect/token",
     client_id="vault-token-exchange",
-    validate_checksum: bool = False
+    validate_checksum: bool = False,
+    verbose: bool = False,
 ) -> None:

Then propagate verbose to all internal function calls throughout the function body.

databusclient/cli.py (1)

151-215: Critical: download command missing @click.pass_context decorator and not propagating verbose.

The function signature includes ctx (line 188), but the @click.pass_context decorator is missing. This will cause a runtime error. Additionally, verbose from ctx.obj is not passed to api_download.

🐛 Proposed fix
 @app.command()
+@click.pass_context
 @click.argument("databusuris", nargs=-1, required=True)
 @click.option(
     "--localdir",

And pass verbose to the API call:

+    verbose = ctx.obj.get("verbose", False)
     try:
         api_download(
             localDir=localdir,
             endpoint=databus,
             databusURIs=databusuris,
             token=vault_token,
             databus_key=databus_key,
             all_versions=all_versions,
             auth_url=authurl,
             client_id=clientid,
-            validate_checksum=validate_checksum
+            validate_checksum=validate_checksum,
+            verbose=verbose,
         )
🤖 Fix all issues with AI agents
In @databusclient/api/download.py:
- Around line 449-452: The verbose logging currently calls log_http("POST",
endpoint_url, ..., status=200) after sparql.query().convert(), which is
misleading because SPARQLWrapper does not expose an HTTP status code; remove the
hardcoded status argument (or pass None) and add a short inline comment in the
verbose block noting that SPARQLWrapper abstracts HTTP status so none is
available; update the log_http call in the verbose branch that references
databus_key and endpoint_url (and the surrounding if verbose block) accordingly.

In @databusclient/api/utils.py:
- Around line 59-71: Add the missing import for log_http at the top of the
module (e.g., from databusclient.api.logging_utils import log_http or adjust the
path to match where log_http is defined) so calls to log_http() in the GET
request flow (the two calls around requests.get) no longer raise NameError; also
insert a single blank line before the def compute_sha256_and_length function to
satisfy PEP8 spacing.
🧹 Nitpick comments (3)
databusclient/api/utils.py (1)

73-75: Missing blank line between function definitions (PEP8 E302).

There should be two blank lines before the compute_sha256_and_length function definition per PEP8 style guidelines.

✏️ Proposed fix
     response.raise_for_status()
     return response.text
+
+
 def compute_sha256_and_length(filepath):
databusclient/api/download.py (1)

190-198: Repeated inline imports of log_http within the same function.

The from databusclient.api.utils import log_http statement is repeated multiple times within _download_file. While functionally correct (Python caches imports), this adds noise and is unconventional.

✏️ Suggestion: move import to function top or module level

Either import once at the top of the function:

def _download_file(...):
    from databusclient.api.utils import log_http
    # ... rest of function

Or import at the module level (preferred if used across multiple functions):

from databusclient.api.utils import log_http

Also applies to: 239-249, 280-288

databusclient/cli.py (1)

281-284: Fix the Click 8.x shell completion environment variable format in the completion command.

The current implementation uses an incorrect environment variable format for Click 8.x. The format should be {shell}_source (e.g., bash_source), not source_{shell}. Additionally, since the project defines a console_script entry point (databusclient), the instruction can invoke it directly rather than via python -m.

For the immediate fix, update line 284 to:

Minimal fix
click.echo(f"# Add this to your shell config:\neval \"$(_DATABUSCLIENT_COMPLETE=bash_source databusclient)\"")

Or more robustly:

click.echo(f"# Add this to your shell config:\neval \"$(_DATABUSCLIENT_COMPLETE={shell}_source databusclient)\"")

For a more complete implementation that actually outputs the completion script (allowing users to pipe/source it directly), use the subprocess approach to invoke the CLI with the completion environment variable set.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a6bb139 and 467bce1.

📒 Files selected for processing (3)
  • databusclient/api/download.py
  • databusclient/api/utils.py
  • databusclient/cli.py
🧰 Additional context used
🧬 Code graph analysis (3)
databusclient/cli.py (1)
databusclient/api/deploy.py (1)
  • create_distribution (218-249)
databusclient/api/utils.py (2)
tests/test_download_auth.py (1)
  • raise_for_status (27-29)
tests/test_utils_verbose.py (1)
  • raise_for_status (18-20)
databusclient/api/download.py (1)
databusclient/api/utils.py (1)
  • fetch_databus_jsonld (40-74)
🔇 Additional comments (1)
databusclient/cli.py (1)

239-278: LGTM! The mkdist command implementation is well-structured.

Good validation of content-variant keys (lines 253-254), duplicate detection (lines 255-256), SHA-length format validation with regex (lines 261-262), and deterministic ordering via sorted keys (line 266).

Comment on lines +449 to +452
if verbose:
from databusclient.api.utils import log_http

log_http("POST", endpoint_url, req_headers={"X-API-KEY": databus_key} if databus_key is not None else None, status=200)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Inaccurate status code in verbose logging for SPARQL query.

The log_http call hardcodes status=200 but this is called after sparql.query().convert() which could fail or return other status codes. The actual status isn't available from SPARQLWrapper, so either omit the status or note this limitation.

💡 Suggestion

Consider logging without the status code since SPARQLWrapper abstracts it away, or add a comment explaining the limitation:

     if verbose:
         from databusclient.api.utils import log_http
-
-        log_http("POST", endpoint_url, req_headers={"X-API-KEY": databus_key} if databus_key is not None else None, status=200)
+        # Note: actual status unknown; SPARQLWrapper abstracts HTTP layer
+        log_http("POST", endpoint_url, req_headers={"X-API-KEY": databus_key} if databus_key is not None else None)
🤖 Prompt for AI Agents
In @databusclient/api/download.py around lines 449 - 452, The verbose logging
currently calls log_http("POST", endpoint_url, ..., status=200) after
sparql.query().convert(), which is misleading because SPARQLWrapper does not
expose an HTTP status code; remove the hardcoded status argument (or pass None)
and add a short inline comment in the verbose block noting that SPARQLWrapper
abstracts HTTP status so none is available; update the log_http call in the
verbose branch that references databus_key and endpoint_url (and the surrounding
if verbose block) accordingly.

Comment on lines +59 to +71
if verbose:
log_http("GET", uri, req_headers=headers)

response = requests.get(uri, headers=headers, timeout=30)
response.raise_for_status()

return response.text
if verbose:
log_http(
"GET",
uri,
req_headers=headers,
status=response.status_code,
resp_headers=response.headers,
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# Find where log_http is defined and where it's imported
rg -n "def log_http" --type=py
echo "---"
rg -n "from.*import.*log_http|import.*log_http" --type=py

Repository: dbpedia/databus-python-client

Length of output: 1128


🏁 Script executed:

# Check the imports section of the specific file
head -30 databusclient/api/utils.py

Repository: dbpedia/databus-python-client

Length of output: 797


🏁 Script executed:

cat -n databusclient/api/utils.py

Repository: dbpedia/databus-python-client

Length of output: 3039


log_http is used but not imported, causing NameError when verbose=True.

The function calls log_http() at lines 60 and 65–71, but there's no import statement for it. This will raise a NameError at runtime when verbose mode is enabled. Additionally, log_http is not defined in this file, yet other modules (e.g., databusclient/api/download.py) attempt to import it from utils.py.

Add the import statement to the top of the file:

from databusclient.api.logging_utils import log_http

(adjust the import path if log_http is defined elsewhere)

Also, add a blank line before line 75 (def compute_sha256_and_length) to comply with PEP 8.

🤖 Prompt for AI Agents
In @databusclient/api/utils.py around lines 59 - 71, Add the missing import for
log_http at the top of the module (e.g., from databusclient.api.logging_utils
import log_http or adjust the path to match where log_http is defined) so calls
to log_http() in the GET request flow (the two calls around requests.get) no
longer raise NameError; also insert a single blank line before the def
compute_sha256_and_length function to satisfy PEP8 spacing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

verbose param

1 participant