-
Notifications
You must be signed in to change notification settings - Fork 12
Add -v/--verbose flag for detailed HTTP request logging (issue #27) #48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…EY headers; propagate verbose through download flows; add tests and docs
📝 WalkthroughWalkthroughAdds a global CLI verbose flag (-v/--verbose) that enables redacted HTTP request/response logging. Implements conditional redacted HTTP logging and threads a Changes
Sequence Diagram(s)sequenceDiagram
actor User
participant CLI
participant DownloadAPI
participant SPARQLEndpoint
participant DatabusHTTP
participant VaultAuth
User->>CLI: run download -v ...
CLI->>DownloadAPI: download(..., verbose=True)
DownloadAPI->>SPARQLEndpoint: GET/POST SPARQL (log_http -> redacted log)
DownloadAPI->>DatabusHTTP: HEAD/GET file URL (log_http -> redacted log)
alt 401 requires vault
DownloadAPI->>VaultAuth: POST token exchange (log_http -> redacted log)
VaultAuth-->>DownloadAPI: return token
DownloadAPI->>DatabusHTTP: retry GET with token (log_http -> redacted log)
end
DownloadAPI-->>CLI: return status/output
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
README.md (1)
318-353: Resolve unresolved merge conflict markers.The documentation contains unresolved merge conflict markers (
<<<<<<< HEAD,=======,>>>>>>> upstream/main) that will break the rendered documentation and must be resolved before merging.📍 Locations of merge conflicts
Additional conflict markers appear at:
- Lines 370-385
- Lines 421-438
Please resolve all conflicts by:
- Reviewing both versions of the conflicting sections
- Choosing the correct content or merging both appropriately
- Removing all conflict markers
- Ensuring documentation consistency
tests/test_download_auth.py (1)
1-13: Remove duplicate imports.The file contains duplicate imports that should be consolidated:
patchis imported on both line 1 and line 11DownloadAuthErroris imported on both line 10 and line 13🔧 Proposed fix
from unittest.mock import Mock, patch import pytest import requests import logging import databusclient.api.download as dl -from databusclient.api.download import VAULT_REQUIRED_HOSTS, DownloadAuthError -from unittest.mock import patch - -from databusclient.api.download import download, DownloadAuthError +from databusclient.api.download import VAULT_REQUIRED_HOSTS, DownloadAuthError, download
🤖 Fix all issues with AI agents
In @databusclient/api/utils.py:
- Around line 58-68: The _redact_headers function currently uses
key.startswith("x-api-key") which unintentionally matches headers like
"x-api-key-extra"; change that logic to check for an exact header name match
(compare key == "x-api-key") alongside the existing authorization check so only
the intended "X-API-KEY" header is redacted; update the condition in
_redact_headers to use equality for the lowercase key and keep the rest of the
function intact.
In @README.md:
- Around line 524-564: The README contains duplicated "mkdist command" and
"Completion" sections; remove the second duplicate block (the repeated "mkdist
command" and its following "Completion" snippet) so only the first occurrence
remains, ensuring the remaining section titles "mkdist command" and "Completion"
are preserved exactly.
In @test.sh:
- Line 4: Replace the incorrect CLI flag usage `--versionid` in test.sh with the
correct `--version-id` to match the flag defined in databusclient/cli.py; also
search README.md for any leftover merge-conflict or inconsistent occurrences of
`--versionid`/`--version-id` and normalize them to `--version-id` so docs and
tests use the exact flag name declared in the CLI.
🧹 Nitpick comments (6)
README.md (1)
323-327: Add language specifiers to code blocks.Several code blocks are missing language specifiers for proper syntax highlighting. This is flagged by markdownlint and reduces documentation quality.
♻️ Suggested improvement
Add appropriate language identifiers to code blocks. For example:
-``` +```bash databusclient deploy --versionid https://databus.dbpedia.org/...Additional locations missing language specifiers: - Lines 529, 534, 541, 550, 555, 562 Use `bash` or `shell` for command-line examples. </details> </blockquote></details> <details> <summary>databusclient/api/utils.py (1)</summary><blockquote> `71-71`: **Move import to top of file.** The `import logging` statement should be at the top of the file with other imports, per PEP 8 conventions. <details> <summary>Proposed fix</summary> Move the import to the top of the file: ```diff from typing import Optional, Tuple import requests +import loggingAnd remove line 71.
databusclient/cli.py (1)
274-277: Consider using Click'sdefaultparameter for consistency.The default value
"bash"is specified in the function signature rather than in theclick.argumentdecorator. While this works, using Click'sdefaultparameter would be more idiomatic and would show the default in help text.Proposed fix
@app.command() -@click.argument("shell", type=click.Choice(["bash","zsh","fish","powershell"]), required=False) -def completion(shell="bash"): +@click.argument("shell", type=click.Choice(["bash","zsh","fish","powershell"]), default="bash") +def completion(shell): click.echo(f"Run: eval \"$(_DATABUSCLIENT_COMPLETE=source_{shell} python -m databusclient)\"")databusclient/api/download.py (2)
75-83: Consider importinglog_httpat module level to reduce repetition.The
from databusclient.api.utils import log_httpstatement is repeated inside each conditional block. Moving it to the module-level imports would reduce duplication and slightly improve performance by avoiding repeated import lookups.Proposed fix
Add to the imports at the top:
from databusclient.api.utils import ( fetch_databus_jsonld, get_databus_id_parts_from_file_url, + log_http, )Then remove all the inline
from databusclient.api.utils import log_httpstatements throughout the file.
304-311: Hardcodedstatus=200in SPARQL logging may not reflect actual status.The
log_httpcall after the SPARQL query uses a hardcodedstatus=200. While this is generally true on success (since exceptions would be raised on failure), it's not guaranteed to be accurate. SPARQLWrapper doesn't easily expose the actual HTTP status code.Consider either:
- Omitting the status parameter for the response log
- Adding a comment explaining the assumption
Proposed fix (omit status)
if verbose: from databusclient.api.utils import log_http - log_http("POST", endpoint_url, req_headers={"X-API-KEY": databus_key} if databus_key is not None else None, status=200) + # Note: actual status not available from SPARQLWrapper; logged as successful since no exception was raised + log_http("POST", endpoint_url, req_headers={"X-API-KEY": databus_key} if databus_key is not None else None)tests/test_cli_verbose.py (1)
27-38: Patch targets could be more isolated for better test robustness.The current patches at
requests.headandrequests.getwork correctly, since the test invokes only the download command and all mocked responses are configured. However, the patches affect the entirerequestsmodule globally. Sincedownload.pyusesrequestsdirectly and also callsutils.py(which also importsrequests), patching atdatabusclient.api.download.requests.headanddatabusclient.api.download.requests.getwould provide better test isolation and avoid unintended interactions if other modules usingrequestsare imported in the future.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (12)
CHANGELOG.mdPR_BODY.mdREADME.mddatabusclient/api/download.pydatabusclient/api/utils.pydatabusclient/cli.pyfile.txttest.shtests/test_cli.pytests/test_cli_verbose.pytests/test_download_auth.pytests/test_utils_verbose.py
🧰 Additional context used
🧬 Code graph analysis (5)
databusclient/cli.py (1)
databusclient/api/deploy.py (1)
create_distribution(176-207)
tests/test_cli.py (1)
databusclient/cli.py (1)
app(18-32)
tests/test_cli_verbose.py (2)
tests/test_download_auth.py (1)
iter_content(22-26)databusclient/cli.py (1)
app(18-32)
tests/test_utils_verbose.py (3)
databusclient/api/download.py (3)
download(669-797)_get_sparql_query_of_collection(249-276)_query_sparql_endpoint(279-311)databusclient/api/utils.py (1)
fetch_databus_jsonld(33-55)tests/conftest.py (1)
convert(24-25)
databusclient/api/download.py (1)
databusclient/api/utils.py (2)
log_http(74-103)fetch_databus_jsonld(33-55)
🪛 markdownlint-cli2 (0.18.1)
README.md
323-323: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
327-327: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
529-529: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
534-534: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
541-541: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
550-550: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
555-555: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
559-559: Multiple headings with the same content
(MD024, no-duplicate-heading)
562-562: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🔇 Additional comments (17)
CHANGELOG.md (1)
1-7: LGTM!The changelog entry clearly documents the new verbose feature, header redaction behavior, and security considerations. The format is standard and informative.
PR_BODY.md (1)
1-19: LGTM!The PR description is well-structured and comprehensive. It clearly explains the feature, changes, motivation, and security considerations.
README.md (1)
185-186: LGTM!The verbose flag documentation is clear and appropriately includes security warnings about redacted headers and avoiding verbose in public CI logs.
tests/test_download_auth.py (1)
110-137: LGTM!The test effectively validates the verbose logging feature with proper token redaction. The test setup appropriately mocks the HTTP auth flow, and assertions correctly verify that:
- HTTP logging is active when verbose is enabled
- Sensitive headers are redacted
- Actual token values don't leak into logs
tests/test_cli.py (4)
5-19: LGTM! Good test coverage for mkdist happy path.The test correctly validates:
- Multiple
--cvoptions are accepted- Keys are sorted alphabetically in output (
a=1_b=2notb=2_a=1)- All components (format, compression, sha-length) are correctly assembled
22-26: LGTM! Validates error handling for malformed content variants.
29-35: LGTM! Validates sha-length format enforcement.
38-42: LGTM! Basic validation for shell completion output.tests/test_utils_verbose.py (4)
12-22: LGTM! Well-structured mock response helper.The
make_responsehelper correctly simulates a requests.Response object with status_code, headers, text, and raise_for_status behavior.
25-34: LGTM! Validates API key redaction in verbose logging.The test correctly verifies that:
- HTTP activity is logged with
[HTTP] GET- Sensitive data is replaced with
REDACTED- The actual secret value
SECRETnever appears in logs
38-47: LGTM! Validates redaction in SPARQL collection query flow.
51-76: LGTM! Good mock implementation for SPARQLWrapper testing.The MockSPARQL class adequately simulates the SPARQLWrapper interface to test verbose logging around SPARQL POST operations.
databusclient/api/utils.py (1)
74-103: LGTM! Robust HTTP logging with graceful fallbacks.The
log_httpfunction handles various response header formats defensively, which is appropriate for logging utilities that shouldn't break on unexpected input types. The fallback chain for handling Mock objects and different response types is well thought out.databusclient/cli.py (2)
15-32: LGTM! Well-implemented verbose flag with proper logging setup.The verbose flag is correctly:
- Defined at the group level for global availability
- Stored in Click context for propagation to subcommands
- Configured with a handler only if none exists (prevents duplicate handlers)
- Set to DEBUG level when enabled
234-271: LGTM! Well-structured mkdist command with proper validation.The command properly:
- Validates content variant format and forbidden characters
- Prevents duplicate keys
- Validates sha-length with a strict regex pattern
- Ensures deterministic output via sorted keys
- Supports both plain string and JSON output formats
databusclient/api/download.py (2)
9-11: LGTM! Module-level logger setup.The logger is correctly configured at the module level, allowing consistent logging behavior across all functions.
669-797: LGTM! Consistent verbose flag propagation throughout download flows.The verbose parameter is correctly accepted by the main
download()function and propagated to all sub-functions across all code paths (collections, files, versions, artifacts, groups, and SPARQL queries).
databusclient/api/utils.py
Outdated
| def _redact_headers(headers): | ||
| if not headers: | ||
| return headers | ||
| redacted = {} | ||
| for k, v in headers.items(): | ||
| key = k.lower() | ||
| if key == "authorization" or key.startswith("x-api-key"): | ||
| redacted[k] = "REDACTED" | ||
| else: | ||
| redacted[k] = v | ||
| return redacted |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Issue with x-api-key header matching logic.
The condition key.startswith("x-api-key") will match x-api-key exactly but also any header starting with that prefix (e.g., x-api-key-extra). This is likely unintended. Additionally, the header name is X-API-KEY (exact match), so startswith seems overly broad.
Proposed fix
def _redact_headers(headers):
if not headers:
return headers
redacted = {}
for k, v in headers.items():
key = k.lower()
- if key == "authorization" or key.startswith("x-api-key"):
+ if key == "authorization" or key == "x-api-key":
redacted[k] = "REDACTED"
else:
redacted[k] = v
return redacted📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def _redact_headers(headers): | |
| if not headers: | |
| return headers | |
| redacted = {} | |
| for k, v in headers.items(): | |
| key = k.lower() | |
| if key == "authorization" or key.startswith("x-api-key"): | |
| redacted[k] = "REDACTED" | |
| else: | |
| redacted[k] = v | |
| return redacted | |
| def _redact_headers(headers): | |
| if not headers: | |
| return headers | |
| redacted = {} | |
| for k, v in headers.items(): | |
| key = k.lower() | |
| if key == "authorization" or key == "x-api-key": | |
| redacted[k] = "REDACTED" | |
| else: | |
| redacted[k] = v | |
| return redacted |
🤖 Prompt for AI Agents
In @databusclient/api/utils.py around lines 58 - 68, The _redact_headers
function currently uses key.startswith("x-api-key") which unintentionally
matches headers like "x-api-key-extra"; change that logic to check for an exact
header name match (compare key == "x-api-key") alongside the existing
authorization check so only the intended "X-API-KEY" header is redacted; update
the condition in _redact_headers to use equality for the lowercase key and keep
the rest of the function intact.
|
|
||
| databusclient deploy \ | ||
| --version-id "https://d8lr.tools.dbpedia.org/hopver/testGroup/testArtifact/1.0-alpha/" \ | ||
| --versionid "https://d8lr.tools.dbpedia.org/hopver/testGroup/testArtifact/1.0-alpha/" \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# Find the test.sh file and verify the line in question
find . -name "test.sh" -type f 2>/dev/null | head -20Repository: dbpedia/databus-python-client
Length of output: 82
🏁 Script executed:
# Search for references to both flag names across the codebase
rg -n '\-\-version-id|\-\-versionid|version_id' --type-list | head -5Repository: dbpedia/databus-python-client
Length of output: 462
🏁 Script executed:
# Search for both flag variations
rg -n 'version-id|versionid' -C 2Repository: dbpedia/databus-python-client
Length of output: 3596
🏁 Script executed:
# Read the CLI implementation around the flag definition
sed -n '35,45p' databusclient/cli.pyRepository: dbpedia/databus-python-client
Length of output: 517
🏁 Script executed:
# Search for all CLI option definitions related to version
rg -n 'version.id|versionid' databusclient/cli.py -C 3Repository: dbpedia/databus-python-client
Length of output: 1846
🏁 Script executed:
# Check if there are any aliases or alternative names defined
rg -n '@click.option|--version' databusclient/cli.py -C 2 | head -50Repository: dbpedia/databus-python-client
Length of output: 1831
Fix test.sh: Use correct CLI flag name --version-id instead of --versionid.
The actual CLI implementation in databusclient/cli.py defines the flag as --version-id (with hyphen), but test.sh line 4 uses --versionid (no hyphen). This will cause the test to fail at runtime. Additionally, the README.md file contains unresolved merge conflicts with inconsistent flag names that should be resolved to use --version-id consistently.
🤖 Prompt for AI Agents
In @test.sh at line 4, Replace the incorrect CLI flag usage `--versionid` in
test.sh with the correct `--version-id` to match the flag defined in
databusclient/cli.py; also search README.md for any leftover merge-conflict or
inconsistent occurrences of `--versionid`/`--version-id` and normalize them to
`--version-id` so docs and tests use the exact flag name declared in the CLI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
README.md (3)
324-359: Critical: Unresolved merge conflict markers must be removed.The README contains unresolved merge conflict markers (
<<<<<<<,=======,>>>>>>>). These must be resolved before the PR can be merged.Please resolve the conflicts by:
- Choosing the correct version of the content
- Removing all conflict markers
- Ensuring consistent formatting and examples
376-391: Critical: Additional unresolved merge conflict markers.More merge conflict markers present in this section. Please resolve these conflicts and ensure the documentation is consistent throughout.
427-444: Critical: More unresolved merge conflict markers.This section also contains unresolved merge conflict markers. All merge conflicts must be resolved before merging.
🤖 Fix all issues with AI agents
In @README.md:
- Around line 530-570: Remove the duplicated "### mkdist command" and "##
Completion" sections so only one copy of each remains; locate the repeated
headings ("mkdist command" and "Completion") and delete the second occurrence
(the block starting at the duplicate heading), preserving the first instance
with its Usage, Example, and code blocks; also ensure fenced code blocks are
marked with the language (e.g., ```bash or ```text/```python as appropriate) to
satisfy markdownlint and keep formatting consistent.
- Line 310: The README has an inconsistent CLI option name `--versionid`; change
it to the hyphenated `--version-id` to match the CLI code and other
examples—update the example line that currently shows `--versionid TEXT` to
`--version-id TEXT` and verify any surrounding examples or references use the
same `--version-id` token for consistency.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
CHANGELOG.mdREADME.mddatabusclient/api/utils.pydatabusclient/cli.pytests/test_download_auth.py
🚧 Files skipped from review as they are similar to previous changes (2)
- databusclient/api/utils.py
- CHANGELOG.md
🧰 Additional context used
🧬 Code graph analysis (2)
databusclient/cli.py (2)
databusclient/api/download.py (1)
download(669-797)databusclient/api/deploy.py (1)
create_distribution(218-249)
tests/test_download_auth.py (2)
tests/test_utils_verbose.py (1)
make_response(12-22)databusclient/api/download.py (1)
_download_file(31-213)
🪛 markdownlint-cli2 (0.18.1)
README.md
329-329: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
333-333: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
535-535: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
540-540: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
547-547: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
556-556: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
561-561: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
565-565: Multiple headings with the same content
(MD024, no-duplicate-heading)
568-568: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🔇 Additional comments (6)
tests/test_download_auth.py (1)
107-134: LGTM! Well-structured test for verbose logging and redaction.The test properly validates that:
- HTTP operations are logged when verbose is enabled
- Sensitive authorization tokens are redacted ("REDACTED" appears)
- Actual token values ("ACCESS", "VAULT") do not leak into logs
This aligns with the PR objectives and provides good coverage for the verbose feature.
databusclient/cli.py (4)
16-36: LGTM! Well-implemented global verbose flag.The verbose flag is properly integrated as a global CLI option with appropriate logging configuration:
- Context passing is correctly set up for downstream commands
- Logger configuration is conditional and defensive (checks for existing handlers)
- Log level appropriately set to DEBUG when verbose is enabled
182-205: LGTM! Correct context passing for verbose flag.The download command properly receives and propagates the verbose flag from the global context to the underlying API call. The use of
.get("verbose", False)provides a safe default.
232-271: LGTM! Robust implementation of mkdist command.The command includes thorough validation:
- Content variant format and key constraints are properly enforced
- SHA-256 validation regex is correct (64 hex chars + ':' + length)
- Deterministic ordering via sorted keys prevents ambiguity
- Clear error messages for invalid inputs
The implementation aligns well with the underlying
create_distributionAPI.
274-277: LGTM! Simple and effective completion command.The completion command provides helpful instructions for enabling shell completion across multiple shells. Relies on Click's built-in completion support.
README.md (1)
191-192: LGTM! Clear documentation of verbose flag.The documentation properly explains the verbose flag functionality and includes an appropriate warning about avoiding its use in public CI logs due to the sensitive nature of HTTP request/response details.
|
|
||
| Options: | ||
| --version-id TEXT Target databus version/dataset identifier of the form <h | ||
| --versionid TEXT Target databus version/dataset identifier of the form <h |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix inconsistent option name.
Line 310 shows --versionid but the CLI code and other examples use --version-id (with hyphen). Please correct this to --version-id for consistency.
🤖 Prompt for AI Agents
In @README.md at line 310, The README has an inconsistent CLI option name
`--versionid`; change it to the hyphenated `--version-id` to match the CLI code
and other examples—update the example line that currently shows `--versionid
TEXT` to `--version-id TEXT` and verify any surrounding examples or references
use the same `--version-id` token for consistency.
| ### mkdist command | ||
| Create a distribution string from components. | ||
| Usage: | ||
| ``` | ||
| databusclient mkdist URL --cv key=value --cv key2=value2 --format ttl --compression gz --sha-length <sha256hex>:<length> | ||
| ``` | ||
| Example: | ||
| ``` | ||
| python -m databusclient mkdist "https://example.org/file.ttl" --cv lang=en --cv part=sorted --format ttl --compression gz --sha-length aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa:12345 | ||
| ``` | ||
| ## Completion | ||
| Enable shell completion (bash example): | ||
| ``` | ||
| eval "$(_DATABUSCLIENT_COMPLETE=source_bash python -m databusclient)" | ||
| ``` | ||
| ### mkdist command | ||
| Create a distribution string from components. | ||
| Usage: | ||
| ``` | ||
| databusclient mkdist URL --cv key=value --cv key2=value2 --format ttl --compression gz --sha-length <sha256hex>:<length> | ||
| ``` | ||
| Example: | ||
| ``` | ||
| python -m databusclient mkdist "https://example.org/file.ttl" --cv lang=en --cv part=sorted --format ttl --compression gz --sha-length aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa:12345 | ||
| ``` | ||
| ## Completion | ||
| Enable shell completion (bash example): | ||
| ``` | ||
| eval "$(_DATABUSCLIENT_COMPLETE=source_bash python -m databusclient)" | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Critical: Duplicate sections must be removed.
The mkdist and Completion sections are duplicated (lines 530-549 and 551-570). This duplication likely resulted from incomplete merge conflict resolution.
Please:
- Remove the duplicate sections
- Keep only one instance of each section
- Ensure proper formatting and language specification for code blocks (as flagged by markdownlint)
🧰 Tools
🪛 markdownlint-cli2 (0.18.1)
535-535: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
540-540: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
547-547: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
556-556: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
561-561: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
565-565: Multiple headings with the same content
(MD024, no-duplicate-heading)
568-568: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🤖 Prompt for AI Agents
In @README.md around lines 530 - 570, Remove the duplicated "### mkdist command"
and "## Completion" sections so only one copy of each remains; locate the
repeated headings ("mkdist command" and "Completion") and delete the second
occurrence (the block starting at the duplicate heading), preserving the first
instance with its Usage, Example, and code blocks; also ensure fenced code
blocks are marked with the language (e.g., ```bash or ```text/```python as
appropriate) to satisfy markdownlint and keep formatting consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (8)
databusclient/api/download.py (7)
145-166: Critical:verboseparameter missing from function signature but used in the body.The docstring documents a
verboseparameter (line 165), and the function body referencesverboseat multiple locations (lines 190, 195, 239, 246, 275, 280, 285), butverboseis not in the function signature. This will raise aNameErrorat runtime.🐛 Proposed fix: add verbose parameter to signature
def _download_file( url, localDir, vault_token_file=None, databus_key=None, auth_url=None, client_id=None, validate_checksum: bool = False, expected_checksum: str | None = None, + verbose: bool = False, ) -> None:
354-389: Critical:verboseparameter missing from_download_filessignature.The docstring documents
verbose(line 374), but the parameter is not in the function signature. Additionally, this function should propagateverboseto_download_filecalls.🐛 Proposed fix
def _download_files( urls: List[str], localDir: str, vault_token_file: str = None, databus_key: str = None, auth_url: str = None, client_id: str = None, validate_checksum: bool = False, checksums: dict | None = None, + verbose: bool = False, ) -> None:And update the
_download_filecall:_download_file( url=url, localDir=localDir, vault_token_file=vault_token_file, databus_key=databus_key, auth_url=auth_url, client_id=client_id, validate_checksum=validate_checksum, expected_checksum=expected, + verbose=verbose, )
561-603: Critical:verboseparameter missing from_download_collectionsignature but used in body.The docstring mentions
verbose(line 582), and the variable is used on lines 584 and 586, but the signature on lines 561-570 lacks the parameter.🐛 Proposed fix
def _download_collection( uri: str, endpoint: str, localDir: str, vault_token: str = None, databus_key: str = None, auth_url: str = None, client_id: str = None, - validate_checksum: bool = False + validate_checksum: bool = False, + verbose: bool = False, ) -> None:
606-645: Critical:verboseparameter missing from_download_versionsignature.The docstring mentions
verbose(line 625), and it's used on line 627, but the signature lacks the parameter.🐛 Proposed fix
def _download_version( uri: str, localDir: str, vault_token_file: str = None, databus_key: str = None, auth_url: str = None, client_id: str = None, - validate_checksum: bool = False, + validate_checksum: bool = False, + verbose: bool = False, ) -> None:
648-695: Critical:verboseparameter missing from_download_artifactsignature.The docstring mentions
verbose(line 669), and it's used on lines 671 and 677, but the signature lacks the parameter.🐛 Proposed fix
def _download_artifact( uri: str, localDir: str, all_versions: bool = False, vault_token_file: str = None, databus_key: str = None, auth_url: str = None, client_id: str = None, - validate_checksum: bool = False, + validate_checksum: bool = False, + verbose: bool = False, ) -> None:
762-798: Critical:verboseparameter missing from_download_groupsignature.The docstring mentions
verbose(line 783), and it's used on line 785, but the signature lacks the parameter. Also,_download_artifactcalls need to propagateverbose.🐛 Proposed fix
def _download_group( uri: str, localDir: str, all_versions: bool = False, vault_token_file: str = None, databus_key: str = None, auth_url: str = None, client_id: str = None, - validate_checksum: bool = False, + validate_checksum: bool = False, + verbose: bool = False, ) -> None:And propagate to
_download_artifact:_download_artifact( artifact_uri, localDir, all_versions=all_versions, vault_token_file=vault_token_file, databus_key=databus_key, auth_url=auth_url, client_id=client_id, validate_checksum=validate_checksum, + verbose=verbose, )
834-859: Critical:verboseparameter missing fromdownloadfunction signature.The docstring mentions
verbose(line 858), and it's used on line 969, but the signature lacks the parameter. This is the public API entry point, so it must acceptverbosefrom the CLI.🐛 Proposed fix
def download( localDir: str, endpoint: str, databusURIs: List[str], token=None, databus_key=None, all_versions=None, auth_url="https://auth.dbpedia.org/realms/dbpedia/protocol/openid-connect/token", client_id="vault-token-exchange", - validate_checksum: bool = False + validate_checksum: bool = False, + verbose: bool = False, ) -> None:Then propagate
verboseto all internal function calls throughout the function body.databusclient/cli.py (1)
151-215: Critical:downloadcommand missing@click.pass_contextdecorator and not propagatingverbose.The function signature includes
ctx(line 188), but the@click.pass_contextdecorator is missing. This will cause a runtime error. Additionally,verbosefromctx.objis not passed toapi_download.🐛 Proposed fix
@app.command() +@click.pass_context @click.argument("databusuris", nargs=-1, required=True) @click.option( "--localdir",And pass verbose to the API call:
+ verbose = ctx.obj.get("verbose", False) try: api_download( localDir=localdir, endpoint=databus, databusURIs=databusuris, token=vault_token, databus_key=databus_key, all_versions=all_versions, auth_url=authurl, client_id=clientid, - validate_checksum=validate_checksum + validate_checksum=validate_checksum, + verbose=verbose, )
🤖 Fix all issues with AI agents
In @databusclient/api/download.py:
- Around line 449-452: The verbose logging currently calls log_http("POST",
endpoint_url, ..., status=200) after sparql.query().convert(), which is
misleading because SPARQLWrapper does not expose an HTTP status code; remove the
hardcoded status argument (or pass None) and add a short inline comment in the
verbose block noting that SPARQLWrapper abstracts HTTP status so none is
available; update the log_http call in the verbose branch that references
databus_key and endpoint_url (and the surrounding if verbose block) accordingly.
In @databusclient/api/utils.py:
- Around line 59-71: Add the missing import for log_http at the top of the
module (e.g., from databusclient.api.logging_utils import log_http or adjust the
path to match where log_http is defined) so calls to log_http() in the GET
request flow (the two calls around requests.get) no longer raise NameError; also
insert a single blank line before the def compute_sha256_and_length function to
satisfy PEP8 spacing.
🧹 Nitpick comments (3)
databusclient/api/utils.py (1)
73-75: Missing blank line between function definitions (PEP8 E302).There should be two blank lines before the
compute_sha256_and_lengthfunction definition per PEP8 style guidelines.✏️ Proposed fix
response.raise_for_status() return response.text + + def compute_sha256_and_length(filepath):databusclient/api/download.py (1)
190-198: Repeated inline imports oflog_httpwithin the same function.The
from databusclient.api.utils import log_httpstatement is repeated multiple times within_download_file. While functionally correct (Python caches imports), this adds noise and is unconventional.✏️ Suggestion: move import to function top or module level
Either import once at the top of the function:
def _download_file(...): from databusclient.api.utils import log_http # ... rest of functionOr import at the module level (preferred if used across multiple functions):
from databusclient.api.utils import log_httpAlso applies to: 239-249, 280-288
databusclient/cli.py (1)
281-284: Fix the Click 8.x shell completion environment variable format in the completion command.The current implementation uses an incorrect environment variable format for Click 8.x. The format should be
{shell}_source(e.g.,bash_source), notsource_{shell}. Additionally, since the project defines a console_script entry point (databusclient), the instruction can invoke it directly rather than viapython -m.For the immediate fix, update line 284 to:
Minimal fix
click.echo(f"# Add this to your shell config:\neval \"$(_DATABUSCLIENT_COMPLETE=bash_source databusclient)\"")Or more robustly:
click.echo(f"# Add this to your shell config:\neval \"$(_DATABUSCLIENT_COMPLETE={shell}_source databusclient)\"")For a more complete implementation that actually outputs the completion script (allowing users to pipe/source it directly), use the subprocess approach to invoke the CLI with the completion environment variable set.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
databusclient/api/download.pydatabusclient/api/utils.pydatabusclient/cli.py
🧰 Additional context used
🧬 Code graph analysis (3)
databusclient/cli.py (1)
databusclient/api/deploy.py (1)
create_distribution(218-249)
databusclient/api/utils.py (2)
tests/test_download_auth.py (1)
raise_for_status(27-29)tests/test_utils_verbose.py (1)
raise_for_status(18-20)
databusclient/api/download.py (1)
databusclient/api/utils.py (1)
fetch_databus_jsonld(40-74)
🔇 Additional comments (1)
databusclient/cli.py (1)
239-278: LGTM! Themkdistcommand implementation is well-structured.Good validation of content-variant keys (lines 253-254), duplicate detection (lines 255-256), SHA-length format validation with regex (lines 261-262), and deterministic ordering via sorted keys (line 266).
| if verbose: | ||
| from databusclient.api.utils import log_http | ||
|
|
||
| log_http("POST", endpoint_url, req_headers={"X-API-KEY": databus_key} if databus_key is not None else None, status=200) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inaccurate status code in verbose logging for SPARQL query.
The log_http call hardcodes status=200 but this is called after sparql.query().convert() which could fail or return other status codes. The actual status isn't available from SPARQLWrapper, so either omit the status or note this limitation.
💡 Suggestion
Consider logging without the status code since SPARQLWrapper abstracts it away, or add a comment explaining the limitation:
if verbose:
from databusclient.api.utils import log_http
-
- log_http("POST", endpoint_url, req_headers={"X-API-KEY": databus_key} if databus_key is not None else None, status=200)
+ # Note: actual status unknown; SPARQLWrapper abstracts HTTP layer
+ log_http("POST", endpoint_url, req_headers={"X-API-KEY": databus_key} if databus_key is not None else None)🤖 Prompt for AI Agents
In @databusclient/api/download.py around lines 449 - 452, The verbose logging
currently calls log_http("POST", endpoint_url, ..., status=200) after
sparql.query().convert(), which is misleading because SPARQLWrapper does not
expose an HTTP status code; remove the hardcoded status argument (or pass None)
and add a short inline comment in the verbose block noting that SPARQLWrapper
abstracts HTTP status so none is available; update the log_http call in the
verbose branch that references databus_key and endpoint_url (and the surrounding
if verbose block) accordingly.
| if verbose: | ||
| log_http("GET", uri, req_headers=headers) | ||
|
|
||
| response = requests.get(uri, headers=headers, timeout=30) | ||
| response.raise_for_status() | ||
|
|
||
| return response.text | ||
| if verbose: | ||
| log_http( | ||
| "GET", | ||
| uri, | ||
| req_headers=headers, | ||
| status=response.status_code, | ||
| resp_headers=response.headers, | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# Find where log_http is defined and where it's imported
rg -n "def log_http" --type=py
echo "---"
rg -n "from.*import.*log_http|import.*log_http" --type=pyRepository: dbpedia/databus-python-client
Length of output: 1128
🏁 Script executed:
# Check the imports section of the specific file
head -30 databusclient/api/utils.pyRepository: dbpedia/databus-python-client
Length of output: 797
🏁 Script executed:
cat -n databusclient/api/utils.pyRepository: dbpedia/databus-python-client
Length of output: 3039
log_http is used but not imported, causing NameError when verbose=True.
The function calls log_http() at lines 60 and 65–71, but there's no import statement for it. This will raise a NameError at runtime when verbose mode is enabled. Additionally, log_http is not defined in this file, yet other modules (e.g., databusclient/api/download.py) attempt to import it from utils.py.
Add the import statement to the top of the file:
from databusclient.api.logging_utils import log_http(adjust the import path if log_http is defined elsewhere)
Also, add a blank line before line 75 (def compute_sha256_and_length) to comply with PEP 8.
🤖 Prompt for AI Agents
In @databusclient/api/utils.py around lines 59 - 71, Add the missing import for
log_http at the top of the module (e.g., from databusclient.api.logging_utils
import log_http or adjust the path to match where log_http is defined) so calls
to log_http() in the GET request flow (the two calls around requests.get) no
longer raise NameError; also insert a single blank line before the def
compute_sha256_and_length function to satisfy PEP8 spacing.
Description
This PR introduces a -v/--verbose CLI flag for the databusclient, enabling detailed HTTP request and response logging during interactions with Databus and Vault. This helps users debug network or authentication issues, trace requests, and inspect API communication without changing default behavior.
Key Features
Related Issues
Resolves #27
Type of change
Checklist
Summary
Summary by CodeRabbit
New Features
Documentation
Tests
✏️ Tip: You can customize this high-level summary in your review settings.