Skip to content

Conversation

@akhilnittala
Copy link
Contributor

What type of PR is this?

Uncomment only one /kind line, and delete the rest.
For example, > /kind bug would simply become: /kind bug

/kind bug

/kind cleanup
/kind failing-test
/kind enhancement
/kind documentation
/kind code-refactoring

What does this PR do / why we need it:
Issue Description / Context:
During the upgrade from an older version (1.17.2) to a newer version (1.17.3) — noting that the specific versions mentioned are for testing and validation purposes — we observed that the TargetDown alert was being triggered. Investigation showed that the metrics endpoint was returning a 403 Forbidden response.

Root Cause Analysis:
The 403 error indicates an authorization failure. One identified scenario is that the required RBAC permissions were missing for accessing the /metrics endpoint. As a result, Prometheus was unable to scrape metrics successfully, leading to the TargetDown alert.

Resolution / Fix:
To address this issue, we updated the RBAC configuration by explicitly adding the /metrics non-resource URL to the relevant ClusterRole/ClusterRoleBinding. With this change in place, Prometheus is able to access the metrics endpoint successfully, and the TargetDown alert is resolved.

Outcome:
Post-change verification confirms that metrics scraping works as expected and no further 403 errors are observed.
Have you updated the necessary documentation?

  • Documentation update is required by this PR.
  • Documentation has been updated.

Which issue(s) this PR fixes:
https://issues.redhat.com/browse/GITOPS-8591
Fixes #?
https://issues.redhat.com/browse/GITOPS-8591
Test acceptance criteria:

  • Unit Test
  • E2E Test

How to test changes / Special notes to the reviewer:
Install the GitOps Operator using a version that includes the relevant changes.

Navigate to the OLM UI and verify the metrics targets.

Confirm that all targets are in an UP state and correctly labeled with the gitops identifier.

Ensure that no alerts are triggered for TargetDown.

@openshift-ci openshift-ci bot added the kind/bug Something isn't working label Dec 29, 2025
@openshift-ci
Copy link

openshift-ci bot commented Dec 29, 2025

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci
Copy link

openshift-ci bot commented Dec 29, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign jannfis for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@akhilnittala akhilnittala marked this pull request as ready for review December 29, 2025 13:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant