Skip to content

Conversation

@RahulC7
Copy link
Contributor

@RahulC7 RahulC7 commented Dec 21, 2025

Summary:
A fused pattern is when the quantizer recognizes a sequence of operations and treats as a single unit for quantization purposes. So for example, for a Conv2D + ReLU fusion, rather than having something like this:

  input → [quantize] → conv2d → [dequantize] → [quantize] → relu → [dequantize] → output

a fused pattern quantizes them together like so:

  input → [quantize] → conv2d → relu → [dequantize] → output

We need to make a few changes in our framework to test this.

Change 1: We allow graph builders to return a 3rd element for fused patterns

For fused patterns like conv+relu, the quantization annotations are split across two nodes:

  • Output annotation is on the relu node (the final output of the fused pattern)
  • Input annotations are on the conv node (where the quantized inputs enter)

The existing graph builders return (gm, target_node), which works for single-op patterns where both annotations are on the same node. For fused patterns, we need to know both nodes, so graph builders can now optionally return (gm, output_node, input_source_node).

Change 2: We check annotations on the correct nodes for fused patterns

The test previously assumed output_qspec and input_qspec_map were both on the same node. For fused patterns, they're on different nodes:

  • output_qspec is checked on the output node (relu)
  • input_qspec_map is checked on the input source node (conv)

This change is backwards-compatible: for non-fused patterns, both nodes are the same.

Differential Revision: D89630759

Copilot AI review requested due to automatic review settings December 21, 2025 22:20
@pytorch-bot
Copy link

pytorch-bot bot commented Dec 21, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/16358

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 1 Unrelated Failure

As of commit fd92112 with merge base 050e2ee (image):

NEW FAILURE - The following job has failed:

UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 21, 2025
@meta-codesync
Copy link

meta-codesync bot commented Dec 21, 2025

@RahulC7 has exported this pull request. If you are a Meta employee, you can view the originating Diff in D89630759.

@github-actions
Copy link

This PR needs a release notes: label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds comprehensive test coverage for the CadenceFusedConvReluQuantizer and several other previously untested Cadence quantizers. The key innovation is extending the test framework to support fused quantization patterns, where multiple operations (e.g., conv2d + relu) are quantized as a single unit, requiring annotations to be split across different nodes in the computation graph.

  • Updated the graph builder function signature to optionally return a third element (input source node) for fused patterns
  • Modified the test assertion logic to check output annotations on the output node and input annotations on the input source node
  • Added 13 new test cases covering 6 previously untested quantizer classes

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Summary: Add annotation tests for CadenceWith16BitConvActivationsQuantizer covering both conv1d and conv2d operations.

Differential Revision: D88895865
Differential Revision: D88896712
Differential Revision: D88898823
Differential Revision: D88898933
Differential Revision: D88899457
RahulC7 added a commit to RahulC7/executorch that referenced this pull request Dec 22, 2025
Summary:
Pull Request resolved: pytorch#16358

A fused pattern is when the quantizer recognizes a sequence of operations and treats as a single unit for quantization purposes. So for example, for a Conv2D + ReLU fusion, rather than having something like this:
```
  input → [quantize] → conv2d → [dequantize] → [quantize] → relu → [dequantize] → output
```
a fused pattern quantizes them together like so:

```
  input → [quantize] → conv2d → relu → [dequantize] → output
```

We need to make a few changes in our framework to test this.

# Change 1: We allow graph builders to return a 3rd element for fused patterns
 For fused patterns like conv+relu, the quantization annotations are split across two nodes:
  - Output annotation is on the relu node (the final output of the fused pattern)
  - Input annotations are on the conv node (where the quantized inputs enter)

 The existing graph builders return (gm, target_node), which works for single-op patterns where both annotations are on the same node. For fused patterns, we need to know both nodes, so graph builders can now optionally return (gm, output_node,  input_source_node).

# Change 2: We check annotations on the correct nodes for fused patterns

The test previously assumed output_qspec and input_qspec_map were both on the same node. For fused patterns, they're on different nodes:
  - output_qspec is checked on the output node (relu)
  - input_qspec_map is checked on the input source node (conv)

This change is backwards-compatible: for non-fused patterns, both nodes are the same.

Reviewed By: hsharma35

Differential Revision: D89630759
Summary:
Pull Request resolved: pytorch#16358

A fused pattern is when the quantizer recognizes a sequence of operations and treats as a single unit for quantization purposes. So for example, for a Conv2D + ReLU fusion, rather than having something like this:
```
  input → [quantize] → conv2d → [dequantize] → [quantize] → relu → [dequantize] → output
```
a fused pattern quantizes them together like so:

```
  input → [quantize] → conv2d → relu → [dequantize] → output
```

We need to make a few changes in our framework to test this.

# Change 1: We allow graph builders to return a 3rd element for fused patterns
 For fused patterns like conv+relu, the quantization annotations are split across two nodes:
  - Output annotation is on the relu node (the final output of the fused pattern)
  - Input annotations are on the conv node (where the quantized inputs enter)

 The existing graph builders return (gm, target_node), which works for single-op patterns where both annotations are on the same node. For fused patterns, we need to know both nodes, so graph builders can now optionally return (gm, output_node,  input_source_node).

# Change 2: We check annotations on the correct nodes for fused patterns

The test previously assumed output_qspec and input_qspec_map were both on the same node. For fused patterns, they're on different nodes:
  - output_qspec is checked on the output node (relu)
  - input_qspec_map is checked on the input source node (conv)

This change is backwards-compatible: for non-fused patterns, both nodes are the same.

Reviewed By: hsharma35

Differential Revision: D89630759
Copilot AI review requested due to automatic review settings December 22, 2025 17:42
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +341 to +364
def _build_layer_norm_graph(self) -> tuple[torch.fx.GraphModule, torch.fx.Node]:
"""Build a simple graph with a layer_norm operation."""
# Input shape: (batch, features)
x = torch.randn(1, 10)
# normalized_shape must match the last dimension(s) of input
normalized_shape = [10]
gm = single_op_builder(
placeholders=(x,),
op=torch.ops.aten.layer_norm.default,
args=(x, normalized_shape),
)

layer_norm_nodes = gm.graph.find_nodes(
op="call_function",
target=torch.ops.aten.layer_norm.default,
)
self.assertEqual(
len(layer_norm_nodes), 1, "Should find exactly one layer_norm node"
)
# Add source_fn_stack metadata required by quantizer pattern matching
layer_norm_nodes[0].meta["source_fn_stack"] = [
("layer_norm", torch.ops.aten.layer_norm.default)
]
return gm, layer_norm_nodes[0]
Copy link

Copilot AI Dec 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This builder is inconsistent with the others. Most builders use GraphBuilder and include NodeMetadata with source_fn_stack at creation time. This builder uses single_op_builder and then manually adds source_fn_stack metadata after the fact. Consider either using GraphBuilder for consistency or documenting why single_op_builder is necessary for layer_norm.

Copilot uses AI. Check for mistakes.
# Find the index of this input node in the input source node's args
arg_index = None
args = input_source_node.args
assert isinstance(args, tuple)
Copy link

Copilot AI Dec 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using Python's assert statement in test code is not ideal because it can be disabled with optimization flags. Consider using self.assertIsInstance(args, tuple) instead to ensure the check always runs.

Suggested change
assert isinstance(args, tuple)
self.assertIsInstance(args, tuple)

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported meta-exported

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants