New demo: cat detector #465

jholveck · 2026-01-22T07:04:11Z

This will detect if a cat is on the screen. By which I mean displayed on the screen, not sitting on your laptop.

This is meant as a simple demo of using MSS for AI. It works as-is, but needs to be documented, and there's some bits that could do with cleanup.

There are a lot of additional features that could be added, such as showing a window with bounding boxes, but that's probably more complexity than is called for here.

Changes proposed in this PR

Fixes #
(...)

Tests added/updated
Documentation updated
Changelog entry added
./check.sh passed

This will detect if a cat is on the screen. By which I mean displayed on the screen, not sitting on your laptop. This is meant as a simple demo of using MSS for AI. It works as-is, but needs to be documented, and there's some bits that could do with cleanup. There are a lot of additional features that could be added, such as showing a window with bounding boxes, but that's probably more complexity than is called for here.

BoboTiG · 2026-01-22T08:04:04Z

I like it, great inspiration!

halldorfannar

Excellent PR. Well documented, enjoyable to read. Just some minor improvements suggested.

halldorfannar · 2026-01-26T13:22:17Z

demos/cat-detector.py

+# identify what it's seeing on its cameras.
+#
+# For this demo, we want to tell if a cat is anywhere on the screen, not if the whole screen is a picture of a cat.
+# That means that we want to use an detector, not a classifier.


Suggested change

# That means that we want to use an detector, not a classifier.

# That means that we want to use a detector, not a classifier.

halldorfannar · 2026-01-26T13:37:13Z

demos/cat-detector.py

+# Performance
+# ===========
+#
+# The biggest determinant of performance is whether the model runs on a GPU or on the CPU.  GPUs are extremely


Maybe we should mention here, right away, that this particular model will work on both? I know this becomes clearer in the end of this section, when GPU vs CPU performance comparisons are discussed.

halldorfannar · 2026-01-26T13:40:27Z

demos/cat-detector.py

+# ===========
+#
+# The first time you run this demo, Torchvision will download a 167 MByte DNN.  This is cached in
+# ~/.cache/torch/hub/checkpoints on Unix.  I'm not sure where it's cached on other platforms, but it will tell you.


Third person was used earlier in the text. I would avoid switching to first person here. Perhaps just go with this:

Suggested change

# ~/.cache/torch/hub/checkpoints on Unix. I'm not sure where it's cached on other platforms, but it will tell you.

# ~/.cache/torch/hub/checkpoints on Unix. If you want to know where the cache is stored on other platforms, this information will be displayed after downloading the DNN.

halldorfannar · 2026-01-26T13:42:55Z

demos/cat-detector.py

+import torchvision.models.detection
+import torchvision.transforms.v2
+
+# You'll also need to "pip install mss pillow".


Rather than assume the user is leveraging pip (I'm a fan of uv) I would suggest the more general:

Suggested change

# You'll also need to "pip install mss pillow".

# You'll also need to install mss and pillow.

This also aligns with the earlier text where pip is suggested but it is left open for the user how to do this specifically.

halldorfannar · 2026-01-26T13:47:27Z

demos/cat-detector.py

+# If an image is too small, then it's got a pretty decent chance of being a false positive: it's hard to tell if a
+# Discord or Slack reaction icon is a cat or something different.  We ignore any results that are too small to be
+# reliable.  Here, this cutoff is 0.1% of the whole monitor (about 1.5 cm square on a 27" monitor, the diameter of a
+# AA battery).  Like the score threshold, this is just something you try and see what the model seems to be able to


Suggested change

# AA battery). Like the score threshold, this is just something you try and see what the model seems to be able to

# AA battery). Like the score threshold, this is just something you try and see what the model is able to

halldorfannar · 2026-01-26T13:53:14Z

demos/cat-detector.py

+    preprocess = weights.transforms()
+
+    # The labels ("what type of object is this") that the model gives us are just integers; for this model, they're
+    # from 0 to 90.  The English words describing them ("cat") are in a list, stored in the weight's metadata.


Suggested change

# from 0 to 90. The English words describing them ("cat") are in a list, stored in the weight's metadata.

# from 0 to 90. The English words describing them (like "cat") are in a list, stored in the weight's metadata.

jholveck and others added 3 commits January 22, 2026 10:37

Start on front-of-file comments

84c8076

Add many more comments

bc2ce7a

Improve comments, and add screenshot_to_tensor

cda4c2e

halldorfannar reviewed Jan 26, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

New demo: cat detector #465

New demo: cat detector #465

Uh oh!

jholveck commented Jan 22, 2026

Uh oh!

BoboTiG commented Jan 22, 2026

Uh oh!

halldorfannar left a comment

Uh oh!

halldorfannar Jan 26, 2026

Uh oh!

halldorfannar Jan 26, 2026

Uh oh!

halldorfannar Jan 26, 2026

Uh oh!

halldorfannar Jan 26, 2026

Uh oh!

halldorfannar Jan 26, 2026

Uh oh!

halldorfannar Jan 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

	# That means that we want to use an detector, not a classifier.
	# That means that we want to use a detector, not a classifier.

	# ~/.cache/torch/hub/checkpoints on Unix. I'm not sure where it's cached on other platforms, but it will tell you.
	# ~/.cache/torch/hub/checkpoints on Unix. If you want to know where the cache is stored on other platforms, this information will be displayed after downloading the DNN.

	# You'll also need to "pip install mss pillow".
	# You'll also need to install mss and pillow.

	# AA battery). Like the score threshold, this is just something you try and see what the model seems to be able to
	# AA battery). Like the score threshold, this is just something you try and see what the model is able to

	# from 0 to 90. The English words describing them ("cat") are in a list, stored in the weight's metadata.
	# from 0 to 90. The English words describing them (like "cat") are in a list, stored in the weight's metadata.

Uh oh!

New demo: cat detector #465

Are you sure you want to change the base?

New demo: cat detector #465

Uh oh!

Conversation

jholveck commented Jan 22, 2026

Changes proposed in this PR

Uh oh!

BoboTiG commented Jan 22, 2026

Uh oh!

halldorfannar left a comment

Choose a reason for hiding this comment

Uh oh!

halldorfannar Jan 26, 2026

Choose a reason for hiding this comment

Uh oh!

halldorfannar Jan 26, 2026

Choose a reason for hiding this comment

Uh oh!

halldorfannar Jan 26, 2026

Choose a reason for hiding this comment

Uh oh!

halldorfannar Jan 26, 2026

Choose a reason for hiding this comment

Uh oh!

halldorfannar Jan 26, 2026

Choose a reason for hiding this comment

Uh oh!

halldorfannar Jan 26, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants