Skip to content

Capture API

dcc_mcp_core (capture module)

Screen capture for DCC applications using platform-specific backends.

Capturer

High-level capturer wrapper with automatic backend selection.

Constructor

python
from dcc_mcp_core import Capturer

capturer = Capturer.new_auto()

Static Methods

MethodReturnsDescription
new_auto()CapturerCreate capturer with best available backend (full-screen / display)
new_window_auto()CapturerCreate capturer configured for single-window capture (HWND PrintWindow on Windows; Mock elsewhere)
new_mock(width=1920, height=1080)CapturerCreate capturer with mock backend (for testing/CI)
capture_window_png(pid, *, timeout_ms=1000)bytes | NoneOne-shot helper: resolve the main window of pid, capture it, return PNG-encoded bytes. Returns None on any failure (window not found, backend error, …) instead of raising
capture_region_png(pid, x, y, w, h, *, timeout_ms=1000)bytes | NoneOne-shot helper: capture window of pid and CPU-crop to the rectangle (x, y, w, h) (window-local pixels). Zero-width / zero-height regions short-circuit to None without touching the backend

Methods

MethodReturnsDescription
capture(format="png", jpeg_quality=85, scale=1.0, timeout_ms=5000, process_id=None, window_title=None)CaptureFrameCapture a frame (display or, when process_id/window_title is set, the matching window)
capture_window(*, process_id=None, window_handle=None, window_title=None, format="png", jpeg_quality=85, scale=1.0, timeout_ms=5000, include_decorations=True)CaptureFrameCapture a single window. At least one of process_id / window_handle / window_title must be provided
backend_name()strName of the active backend (e.g. "DXGI Desktop Duplication", "HWND PrintWindow")
backend_kind()CaptureBackendKindEnum form of the active backend
stats()tuple[int, int, int]Running statistics: (capture_count, total_bytes, error_count)

CaptureFrame

python
frame = capturer.capture(format="png")
print(frame.width, frame.height)  # Frame dimensions
print(frame.format)               # Format string: "png", "jpeg", or "raw_bgra"
print(frame.mime_type)            # MIME type, e.g. "image/png"
print(frame.byte_len())           # Byte length of encoded data
print(frame.data)                 # Encoded image bytes

CaptureFrame Properties

PropertyTypeDescription
widthintFrame width in pixels
heightintFrame height in pixels
databytesEncoded image bytes (PNG, JPEG) or raw BGRA32 data
formatstrFormat string: "png", "jpeg", or "raw_bgra"
mime_typestrMIME type for the encoded bytes (e.g. "image/png")
timestamp_msintMilliseconds since Unix epoch at capture time
dpi_scalefloatDisplay scale factor (1.0 standard, 2.0 HiDPI)
window_recttuple[int, int, int, int] | None(x, y, width, height) of the source window in screen coordinates, or None for full-screen / display captures
window_titlestr | NoneSource window title, or None for full-screen / display captures

CaptureFrame Methods

MethodReturnsDescription
byte_len()intByte length of the encoded image data

CaptureFormat

FormatDescription
pngPNG image format (lossless, larger)
jpeg / jpgJPEG image format (lossy, smaller)
raw_bgraRaw BGRA32 bytes (no encoding)

Capture Parameters

ParameterTypeDefaultDescription
formatstr"png"Output format
jpeg_qualityint85JPEG quality (1-100)
scalefloat1.0Scale factor
timeout_msint5000Capture timeout
process_idintNoneCapture specific process
window_titlestrNoneCapture specific window

Window-Target Capture

Capture a single application window instead of the entire display.

python
from dcc_mcp_core import Capturer, CaptureTarget, WindowFinder

# High-level: auto-select window backend, capture by PID / title / handle
cap = Capturer.new_window_auto()
frame = cap.capture_window(window_title="Maya 2024", include_decorations=True)
print(frame.window_rect, frame.window_title)   # ((x, y, w, h), "Maya 2024 - ...")

# Low-level: resolve a target to a concrete HWND before capture
finder = WindowFinder()
info = finder.find(CaptureTarget.process_id(12345))
if info is not None:
    print(info.handle, info.pid, info.title, info.rect)
    frame = cap.capture_window(window_handle=info.handle)

# Enumerate every visible top-level window
for info in finder.enumerate():
    print(info.handle, info.title)

One-shot PNG sugar

When a caller only needs PNG bytes for a single window and does not want to manage a Capturer instance, use the static helpers. They swallow every failure mode (window not found, backend error, decode error, out-of-bounds crop) and return None:

python
from dcc_mcp_core import Capturer

png = Capturer.capture_window_png(pid=12345, timeout_ms=1000)
if png is not None:
    Path("maya.png").write_bytes(png)

# CPU-crop to a sub-rectangle (window-local coordinates, no backend change)
thumb = Capturer.capture_region_png(12345, 0, 0, 320, 180, timeout_ms=1000)

CaptureTarget

Opaque window / display target descriptor. Construct via the static factories below.

FactoryDescription
CaptureTarget.primary_display()The primary display (full-screen capture)
CaptureTarget.monitor_index(index)A specific monitor by 0-based index
CaptureTarget.process_id(pid)The main window belonging to a process
CaptureTarget.window_title(title)The first window whose title contains the substring
CaptureTarget.window_handle(handle)A specific HWND / X11 window ID

WindowFinder

Resolves a CaptureTarget to a concrete WindowInfo without raising when no match is found.

MethodReturnsDescription
WindowFinder()WindowFinderConstruct a finder (platform-native enumeration on Windows; stubbed elsewhere)
.find(target)WindowInfo | NoneResolve a CaptureTarget — returns None when no matching window exists
.enumerate()list[WindowInfo]Every visible top-level window

WindowInfo

PropertyTypeDescription
handleintNative window handle (HWND on Windows, X11 window ID on Linux)
pidintOwner process ID
titlestrWindow title
recttuple[int, int, int, int](x, y, width, height) in screen coordinates

Backends

BackendPlatformKindDescription
dxgiWindowsDxgiDesktopDuplicationDXGI Desktop Duplication API — full-screen / display
hwndWindowsHwndPrintWindowGDI PrintWindow + BitBlt fallback — single window
x11LinuxX11XshmX11 XShmGetImage — full-screen
pipewireLinuxPipeWirePipeWire screencast (Wayland) — reserved
screencapturekitmacOSScreenCaptureKitScreenCaptureKit — reserved
mockAllMockSynthetic checkerboard for testing

Backend selection is automatic:

  • Capturer.new_auto() — picks the best full-screen / display backend.
  • Capturer.new_window_auto() — picks the best window-target backend (HWND on Windows; Mock elsewhere).

CaptureBackendKind

Enum exposed as CaptureBackendKind.<Variant> class attributes. Useful for branching on the backend without parsing backend_name():

python
from dcc_mcp_core import Capturer, CaptureBackendKind

cap = Capturer.new_window_auto()
if cap.backend_kind() == CaptureBackendKind.HwndPrintWindow:
    ...  # Windows window capture path

Error Handling

Capture errors are raised as RuntimeError:

python
try:
    frame = capturer.capture(timeout_ms=1000)
except RuntimeError as e:
    print(f"Capture failed: {e}")

Platform-Specific Notes

Windows

The DXGI backend requires:

  • Windows 8 or later
  • DirectX 11 compatible GPU
  • Desktop Duplication support

Linux

The X11 backend requires:

  • X11 display server
  • Read access to the X server

macOS

macOS uses the Mock backend for testing. Production capture requires platform-specific implementation.

Released under the MIT License.