Skip to content

daitomanabe/gaze-effect

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Gaze Effect

Gaze Effect thumbnail

Gaze Effect は、カメラ映像の人物が常にカメラ目線に見えるようにするための macOS 向けカメラエフェクトです。目の輪郭と瞳孔位置を解析し、視線だけを自然にカメラ方向へ寄せることで、オンライン会議、配信、録画の映像に「見られている」感覚を加えます。

The concept is simple: make every captured face look toward the camera, continuously and in real time.

Concept

ビデオ通話や配信では、画面上の相手や資料を見ているだけで、視線はカメラから外れて見えます。Gaze Effect はこのズレを映像側で補正し、人物が常にカメラを見ているように変換します。

補正は顔全体を作り替えるのではなく、左右の目元だけに限定します。Vision の顔ランドマークから目の輪郭と瞳孔位置を取得し、瞳孔を目の中心方向へ小さく移動させます。これにより、表情、まばたき、頭の向きは保ったまま、視線だけをカメラ方向へ近づけます。

最終的には、Core Media I/O Camera Extension として macOS に仮想カメラを登録し、FaceTime、Zoom、OBS、AVFoundation クライアントなどから通常のカメラとして選択できるようにします。

Examples

The still-image example below uses a public domain source image and applies the same eye-contact correction concept. The pair shows the original image and the corrected output generated by GazeEffectImageTool.

Cecil Beaton / Tyneside Shipyards

Original Corrected
Cecil Beaton Tyneside original Cecil Beaton Tyneside corrected

Source: Wikimedia Commons, public domain

Video Test

This short test uses a local camera recording, samples it at 12 fps, applies gaze correction frame by frame with the slower inpaint fill mode, and encodes the clip at 36 fps for roughly 3x playback. Assets/test-video-2.mp4 is included as a source clip for reproducible testing; other local source recordings remain ignored.

For debugging accuracy, the second test also includes diagnostic renders. The first diagnostic clip paints the detected eye regions white. The second paints the same eye regions white and replaces the corrected pupil/iris target with a red circle. These clips make it easier to see that the current eye-region and pupil localization are still experimental and not yet production accurate.

Status

This repository currently contains the public project description, the first testable Swift core for estimating eye-contact correction vectors, a local camera test app, and an offline rendering test app. The production Camera Extension target, Metal/Core Image renderer, signing, and notarized installer are the next implementation steps.

The included installer is a developer-preview package. It installs the command-line validation tool and project documentation, but it does not yet install a virtual camera device.

Accuracy is still a work in progress. The current implementation is useful for verifying the Vision-only pipeline, but eye-region detection, pupil localization, and camera-facing target estimation still need improvement before the effect is reliable across different faces, gaze angles, lighting, and glasses.

How It Works

flowchart LR
    A["Physical camera"] --> B["AVCaptureSession in Camera Extension"]
    B --> C["CMSampleBuffer frames"]
    C --> D["Vision face landmark analyzer"]
    D --> E["EyeContactEstimator"]
    C --> F["Metal/Core Image eye warp renderer"]
    E --> F
    F --> G["CMIOExtensionStream"]
    G --> H["FaceTime / Zoom / OBS / AVFoundation clients"]
Loading

Processing

  1. Capture frames from the selected physical camera.
  2. Detect eye geometry with either the built-in Vision fallback or MediaPipe Face Landmarker sidecars.
  3. Select the primary face by largest bounding box.
  4. Estimate the white-of-eye region from each eye contour.
  5. Estimate the pupil/iris position from MediaPipe iris landmarks, or with a dark-blob search inside that region when using the offline refinement mode.
  6. Remove the original pupil/iris by blending it into surrounding sclera color. The preview app uses a realtime blend; offline frame processing can use an iterative inpaint-like fill.
  7. Estimate a per-eye camera-facing target from the eye corners and eyelid bounds, using a small vertical bias to avoid pushing pupils toward the eyelids.
  8. Paint the original pupil/iris texture back at the target position with a feathered mask.
  9. Do not temporally interpolate pupil positions or correction vectors; eye motion is fast, so each analyzed frame uses the current measurement directly.
  10. Emit the processed pixel buffer through CMIOExtensionStream.

Detection Pipelines

Gaze Effect now has two detector paths that feed the same renderer:

  • realtime: lightweight MediaPipe Face Landmarker / iris landmarks. This avoids Vision's coarse eye contour as the primary source and is designed for the future Camera Extension path. The current Swift preview keeps Vision as a fallback until MediaPipe is integrated natively into the app target.
  • offline: MediaPipe landmarks plus local dark-blob pupil refinement, then the slower inpaint fill mode in GazeEffectImageTool. This path is for README/video generation and accuracy debugging, where latency is less important than quality.

Both paths write or consume the same JSON sidecar shape: leftContour, rightContour, leftPupil, rightPupil, and faceBounds. That keeps the renderer independent of the detector implementation.

Apple APIs

Apple's Camera Extension workflow is documented in Creating a camera extension with Core Media I/O.

Repository Layout

  • Sources/GazeEffectCore/GazeEffectCore.swift: frame-independent eye-contact estimation logic.
  • Sources/GazeEffectCoreCheck/main.swift: geometry and safety checks that run without Xcode.
  • Sources/GazeEffectPreviewApp/main.swift: local camera test app with Effect / Debug preview modes.
  • Sources/GazeEffectImageTool/main.swift: still-image and frame-sequence correction tool used by the offline renderer and README examples.
  • Sources/GazeEffectOfflineRendererApp/main.swift: GUI wrapper for offline video rendering and diagnostic output.
  • scripts/mediapipe-eye-landmarks.py: MediaPipe Face Landmarker / iris sidecar generator for realtime and offline detector modes.
  • scripts/build-camera-test-app.sh: builds the local camera test app.
  • scripts/build-offline-renderer-app.sh: builds the offline video renderer app.
  • scripts/build-test-apps.sh: builds both local test apps.
  • scripts/build-installer.sh: builds an unsigned developer-preview macOS installer package.

The core package intentionally keeps Vision, AVFoundation, Metal, and Core Media I/O out of the library target. This keeps the correction logic testable and allows the same estimator to run inside a Camera Extension, preview app, or offline renderer.

Build

Requirements:

  • macOS 13 or later.
  • Xcode command line tools: xcode-select --install.
  • ffmpeg for offline video rendering. For example: brew install ffmpeg.
  • Python packages for MediaPipe sidecar generation: python3 -m pip install mediapipe opencv-python numpy.
swift run GazeEffectCoreCheck

Build both local test apps:

./scripts/build-test-apps.sh

This writes:

build/GazeEffectCameraTest.app
build/GazeEffectOfflineRenderer.app

These app bundles are ad-hoc signed when no local Apple Development identity is available. They are intended for local testing, not notarized public distribution.

Camera Test App

Build and launch:

./scripts/build-camera-test-app.sh
open build/GazeEffectCameraTest.app

The camera test app opens the physical camera and runs the current realtime preview pipeline. Use the segmented control in the top-right corner:

  • Effect: previews the eye-contact correction on the live camera image.
  • Debug: shows face bounds, eye contours, detected pupils, target pupils, and correction vectors.

The camera test app is not a virtual camera device. It is a local validation app for tuning detection and rendering before the Core Media I/O Camera Extension target is completed.

Offline Rendering App

Build and launch:

./scripts/build-offline-renderer-app.sh
open build/GazeEffectOfflineRenderer.app

Default input is:

Assets/test-video-2.mp4

Default output is:

build/offline-renderer/gaze-effect-offline-corrected.mp4

The offline renderer runs this sequence:

  1. Extract frames from the input movie at 12 fps and resize to the selected max width.
  2. Generate MediaPipe eye/iris sidecar JSON files.
  3. Run GazeEffectImageTool on the frame sequence.
  4. Encode the corrected frames at 36 fps for roughly 3x playback.

Detector modes:

  • Realtime: MediaPipe iris landmarks with realtime sclera fill.
  • Offline: MediaPipe iris landmarks, dark-blob refinement, and slower inpaint-like sclera fill.

Render modes:

  • Effect: corrected gaze render.
  • White eyes: fills the detected eye regions for mask debugging.
  • White eyes + red pupils: fills eye regions and draws the corrected pupil target as red circles.

If the first MediaPipe run cannot find Assets/models/face_landmarker.task, the script downloads it automatically. The model file is ignored by git.

Still image processing:

swift run GazeEffectImageTool -- \
  --input source.jpg \
  --output corrected.jpg \
  --fill-mode realtime

Frame-sequence processing with slower inpaint-like filling:

swift run GazeEffectImageTool -- \
  --input-dir frames \
  --output-dir corrected-frames \
  --fill-mode inpaint

MediaPipe realtime sidecars:

scripts/mediapipe-eye-landmarks.py \
  --input-dir frames \
  --output-dir landmarks \
  --mode realtime

swift run GazeEffectImageTool -- \
  --input-dir frames \
  --output-dir corrected-frames \
  --landmarks-dir landmarks \
  --fill-mode realtime

Offline quality pass:

scripts/mediapipe-eye-landmarks.py \
  --input-dir frames \
  --output-dir landmarks-offline \
  --mode offline

swift run GazeEffectImageTool -- \
  --input-dir frames \
  --output-dir corrected-frames \
  --landmarks-dir landmarks-offline \
  --fill-mode inpaint

Diagnostic frame-sequence rendering:

swift run GazeEffectImageTool -- \
  --input-dir frames \
  --output-dir white-eyes \
  --render-mode white-eyes

swift run GazeEffectImageTool -- \
  --input-dir frames \
  --output-dir white-eyes-red-pupils \
  --render-mode white-eyes-red-pupils

Legacy Local App Command

For compatibility, this command still builds the camera test app:

./scripts/build-app.sh
open build/GazeEffectCameraTest.app

Build Installer

./scripts/build-installer.sh

The generated package is written to:

dist/GazeEffect-DeveloperPreview-0.1.0.pkg

Install locally:

sudo installer -pkg dist/GazeEffect-DeveloperPreview-0.1.0.pkg -target /
gaze-effect-check

Installed files:

  • /usr/local/bin/gaze-effect-check
  • /usr/local/share/gaze-effect/README.md
  • /usr/local/share/gaze-effect/LICENSE

Gatekeeper

The developer-preview installer is unsigned unless it is built with a Developer ID Installer certificate. macOS may show a warning such as:

Apple could not verify "GazeEffect-DeveloperPreview-0.1.0.pkg" is free of malware.

For local development only, remove the quarantine attribute and install from Terminal:

xattr -d com.apple.quarantine dist/GazeEffect-DeveloperPreview-0.1.0.pkg
sudo installer -pkg dist/GazeEffect-DeveloperPreview-0.1.0.pkg -target /

For public distribution, build a signed and notarized package:

DEVELOPER_ID_INSTALLER="Developer ID Installer: Your Name (TEAMID)" \
NOTARY_PROFILE="gaze-effect-notary" \
./scripts/build-installer.sh

Create the notary profile once:

xcrun notarytool store-credentials gaze-effect-notary \
  --apple-id "you@example.com" \
  --team-id "TEAMID"

The public distribution package should pass:

pkgutil --check-signature dist/GazeEffect-DeveloperPreview-0.1.0.pkg
spctl --assess --type install -vv dist/GazeEffect-DeveloperPreview-0.1.0.pkg

Camera Extension Roadmap

  1. Create a macOS app target, for example GazeEffectHost.
  2. Add a Camera Extension target, for example GazeEffectCameraExtension.
  3. Add this package as a local Swift package dependency and link GazeEffectCore to the extension target.
  4. In the extension provider source, create one CMIOExtensionDevice and one video CMIOExtensionStream.
  5. In the stream source, start an AVCaptureSession and receive physical camera frames.
  6. Add a Vision analyzer that converts VNFaceObservation landmarks into FaceLandmarks.
  7. Feed the latest FaceLandmarks to EyeContactEstimator.
  8. Apply a Metal/Core Image ROI warp using EyeCorrection.delta.
  9. Wrap the processed CVPixelBuffer in a CMSampleBuffer.
  10. Send it through the stream source.

Visual Design

The first production version should avoid replacing the whole eye. A small local pupil and iris shift is more stable:

  • mask: ellipse around each eye with soft feather
  • source: original eye texture
  • warp: vector field strongest near pupil and fading at the eyelid boundary
  • clamp: conservative horizontal/vertical limits around the pupil patch
  • fallback: pass-through frame when face/eyes are unstable

For a stronger future version, a 3D eye model or learned gaze-redirection model can be added. The local ROI warp remains the practical MVP for a real-time camera extension.

License

MIT License.

About

macOS camera eye-contact effect concept and core

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors