Skip to content

feat: add configurable selection actions for floating action bar#238

Open
leyle wants to merge 7 commits into
Chevey339:masterfrom
leyle:feature/configurable-selection-actions
Open

feat: add configurable selection actions for floating action bar#238
leyle wants to merge 7 commits into
Chevey339:masterfrom
leyle:feature/configurable-selection-actions

Conversation

@leyle
Copy link
Copy Markdown

@leyle leyle commented Jan 8, 2026

  • Add SelectionAction model with configurable name, icon, and script path
  • Create dedicated 'Selection Actions' settings pane for managing custom scripts
  • Update floating action bar to dynamically display configured actions
  • Add hover detection to keep bar visible while interacting
  • Support CRUD operations for actions (add, edit, delete)
  • Scripts receive selected text as argument when executed
image image

- Add SelectionAction model with configurable name, icon, and script path
- Create dedicated 'Selection Actions' settings pane for managing custom scripts
- Update floating action bar to dynamically display configured actions
- Add hover detection to keep bar visible while interacting
- Support CRUD operations for actions (add, edit, delete)
- Scripts receive selected text as argument when executed
@leyle
Copy link
Copy Markdown
Author

leyle commented Jan 8, 2026

比如下面这个脚本就是传递选中的单词到欧路词典的
使用方法就是保存为 shell 脚本,然后在配置里面选择这个脚本即可。

#!/bin/bash

word="${1:-$POPCLIP_TEXT}"

if [[ -z "$word" ]]; then
    echo "Usage: $0 <word>" >&2
    exit 1
fi

eudic_id="${EUDIC_BUNDLE_ID:-com.eusoft.eudic}"

osascript <<EOF
tell application id "$eudic_id"
    reopen
    activate
    show dic with word "$word"
end tell
EOF

下面这个脚本是调用 NextAI-Translator 的,用法也是一样的。

#!/bin/bash

set -euo pipefail

text="${1:-}"

if [[ -z "$text" ]]; then
    echo "Usage: $0 <text>" >&2
    exit 1
fi

send_text() {
    curl -d "$text" --unix-socket /tmp/openai-translator.sock http://openai-translator
}

if ! send_text; then
    open -g -a "OpenAI Translator"
    sleep 2
    send_text
fi

@leyle
Copy link
Copy Markdown
Author

leyle commented Jan 8, 2026

下面这个是 python 脚本,保存为 openai_tts.py 即可,注意替换里面的 api key。

但是这个 python 脚本无法被执行(env 环境路径限制),因为我们只支持对 shell script 的直接调用。
所以,还需要写一个 shell script 来调用这个 python script,比如

#!/bin/bash
/Users/axel/.pyenv/shims/python /Users/axel/github/popclip-extensions/providers/openai_tts.py "$1"
#!/usr/bin/env python
#-*-coding:utf-8-*-

import sys
import os
import subprocess
import hashlib
from pathlib import Path
from datetime import datetime
from openai import OpenAI

# Import shared utilities
from tts_utils import (
    log_message,
    process_text_for_tts,
    validate_audio_data,
    validate_cache_file,
    MIN_AUDIO_SIZE
)

# --- Configuration ---
MPV_PATH = "/opt/homebrew/bin/mpv"
BASE_URL = "https://aihubmix.com/v1"
API_KEY = "sk-api-key"
MODEL = "gpt-4o-mini-tts"
# VOICE = "alloy"
VOICE = "nova"
INSTRUCTIONS = """
Read the text below as an IELTS Listening narrator (Sections 3–4 style).
Use a natural British or Australian accent.
Apply natural speech features: connected speech, weak forms, reductions, small hesitations, soft fillers (“uh,” “um,” “right”), and light self‑corrections.
You may adjust wording slightly to sound natural, but keep the original meaning.
Use typical IELTS-style delivery with realistic rhythm and intonation.
Add minimal, subtle background ambience.
Now read the following text:
"""
CACHE_DIR = Path.home() / ".cache" / "openai_tts"
LOG_FILE = CACHE_DIR / "tts.log"
# -------------------


def get_cache_key(text: str, model: str, voice: str) -> str:
    """
    Generate a unique cache key based on normalized text, model, and voice.

    Args:
        text (str): The text to speak (will be normalized internally)
        model (str): TTS model name
        voice (str): Voice name

    Returns:
        str: SHA256 hash as cache key
    """
    normalized = process_text_for_tts(text, for_cache=True, log_file=LOG_FILE)
    cache_string = f"{normalized}|{model}|{voice}"
    return hashlib.sha256(cache_string.encode('utf-8')).hexdigest()


def get_cached_audio(cache_key: str) -> Path:
    """Get the path to cached audio file."""
    return CACHE_DIR / f"{cache_key}.mp3"


def ensure_cache_dir():
    """Create cache directory if it doesn't exist."""
    CACHE_DIR.mkdir(parents=True, exist_ok=True)


def save_to_cache(cache_key: str, audio_data: bytes):
    """Save audio data to cache."""
    cache_file = get_cached_audio(cache_key)
    with open(cache_file, 'wb') as f:
        f.write(audio_data)


def play_audio(audio_path: Path):
    """Play audio file using mpv."""
    mpv_command = [MPV_PATH, "--no-terminal", "--force-window=no", str(audio_path)]
    subprocess.run(mpv_command)


def main():
    """Main function to stream OpenAI TTS audio to the mpv player with caching."""
    # 1. Check for input text
    if len(sys.argv) < 2:
        error_msg = "Usage: python3 openai_tts.py \"<text to speak>\""
        log_message(error_msg, LOG_FILE, "ERROR")
        print(error_msg, file=sys.stderr)
        sys.exit(1)

    input_text = sys.argv[1]
    log_message(f"Received input text: '{input_text}'", LOG_FILE, "DEBUG")

    # 2. Process text for TTS (preserve formatting for better TTS quality)
    text_to_speak = process_text_for_tts(input_text, for_cache=False, log_file=LOG_FILE)
    log_message(f"Text for TTS API: '{text_to_speak}'", LOG_FILE, "INFO")

    # 3. Generate normalized text for cache key
    normalized_text = process_text_for_tts(input_text, for_cache=True, log_file=LOG_FILE)
    log_message(f"Normalized for cache: '{normalized_text}'", LOG_FILE, "DEBUG")

    # Validate that we have text to speak
    if not text_to_speak:
        error_msg = "No valid text to speak after processing"
        log_message(error_msg, LOG_FILE, "ERROR")
        print(f"Error: {error_msg}", file=sys.stderr)
        sys.exit(1)

    # 4. Ensure cache directory exists
    ensure_cache_dir()

    # 5. Generate cache key (uses normalized version)
    cache_key = get_cache_key(input_text, MODEL, VOICE)
    cache_file = get_cached_audio(cache_key)
    log_message(f"Cache key: {cache_key}, Model: {MODEL}, Voice: {VOICE}", LOG_FILE, "DEBUG")

    try:
        # 6. Check if audio is already cached
        if cache_file.exists():
            log_message(f"Cache hit! Using cached file: {cache_file}", LOG_FILE, "INFO")
            # Validate cached file before playing
            if validate_cache_file(cache_file, "MP3", LOG_FILE):
                log_message(f"Playing cached audio (size: {cache_file.stat().st_size} bytes)", LOG_FILE, "INFO")
                play_audio(cache_file)
            else:
                # Invalid cache file, remove it and re-fetch
                log_message(f"Invalid cached file, removing and re-fetching", LOG_FILE, "WARNING")
                cache_file.unlink()
                raise Exception("Invalid cached file, re-fetching from API")
        else:
            log_message("Cache miss. Fetching from API...", LOG_FILE, "INFO")

            # 7. Initialize OpenAI client
            client = OpenAI(
                base_url=BASE_URL,
                api_key=API_KEY
            )

            # 8. Fetch audio from OpenAI API (using original formatting)
            log_message(f"Sending to TTS API - Text: '{text_to_speak}', Model: {MODEL}, Voice: {VOICE}", LOG_FILE, "INFO")
            audio_data = b""
            try:
                with client.audio.speech.with_streaming_response.create(
                    model=MODEL,
                    voice=VOICE,
                    instructions=INSTRUCTIONS,
                    input=text_to_speak,  # Use original formatting for better TTS
                    response_format="mp3"
                ) as response:
                    # Check response status
                    if hasattr(response, 'status_code') and response.status_code != 200:
                        error_msg = f"API returned status code: {response.status_code}"
                        log_message(error_msg, LOG_FILE, "ERROR")
                        raise Exception(error_msg)

                    log_message("Receiving audio data from API...", LOG_FILE, "DEBUG")
                    # Collect all audio data
                    for chunk in response.iter_bytes(chunk_size=4096):
                        audio_data += chunk

                    log_message(f"Received {len(audio_data)} bytes from API", LOG_FILE, "INFO")

            except Exception as api_error:
                error_msg = f"Failed to fetch audio from API: {api_error}"
                log_message(error_msg, LOG_FILE, "ERROR")
                raise Exception(error_msg)

            # 9. Validate audio data before saving
            if not validate_audio_data(audio_data, "MP3", LOG_FILE):
                error_msg = f"Invalid audio data received (size: {len(audio_data)} bytes)"
                log_message(error_msg, LOG_FILE, "ERROR")
                raise Exception(error_msg)

            log_message(f"Audio validation passed (size: {len(audio_data)} bytes)", LOG_FILE, "DEBUG")

            # 10. Save to cache (using normalized cache key)
            save_to_cache(cache_key, audio_data)
            log_message(f"Saved audio to cache: {cache_file}", LOG_FILE, "INFO")

            # 11. Play the audio
            log_message("Playing audio...", LOG_FILE, "INFO")
            play_audio(cache_file)

        log_message("TTS playback completed successfully", LOG_FILE, "INFO")
        log_message("-" * 80, LOG_FILE, "DEBUG")  # Separator for readability

    except Exception as e:
        # Write any errors to a log file
        error_msg = f"An error occurred: {e}"
        log_message(error_msg, LOG_FILE, "ERROR")
        log_message("-" * 80, LOG_FILE, "DEBUG")

        # Also write to the old error log for compatibility
        with open(os.path.expanduser("/tmp/popclip_openai_tts_error.log"), "a") as f:
            f.write(f"{error_msg}\n")

        print(error_msg, file=sys.stderr)
        sys.exit(1)


if __name__ == "__main__":
    main()

- Replace static list with ReorderableListView.builder
- Add drag handle icon (GripVertical) that appears on hover
- Connect to existing reorderSelectionActions method in SettingsProvider
- Items can now be dragged and reordered, with order persisted automatically
@luosc
Copy link
Copy Markdown
Contributor

luosc commented Jan 15, 2026

请问这个支持windows吗?

@leyle
Copy link
Copy Markdown
Author

leyle commented Jan 15, 2026

请问这个支持windows吗?

我没有 windows,所以没测试过。
不过从逻辑上讲,只要 windows 支持脚本调用应该就可以吧,
如果你有 windows 可以帮忙测试看看哈

leyle added 5 commits January 16, 2026 10:55
…ollow-ups

When making follow-up requests after tool calls in streaming mode,
historical messages were being copied with only 'role' and 'content',
stripping tool-related fields. This caused Azure/OpenAI API errors:
'Missing required parameter: input[N].call_id'

Fixed 7 locations:
- chat_api_service.dart: initial body construction (2 locations)
- chat_api_service.dart: streaming follow-up paths (5 locations)
- message_builder_service.dart: add final content after tool messages

All locations now preserve tool_calls, tool_call_id, and name fields.
- Add scrollOffset field to Conversation model (@HiveField 12)
- Persist scroll position to Hive storage across app restarts
- Save position before switching/creating conversations
- Restore saved position after switching (with delayed retry for reliability)
- Add ChatService.saveScrollOffset to properly handle draft conversations
- Fix analyzer compatibility in ios_switch.dart (numeric separator syntax)
- Add Escape key support to close the image viewer
- Add left/right arrow key navigation to switch between images
- Add left/right arrow buttons on screen for mouse navigation
- Bump version to 1.1.7+24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants