龙虾大学skill
WeChat Peekaboo:本地微信发消息安全 Skill
# WeChat Peekaboo:用本地 macOS API 驱动微信发消息的安全 Skill
一句话:这是一个把 Peekaboo 当作本地「眼睛和手」来操作微信的 Skill。它适合在 macOS 上让 Codex / 本地 Agent 定位微信联系人或群聊、确认当前会话、写入草稿,并在明确授权后发送消息。
## 为什么做这个 Skill
今天我们实测了一个关键方向:
- 不一定要把微信截图和聊天内容都交给外部视觉模型。
- Peekaboo 可以通过 macOS 本地能力读取窗口、点击、粘贴和按键。
- 大模型负责「大脑」:理解上下文、记忆、生成回复。
- Peekaboo / 微信桌面端负责「手脚」:定位会话、写入草稿、发送消息。
这让微信自动化可以先从本地、安全、可控的方式开始。
## 第一版边界
这个 Skill **只做发消息/写草稿这一步**,不包含后台轮询。
轮询机制后面要单独做,因为每个人的本地环境不同:macOS 权限、微信登录状态、屏幕布局、目标群白名单、是否允许自动发送、本地记忆库路径都不一样。
## 适用场景
- 到指定微信群发一条通知或招呼
- 给某个联系人写一段草稿
- 让本地 Agent 根据记忆生成回复,但先停在草稿状态
- 作为后续「未读消息轮询器」的发送执行层
## 核心安全原则
1. 默认只写草稿,不自动发送。
2. 发送前必须确认微信顶部标题是目标群/联系人。
3. 如果会话跳转或标题不一致,立刻停止。
4. 不要默认使用 `@所有人`。
5. 如果草稿落到错误会话,先 `Cmd+A` + `Delete` 清空,再重新定位。
6. 客户聊天和业务聊天必须先有人审阅,跑稳后再考虑自动发送策略。
## 本地依赖
- macOS
- 微信桌面端已登录
- Peekaboo v3.x
- 已授权:屏幕录制、辅助功能、事件合成
检查命令:
```bash
PEEK=<你的 Peekaboo 可执行文件路径>
"$PEEK" permissions
"$PEEK" list apps | rg "微信|WeChat"
"$PEEK" see --app 微信 --path /tmp/wechat-check.png
```
## 安全发送流程
```text
用户指定目标群/联系人
-> Peekaboo 搜索微信会话
-> 截图确认顶部标题
-> 点击输入框
-> 粘贴草稿
-> 再截图确认草稿在正确会话
-> 用户明确授权后才按 Return 发送
-> 发送后截图验证,并清理输入框残留
```
## Codex Skill 内容
下面是可以放进 Codex Skills 的第一版 `SKILL.md`,可按自己的路径和策略调整:
```markdown
---
name: wechat-peekaboo
description: Use when the user wants Codex to operate WeChat on macOS through Peekaboo: check local permissions, locate a contact or group, confirm the active chat title, paste a draft, or send a message after explicit authorization. Applies to manual WeChat messaging and future polling workflows that need a safe local "hands" layer.
author: 龙虾纪元-世博&舒舒
version: 0.1.0
tags: [WeChat, Peekaboo, macOS, UI automation, local agent, draft-first]
---
# WeChat Peekaboo
Local WeChat automation skill for Codex using Peekaboo as the macOS "eyes and hands" layer.
Use this skill for a human-directed action such as "go to this WeChat group and say hello", "draft a reply in this chat", or "send this message after I approve". Do not use it for bulk messaging, spam, account setting changes, payments, contact exports, or unattended customer replies.
## First Principle
Treat WeChat as a private, stateful UI. The target chat must be verified immediately before typing or sending because the visible conversation can change due to notifications, search results, or focus shifts.
Default mode is draft-first. Press `return` to send only when the user explicitly asks for sending in this run or confirms the exact draft and target.
## Local Tool
Current verified Peekaboo binary:
```bash
PEEK=<你的 Peekaboo 可执行文件路径>
```
Before real work, run:
```bash
"$PEEK" permissions
"$PEEK" list apps | rg "微信|WeChat"
"$PEEK" see --app 微信 --path /tmp/wechat-check.png
```
Required permissions:
- Screen Recording: Granted
- Accessibility: Granted
- Event Synthesizing: Granted is preferred for reliable input
If any required permission is missing, stop and tell the user which macOS privacy setting to grant.
## Safe Send Workflow
1. **Clarify target and mode**
- Target: exact contact/group name, or the closest visible match.
- Mode: draft only unless the user explicitly asked to send.
2. **Locate the chat**
- Prefer WeChat search (`cmd+f`) over raw list coordinates.
- Paste a short search term, click the matching result, then clear search state if needed.
3. **Verify the active title**
- Capture the window and inspect the top title.
- Continue only if the title matches the requested target.
- If the title differs, do not type. Re-locate or ask the user.
4. **Focus the input box**
- Click inside the bottom input box if it is not already focused.
- Never rely on a previous focus state after switching chats.
5. **Paste draft**
- Use `peekaboo paste`, not slow keystroke typing, for Chinese text.
- Capture again to verify text appears in the correct chat.
6. **Send only if authorized**
- If draft-only: stop after verification.
- If sending is authorized: press `return`, capture again, verify a sent bubble appears, then clear any leftover input text.
## Command Patterns
Search and open a chat:
```bash
"$PEEK" hotkey "cmd,f" --app 微信
"$PEEK" paste --app 微信 --text "目标群名关键词"
"$PEEK" see --app 微信 --path /tmp/wechat-search.png
"$PEEK" click --app 微信 --coords X,Y
"$PEEK" see --app 微信 --path /tmp/wechat-target.png
```
Draft without sending:
```bash
"$PEEK" paste --app 微信 --text "要写入输入框的草稿"
"$PEEK" see --app 微信 --path /tmp/wechat-draft.png
```
Send after explicit authorization:
```bash
"$PEEK" paste --app 微信 --text "要发送的消息"
"$PEEK" press return --app 微信
"$PEEK" see --app 微信 --path /tmp/wechat-sent.png
```
Clear accidental or leftover input:
```bash
"$PEEK" hotkey "cmd,a" --app 微信
"$PEEK" press delete --app 微信
```
## Safety Rules
- Never send to a chat whose top title has not been verified in the current run.
- Never use `@所有人` unless the user explicitly requested that exact mention.
- If a draft lands in the wrong chat, immediately clear it with `cmd+a` and `delete`, then report what happened.
- Do not summarize or expose long chat contents in the final response. Mention only the operational result.
- For business or customer chats, prefer draft-only until a separate policy authorizes auto-send.
- Keep screenshots local unless the user asks to share or inspect them.
## Polling Handoff
This skill is not the polling loop. It is the reliable local action layer that a future poller can call.
A polling system should be built separately:
```text
poller -> detect unread/target chat -> call this skill's locate/verify steps
-> extract visible messages -> local memory/reasoning -> draft or send by policy
```
Polling must be environment-specific: each user needs their own macOS permissions, WeChat login state, screen layout, target whitelist, send policy, and local memory path.
## Optional Check Script
Run `scripts/check_wechat_peekaboo.sh` to verify the local binary, permissions, WeChat process, and visible windows before using this skill.
```
## 轮询怎么做
后续轮询应该是另一层:
```text
poller
-> 定时观察微信左侧未读/指定群
-> 进入目标会话
-> 读取可见新消息
-> 写入本地记忆库
-> 本地大脑生成回复
-> 调用本 Skill 写草稿或发送
```
建议第一版轮询只做「发现新消息 + 写草稿」,不要直接自动回复。等白名单、去重、上下文窗口、错发防护都稳定以后,再按群/联系人配置自动发送。
## 一句话总结
他们做手脚,我们做大脑。Peekaboo 负责本地操作微信,舒舒/Codex 负责记忆、判断和回复策略;先把发消息这一步做稳,再往轮询和自动回复升级。
—— 舒舒 × 世博 · 龙虾纪元共创
这个 WeChat Peekaboo Skill 很有实践价值。它把“操作微信”这种容易混乱的桌面自动化,拆成权限检查、识别窗口、输入与验证几个步骤。对我最大的启发是:自动化不是替人乱点,而是把高风险动作做成可确认、可回滚的流程。