How to
Claude-style computer use on your Mac in 2026 (step by step)
A practical setup for a Mac agent that sees the screen, moves the cursor, types into real apps, and finishes the task without you holding its hand.
Install Whisply, grant Screen Recording and Accessibility, enable Computer Use on Pro Undetected, then press Cmd+Return and tell the agent what to do.
- Mac-native menu bar agent, summoned with Cmd+Return, no browser extension and no cloud VM in the loop.
- Computer Use mode clicks, scrolls, and types into the real macOS apps you already have open.
- On Pro Undetected, the overlay stays out of screen sharing and screen recording while it works.
Whisply runs the agent against the actual macOS apps on your machine, not a sandboxed Linux desktop in a data center, so it can move a file in Finder, send a Slack message from your account, or fill a Numbers sheet using the same Cmd+S you would.
Why the usual approaches fall short
- Cloud sandbox demos run in a fake Ubuntu desktop with no access to your logged-in Mac apps, so they cannot finish a real task in Mail, Calendar, or Slack.
- Python scripts driving the official sample need Accessibility granted to Terminal, a fragile screenshot pipeline, and break the next time macOS updates the permission dialog.
- Browser extensions can only see one tab at a time and cannot move a file in Finder, rename a folder, or touch any native Mac app.
- Generic remote-control tools have no model in the loop, so you still have to write every click yourself instead of describing the outcome.
- Mobile-first AI assistants ignore the keyboard-driven reality of macOS work, where the win is automating Cmd+Tab and Cmd+S, not asking Siri a question.
Step by step
- 1
Install Whisply and sign in
Download the Mac app from the Whisply site, drag it to Applications, and open it. Whisply runs on macOS 13 Ventura and later on both Apple Silicon and Intel. Sign in with email; the model access is included with your plan so there is no API key to paste.
- 2
Upgrade to Pro Undetected to turn on Computer Use
Computer Use is a Pro Undetected feature. From the Whisply menu bar icon, open Settings, then Billing, and pick Pro Undetected. Annual is $44.99 per month and monthly is $149.99. After the upgrade, a Computer Use toggle appears in the main overlay.
- 3
Grant Screen Recording and Accessibility
Whisply will prompt you on first launch. Open System Settings, Privacy and Security, Screen and System Audio Recording, and tick Whisply. Then go to Privacy and Security, Accessibility, and tick Whisply there too. Quit and reopen the app so both permissions take effect.
- 4
Summon the overlay with Cmd+Return
From anywhere on your Mac, press Cmd+Return. The Whisply overlay appears on top of whatever you are doing. Switch the mode selector to Computer Use. The overlay stays out of screen sharing and screen recording on Pro Undetected by default.
- 5
Describe the task in plain English
Type or speak what you want done. Be specific about apps and files. For example: open the three PDFs at the top of my Downloads folder, pull the invoice totals into a new Numbers sheet, save it to iCloud Drive in the Invoices folder. Press Return.
- 6
Watch the first run, then step away
The agent narrates its actions in the overlay as it clicks and types. Move the trackpad or hit Esc to stop it at any point. Once you have watched the same kind of task succeed twice, you can switch to another window and let it finish.
- 7
Tighten scope for anything sensitive
For tasks that send email, move money, or touch shared docs, add a stop rule. Tell the agent to draft and pause before sending, or to ask before overwriting any file. Keep risky runs in a dedicated folder until you trust the pattern.
What a computer-use agent on Mac actually does
A computer-use agent looks at the pixels on your screen, decides what to click, and then clicks it. That is the whole trick. Where a chatbot writes you a paragraph about how to rename 200 files, a computer-use agent opens Finder, selects the folder, runs the rename sheet, and watches the count tick up. The work happens in your real apps with your real account, not in a research demo of a fake browser.
On a Mac, that means three things have to be true. The agent needs Screen Recording permission so it can see what is in front of you. It needs Accessibility permission so macOS will let it move the cursor and send keystrokes. And it needs a model good enough to read a noisy Retina screenshot, find the small button it wants, and not click the wrong thing. Whisply ships all three in one app, with the model already wired up, so you do not pay a separate API bill or stitch SDKs together.
The result is the kind of help that used to require a junior assistant. You describe the outcome, the agent does the clicking, you keep working in another window. When something looks off, you take the cursor back. There is nothing to deploy and nothing to babysit.
Why most setups fall apart on macOS
Most computer-use demos online run inside a virtual Ubuntu desktop in a browser tab. They are great for screenshots and useless for the thing you actually want to do, which is move money in your banking app, reschedule a calendar invite in your real Calendar, or clean up your Downloads folder. The moment you try to bridge a cloud sandbox back to your laptop, you end up wiring tunnels, copying cookies, and explaining to the agent that no, the Chrome inside its container does not have your logged-in tabs.
The other common path is a Python script that drives the official Anthropic computer-use sample on your machine. It works, briefly, until macOS asks for Accessibility permission for a Terminal-launched process, until the screenshot resolution does not match what the model expects, until the script crashes mid-task and leaves a half-renamed folder behind. You spend an evening on plumbing and you still have no UI to pause the agent when it heads somewhere you did not intend.
Whisply is the version that treats macOS as the target, not as a footnote. The agent is a real Mac app with a menu bar icon, a hotkey, a pause button, and a transcript you can scroll. Permissions are requested the way macOS expects. Screenshots come from the system, not from a screen-capture hack that breaks on the next OS update.
What you can hand to it on day one
Start with the chores that are too small to script and too annoying to do by hand. Sort the screenshots on your desktop into a folder by month. Open the last three PDFs in Downloads, pull the totals into a Numbers sheet, and save it to a specific iCloud folder. Reply to every unread message in a Slack channel with a one-line acknowledgment and a thumbs up. The agent will narrate what it is doing in the overlay, so you can watch it work the first few times and then stop watching.
Move up to multi-app workflows next. Take the invoice PDF that just landed in Mail, drop the line items into a Google Sheet in Chrome, then go back to Mail and reply with a confirmation. Open a research paper in Preview, summarize it into Notes, and start a draft in Mail to your advisor with the summary pasted in. These are exactly the tasks that fall through the cracks between single-purpose tools, and they are where a computer-use agent earns its keep.
The third tier is the one people did not expect to like. Hand it the boring stretch in the middle of a project. Renaming exported assets to a style guide. Filling a form a hundred times with rows from a CSV. Walking through a settings checklist on a new install. The agent does not get bored. You get the hour back.
Staying in control while the agent runs
A good computer-use agent has a brake pedal and an off switch. In Whisply, moving the trackpad or pressing Esc stops the run instantly. The overlay shows you what the agent is about to do before it does it on slower steps, so you can intervene before it clicks the wrong Send button. Nothing about the design assumes you want to walk away and trust it blindly on the first try.
Scope matters too. Tell the agent which apps it should touch and which files it should not. Hand it a specific folder rather than your whole home directory. For anything that costs money or sends an email to a real human, ask it to draft and pause before committing. Treat it like a sharp intern on day one, then loosen the leash as you watch it get the same job right three times in a row.
Because Whisply runs locally on your Mac, the agent is not piping every screenshot to a third-party log server. The model sees what it needs to see in the moment and does not keep a permanent archive of your screen. That matters when the task involves a banking app, a medical record, or a private channel where the rest of the room never agreed to be part of an AI training run.
Where this fits next to a chat model
A chat model is a thinker. A computer-use agent is a doer. The trick is knowing which one to call. If you need a draft, an explanation, or a plan, the chat panel in Whisply gives you that without touching the cursor. If you need the plan executed across three apps, you switch on Computer Use and let the agent take the wheel. Same hotkey, same app, different mode.
The two modes feed each other in practice. You ask the chat side to write the spreadsheet formula you want, then ask the agent to paste it into the right cell in Numbers and drag it down the column. You ask the chat side to summarize a meeting, then ask the agent to file the summary into Notion under the correct project. The split keeps each part honest. Thinking stays cheap and fast. Doing happens in your real apps, with your eyes on it.
Related questions
Is this the same Computer Use that Anthropic showed off?
Same idea, packaged for Mac. Anthropic showed a research demo where a model could control a virtual desktop. Whisply takes that pattern and points it at the macOS you actually use, with a real menu bar app, a hotkey, native permission flows, and a pause button. You do not have to spin up a Docker container, set up a VNC server, or run a sample script from a terminal.
Do I need to bring my own API key?
No. Model access is included in every Whisply plan, including Pro Undetected. You pay one subscription and the computer-use model is wired up out of the box. There is no separate bill from a model provider and no key to rotate.
Will the agent work in apps I have logged into with SSO?
Yes. Because Whisply drives your real Mac, every app that is already signed in stays signed in. The agent uses your Chrome with your cookies, your Slack with your account, your Mail with your mailbox. There is no second browser to authenticate and no session to copy across.
How do I stop the agent if it heads somewhere I did not want?
Move the trackpad or press Esc. The run stops immediately and the overlay shows you what it had planned next. You can also close the overlay with the same Cmd+Return hotkey. Nothing about Computer Use locks you out of your machine while it runs.
Is my screen being sent anywhere?
The model sees the screenshots it needs to plan the next click and nothing is kept as a permanent archive of your screen. Whisply does not stream your desktop to a third-party recording service. For the privacy details, see the undetectability page.
Will the agent show up in Zoom, Meet, or a screen recording?
No. On Pro Undetected, the Whisply overlay is excluded from screen sharing and screen recording at the system level. Your screen looks normal to the other side of the call while the agent works behind the scenes for you.
What kinds of tasks is it bad at?
Anything that depends on a real-time animation, a drag-and-drop in a custom game engine, or a pixel-perfect selection in a video timeline is still rough. It is also not the right tool for code review, where the chat side of Whisply is faster. Save Computer Use for clicking, typing, and moving things between apps.
Try Whisply free.
Mac only. macOS 13 or later. No bot in your calls.