I’m not convinced that most people want to talk to their computer, even if it’s great for accessibility and scenarios like getting help with apps. “Doctors are taking transcriptions while they’re performing examinations, people use it for searching, and our work with the accessibility community has taught us a lot about how to make voice access and voice typing really valuable,” says Mehdi.
For AI to control a PC and take actions on behalf of the user, it must first be granted access to see what’s on the screen. Microsoft has been testing Copilot Vision in recent months, a feature that can scan everything on your screen and coach you through using apps or answer questions about photos and documents.
Copilot Vision is now rolling out worldwide in all markets where Copilot is available, and it will let you get help using apps, troubleshoot PC problems, learn new tasks, and even get step-by-step guidance in games. Unlike the Recall feature that automatically takes a snapshot of your PC, Copilot Vision is an opt-in feature where you essentially stream what you’re seeing on your screen much like you would in a Teams call.
The next step beyond Copilot Vision is Copilot Actions, allowing Microsoft’s AI assistant to take actions on a local PC, like making edits to a folder full of photos. Microsoft is starting to test these actions on Windows PCs through a preview program, limited to a narrow set of use cases while Microsoft optimizes the AI model.
“In the beginning you might see the agent make some mistakes, or encounter some challenges when trying to use some really complex applications,” explains Navjot Virk, corporate vice president of Windows Experiences. An AI agent making mistakes using a computer doesn’t fill me with confidence, which is probably why Microsoft is limiting this to Copilot Labs for now. “We’re absolutely committed to learning from how people use it, and we want to continue to improve the experience to make it more capable and streamlined over time,” says Virk.
Copilot Actions launches in a separate Windows desktop “secure and contained environment,” and uses an AI agent to complete the job you’ve asked it to do. You can have it running in the background while you go ahead and do something else, and it will list all the steps it’s taking and you can sit and watch it complete them.
Microsoft is also integrating Copilot into the Windows taskbar, with one-click access to these new Copilot Vision and Voice features. It also has a new integrated search experience to make it faster to find local files, apps, and settings.
After the Recall fiasco last year, I think Microsoft will have a hard time convincing people to trust its Copilot Vision and Copilot Actions features, and an equally challenging time getting people to talk to their PCs. That’s not stopping Microsoft from trying, though. The company is planning to run television ads that highlight these new AI features in Windows 11, with the tagline “meet the computer you can talk to.”
The ads coincide with the end-of-support phase for Windows 10 earlier this week, and Microsoft is once again promoting Windows 11 PCs that consumers can upgrade to. “We want every person making the move to experience what it means to have a PC that’s not just a tool, but a true partner,” says Mehdi.