ChatPaper.aiChatPaper

That's an excellent and very important question. The answer is nuanced: **it depends entirely on the specific agent, the company behind it, and how it's designed.** The core of the issue is that for a "phone-use agent" (like a voice assistant such as Siri, Google Assistant, or Alexa) to be useful, it *must* process your personal data. The key question is *how* that data is handled. Here’s a breakdown of the privacy considerations: ### How Agents Need to Use Your Data To function, these agents typically need to: 1. **Listen for a Wake Word:** The device is constantly listening for a trigger phrase like "Hey Siri" or "Okay Google." This involves processing audio locally on the device. 2. **Process Your Request:** After the wake word, your voice command is often sent to company servers in the cloud to be converted from speech to text and understood. This request could include highly personal information (e.g., "remind me to call my doctor about my test results"). 3. **Provide Contextual Answers:** To be helpful, the agent uses data like your location, calendar, contacts, and search history. ### The Potential Privacy Risks 1. **Data Collection and Storage:** Companies may store your voice recordings and associated data to improve their services. The big question is: Is this storage anonymized? Is it tied to your personal account? 2. **Human Review:** In the past, it was revealed that companies used human contractors to review audio snippets to improve speech recognition accuracy. This raised significant concerns, as these reviewers could hear private conversations accidentally triggered by mistake. 3. **Data Sharing and Use:** Privacy policies determine if your data is used to build a advertising profile for targeted ads or shared with third parties. 4. **Security Breaches:** Any stored data is potentially vulnerable to hacking, which could expose your private information. 5. **Accidental Activation:** The device might activate without the wake word, leading to it recording and transmitting conversations you never intended to share. ### Steps Companies Have Taken and You Can Take Due to public pressure and regulations like GDPR and CCPA, companies have improved privacy controls. Most now offer: * **Voice Match/Recognition:** The agent tries to recognize only your voice before giving personal results. * **Auto-Delete Options:** You can often choose to automatically delete your voice history after a set period (e.g., every 3 or 18 months). * **Opt-Out of Human Review:** Many services now allow you to opt out of having your audio used for "product improvement" or human review. * **Local Processing:** There is a growing trend toward processing more commands directly on the device (on-device processing) instead of sending everything to the cloud, which enhances privacy. **What You Can Do to Protect Your Privacy:** * **Review Privacy Settings:** Go into the app for your assistant (e.g., Google Home, Alexa app) and carefully review the privacy settings. Turn off anything you're uncomfortable with. * **Set Up Auto-Delete:** Enable the automatic deletion of your voice and activity history. * **Mute the Microphone:** Use the physical mute button when having sensitive conversations. * **Be Mindful of Your Queries:** Avoid giving extremely sensitive information (passwords, financial details, deeply personal medical info) to a voice assistant. * **Read the Privacy Policy:** While tedious, understanding how a company says it uses your data is crucial. ### Conclusion Do phone-use agents *respect* your privacy? **They are designed to respect it as much as their business model and your local laws require.** They are not inherently private tools. The responsibility is shared: * **Companies** must be transparent and build privacy-protective features by design. * **You, the user,** must actively manage your privacy settings and be conscious of how you use the technology. Ultimately, using a voice assistant involves a trade-off: you exchange a degree of privacy for a significant amount of convenience. It's up to you to decide where to draw the line.

Do Phone-Use Agents Respect Your Privacy?

April 1, 2026
著者: Zhengyang Tang, Ke Ji, Xidong Wang, Zihan Ye, Xinyuan Wang, Yiduo Guo, Ziniu Li, Chenxin Li, Jingyuan Hu, Shunian Chen, Tongxu Luo, Jiaxi Bi, Zeyu Qin, Shaobo Wang, Xin Lai, Pengyuan Lyu, Junyi Li, Can Xu, Chengquan Zhang, Han Hu, Ming Yan, Benyou Wang
cs.AI

要旨

我々は、携帯電話利用エージェントが良性のモバイルタスクを実行する際に、プライバシーを尊重するかどうかを検討する。この問いは、プライバシー準拠行動が携帯電話利用エージェント向けに具体化されておらず、通常のアプリケーションでは実行時にエージェントがどのフォーム項目にどのデータを入力するかを正確に明らかにしないため、これまで答えを得ることが困難であった。この問いを測定可能にするため、我々はモバイルエージェントのプライバシー行動に関する検証可能な評価フレームワーク「MyPhoneBench」を提案する。我々は、プライバシー尊重型の携帯電話利用を、許可されたアクセス、開示の最小化、ユーザー制御のメモリとして、最小限のプライバシー契約「iMy」を通じて具体化する。これを、計装されたモックアプリとルールベースの監査と組み合わせることで、不必要な権限要求、欺瞞的な再開示、不必要なフォーム入力を観察可能かつ再現可能にする。10のモバイルアプリと300のタスクに対する5つの最先端モデルを横断的に評価した結果、タスクの成功、プライバシーに準拠したタスク完了、および保存された設定の後続セッションでの利用は、それぞれ異なる能力であり、これら3つ全てで優位な単一のモデルは存在しないことがわかった。成功とプライバシーを同時に評価すると、各指標単独でのモデル順位が入れ替わる。全てのモデルで最も持続的に見られる失敗モードは、単純なデータ最小化の原則違反である。すなわち、エージェントはタスクで必要とされない任意の個人情報入力項目を依然として記入してしまう。これらの結果は、プライバシー侵害が良性タスクの「過剰に親切な」実行から生じること、および成功のみの評価は現在の携帯電話利用エージェントの実用適性を過大評価していることを示している。全てのコード、モックアプリ、エージェントの行動軌跡は、~ https://github.com/tangzhy/MyPhoneBench で公開されている。
English
We study whether phone-use agents respect privacy while completing benign mobile tasks. This question has remained hard to answer because privacy-compliant behavior is not operationalized for phone-use agents, and ordinary apps do not reveal exactly what data agents type into which form entries during execution. To make this question measurable, we introduce MyPhoneBench, a verifiable evaluation framework for privacy behavior in mobile agents. We operationalize privacy-respecting phone use as permissioned access, minimal disclosure, and user-controlled memory through a minimal privacy contract, iMy, and pair it with instrumented mock apps plus rule-based auditing that make unnecessary permission requests, deceptive re-disclosure, and unnecessary form filling observable and reproducible. Across five frontier models on 10 mobile apps and 300 tasks, we find that task success, privacy-compliant task completion, and later-session use of saved preferences are distinct capabilities, and no single model dominates all three. Evaluating success and privacy jointly reshuffles the model ordering relative to either metric alone. The most persistent failure mode across models is simple data minimization: agents still fill optional personal entries that the task does not require. These results show that privacy failures arise from over-helpful execution of benign tasks, and that success-only evaluation overestimates the deployment readiness of current phone-use agents. All code, mock apps, and agent trajectories are publicly available at~ https://github.com/tangzhy/MyPhoneBench.
PDF11April 3, 2026