Akshay 🚀's Twitter Thread

MCP security is completely broken! Let's understand tool poisoning attacks and how to defend against them:

MCP allows AI agents to connect with external tools and data sources through a plugin-like architecture. It's rapidly taking over the AI agent landscape with millions of requests processed daily. But there's a serious problem... 👇

1️⃣ What is a Tool Poisoning Attack (TPA)? When Malicious instructions are hidden within MCP tool descriptions that are: ❌ Invisible to users ✅ Visible to AI models These instructions trick AI models into unauthorized actions, unnoticed by users.

Here's how the attack works: AI models see complete tool descriptions (including malicious instructions), while users only see simplified versions in their UI. First take a look at this malicious tool:

Let me quickly show the attack in action by connecting this server to my cursor IDE. Check this out👇

Now let's understand a few other ways these attacks can happen and then we'll also talk about solutions...👇

2️⃣ Tool hijacking Attacks: When multiple MCP servers are connected to same client, a malicious server can poison tool descriptions to hijack behavior of TRUSTED servers. Here's an example of an email sending server hijacked by another server:

Take a look at these two MCP servers before we actually use them to demonstrate tool hijacking. `add()` tool in the second server secretly tries to hijack the operation of send email tool in the first server.

Let's see tool hijacking attack in action, again by connecting the above two servers to my cursor IDE! Check this out👇

3️⃣ MCP Rug Pulls ⚠️ Even worse - malicious servers can change tool descriptions AFTER users have approved them. Think of it like a trusted app suddenly becoming malware after installation. This makes the attack even more dangerous and harder to detect.

🛡️Mitigation Strategies: - Display full tool descriptions in the UI - Pin (lock) server versions - Isolate servers from one another - Add guardrails to block risky actions Until security issues are fixed, use EXTREME caution with.

Finally, here's a summary of how MCP works and how these attacks can occur. This visual explains it all. I hope you enjoyed today's post. Stay tuned for more! 🙌

If you found it insightful, reshare with your network. Find me → @akshay_pachaar ✔️ For more insights and tutorials on LLMs, AI Agents, and Machine Learning! https://x.com/akshay_pachaar/s...

@akshay_pachaar This is super important. I have seen MCP servers mess with local filesystems. Thanks Akshay.

@_avichawla True, sandboxing MCP server is one of the precautionary measure. Helps avoid messing with local files.

@akshay_pachaar It's so funny how we secure the front facing access to most apps with complex passwords, 2FA codes, FIDO keys, email and SMS authentification, fail2ban, etc. Then an MCP server promises us 0.0001% less work and we open the backdoor with 2 clicks. Humanity! 😀

@SingularityAge True! xD

@akshay_pachaar MCP opened new classes of security vulnerabilities. I run mine locally in docker containers which provides some protection. We need new security suites scanning for these types of attacks. Huge business opportunity for whoever can figure it out.

@JacksonAtkinsX Couldn't agree more!

@akshay_pachaar How could that be solved 🤔, maybe an external instruction analyzer 🤔

@Zeeshan3472 Adding client side guardrails is one of the easiest solution i can think of. Context sent to any MCP server must go through a security layer.

@akshay_pachaar Thanks for sharing, very informative post!

@kmeanskaran Glad you found it helpful! 🙌

@akshay_pachaar This is such an early post! In a good way…

@DanThielDOTcom Glad you found it helpful!

@akshay_pachaar Good post!

@Samyak0606 Glad you found it helpful!

@akshay_pachaar Exactly! MCP servers can be malicious, and create unnecessary risk That’s why we released UTCP, an alternative protocol that allows LLMs to call native endpoints WITHOUT needing MCP servers DM to know more https://github.com/universal-t...

@juanvierag Interesting take, i'll check this.

@akshay_pachaar Finally a detailed post on mcp attacks, a very insightful read.

@Sreenad44720197 Glad you found it helpful!

@akshay_pachaar Mcp client tech has to improve

@VirajSharma2000 Absolutely! 💯

@akshay_pachaar wow, this is so valuable. thanks

@filiksyos You’re welcome!

@akshay_pachaar Indeed. Also, tool descriptions and data returned from MCP servers can contain invisible Unicode Tags characters that many LLMs interpret as instructions and AI apps often don't consider removing or showing to user. https://embracethered.com/blog...

Share this thread

Read on Twitter

Navigate thread