• Towards AGI
  • Posts
  • The New AI Hack That Can Trick Your Agents Into Betraying You

The New AI Hack That Can Trick Your Agents Into Betraying You

A structural weakness in autonomy

Here is what’s new in the AI world.

AI news: Prompt injection becomes an enterprise-grade threat

Hot Tea: The rise of consumer-grade autonomous agents

Closed AI: Figure AI’s Helix 02 - The birth of embodied agents

Open AI: Prism - The emerging agentic stack

Prompt Injection Attacks - The Fastest-Growing Threat to Agentic AI

As enterprises accelerate toward autonomous AI, a new class of cyber risk is emerging faster than most security frameworks can adapt: prompt injection attacks.

Once considered an academic edge case, prompt injection has now become one of the most scalable and dangerous attack vectors in modern AI systems, particularly as organizations deploy agentic AI with real system access.

What makes today’s prompt injection attacks uniquely dangerous is their evolution beyond text.

Attackers are now embedding malicious instructions inside images, documents, web pages, emails, and calendar invites, exploiting cross-modal vulnerabilities that dramatically expand the attack surface.

Recent research underscores the scale of the problem: a 2025 benchmark challenge recorded 461,640 prompt injection attempts, including 208,095 unique attack prompts: a signal that this threat is no longer theoretical, but industrialized.

In fact, the risk is now widely recognized.

OWASP ranked prompt injection as the #1 security threat in its 2025 Top 10 for LLM Applications.

The real danger emerges when prompt injection intersects with Agentic AI.

Unlike traditional chatbots, AI agents can take actions, trigger workflows, access multiple tools, and operate across systems with minimal human oversight.

When such agents are manipulated, the consequences extend beyond misinformation into financial fraud, confidential data exposure, system misconfiguration, and reputational damage.

Demonstrations at major security conferences have already shown AI assistants being hijacked to control connected devices, generate convincing phishing alerts, and leak sensitive information.

Thus, for business leaders, the stakes are clear: deploying Agentic AI without a comprehensive security strategy is no longer an option.

Protecting your enterprise against prompt injection attacks requires both technological safeguards and operational discipline, and GenAI itself can be a force multiplier in building these defenses.

1) The first step is preemptive threat modeling. Use GenAI to simulate potential prompt injection scenarios against your own workflows.

By feeding AI agents adversarial prompts in a controlled environment, you can uncover vulnerabilities before attackers exploit them.

2) Next, segregate instruction layers from user inputs. GenAI tools can automatically sanitize and classify incoming data, filtering out suspicious instructions embedded in text, images, or documents.

These AI-driven classifiers can detect subtle manipulations, such as hidden commands in zero-width text or encoded Unicode characters

3) Another critical lever is real-time monitoring and adaptive defense. GenAI can continuously track agent behavior, identify deviations from expected workflows, and correlate anomalies across connected systems.

When suspicious activity is detected, AI-driven alerts can trigger automatic containment measures, such as restricting system access or isolating affected agents, reducing the potential for a cascade of compromised actions.

Platforms like DataManagement.AI take this a step further with their ‘RealTime Alerts & Notifications feature,’ delivering immediate, data-driven alerts to the right stakeholders the moment predefined conditions occur.

By retrieving streaming or near-real-time metrics, fetching stakeholder contact preferences, and mapping alert rules to escalation paths, the system ensures that every alert comes with full context and a clear resolution workflow.

Meet Moltbot: The Crustacean-Inspired AI That Actually Does Things

The latest wave of AI innovation has a surprising mascot: a lobster.

Moltbot, formerly known as Clawdbot, went viral almost overnight, capturing attention for its unusual theme and bold promise: an AI assistant that doesn’t just chat, it acts.

It is designed to manage everyday tasks autonomously, from scheduling meetings and sending messages to checking in for flights.

Thousands of early adopters have embraced it, willing to tackle the technical setup required, despite its origins as a solo project created by Austrian developer Peter Steinberger.

Steinberger previously developed PSPDFkit, but stepped away from coding for three years before returning with Moltbot.

His goal was personal: to build a digital assistant that could handle his day-to-day workflow while exploring the limits of human-AI collaboration.

What started as Clawd, his “crusted assistant,” has now evolved into Molty, the foundation of Moltbot’s publicly available version.

Setting up Moltbot safely currently requires technical savvy: isolating it on a separate virtual server, limiting access to sensitive credentials, and carefully choosing which AI models it uses.

This trade-off between utility and security is emblematic of the broader enterprise challenge as organizations scale Agentic AI.

Moltbot shows that autonomous AI is no longer a novelty, it’s a real, actionable tool, but one that demands careful oversight as it begins to interact with complex digital environments.

Figure AI’s Helix 02 Signals a Leap Toward Truly Autonomous Humanoid Robots

Figure AI has unveiled Helix 02, a major upgrade to its humanoid robotics platform that brings the industry closer to a long-promised goal: robots that can walk, manipulate objects, and adapt in real-world environments without human intervention.

Unlike earlier systems that separated movement, balance, and object handling into distinct control modules, Helix 02 runs on a single, unified neural system that governs the robot’s entire body using only onboard sensors.

Helix 02 completed a four-minute kitchen task end-to-end, walking to a dishwasher, unloading dishes, moving across the room, stacking items, reloading the dishwasher, and starting it, all with no resets and no human oversight.

The robot executed 61 ordered actions while maintaining task context for several minutes, recovering implicitly from minor errors along the way.

This marks Figure’s longest fully autonomous run to date and a tangible milestone in continuous, real-world robot autonomy.

What sets Helix 02 apart is its shift from traditional, rule-based control logic to a learning-driven whole-body model.

At its core is System 0, a neural controller trained on over 1,000 hours of human motion data, replacing more than 100,000 lines of hand-written control code with a single adaptive model.

The architecture layers additional intelligence on top: System 2 interprets scenes and high-level goals, while System 1 converts sensor data into joint-level commands for the entire body.

Together, these layers allow the robot to perceive, decide, and act continuously eliminating the rigid “state machine” approach where robots must stop walking before reaching or manipulating objects.

This mirrors core principles of Agentic AI:

  • Persistent memory of task context

  • Continuous policy execution across perception

  • Actuation, and hierarchical decision-making across abstraction layers

Crucially, Helix 02 replaces brittle state machines and hand-engineered control trees with a learned policy that generalizes across tasks, environments, and failure modes.

The result is not just better robotics, but an embodied instance of agentic intelligence one that plans, simulates outcomes, adapts policies online, and executes multi-step workflows with minimal human oversight.

This convergence of agentic reasoning and physical autonomy points toward a new class of systems that behave less like programmable machines and more like autonomous operators capable of managing complex, real-world processes end to end.

From Concept to Execution: Orchestrating AI Agents with AgentsX

As organizations move beyond single-task automation toward autonomous, multi-step workflows, the challenge is no longer just building AI models. It’s designing agents that can reason, coordinate, and act across real business systems.

AgentsX positions itself as an infrastructure layer for this shift, providing a structured environment to create, deploy, and govern intelligent agents at scale.

Instead of hardcoding brittle rules or stitching together disconnected tools, AgentsX enables teams to architect multi-agent workflows where different AI components specialize in planning, execution, monitoring, and exception handling.

These agents can operate across domains such as banking, insurance, retail, and enterprise operations, translating high-level business intent into continuous, automated action.

By abstracting away low-level orchestration complexity, the platform allows organizations to move faster from experimentation to production.

What starts as a conceptual use case like automating customer onboarding, detecting fraud, optimizing operations, or managing compliance can be turned into a coordinated network of agents that learn, adapt, and improve over time.

OpenAI Prism Signals the Rise of AI-Native Scientific Workspaces

OpenAI’s launch of Prism signals a quiet but meaningful shift in how scientific research is produced: away from fragmented toolchains and toward AI-native research environments where reasoning, writing, and collaboration live in a single computational layer.

Rather than positioning GPT-5.2 as a separate chatbot, Prism embeds the model directly into the document pipeline, allowing it to operate over the full structural graph of a scientific manuscript including LaTeX source, equations, citations, figures, and narrative context at inference time.

This turns GPT-5.2 into a context-aware co-author that can parse paper structure, track logical dependencies across sections, resolve citation graphs, format symbolic math, and suggest revisions based on the evolving state of the entire document rather than isolated prompts.

Because Prism is LaTeX-native, the model interacts with raw source code instead of rendered text, enabling it to reason over mathematical notation, reference labels, bibliographies, and formatting directives with higher fidelity.

This reduces context fragmentation, a long-standing failure mode in AI-assisted writing, where meaning is lost when switching between PDFs, word processors, citation managers, and chat interfaces.

Prism also introduces real-time, cloud-synchronized collaboration at the model layer, meaning multiple researchers can edit, comment, and iterate while the AI continuously updates its understanding of the shared manuscript state.

Thus, the system acts less like a writing assistant and more like an agent embedded inside the research workflow tracking evolving hypotheses, maintaining consistency across revisions, and supporting long-horizon intellectual tasks like argument refinement, literature synthesis, and structural coherence.

Apple’s Q.ai Acquisition Signals the Next Phase of AI-Native Audio and Sensor Intelligence

Apple’s acquisition of Israeli AI startup Q.ai marks more than another strategic talent and technology grab, it signals a deeper push toward embedding machine intelligence directly into the sensory and perceptual layer of consumer devices.

While Apple has not disclosed the financial terms, reports valuing the deal at roughly $1.6 billion suggest this is a materially significant bet on the future of AI-driven audio, speech perception, and biometric sensing.

At its core, Q.ai specializes in applying machine learning to decode subtle audio and visual signals that traditional systems struggle to interpret.

The company has been developing models capable of understanding whispered speech, enhancing degraded or low-signal audio, and interpreting speech in acoustically hostile environments.

Technically, this points toward a shift from classical signal-processing pipelines to multimodal neural inference stacks, where audio is no longer treated as an isolated waveform but as part of a richer sensor fusion problem.

Instead of relying solely on microphones, Q.ai’s use vision-based and biometric cues, including micro-movements in facial skin to infer spoken content, speaker identity, emotional state, and physiological signals such as heart rate and respiration.

This implies a layered perception architecture in which convolutional and transformer-based models operate across synchronized video frames, acoustic spectrograms, and temporal biometric signals.

The resulting system can reconstruct semantic intent even when audio data is incomplete or unreliable .

For Apple, the strategic implications span multiple product layers.

On the consumer side, this technology could dramatically enhance AirPods, iPhones, Vision Pro, and future wearable devices by enabling more accurate voice input, real-time translation, whisper-level speech detection, and context-aware audio enhancement.

Rather than simply amplifying sound, Apple devices could increasingly infer user intent by modeling the physical and emotional state of the speaker, enabling adaptive audio experiences that respond dynamically to stress, fatigue, or environmental noise.

On the platform side, integrating Q.ai’s models into Apple Silicon opens the door to tightly optimized on-device inference.

Apple’s vertical integration spanning custom neural engines, real-time sensor processing, and OS-level privacy controls creates a path to deploying these capabilities without relying on cloud processing.

That matters not only for latency and power efficiency, but also for Apple’s long-standing positioning around privacy-preserving AI, where sensitive biometric and speech data remains local to the device.

There is also a broader pattern emerging.

Q.ai’s CEO Aviad Maizels previously founded PrimeSense, whose depth-sensing technology became foundational to Face ID and Apple’s transition from fingerprint-based authentication to facial recognition.

That precedent suggests Q.ai’s work could similarly become a backbone capability, not a feature-layer experiment, but a structural upgrade to how Apple devices perceive and interpret humans.

This acquisition also moves Apple closer to building systems that don’t just recognize commands, but model users continuously over time.

By combining speech inference, biometric sensing, emotional classification, and environmental awareness, Apple can construct persistent user-state representations, a key ingredient for AI agents that adapt behavior across sessions, anticipate needs, and personalize interactions dynamically.

This is also a step toward embodied intelligence at the edge.

If devices can infer intent from micro-signals, track physiological state, and contextualize audio within broader sensor inputs, they begin to function less like passive tools and more like perceptual agents (systems that monitor, interpret, and respond to human behavior in real time).

Your opinion matters!

Hope you loved reading our piece of newsletter as much as we had fun writing it. 

Share your experience and feedback with us below ‘cause we take your critique very critically. 

How's your experience?

Login or Subscribe to participate in polls.

Thank you for reading

-Shen & Towards AGI team