🤖 Daily Inference

Happy Sunday! This week's AI news has been anything but quiet - and today we're digging into six stories that really matter. We've got OpenAI's most powerful model yet pushing hard toward autonomous agents, Anthropic's Claude pulling off a remarkable security feat inside Firefox, and the messy Pentagon-Anthropic saga heading to court. Plus: a surprisingly capable compact model from Microsoft, North Korean agents using AI to infiltrate Western companies, and Netflix making a bold bet on AI-powered filmmaking. Let's get into it.

🚀 OpenAI's GPT-5.4 Is Here - and It's Gunning for Autonomous Agents

OpenAI has officially launched GPT-5.4, with the company billing it as its best model ever. According to both TechCrunch and The Verge, GPT-5.4 is available in two versions - a standard Pro version and a Thinking version - and represents a significant leap forward not just in raw capability, but in how the model is designed to operate. The Verge specifically highlights that GPT-5.4 marks a big step toward autonomous AI agents, meaning the model is increasingly built to take sequences of actions, make decisions, and complete long-horizon tasks with less human hand-holding.

The Thinking variant appears to bring enhanced reasoning capabilities - similar in concept to the o-series models - directly into the GPT-5.4 product line, giving users a choice between speed and deeper deliberation. This consolidation of reasoning and general capability into a single flagship model is a notable strategic move. Rather than maintaining separate product lines for chat and reasoning, OpenAI seems to be merging these into one unified, highly capable system.

The agent-forward design is the most important signal here. As AI moves from answering questions to actually doing things - browsing, coding, booking, managing workflows - models like GPT-5.4 become the engine powering a whole new category of software. Keep an eye on our OpenAI coverage for all the follow-up as developers start stress-testing this one in production.

🛠️ Anthropic's Claude Found 22 Vulnerabilities in Firefox in Two Weeks

In what may be one of the most striking demonstrations of AI-powered security research to date, Anthropic's Claude discovered 22 vulnerabilities in Firefox over just a two-week period, according to TechCrunch. This is the kind of result that security teams typically spend months or years chasing - and it signals a genuine inflection point for how AI can be deployed in offensive and defensive cybersecurity work.

The significance here goes beyond just the number of bugs found. Firefox is a mature, heavily audited open-source browser with decades of security scrutiny behind it. Finding 22 vulnerabilities in two weeks suggests that Claude is able to reason about complex, real-world codebases at a level that meaningfully augments - or in some cases surpasses - what human researchers can do in equivalent time. This aligns with the broader trend of AI models being applied to vulnerability detection, validation, and patch generation, an area OpenAI is also pursuing with its new Codex Security research preview.

For developers and security professionals, this is both exciting and sobering. The same capability that finds vulnerabilities can, in the wrong hands, be used to discover and exploit them faster than patches can be deployed. The cybersecurity community is going to have to move quickly to establish norms around responsible disclosure when AI is doing the hunting.

⚠️ Anthropic vs. the Pentagon: The Battle Heads to Court

The most dramatic ongoing story in AI right now just got more serious. The Pentagon has formally labeled Anthropic a supply-chain risk - a designation that effectively restricts how government contractors can use Claude - and Anthropic has announced it will challenge that label in court, according to TechCrunch. President Trump reportedly described firing Anthropic from the deal in colorful terms, and the company's CEO Dario Amodei has been navigating a difficult position: pushing back against the Pentagon's classification while also reportedly still seeking to salvage some form of working relationship with defense customers.

Meanwhile, Microsoft, Google, and Amazon have all clarified that Claude remains available to their non-defense customers through their respective cloud platforms - a move designed to reassure enterprise clients that the Pentagon dispute doesn't affect commercial availability. This is a crucial distinction: the supply-chain-risk label applies to defense procurement, not the broader market.

The deeper story here is about the peril of chasing federal contracts without fully understanding the bureaucratic and political risks involved. As TechCrunch notes, this is shaping up as a cautionary tale for any AI startup eyeing government deals. The military AI landscape is treacherous terrain - and Anthropic's experience is a real-time case study in how quickly things can unravel. We've been tracking this story closely; see our earlier coverage of the Anthropic-Pentagon standoff for background.

⚡ Microsoft's Phi-4-Reasoning-Vision-15B: Big Brains in a Small Package

Not every powerful AI model needs to be enormous. Microsoft has released Phi-4-Reasoning-Vision-15B, a compact 15-billion-parameter multimodal model that is specifically designed to handle math, science, and GUI (graphical user interface) understanding. According to MarkTechPost, this model combines reasoning capability with visual understanding - meaning it can interpret diagrams, charts, and interface elements alongside text, making it genuinely useful for technical and scientific applications.

The 15B parameter size is deliberately lean. While frontier models from OpenAI and Google operate at scales that require massive cloud infrastructure, Phi-4-Reasoning-Vision-15B is built to deliver strong performance at a fraction of the compute cost. This matters enormously for enterprise AI deployments where running a 100B+ parameter model for every query is simply not economical, and for on-device or edge scenarios where bandwidth and compute are constrained.

Microsoft's Phi series has consistently punched above its weight - and this new release continues that trend by adding vision capabilities to an already strong reasoning foundation. For developers building scientific tools, educational platforms, or automation systems that interact with software UIs, this is a model worth evaluating seriously. If you're comparing model costs for your use case, our token calculator can help you run the numbers.

🏢 North Korea Is Using AI to Land Jobs at Western Tech Firms

In a striking warning, Microsoft says North Korean agents are using artificial intelligence to impersonate qualified tech workers and trick Western companies into hiring them, according to The Guardian. The scheme involves using AI-generated profiles, fake credentials, and likely AI-assisted communication to pass job interviews and background checks - allowing North Korean operatives to gain access to corporate systems, sensitive data, and potentially generate income for the regime.

This isn't an abstract threat. As AI makes it cheaper and easier to generate convincing written communication, professional headshots, and even synthetic voice and video, the traditional signals that hiring teams use to verify identity become less reliable. The problem is compounded by the rise of fully remote work, which removes the in-person verification step that once served as a natural safeguard.

For HR teams and hiring managers, this is a wake-up call to invest in more rigorous identity verification - particularly for remote technical roles with access to sensitive infrastructure. The story also underscores a broader point: AI doesn't just create new capabilities for legitimate users, it lowers the bar for sophisticated deception at scale. This is a story with serious national security implications that will only grow more pressing as AI tools improve.

🎬 Netflix Buys Ben Affleck's AI Filmmaking Startup InterPositive

In a move that signals just how seriously the entertainment industry is taking AI-powered production, Netflix has acquired InterPositive, the AI post-production startup co-founded by Ben Affleck. According to both The Guardian and TechCrunch, the deal brings Affleck's AI-focused filmmaking company inside one of the world's most powerful streaming platforms - giving Netflix access to whatever technology and workflows InterPositive has developed for AI-assisted post-production.

Post-production is one of the most labor-intensive and expensive phases of filmmaking - covering everything from visual effects and color grading to sound mixing and editing. AI tools that can accelerate or automate parts of this pipeline have enormous commercial value, and Netflix, which produces an extraordinary volume of original content, has obvious incentives to reduce per-title production costs. Bringing this capability in-house rather than licensing it externally gives Netflix a potential competitive edge in production efficiency.

The acquisition also speaks to the broader convergence of Hollywood talent and AI technology. It's no longer just tech companies building AI tools for entertainment - now we're seeing creative-industry insiders building and selling those tools. This won't be the last acquisition of its kind. Speaking of building quickly with AI - if you're inspired to launch your own project, 60sec.site lets you build a professional AI-powered website in under a minute. Worth checking out.

💬 What Do You Think?

Claude finding 22 Firefox vulnerabilities in two weeks is genuinely impressive - but it cuts both ways. The same AI capability that helps defenders find bugs faster also gives potential attackers a powerful new tool. Do you think the benefits of AI-powered security research outweigh the risks, or are we moving too fast without the right guardrails in place? Hit reply and tell me what you think - I read every response.

That's your Sunday briefing. A lot is moving at once - from courtrooms to code repositories to film studios - and we'll keep tracking all of it. If a friend or colleague would find this useful, please share Daily Inference with them. See you tomorrow.

Keep Reading