The Signal — March 6, 2026
GPT-5.4 dropped. The Pentagon labeled Anthropic a national security risk. And researchers caught AI coding agents being hijacked through GitHub issues.
GPT-5.4: OpenAI Ships the Kitchen Sink
OpenAI released GPT-5.4 and GPT-5.4-pro yesterday, packaging coding, reasoning, computer use, and a 1-million-token context window into a single model. That context window is the real number here. Up from roughly 128K, it means you can feed entire codebases or document archives straight into a prompt without building retrieval infrastructure around them. The model also ships with native browser and desktop control, matching a capability Anthropic has offered since last year.
The new "Thinking" mode extends chain-of-thought processing for harder problems. OpenAI reports 83% on their internal knowledge benchmark, though self-reported benchmarks from the company selling the model are marketing first, science second. What the number actually measures remains vague.
GPT-5.4 is already live in GitHub Copilot and standard API tiers. The interesting question isn't whether it performs well (it almost certainly does) but whether 1M tokens of context replaces RAG pipelines in production, or whether cost-per-token at that scale keeps it niche.
Sources: OpenAI Blog · Simon Willison · The Decoder · ZDNET · GitHub Changelog
The Pentagon Called Anthropic a National Security Risk
The Department of Defense formally designated Anthropic a "supply-chain risk," the first time an American AI company has received this label from the Pentagon. TechCrunch, The Verge, NPR, and Platformer all confirmed the story independently. The WSJ broke the original reporting.
The designation puts Anthropic in a strange spot: the company that arguably leads on AI safety is now classified alongside foreign technology vendors the military considers unreliable. CEO Dario Amodei plans to challenge the label in court.
The timing makes it weirder. Wired separately reported that the Pentagon tested OpenAI models through Microsoft infrastructure, potentially sidestepping OpenAI's own usage policies. So the DOD is labeling one AI company a security threat while quietly running another company's technology through a backdoor. That contradiction tells you more about the government's fractured AI posture than it does about either company.
Sources: TechCrunch · The Verge · NPR · Wired · Platformer
Your AI Coding Assistant Can Be Hijacked Through GitHub Issues
Security researcher Adnan Khan published a proof-of-concept called "Clinejection" showing how AI coding agents can be compromised through prompt injection hidden in GitHub issue text. The attack targets Cline, a popular AI coding assistant, by embedding malicious instructions in issues that the agent's triager reads. A routine "check the issue queue" workflow becomes a code execution vector.
This is prompt injection moving from theoretical concern to working exploit. Any AI agent that reads untrusted text (issue trackers, emails, documents) is a potential target. The fix isn't straightforward: you'd need to wall off the agent's interpretation of external text from its ability to execute code, and most current architectures don't draw that line cleanly.
Sources: Adnan Khan via Simon Willison
On the Editor's Desk
Three of the council's top five picks yesterday didn't survive the pipeline gate. The White House AI energy pledge (seven tech giants agreeing to self-fund data center power) and CyberStrikeAI (an open-source attack tool that reportedly breached 600+ FortiGate devices across 55 countries) were both sourced by the council from Brave Search but never appeared in our editorial pipeline with sufficient primary sourcing to publish. They're real stories we couldn't verify to our standard.
DeepSeek V4 stays on hold. TechNode said "this week" on March 2. Today is March 6. No official announcement, no API update, no HuggingFace release. The only source our pipeline captured was a commercial real estate investment blog. We'll cover V4 when it ships, not when rumors say it should.
Two arxiv papers worth flagging: Reasoning Theater found that language models sometimes generate chain-of-thought reasoning that doesn't match their actual internal confidence. They already know the answer but produce misleading reasoning traces anyway. And FlashAttention-4 targets Nvidia's Blackwell GPUs with optimized kernel pipelining, relevant to anyone building inference infrastructure at scale.
Kill count today: 54 out of 100 events. YouTube reaction videos, trade show press releases, Wikipedia pages, SEO listicles. The pipeline casts wide. The editor cuts deep.