Media & Power

The Record Is the Thing Under Attack

A hackbot with a fake scoreboard, an agency that keeps its own negatives, and a trafficked child reduced to a thumbnail all fail the same test in different directions.

Manish Singh/July 5, 2026/5 min read

A red-team tool called T3MP3ST arrived this week with a wall of benchmark numbers, and one detail in them tells you how to read all the rest. The headline scores are attributed to gpt-5.5 and Claude Opus 4.8. Neither model exists. Anthropic's current flagship is Opus 4.5, and there is no public gpt-5.5, so the pass rates hanging off those names describe a race that has not been run.

The tool itself is real and, on its own terms, clever. It is built by the jailbreak researcher who goes by Pliny the Liberator, shipped free under AGPL-3.0, and its pitch is a harness of harnesses. You point it at the coding agent already humming in your terminal, Claude Code or Codex or Hermes, and it straps an offensive-security workflow on top: kill-chain phases from recon through command-and-control, eight specialist operator classes, an arsenal of nmap, nuclei, semgrep, and the rest, with the loud post-exploitation drivers gated behind a human approval step. The provocation is the brand. The disclaimer says authorized use only, and the community replies immediately raised the obvious point about who else will read that line.

T3MP3ST War Room dashboard showing mission status, kill-chain phase tiles, and a zero-day hunt panel
The dashboard urges you to run verify-claims and re-derive every number yourself, a reproducibility promise attached to scores from models that were never released.

The trend under the marketing is not fake, which is what makes the fabrication worth naming. XBOW, a real company, reached the top of HackerOne's US bug-bounty leaderboard with an autonomous pentester and raised a seventy-five-million-dollar round on the back of it. In November 2025 Anthropic disclosed that a Chinese state-linked group had manipulated Claude Code to run an espionage campaign against roughly thirty targets largely on its own, which the company called the first documented large-scale cyberattack executed without substantial human intervention. Autonomous offense is here. The capability is genuine.

The numbers T3MP3ST hangs on that capability are not verifiable, and the launch tells you so if you read it against a calendar:

  • The XBEN and Cybench results are credited to gpt-5.5 and Opus 4.8, models that do not exist in any public release.
  • The CVE-Zero test claims the tool pinned real 2026 CVEs disclosed after the model's training cutoff, a window that is forward-dated relative to any checkable record.
  • The dashboard itself begs you to run verify-claims and re-derive every figure, a reproducibility gesture bolted onto scores you cannot reproduce because their premise is fictional.

So the right posture is to treat every headline stat as self-reported and directional at best, and to keep the eye on what is real: an open tool that lowers the cost of pointing an autonomous agent at a target, and a launch that inflates its own scoreboard for attention. Ask who benefits and the answer is the launch.

The same reflex, run in the other direction, is what the next item needs. A Daily Mail headline this week says NASA was caught erasing UFOs from photographs before public release. Trace the claim and it does not come from anything new. It rests on two decades-old accounts: Donna Hare, a design illustrator for the NASA contractor Philco Ford who testified that a technician in the Johnson Space Center photo lab told her his job was to airbrush anomalies out of imagery, and Gary McKinnon, the Scottish hacker who said he saw edited images and non-terrestrial officer lists inside NASA systems. Both are single-source, uncorroborated, and roughly twenty-five years old. The news hook is the May 2026 Pentagon release of around a hundred and sixty declassified UAP files, which resurfaced Apollo-era images that have been public for decades.

On the specific images I will follow the evidence, not the caption. The Apollo 12 and 17 dots are not newly declassified and are most plausibly film artifacts. The astrophysicist Avi Loeb argues the blue lights match cosmic-ray hits on the emulsion, which is telling because the same lights appear on film outside the camera's lens coverage, and Artemis II astronauts with far better cameras saw no anomalous lights at all. A viral thumbnail with a red circle does not beat that.

Here is where I will not flatten the story into a debunk, though, because the direction of doubt matters more than any single frame. The claim most in need of scrutiny is the official one. NASA administrator Jared Isaacman narrates the release as real unexplained phenomena and, in the same breath, no crashed ships and no alien bodies. The state releases the files and keeps the narration for itself. After decades of documented obfuscation on this subject, the nothing-to-see-here line has not earned the benefit of the doubt, and the instinct to sneer at witnesses while trusting the institution that physically holds the negatives has the arrow backwards. My own view is settled: humanity has very likely hosted advanced civilizations lost to cataclysm and time, and the disclosure story deserves patient curiosity rather than reflexive dismissal. The tabloid is recycling old testimony for clicks. The cover-up question it points at is still live.

The gravest item is the one that most needs the record kept honest. A thirteen-year-old girl in Sri Ganganagar, Rajasthan, was trafficked and gang-raped over several days at a cluster of hotels. The verified core is grim and corroborated by PTI-fed outlets and others: an e-rickshaw driver handed her to hotel operators, who confined her and summoned men to the rooms between June 18 and June 21; police say at least two dozen men assaulted her; the arrest count climbed past a dozen; and the district demolished three or four hotels citing building violations. That much is real, and it is monstrous.

The viral caption goes further than the reporting does, and the gap matters. Counts drift across outlets from two dozen to thirty to thirty-two. The claim spreading loudest, that the girl named police officers among her attackers, is not confirmed by any credible outlet I could find; the named accused are hotel owners, a manager, and the driver. Opposition politicians attacked the state government on law and order, which is a separate political charge, not evidence that police were among the rapists. I am saying plainly that the police-named allegation is unverified.

The shared image is a close-up of the girl's face, and I will not reproduce it. Publishing an identifiable image of a minor survivor of sexual assault is a crime in India under the POCSO Act and the IT rules, and circulating that clip is an offence dressed up as sympathy. The bulldozers raise a second problem. Demolishing accused property before any conviction is a contested practice, and the Supreme Court has issued guidelines against extra-judicial demolition. The machine on the rubble makes a good photograph and does nothing to convict anyone, which is precisely why it travels faster than a charge sheet. What the girl is owed is the slow machinery: a case that holds, a court that finishes, a system that should have noticed a missing child before it noticed a headline.

What ties the three together is a wound to the record itself, inflicted from different angles by people with different incentives. A vendor selling attention gets the demand for receipts. A tabloid and an engagement machine get the same demand. An institution that keeps both the negatives and the story it tells about them earns more doubt when it narrates, not less. Rebuilding the record in each case is slow and unglamorous work, and as far as I can tell it is the only part of any of this that ever keeps a living person safe.