How AI Misread A Chemical Incident in lebanon

Introduction

A single Facebook post by Beirut Today on 4 February 2026 reached over 1.5 million views, while the same content on Instagram accumulated 574,000 likes and 125,000 shares. It described Israel’s aerial spraying over southern Lebanon as “an environmental crime”. Three days earlier, Israel had told United Nations (UN) peacekeepers the same substance was “non-toxic”. Both claims circulated simultaneously and Artificial Intelligence (AI) tools were about to make that contradiction much harder to resolve. This article examines how AI language models shaped the information environment around Israel’s aerial spraying of glyphosate in southern Lebanon in February 2026 and what this reveals about the epistemic hierarchies embedded in these tools.

Screenshots of the post by Beirut Today on Facebook and Instagram.

Framing

Words do not simply describe events, they determine which events count as security problems in the first place. Carol Cohn’s foundational analysis of technostrategic language showed how specialised terminology structures determine what can be thought, said and ultimately acted upon within security communities. The word chosen to describe a substance: herbicide, toxic agent, chemical weapon — it is not a neutral act of classification. It is political and with immediate consequences for legal accountability, institutional response and public perception.

This dynamic becomes acutely visible in chemical, biological, radiological and nuclear (CBRN) incidents, where classification disputes are not incidental. AsKing’s Reader in Science & International Security Dr Filippa Lentzos has documented, CBRN disinformation is deliberately constructed to exploit these classification grey zones; to delay response, erode trust and insulate perpetrators from accountability. The Lebanon case fits this pattern as a legible episode within a broader architecture of narrative manipulation around CBRN events.

What is less understood is how AI language models interact with these disputes. Humphreys argues that Large Language Models (LLMs) are trained on human-generated data through processes that amplify dominant preferences and marginalise minority voices. Applied to conflict reporting, this means that current AI tools are more likely to reproduce the narratives of powerful state actors than those of affected local populations. The Lebanon case offers a test of this hypothesis in real time.

UNIFIL map showing the location of Ayta ash-Shaab and the ‘blue line’ (source: https://unifil.unmissions.org/ar/unifil-map-operations).

Case study

The February 2026 incident did not occur in a vacuum. Israeli forces have carried out identical operations in southern Lebanon since October 2023, framing them as security measures to clear the zone that could provide cover for militant activity and causing “more than $700 million in damage and losses” to the country’s agricultural sector. Similar patterns of vegetation destruction have been documented in Syria’s Quneitra region since late January 2026 and across Gaza, establishing a precedent that informed both the Lebanese response and the international information environment surrounding this case.

Week 1

‍Based on the same justifications, Israeli aircraft conducted aerial spraying operations on 1 February 2026 over agricultural land in Ayta ash-Shaab, a village in southern Lebanon near the Blue Line, the de facto boundary between Lebanon and Israel. The following day, the United Nations Interim Force in Lebanon (UNIFIL) confirmed that Israeli forces had notified them in advance, describing the substance as a “non-toxic chemical substance” intended to clear vegetation. Lebanese authorities immediately contested this framing. Within 72 hours, soil samples collected by the Lebanese Ministry of Agriculture identified glyphosate, a widely used herbicide classified as “probably carcinogenic” by the World Health Organisation (WHO), at concentrations 20 to 30 times above recommended safety thresholds.

The classification dispute erupted immediately. While UNIFIL cautiously documented “unknown chemical substances”, Lebanese President Aoun called it “an environmental and health crime” and “a flagrant violation of Lebanese sovereignty”. The NHRC-CPR invoked the Rome Statute, calling for an International Criminal Court (ICC) investigation and describing the act as a potential war crime, crime against humanity and collective punishment. This created a triad of clashing taxonomies, alongside Israel’s “non-toxic” claim and the UN/OHCHR’s “highly toxic herbicide” label, each carrying radically different legal implications. Euro-Med Monitor went further, using the term “lethal chemicals” and framing the incident as a war crime targeting civilian survival.

On social media, the incident spread rapidly across platforms before any institutional investigation had even concluded. The Beirut Today posts on Facebook and Instagram are a concrete example, mobilising global audiences around the term “environmental crime”. In the comments of its viral post, users openly questioned the verifiability of the claims: “How can this be independently verified?” and “How do you know its israeli airplane ?”, depicting trust erosion and a public already aware of the epistemic fragility of the information environment.

Week 2

‍By the second week, two divergent patterns emerged. In the Lebanese and Arab press, coverage deepened and terms escalated. L’Orient Le Jour intensified the classification on social media, using “écocide” and “violation du droit international”: stronger terms than its own written press coverage, illustrating how platforms amplify classifications beyond editorial caution. The escalation was visible in real time through Green Southerners’ Instagram post. The post circulated with hashtags #warcrime #ecocide #southlebanon and showed how local NGO classifications migrated directly into international digital discourse, bypassing traditional media filters entirely.

The New Arab cited multiple field experts, including Ayta ash-Shaab’s mukhtar, MP Najat Saliba and Syrian agriculture directors, describing a “scorched earth policy” and noted explicitly that glyphosate “isn’t classified as a chemical weapon under the Chemical Weapons Convention” naming the legal gap that enabled Israeli impunity. Lebanon formally announced its intention to file charges before the UN Security Council.

Western institutional actors moved in the opposite direction. The Israel Defense Forces (IDF) public X communications during this period focused exclusively on military operations against Hezbollah, with no mention of chemical spraying. When UK Green Party peer Baroness Bennett raised the incident in a written parliamentary question, the UK government’s response used the term “herbicide” and “reports”, avoiding any stronger classification and creating a striking distance from Lebanese and UN language. This divergence in language had a concrete effect: it reinforced the asymmetry that would later shape how AI tools processed the incident.

Screenshots from the Green Southeners instagram post and the IDF post on X.

Week 3

By mid-February, physical evidence began to materialise as the Lebanese Agriculture Minister Hani confirmed visible damage to trees across 400 to 500 hectares, describing the affected zone as a deliberately created buffer strip between Lebanon and Israel, which corroborated the scorched earth narrative of earlier weeks.

On instagram, Baronnes Bennet was seen using cautious language, referring to “Israel glyphosate allegations”, and asked whether the government had raised the matter with Israel, implicitly placing responsibility on Lebanon to verify rather than on Israel to account. In the post’s comments, users escalated sharply: “GLYPHOSATE IS A WEAPON OF MASS STARVATION”.

‍As proofs accumulated over the three weeks, Lebanese and NGO language began to normalise around “ecocide”, “attacks on civilians” and “health crime” but the juridical classification never intensified beyond a legal-diplomatic register, suggesting that institutional framings had already set a ceiling on what could be said. The gap between documented evidence and language produced a loss of trust in official sources with users turning to social media classifications as more credible than governmental ones.

Screenshot of the Instagram post by Baroness Bennet, alongside screenshots from a video by L’Orient.

LLMs join the discussion

In May 2026, as part of this research, six major AI language models were prompted with two questions about the incident: (1) “What happened in Ayta ash-Shaab, southern Lebanon, in February 1 2026?”; (2) “What substance did Israel spray over southern Lebanon in February 2026 and how was it classified?”Five out of six (Gemini, ChatGPT, Le Chat, Claude and Perplexity) failed to identify the aerial spraying when asked what happened in Ayta ash-Shaab on 1 February. Instead, they described Israeli military operations, drone activity and stun explosives. The chemical incident was invisible and unnamed.

Only Grok answered correctly from the first prompt, likely because it is trained on X, which was saturated with local narratives about the spraying. All models performed better on the second prompt, but defaulted to Western sources such as The Guardian and the BBC, systematically underrepresenting Lebanese and Arabic-language classifications.

Screenshots from ChatGPT, Gemini and Grok.

Analysis

Who benefits directly from the definition battle?

The answer to that question is not conspiratorial. Israel benefits from the absence of a shared classification, not because it manufactured the dispute, but because the existing legal architecture made ambiguity its default ally. Glyphosate’s explicit exclusion from the CWC’s definition of chemical weapons meant that no binding international mechanism could be automatically triggered. Strategic non-disclosure reinforced this: the IDF issued no comment when contacted by the BBC, while its public X communications during this period focused exclusively on military priorities against Hezbollah, describing the conflict in purely security terms that left no room for a chemical attack to exist. Even Haaretz, arguably Israel’s most critical outlet, covered only military operations during this period.

The original i24 News article justifying the spraying as “weed clearing” could not be retrieved through any archival tool for this blog post. The narrative existed: it was cited both across multiple sources and by LLMs; but its source was inaccessible, limiting accountability. This is a recognisable pattern: CBRN disinformation does not need to fabricate facts; it also thrives on delaying international consensus by exploiting regulatory loopholes. The colonial dimension of this tactic was named by Hisham Younes on one of Green Southerners’ posts, declaring that “the very concept of ‘scorched’ or ‘dead’ land is rooted in a colonial tradition of warfare”. This situates the attack as a clear pattern within a longer history of deliberate agricultural destruction acting as a tool of displacement that predates and exceeds any single classification dispute.

Rob Nixon’s concept of “slow violence” helps explain why this incident struggled to gain institutional traction. He defines it as resisting media attention because of its tendency to unfold too gradually to register as breaking news. Unlike spectacular military violence, herbicide-based agricultural destruction occurs across seasons and years, making it harder to archive, classify, and trigger legal accountability. Patrick Wolfe’s analysis of “settler-colonial” elimination through agricultural dispossession adds another dimension: the deliberate destruction of crops and lands as a tool for severing communities from their territory is a constitutive strategy of dispossession operating under the cover of military operations. Together, these frameworks suggest that the classification dispute is a defining feature of how certain forms of violence are rendered invisible, both in media cycles and in the training data that AI systems draw upon.

Is there an epistemic hierarchy?

This case plays into Cohn’s research which shows that mainstream structures decide which voices are taken seriously. Here, Lebanese official sources communicated through state-run press agencies whose web infrastructure presents technical barriers to deep-linking and external digital archiving. Meanwhile, Western institutional actors such as the UK government or UNIFIL in its initial framing adopted the IDF’s own terminology. The result was a hierarchy not of truth but of durability: Israeli framing was more instantly digitally retrievable and therefore more legible to AI systems.

As Humphreys argued, LLMs amplify the preferences of the majority through reinforcement learning from human feedback (RLHF) training processes. For this specific attack on Lebanese territory, five out of six models erased the chemical incident entirely on prompt 1, aligning with the high-volume military reporting prevalent in English-language outputs. This disparity confirms this mechanism and that training data architecture determines algorithmic visibility, not the event itself. This is a faithful reproduction of an epistemic hierarchy already encoded, one that LLMs inherited from the information environment, and then amplified at scale.

What are the solutions? 

For policymakers, the CWC’s exclusion of glyphosate-class substances creates a governance gap that dual-use actors can exploit. Notification regimes (even voluntary ones) for high-concentration herbicide deployment in conflict zones would create an evidentiary baseline that makes strategic non-disclosure harder to sustain.

For journalists and fact-checkers, when covering CBRN incidents in active conflict zones, explicitly naming semantic conflicts as competing legal frameworks with different accountability implications will help. The WHO’s risk communication guidelines offer a model for distinguishing scientific uncertainty from political contestation.

For AI developers and platform auditors, the Grok anomaly is instructive. Developers should audit their models’ performance on non-English-language conflict events and publish transparency reports on source language distribution. Lentzos team’s Disinformation Tracker offers a model for systematic documentation.

For students, general public, and anyone else who does not identify as the above – the next time you use an AI tool to research a conflict, ask yourself what language the sources are in and whose perspective is absent. When consuming news about CBRN incidents specifically, take a closer look on which words are used to describe the substance and by whom. Classification is a political matter.

Conclusion

In early February, Ayta ash-Shaab was sprayed with a substance that was simultaneously “non-toxic”, “highly toxic”, and invisible to five out of six AI tools. What this case reveals is a new form of information disorder - and a reminder that what AI can find depends entirely on what humans chose to archive, in which language, and from whose perspective. The questions worth asking now are: how do LLM training datasets weight Arabic sources relative to English in conflict reporting? And perhaps even more crucially, how many CBRN incidents in active conflict zones have already been erased from AI-accessible memory because they were documented in non-dominant languages?

Victoria Gomez

King’s Postgraduate Research Fellow (KPRF) 2026

https://www.linkedin.com/in/victoriagomezspacesecurity/
Next
Next

A House of dynamite: nuclear scenario movie review