When a browser decides to skip rendering off-screen content, users notice faster page loads. Web agents notice something else: elements that exist in the DOM but won't render until scrolled into view. The distinction matters when you're running thousands of concurrent sessions and need predictable behavior.
That performance optimization (content-visibility:auto, enabled by default in WebKit in 2024) came from Igalia, where Mario Sánchez Prada coordinates the WebKit team. Igalia is a worker-owned consultancy in Galicia, Spain that's the second-largest contributor to WebKit after Apple. When Igalia contributes browser engine work (13% of all WebKit commits from October 2023 to October 2024, plus major contributions to Chromium and Gecko), they're shaping the environment that web automation must navigate. Mario sees how these decisions ripple outward.
Upstream Decisions, Downstream Complexity
At the WebKit Contributors Meeting in October 2024, Mario presented work that seems purely about browser performance and security. Each technical decision creates operational realities for web agents.
Consider Trusted Types, designed to reduce DOM-based XSS attacks. The implementation restricts how JavaScript can manipulate the DOM. A form submission that previously worked through direct DOM manipulation now requires going through a sanitization policy. For a single page, this is manageable. For web agents handling thousands of different sites with different implementations of Trusted Types, it's a new layer of variability to handle reliably.
The Cairo to Skia migration for Linux ports tells a similar story. Different rendering backend means different visual fingerprints, different performance characteristics, different edge cases in how pages render. Each change is technically sound. Each change also means extraction logic must account for new behavior.
Igalia's position across three major browser engines reveals something important. When an organization implements web standards in Chromium, WebKit, and Gecko, they see how different architectural choices create different automation surfaces. Mario's experience coordinating WebKit work, combined with his previous Chromium involvement, gives him perspective on how these differences play out. There's no single "correct" way to handle browser interaction. The web's complexity is baked into its multiple implementations.
Browser security improvements create the authentication and interaction complexity that web agents must navigate reliably.
The Invisible Architecture
Browser engine work operates mostly out of sight. Igalia's roughly 140 employees work remotely from a flat, worker-owned structure, maintaining critical web infrastructure. WPE WebKit for embedded devices. Major Chromium components. They shape browser behavior at the specification level through W3C, WHATWG, and ECMA standards bodies.
This standards work matters more than individual implementations. When Igalia contributes to specifications, they're defining behavior that will persist across browser versions and vendors. A decision made at the standards level becomes permanent infrastructure that web agents must navigate for years. Understanding these upstream decisions (rather than just reacting to them) separates infrastructure that adapts from infrastructure that constantly breaks.
We recognize this work's significance because we've encountered its downstream effects. When browser engines implement new security features, authentication flows become more complex to navigate reliably. When rendering behavior changes, web agents must adapt their interaction patterns. Mario's team is making the web more secure, which necessarily means making it more complex to automate.
A worker-owned consultancy in Spain shapes enterprise web automation in ways most people never see. When we encounter a new authentication challenge or an unexpected rendering behavior, understanding that it emerged from thoughtful engineering trade-offs (rather than arbitrary complexity) changes how we build infrastructure. We're building systems that anticipate how browser engines will evolve. Mario's work reminds us that the operational challenges we solve today are shaped by decisions made upstream, often by people most of the industry has never heard of.
Things to follow up on...
-
WebDriver BiDi protocol: The next-generation browser automation standard uses WebSockets for two-way communication, addressing limitations of both classic WebDriver and Chrome DevTools Protocol.
-
Ladybird browser engine: A completely new engine written from scratch with no code from Blink, WebKit, or Gecko, backed by a non-profit that refuses search deal funding.
-
Chrome's security vulnerabilities: Chrome reported over 50 critical vulnerabilities in 2024, creating constant security emergencies that affect enterprise automation reliability.
-
Browser engine diversity: For the first time ever, five open-source browser engines are in active development, each with different architectural choices that create different automation surfaces.

