Inside iOS 27's Reworked Stub Islands

Inside iOS 27's Reworked Stub Islands

At WWDC 2026, Apple announced iOS 27 with performance improvements, but of course the keynote didn't cover much of the details. Below are a few observations from the disassembly — some may relate to those performance gains, some may not. This article only focuses on aarch64 implementations.

How Stubs Worked Before iOS 27

In the dyld_shared_cache, external symbols—whether functions or data references from other libraries—are pre-bound during cache optimization. Pointers to other libraries become rebase operations (or relative offsets) because the precise distances between libraries are known and fixed.

A typical example of how dyld resolves dynamic symbol references in standalone binaries outside of the dyld_shared_cache:

__text:

BL              _os_unfair_lock_unlock ; __auth_stubs
__auth_stubs: _os_unfair_lock_unlock

ADRL            X17, _os_unfair_lock_unlock_ptr
LDR             X16, [X17] ; load from GOT entry
BRAA            X16, X17 ; jump to the resolved function

Where _os_unfair_lock_unlock_ptr is an entry in __auth_got. The linker (dyld) will bind (and sign) the pointer to the actual implementation (__imp__os_unfair_lock_unlock) and store it in __auth_got.

_os_unfair_lock_unlock_ptr DCQ __imp__os_unfair_lock_unlock

In dyld_shared_cache, most of __auth_stubs and __auth_got are aggregated into stub island pages, which don't belong to any specific binary.

There is still a per-binary __auth_stubs section in the dyld_shared_cache, however most of the time you won't find the actual cross-references to them, because the branch instructions are replaced to point to the stub island pages.

iOS still makes heavy use of Objective-C, so there is first-class support for method calls (message dispatch).

Most Objective-C method calls are compiled into a branch instruction that points to stubs in a dedicated section __objc_stubs:

__objc_stubs: _objc_msgSend$URLByAppendingPathComponent_
ADRP            X1, #selRef_URLByAppendingPathComponent_@PAGE ; load from __objc_selrefs
LDR             X1, [X1,#selRef_URLByAppendingPathComponent_@PAGEOFF] ; load selector
ADRL            X17, _objc_msgSend_ptr ; GOT entry
LDR             X16, [X17] ; load _objc_msgSend
BRAA            X16, X17 ; dispatch message

What iOS 27 Changes

Redundant sections are gone

As mentioned, after dyld cache optimization, branch instructions to __objc_stubs are updated to jump to stub island pages, while the unused sections remain in each binary.

On iOS 27 beta, those sections are removed; only the stub island pages in the dyld_shared_cache remain.

Some other sections related to Objective-C are also removed from source binaries, such as __objc_methname, __objc_methtype and __objc_classname. Now the data references (selectors, method types, class names) point to the __OBJC_RO region.

It's worth noting that a while ago, the schema of __objc2_meth_list changed.

__objc2_meth_list contains a list of method information for Objective-C methods, including each method's selector, type encoding, and implementation pointer. The selector field used to be an offset from the field itself. Then the dyld_shared_cache optimizer changed the semantics to use a global base address for selector offsets, which can be found through the cache header:

struct dyld_cache_header {
    // other fields omitted
    uint64_t    objcOptsOffset;         // VM offset from cache_header* to ObjC optimizations header
    uint64_t    objcOptsSize;           // size of ObjC optimizations header
};

This pair points to the ObjCOptimizationHeader struct:

dyld/common /DyldSharedCache.h#L89

struct VIS_HIDDEN ObjCOptimizationHeader
{
    uint32_t version;
    uint32_t flags;
    uint64_t headerInfoROCacheOffset;
    uint64_t headerInfoRWCacheOffset;
    uint64_t selectorHashTableCacheOffset;
    uint64_t classHashTableCacheOffset;
    uint64_t protocolHashTableCacheOffset;
    uint64_t relativeMethodSelectorBaseAddressOffset;

    // Added in version 2
    uint64_t relativeMethodSelectorBufferSize;
    uint64_t relativeMethodTypesBufferSize; // this buffer starts at the end of the selectors buffer
};

So on iOS 26, to get the selector string, you need to add the selector offset to relativeMethodSelectorBaseAddressOffset, instead of to the address of that field itself. 🤯

In version 2 of the Objective-C optimizations, dyld also applies this same offset schema to method type encoding strings.

The throughline: Apple is actively removing redundant sections.

Rethinking the stub trampolines

We've mentioned two types of stubs: one for cross-module symbols and one for Objective-C method calls.

On iOS 27, the old __auth_stubs style — ADRL and LDR to load a function pointer from the GOT, then BRAA to branch with pointer validation — still exists. But there are two new variants.

The first has no memory load nor pointer validation:

_stubs: _open
ADRL            X16, _open
BR              X16

Then here comes an interesting pattern:

ADR             X16, 0x188060060
MOV             X17, #0x9B7
ADD             X16, X16, X17, LSL #21
BR              X16

If you have no clue what it is supposed to do, try simulating the arithmetic instructions.

> X16 = 0x188060060 + (0x9B7 << 21)
      = 0x188060060 + 0x136E00000
      = 0x2BEE60060

0x2BEE60060 is the address of libsystem_m.dylib!_acosl.

So the first one can reach ±4 GiB from the stub: the ADRP gives a page-granular displacement of ±4 GiB, and then ADD fills in the byte offset within that 4 KiB page. Byte-exact and symmetric in both directions — but capped at 4 GiB.

The second variant exists for everything that lives further than that. Notice that our example target, 0x2BEE60060, sits 0x136E00000 ≈ 4.857 GiB away from the stub — already past what a single ADRP+ADD can encode.

In the instruction ADD X16, X16, X17, LSL #21, imm16 << 21 is a multiple of 2 MiB, ranging up to 0xFFFF << 21 ≈ 128 GiB. Add the two together and the stub can reach up to roughly +128 GiB — about 32 times the range of ADRL. Note that this displacement is unsigned, so the range is forward-biased, unlike ADRL, which can also reach backward.

The motivation behind this pattern is easy to guess: performance.

This new arithmetic way to encode large offsets eliminates the memory load and pointer validation overhead.

Objective-C trampolines

The Objective-C trampolines now look like this:

ADRL            X1, sel_length
B               _objc_msgSend ; /usr/lib/objc/libobjcMsgSend.dylib

It's shorter than the previous load-and-branch pair. But wait a second. This branch instruction can only reach ±128 MB from the current PC. The dyld_shared_cache is several gigabytes — how can it handle all frameworks?

Write a parser to dump the image list from the cache, and we'll see the answer:

  • /usr/lib/objc/libobjcMsgSend.dylib
  • /usr/lib/objc/libobjcMsgSend1.dylib
  • /usr/lib/objc/libobjcMsgSend2.dylib
  • ...
  • /usr/lib/objc/libobjcMsgSend33.dylib

The optimizer makes dozens of copies of the same objc_msgSend code and distributes them across the cache. Every binary then branches to whichever copy sits within the range.