Apple lays foundation for mixed reality development with iOS 16 APIs

Without saying a word about it, Apple is gearing up developers to create apps for its long-awaited AR/VR device.

Source: Apple
Source: Apple

When Apple’s WWDC 2022 keynote kicked off, the world was eagerly awaiting the announcements of the much-discussed mixed reality headset.

At a minimum, a toolkit similar to the DTK for M1 developers was expected. Moreover, there were many rumors about a possible announcement realityOS. But the next WWDC event passed without mentioning Apple’s most ambitious project.

What’s even more mysterious? Frameworks RealityKit and SceneKit received virtually no updates this year. Instead, we were greeted by an M2-based Mac, iPadOS’ Stage Manager mode, updated iOS, and a significant upgrade to Carplay.

A mixed reality headset that was expected to release in 2020 but was eventually pushed back to 2023 is now being further delayed with a 2024 release date.

In Apple’s defense, cautious progress is understandable. In order for their new product to be widely distributed, it needs to be more closely integrated into the ecosystem, as well as to involve developers in the creation of the metaverse.

Thanks to their success in building continuity, today the Apple ecosystem is more unified than ever before. I can’t help but mention the new feature in iOS 16 that lets you use your iPhone as a webcam for your Mac (looks like a beta test of how an Apple headset can work alongside an iPhone).

At the same time, despite the lack of news about the development of realityOS, the iPhone maker is making significant improvements to its APIs and frameworks to prepare developers for the future of mixed reality.

Let’s take a look at some of the new APIs that were announced during WWDC 2022. Some of them are well-known and have become widely known during WWDC 22. However, from an AR/VR development perspective, the role that these APIs will play here has not been as prominent. during the event.

Live Text API and updated PDFKit for scanning text from media

With iOS 15, Apple introduced the Live Text feature for extracting text from images. In iOS 16, they’ve taken things a step further by releasing the Live Text API for easily capturing text from images and video frames. Class DataScannerViewControllerreleased as part of the framework VisionKit, allows you to configure various settings for scanning. The Live Text API under the hood uses VNRecognizeTextRequest to detect texts.

At first glance, the Live Text API feature is reminiscent of Google Lens on steroids. However, just think of the possibilities it will open up when you see the next big gadget from Apple. To begin with, imagine that you are turning your head to quickly extract the necessary information with your eyes. Yes, this was already possible in iOS 15 thanks to AirPods’ spatial awareness for head tracking, which uses CMHeadphoneMotionManager. Now add to that the new personalized spatial audio in iOS 16 and I can already see the VR mechanic rolling out.

Similarly, two enhancements to the framework PDFKit The ability to parse text fields and convert document pages into images will go a long way towards creating rich AR lens experiences.

Source: WWDC 2022 Video - PDFKit
Source: WWDC 2022 Video – PDFKit

To prevent Apple’s mixed reality device from becoming just another smartwatch for you, it’s important to provide a set of tools for interacting with text, images, and graphics.

With the advent of two powerful image recognition APIs, I believe the iPhone manufacturer is taking the right steps in the right direction. This is the path that will lead to AR / VR applications with advanced interactive interfaces.

. . .

Speech input and improved speech recognition

Forget about text and images for a while, iOS 16 also updated the dictation feature, allowing users to seamlessly switch between voice and touch.

For example, you’re walking down a hallway and you might want to quickly edit a text message on your phone. In iOS 16, you can immediately use your voice to change a piece of text with ease.

Want more? The Speech framework received a small improvement – the ability to switch punctuation in SFSpeechRecognitionRequest through addsPunctation. I am optimistic that this will lead to rich communication applications as this has already found its way into live captioning during FaceTime calls.

From a mixed reality perspective, this is huge changes. Using voice to enter text will reduce our reliance on keyboards in the VR world. Apple is also making it easier to integrate Siri into our apps with the new App Intents framework.

. . .

More precise control over the user interface

iOS 16 announced a host of new UI controls, mostly in SwiftUI, a declarative framework that promises to be a one-stop solution for building apps across all Apple platforms.

Among the many features of SwiftUI, I was most interested in the changes in the WidgetKit framework. In iOS 16, developers can now create lock screen widgets. In addition, using the same code, you can create widgets for different watch faces. We can also create Live Activities in WidgetKit to provide real-time updates to the user.

I really believe that widgets will be the future of apps – at least in the AR/VR space, as users will look to consume information and view content without having to open the app.

Along with WidgetKit, we have a new WeatherKit framework to help keep your apps and widgets up to date. lookhow it will look on the lock screen.

In addition to frameworks, SwiftUI has also given us many small controls such as Gauges that will be neatly integrated into widgets. Also have SpatialTapGesture – a gesture to track the location of a touch in a SwiftUI view.

I especially liked the ImageRenderer API, which allows you to transform SwiftUI views into images. In combination with the Transferable protocol, dragging media elements between applications will become much easier – especially since SwiftUI now has a built-in share sheet control. See how drag and drop made it easier to cut out objects from photos and share them in other applications:

Source: Apple
Source: Apple

To create interactive applications for a mixed reality headset, our ways of interacting with text, images, voice, graphics, and other forms of media must become more efficient.

I think Apple has made notable moves in that direction, not only with the aforementioned UI and gesture controls, but also with the updated SharedPlay API, the new Shared With You framework, and the Collaboration API.

They all look like very promising building blocks for Apple’s long-awaited headset.

. . .

RoomPlan API and Background Assets Framework

The Background Assets framework is another tool that hasn’t received much publicity yet. It was introduced to handle large file uploads in various application states, but I think its capabilities are beyond the scope of this utility.

By downloading 3D assets from the cloud, we can quickly create and ship augmented reality applications with much smaller batch sizes.

Likewise, the RealityKit framework has not undergone any major changes. However, Apple quietly introduced the new RoomPlan API.

Powered by ARKit 6 (which received some notable improvements this year), the Swift-only API provides support for scanning rooms and building 3D models of them.

You could think of the RoomPlan API as an extension of the Object Capture API, but if you think of it in terms of AR/VR, and given that Apple’s mixed reality headset will feature LiDAR sensors and multiple cameras, RoomPlan will be a game changer for developers. Expect a lot of AR apps that let you remodel houses.

. . .

While these were the main APIs that I thought would fit well into the future of mixed reality, Spatial is another new framework that allows you to work with 3D mathematical primitives. It can be useful when working with graphics in virtual space.

In the end, Apple hasn’t said a single word about its plans for AR/VR headsets, but the new APIs they’ve released this year will play a critical role in bringing all the pieces together to develop the metaverse.

It is important to prepare developers to create applications for the new ecosystem today. After all, for a product to be widely adopted, there must be a mature application ecosystem, and for this, it is necessary to attract developers.


There will be a public session coming soon at OTUS where we will look at how to turn Android into iOS by migrating with KMM and what are the pitfalls. Register link.

We also invite everyone to the webinar “Flux in SwiftUI – the most efficient architecture for 2022?”, where we will discuss:
– Obvious MVVM problems when creating iOS applications on SwiftUI2.
– Possible extensions to MVVM using SOA and Coordinator patterns.
– Why most applications on SwiftUI are written on the architectural concept of Flux.
Registration for the webinar.

Similar Posts

Leave a Reply