Making VoiceOver More Human-Friendly

My name is Zhenya Tyutyuev, I am an iOS developer at 2GIS. I want to share how I adapted our application for VoiceOver:

  • I'm sharing a story about a paradigm shift and how I moved from the stage of “doing it because Apple says so” to a completely new one – “doing it for people”.

  • how I developed a new type of snapshot testing so that everything wouldn't break when new, non-adapted elements were added.

  • described several nuances that are important to consider in work: escape, dynamic accessibility calculation, enlargement of elements.

I hope my experience will inspire you to add value and make your own applications available to more users.

What is VoiceOver

VoiceOver is a feature available on all Apple devices such as iPhone, iPad, Mac, Apple Watch, TV and VisionPro. It speaks interface elements, allowing visually impaired users to control the device using gestures. For example, tapping the screen focuses and speaks an element, double tapping activates it, and three-finger scrolling is used to navigate the screen. You can see a list of basic gestures here on the Apple website.

Here, for example, is a video of how to use VoiceOver:

Like many developers, I first encountered VO when I started writing tests. I placed them throughout the project accessibilityIdentifierby which in UI tests you can already find an element on the screen, find out its state or interact with this element by this id.

let app = XCUIApplication()
app.launch()
// Находим кнопку по AccessibilityID
let button = app.buttons["myButtonAccessibilityID"]

// Проверяем, что кнопка найдена
XCTAssertTrue(button.exists)
// Нажимаем на кнопку
button.tap()
// Пример проверки, что текст изменился после нажатия
XCTAssertEqual(button.label, "New Button Label")

Years passed, periodically I had to test more and more complex things, and just accessibilityIdentifier was not enough, I had to dig into the documentation. At that time, I considered VO as a utilitarian technology for robots and lived in this paradigm for a long time. As I watched lectures on WWDC a strong feeling began to appear that this feature was actually developed not for testing, but for people.

After watching a couple more videos, I was inspired – you can mark up regular standard controls and give them normal names, and profit! I did it simply and quickly, as Apple advised. I went into their accessibility inspectorinspected according to instructions from videochecked that the buttons are clickable and there are no layout issues. Well, I thought, now everything is for real: urgent release, availability hot off the press! Well, how urgent… I marked all the elements for about two months: no one said it would be easy. Everything went into the release, and worked for a while. My conscience was clear – I did everything I could, and even more.

And for a long time it was just impersonal work: I read the documentation, followed it, checked the results against it. And then I came across a video from Wylsacom, how Blind man uses iPhone, MacBook and Apple Watch. Then I realized that there are people in the world who really use VoiceOver. It is their connection to the world, their “eyes”, and it helps them significantly in their daily lives.

And on the wave of overwhelming enthusiasm, I decided to do irreparable good to humanity on warm evenings. From the stage of “doing because Apple advises” I moved to another – “doing for people”. This turned out to be a completely different approach. I decided to do everything honestly.

VoiceOver, version 1.0

Apple has all the tools for adaptation, they kind of whisper: take it and do it. But the information is mostly superficial and there are few “live” examples. Also, everything is great if the project has standard UIKit, standard system buttons. One step to the left, one step to the right, and there is no understanding at all of how to adapt everything so that people can use it conveniently.

The 2GIS application is heavy (189214 files and 29452584 lines of code at the time of writing), relatively well covered with UI tests – 700+ honest UI tests (these are the ones XCUIApplication) and 1000+ unfair UI tests (craftwhich touches the UI within the framework of Unit testing, without raising the entire application). Having poked around the entire application, I determined which scenarios need to be taken into work, and which ones can be neglected for the beginning. Without any experience, I turned on my intuition.

The thing I couldn't start diving into this without was separating the code for users and the code for robots. In the first iteration of onboarding, I didn't even think about it. The more accessible, the better, right? Spoiler: not really.

To separate the environment variables for tests, you need to write a special flag:

extension XCUIApplication {
	@discardableResult
	/// Запустить приложение для тестов или как для обычного пользователя
	func customLaunch(asUser: Bool = false) -> XCUIApplication {
		if !asUser {
			self.launchEnvironment["v4ios_uitests"] = "YES"
		}
		self.launch()
		return self
	}
}

extension ProcessInfo {
	static var isUITests: Bool {
		ProcessInfo.processInfo.environment["v4ios_uitests"] == "YES"
	}
}

In the future, all snippets and examples can be touched on GitHubThere is a fully working project with all the examples that I have described below.

I used to think that I covered 80% of the important scenarios, and that's good. However, while this adaptation allowed real people to work with the app, it wasn't something they'd want to come back to again and again. I wanted someone with experience using adaptations to inspect our app and give recommendations.

On a wave of enthusiasm, I contacted Pavel, the hero of the Wylsacom video. Collaboration with him did not work out for reasons unknown to me, but he advised me to contact another community. Users also wrote to us periodically with reviews of the app. I created a chat in WhatsApp (yes, WhatsApp, as it turned out to be more accessible than Telegram on iOS) and invited everyone I found there.

Little by little, I added features to the app that were adapted from Apple's documentation and asked users to leave feedback. This helped me find and fix bugs. The process was iterative: I immediately uploaded some updates to production, and sent others as TestFlight builds.

A year later, we had a version of the application that could be called adapted. It didn't work perfectly, but everything worked. The most important thing is that we took into account the opinion of real users, for whom the new adaptation became more convenient and understandable.

But years passed, the app went through redesigns and rewriting in Swift, and the whole adaptation gradually turned into a pumpkin. To be honest, I was a little (actually, quite) burned out. People didn’t give me feedback, and I thought: either everything works fine, or no one uses it. The chat gradually died.

VoiceOver, version 2.0

In 2022, sanctions happened, which worsened the situation for users: Google left the market as a provider of transport data, and other Russian applications were not particularly accessible. Users had no alternatives left, and we began to write en masse (as much as possible for such a small audience) that everything was working poorly.

Around the same time, I was able to contact accessibility testing experts from Sber. Without their help and advice, it would have been difficult to move forward. Also, unexpectedly, a caring typhlopedagogue wrote (I didn’t even know such a thing existed) and added motivation to continue the work I had started.

Also at that moment appeared titanic work Mikhail Rubanov — thank him! His work helped solve typical problems. I highly recommend starting VoiceOver onboarding with this book.

Thanks to all these factors, there was a desire and strength to make a new VoiceOver, a more serious and adapted application.

Important introductory information

From a product development perspective, the VoiceOver feature always loses to other tasks by all criteria. There is never time for development and testing. If you don’t push everyone — designers, product managers, QA, developers, and even end users — entropy does its merciless work. Without regular regression and control over new features, the probability that a random scenario will become unavailable in 2-3 weeks of active development steadily approaches 100%. And from these inputs, it is important to understand that no one will conduct this regression almost never.

And one more dicksclaimer.

2GIS is a real Nero burning ROM (only it can't write disks anymore: whoever understands it, will understand it). We have dozens of controls on the screen, hundreds of user scenarios have been implemented. There is a combat code written back in 2012. In such conditions, I honestly understood that it was impossible to adapt everything. It is necessary to adapt only what people definitely use: I took the data based on feedback.

And further I will describe not the ideal adaptation, but the adaptation that can be done by one person for story points that are beaten with sweat and blood. In a gigantic project that is supported by many people, each of whom can accidentally break something (not out of anger, but out of ignorance). In a project where new screens appear that I may not know about, and existing ones are constantly changing. And most importantly, that real people can use it. And it would be nice if everything didn’t break the next day.

Therefore, in second In the next approach, I started development with testing. I came to a clear understanding that soulless tests were needed so that the hands of developers who were not involved in developing this feature or writing UI tests for an existing screen would not break anything else. And there was a strong desire for these tests not to take forever (that is, not to be UI testswhich run for tens of seconds). I also wanted them to be super easy to write, even easier to maintain, and so that when they crash it would be immediately clear what went wrong. Typical ICR according to TRIZ. It seems that this does not happen, but in fact it does.

Snapshot testing

Fortunately, we don't need to take screenshots, because such tests don't solve the main problem – quickly understand what went wrong.

Everything is much simpler, you just need to use a simple Soviet…

snapshot of the letters the user hears.

To do this, we had to come up with a new type of testing that would quickly integrate into our testing system, start up in no time on the existing infrastructure, and would easily get into Jenkins.

These tests create a text snapshot of the entire screen, capturing the text that a blind user would hear. The description of accessible elements is “nailed” into the tests for the screen being tested: if someone adds an inaccessible element or removes an existing one, the test will fail and the code will not be frozen = profit.

If you want to start or are already doing serious development for many years, then I advise you to start with this. For a start, you can, like us, “play around” with the first approach. But then you will understand that it is unreliable.

For such snapshot testing, you need to build a tree of all Views, find the available ones among them, extract the necessary data from them (id, label, value, actions), collapse everything to one text value and, in the end, make the tree flat, receiving an array of texts for each available element.

I didn’t want to parse everything myself and didn’t have the time, but before me, this wonderful “person” did it all — https://github.com/cashapp/AccessibilitySnapshot.

This is already not bad, but the type of data that the library gives is inconvenient for testing, so we will write our own wrapperwhich allows us to turn UIView into something human-readable. For testing, we use Nimbleit displays errors nicely. One more “squat”, and you're ready “matcher“, which allows you to test any View in a convenient form. You can output any hierarchy to the log like this: printAccessibilityHierarchySnapshot.

The result of the call will be the following message in the log:

[

	.label("Some text. Button."),

]

Next you need to take these letters and insert them into the matcher haveAccessibilityHierarchyOfElements(…)

So now we have code that writes tests and compares the hierarchy for developers. In the final version, the test starts to look like this So:

func test_проверяем_иерахию_простого_контроллера() throws {
	let vc = ViewController()
	self.uiTester.showChild(vc)
	expect(vc.view).to(haveAccessibilityHierarchyOfElements([
		.label("Some text. Button."),
	]))
}

If the hierarchy changes, the test fails with an error:

Once the tests are ready, you can start adapting the application.

Escape

There is an important point in VoiceOver: you can start any scenario (open a new screen), but there must be an opportunity to exit this screen. This is something that few people think about. Therefore, the rule of good manners is that if we forget or do not have time to adapt the screen, the best thing we can do is to give the user the opportunity to exit it.

The best method for this is accessibilityPerformEscape (and here example). At any time the user can make Z gesture with two fingers on the screen, and iOS will find the first element that returns true by the responder chain. I strongly recommend implementing a way to escape from the screen in this method. It is highly recommended to add an implementation of this method to each ViewController where the user can get.

The ultimate trick is to make the first element of the hierarchy a button that closes the screen. A typical example is UINavigationController with its back button.

Dynamic accessibility calculation

We want to manually manage accessible elements so that only things we know are accessible. Our goal is to have all of our accessibility code in one predictable place, rather than scattered all over the screen. This will also help us win the unequal battle with automation coders who write accessibility for robots.

iOS 17 introduced native lazy dynamic accessibility (eg. Here):

    /*
     Block based setters take precedence over single line setters (i.e setAccessibilityLabel:(NSString *)) and property overrides (i.e. accessibilityLabel).
     These methods require the block to have a specific return type that corresponds to the attribute's type.
     Each of these block based setters have a corresponding accessibility property.
     See the notes for the property for more specific information about that property.
    */

But this is not our case, we need support for iOS 15. In addition, we need checks for isHidden and alpha. So we will write a special block that returns a predictable lazy hierarchy for ViewController.

self.customAccessibilityElementsBlock = { [weak self] in
	guard let self else { return nil }
	/// если мы хотим вручную задать порядок элементов доступности, с учётом их видимости
	/// по дефолту читаем сверху-вниз слева-направо
	/// но если по какой-то причине хотим, чтобы верхний элемент был в конце, можем перемешать дефолтный порядок
	return [
		self.carousel,
		self.label2,
		self.label,
	]
}

We needed this for several reasons:

  • Improve performance by avoiding unnecessary work.

  • We are not afraid of screen changes – when the screen is updated (for example, when you go online) or the user interacts with it, new elements can be added or something can be deleted. You won't have to do any additional actions, the Accessibility hierarchy is always consistent.

  • Making the setup convenient. All the settings for the Accessibility hierarchy are concentrated in one place, which makes it easy to separate the accessibility code for people and robots.

Now about the accessibility of the elements themselves. Usually we are interested in dynamic properties (label, value, actions). We try to calculate them lazily when necessary, that is, in getters:

override var accessibilityLabel: String? {
		get { "ячейка" }
		set {}
	}
	override var accessibilityValue: String? {
		get { self.label.text }
		set {}
	}

If the hierarchy has changed, you can pull it method:

UIAccessibility.post(notification: .layoutChanged, argument: nil)

And then iOS will recalculate the entire accessibility tree. Any accessible element can be passed as an argument, and the focus is set on it. If nothing is passed, the focus remains on the current element, if there is one left.

Enlargement of elements

Another pillar of accessibility that makes a design more resistant to change is combining several blocks of text or buttons into one semantic design. Such a block is usually more convenient to use, easier to maintain, and harder to break.

The most obvious example of a semantic block is a cell in a table, which can contain several labels, buttons, images and descriptions.

This entire structure can be turned into one accessible element.

First, we make the entire cell an accessible button intended exclusively for humans. For UI tests, it is necessary that all elements of the cell can be identified by their id. After enlarging the elements, we risk losing access to them. Of course, Apple brought native toolwhich can be viewed herebut it is only supported since iOS 17 and does not meet the requirements of our current tests. So we use the good old tried and tested solution:

if !ProcessInfo.isUITests {
	// если не тесты, то ячейка становится одной большой кнопкой
	cell.isAccessibilityElement = true
	cell.accessibilityTraits.formUnion(.button)
}

In our project we use MVVM and each view has its own ViewModel. For each cell there is a corresponding ViewModel that contains all the raw data. We collect all the data for all the labels in the cell into one lazy construct, represented as text in one big heap, using the method accessibilityLabel().

var accessibilityLabel: String? {
	[
		self.title, 
		self.subtitle,
		…
	].accessibilityLabel()
}

We will also collect all the buttons in the cell (if there are any) into the following structure:

override var accessibilityCustomActions: [UIAccessibilityCustomAction]? {
	get {
		guard let name = self.buttonContent?.accessibilityLabel else { return nil }
		return [
			UIAccessibilityCustomAction(name: name) { [weak self] _ in
				guard let self else { return false }
				return self.performCallToAction()
			},
		]
	}
	set { _ = newValue }
}

In the cell itself, you need to override the methods:

override var accessibilityLabel: String? {
	get { viewModel.accessibilityLabel }
	set {}
}
override var accessibilityCustomActions: [UIAccessibilityCustomAction]? {
	get { viewModel.accessibilityCustomActions }
	set {}
}

Using one large cell may not be the most convenient way for the user to interact, but it is a very simple and universal approach to development and testing. Such code is easy to write, because there is a clear rule: one cell – one accessible element. There are only four possible ways to deliver data to a cell – via label, value, accessibilityCustomActions and sometimes hint. You can also accessibilityCustomContent for the strangest scenarios, but I did not consider them in this article.

The output will be the following scenario:

Stories

Another scenario that has become relevant for every new app after Snapchat is stories. As a rule, this is one of the first elements on the screen. 2GIS is no exception:

If you adapt stories head-on, the user will need to scroll through approximately X stories, where X is much greater than 10, before moving on to the next available element.

In the book “Accessibility for all” provides an example of solving such a problem with a good description. There is also Apple recommendations.

These references suggest using the AccessibilityCarousel component for such horizontal lists. However, as a self-respecting developer, I decided to write own bikewhich is more suitable for our needs. This component can be easily integrated with any UICollectionView to make it work as a carousel. An example of use is in the demo project.

In the main application it looks like this:

Nuances that are good to know

One important point that Sberbank specialists mentioned: users often turn off accessibility hint. Therefore, you should not put important relevant information in it, but use it only for onboarding instructions for specific elements that may behave in a non-obvious way.

For marking headings it is better to use Headers.

This greatly structures large lists and with the help of special rotor mechanisms you can navigate very quickly: https://support.apple.com/en-us/111796.

In action it looks like this:

New trendy SwiftUI developers are coming to us, and I myself have to become more modern. In this article, some of the principles for SwiftUI are no longer so relevant, but the general principles are still applicable. Fortunately, SwiftUI still uses UIKit under the hood, so the hierarchies it creates also understand Snapshot testing, and there is no need to invent a new type of tests yet.

An app with poor markup, no zoom, but labeled buttons that can be pressed in VoiceOver mode, and an option to navigate away from an inaccessible screen, is much better than being stuck with the perfect accessible app that may never come.

Accessibility Inspector very bad, you shouldn't rely on it when working with a simulator. It works much better on a device. In fact, there is nothing better than personally testing scenarios with your eyes closed or with the screen off on a real device.

As a result

We have adapted scenarios that solve real user problems. For example, this is how VoiceOver will voice the items on the app's start screen and help with navigation:

The killer feature (at the moment) is the adaptation of the transport output screen for VoiceOver. It is based on visual signs and symbols that need to be voiced for visually impaired users. We added logic that allows complex routes to be voiced.

For example, the hint “M (14) → 2” will be voiced as: “Get on at the MCC stop Gagarin Square, get off at the MCC stop Kutuzovskaya towards Luzhniki station, turn left from the carriage, exit number 2.”

The start and end of a route segment are announced as “Get on at the stop” and “Get off at the stop” instead of just the name of the stop.

Finally

Apple has done a tremendous job – developed technologies, frameworks and tools that continue to improve every year. Their work is impressive, hats off. Any application can be made accessible to any user, adaptation is necessary not only for visually impaired people.

But in real life, the app adaptation process often faces a chicken-and-egg dilemma:

  • if no one uses accessibility, why do it;

  • but if you don't do it, no one will start using it.

So, again, in large commercial organizations, app adoption usually takes a backseat to other features. But this circle can be broken. And it will take real courage to see the adoption through to the end and continue to support it.

From the stage of not understanding how to take on VoiceOver, we finally arrived at the moment when we released version 6.34, and began to receive thanks from users for the ability to find companies, study information about them, and build routes.

Many years of work on this feature are taking their toll: fatigue is accumulating and burnout is felt. Just the other day, they rewrote “Favorites”, and as usual, they forgot about accessibility.

But I understand that I can't stop. There are still a number of unavailable scenarios. And I continue to work as best I can: I'm making the VoiceOver navigator that I was asked to make.

If you also decide to adapt the application, then come to the comments or private messages. Let's communicate and share experiences!

The team of engineers leads tg-channelyou can take a peek, read. If you want to work in our team, we have a vacancy open iOS developer.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *