Apple Vision Pro Spatial ux design8

Spatial Experience using Apple Vision Pro
pple’s Spatial Design is a booming field, and learning from the basics is the right way to go for better understanding.
This is part 1 of an ongoing series, catch up on part 2, part 3 here.
Spatial Design refers to the design of user interfaces and experiences in a 3D spatial environment, offering a more immersive and intuitive way to interact with digital content.
Here are the first 12 important lexicons to remember:
1. Windows
* A window displays 2D or 3D content in a container.
* There can be single window or multiple windows, which can be resized to any scale.


Multiple windows
2. Volumes
* A volume displays 3D content that people can view from any angle. For example, products from e-commerce website can be viewed in 3D to see how it looks in real life, before purchasing it.

Ecommerce virtual object in 3D (click to play)
* It doesn’t display frames around 3D objects (like the frames in windows).

3D object in volume, next to 2D window
3. Ornaments
* An ornament contains controls or information related to a window.
* It can be components like toolbars, tab bars, and video playback controls. For e.g., when a video plays in a Window, its controls such as Stop, Pause, Forward, etc., are displayed in an Ornament.

4. Toolbars
* VisionOS-provided toolbar is horizontal and appears along the bottom edge of a window, slightly in front of the window along the z-axis.
* Toolbars contain frequently used controls to perform actions on the current view.

Toolbar (Ornament)
5. Tab bars
* A tab bar is always vertical, floating in a window’s leading side.
* Tab bars are used to navigate between different sections in an app.

Tab bar & Sidebar
6. Side Bars
* A side bar extends from a window’s leading side when a tab within tab bar is selected. It is used to display additional navigation options.
* For example, in the image above, when the Library icon tab is selected in the Tab bar, the Side bar containing Library’s different navigation options opens up, extending from the window.
7. Menus and Popovers
* Menus and popovers can expand outside the window.
* When a button is selected, use black labels on a white background. This helps people have clear feedback of which button invoked the popover without the need of arrows.
* As a general rule on this platform, avoid using buttons with white backgrounds unless they are selected.

8. Sheets
* A sheet is a modal that floats in front of its parent window, requesting specific information from people or presenting a simple task that they can complete before returning to the parent view.
* For e.g., a Sheet asking user to subscribe to Music app.

9. Spaces
* Shared Space: This is the default space. Windows can exist side by side and people can reposition those wherever they like.
* Full Space: For a more immersive experience, user can transition an app to a Full Space, where other apps hide and only the selected app is visible. It can also open a portal to a different world, or fully immerse people in an environment.

Immersion Spectrum in Shared Space & Full Space
10. Immersion Spectrum
* Window (can be in Shared Space as well as in Full Space)
* Panorama View (in Full Space)
* Environment (in Full Space)
An app can be dynamic and can fluidly transition between different states of immersion as this spectrum offers so much flexibility. An app can be within a window in the Shared Space alongside other apps. Or if it needs more room, it could run in a Full Space, where other apps are hidden. By default, start an app in a window in the Shared Space. This will give people control over how immersed they want to be.
In the example below, in Keynote, the app opens in a window. But when it’s time to play the slideshow, it transitions to Full Space, and dimming is used by default to bring focus to this presentation.
Dimming is a simple way to create contrast between content and people’s surroundings without taking them out of their space.

Dimming is used to bring focus (click to play)
When it’s time to rehearse the presentation, a big stage environment can be formed in the surroundings fully immersing people within the theater. Life-size experiences like these require more room, so Keynote is now in a Full Space and other apps are hidden.

Life-size experiences in Full Space (click to play)
Another example shown below is Photos library - browsing through photos is made familiar, and when a photo is selected, it grows big in space and dims surroundings. Seeing such great memories at a lifelike scale is truly magical. And when it’s time to view a panorama, it makes people feel like they are really there. Panoramas transport users in a way that’s only possible with infinite space.

Photos transitioning from big size to panorama (click to play)
11. Passthrough
* Passthrough enables people (while wearing Apple Vision Pro) to see their physical surroundings, through the real-time video from the device’s external cameras.
* Turning Digital Crown adjusts the amount of Passthrough, to change how much of their surroundings they can see, to control immersive experience.

Digital Crown on Apple Vision Pro to control Immersive Experience
* Increasing Passthrough will enable more of their real surroundings to be visible, while decreasing Passthrough will decrease the amount of real surroundings to be visible, and can also bring up a fully immersive experience using an Environment.

Increasing Immersive Experience by bringing up an Environment (click to play)
12. Transitions & Subtle Animations
* Design smooth, predictable transitions to create continuity between different states of experience.
* Blend thoughtfully with reality — when blending entire scenes into someone’s space, make sure to use soft edges to smoothly integrate the app. This avoids abrupt transitions and keeps people focused on the content.
* The most inspiring experiences make things feel alive. Subtle animation can bring liveliness to a scene.
ow do we interact with apps and objects in a 3D immersive experience? Interestingly, our eyes and hands are the new spatial inputs to perform interactions and this article will explore them in detail.
I. Eyes
Focus & Hover effect
When people focus on an interactive component with the intention of interacting with it, a subtle visual highlighting occurs, known as hover effect, which is a visual feedback (to confirm the object that they are targeting).

Hover effect when eyes move through photos and buttons (click to play)
Long Focus
When an element such as button, tab bar, etc. is focused on for some time, it reveals more information, maintaining a clean UI.

Long focus on tab bar reveals a label for each tab | Long focus on button reveals its tooltip (click to play)
Focusing on the microphone glyph in a search field triggers Speak to Search, revealing this layer and allowing to perform a search using just eyes and voice.

Activating Speak to Search with long focus (click to play)
Dwell Control Feature — Assistive Technology
People can select content just with their eyes, by activating the Dwell Control feature. In this example, focusing on a button for a short time will show the Dwell Control UI and will select the button without needing to perform a tap gesture with hand.

Using the Dwell Control feature to select content just with eyes (click to play)
II. Hands
1. Indirect Gestures
An indirect gesture is primarily done from a distance. When people bring focus to a button and it highlights with hover effect indicating that the button is targeted, they can activate or select it by quickly tapping a finger to their thumb to make the indirect tap gesture.

Indirect tap gesture to select (click to play)
Indirect gestures can also be used to scroll, zoom and rotate.

Scroll, Zoom & Rotate (click to play)
Using eye direction combined with hand gestures can create precise and satisfying interactions.
For example, when zooming an image, the origin point of the zoom is determined by where within the image the eyes are focused at that moment. This results in that particular area to be magnified and centered as it zooms in.

Focus on a point and zoom in to magnify that area (click to play)
Another example of this behavior is pointer movement in Markup. To draw, people can control the brush cursor with their hands, similar to a mouse pointer, but then if they look to the other side of the canvas and tap, the cursor jumps there landing right where they are looking.

Look at a point in canvas, tap and draw to start drawing on that area (click to play)
Standard gestures are available by default in visionOS to perform interactions.

Standard — Indirect gestures
2. Direct Gestures
A direct gesture affects the virtual object that people are touching (bringing a finger close to an object). For direct gestures, virtual objects are placed near the people to be able to reach them and use fingertips to interact with them, such as scrolling a page, typing on a virtual keyboard, manipulating 3D content using fingers, and more.

Scrolling | Typing (click to play)

Manipulating virtual 3D object using fingers (click to play)
Direct touch is to be used for those experiences that invite up close inspection or object manipulation, any interactive mechanic that builds on top of the muscle memory from real-world experiences, and when physical activity is at the center of the experience.

Use direct touch in these scenarios
When components in virtual space are interacted with direct touch, there must be visual and audio feedback to compensate for missing tactile information, and to make direct interactions feel reliable and satisfying.
For example, while a finger is above the keyboard, buttons display a hover state and a highlight that gets brighter as it approach the button surface. When a button is tapped on, the state change is quick and responsive, and is accompanied by matching spatial sound effect.

Visual feedback from virtual keyboard (click to play)
Other familiar inputs
Devices such as keyboard, trackpad and game controller can also be connected to interact with the apps in the spatial experience.

Device controller used to play a game | Physical keyboard to input data in the spatial window (click to play)
Using eyes and hands to interact with virtual content is new for most people. That’s why it’s so important to guide them by providing clear feedback and to rely on familiar interaction patterns and gestures where possible.

App Icons
App icons are three-dimensional with depth between layers that causes them to expand when people look at them.

Icons expand when looking at them (click to play)
How to create App Icons?
All app icons can have up to three flat layers, a background layer and up to two foreground layers on top. Each layer is a square image, and their size is 1024 by 1024 pixels. Both foreground layers should have a transparent background.

Then all layers get cropped by a circular mask. And finally, when layers match together, a glass layer is applied automatically, adding depth, specular highlights, and shadows to them. Always keep the graphics centered.

Three layers combine to form an app icon (click to play)
Windows are designed with a new visual language, known as the Glass Material, that adapt to different lighting conditions. Spatial platform does not have a distinct light or dark mode. Instead, glass and UI naturally adapt and become brighter or darker when placed in front of light and dark backgrounds respectively.

Window becomes brighter or darker based on background lighting (click to play)
Windows live in our space and feel like part of our surroundings. Interfaces are placed within windows so people can comfortably see them and use them.
Design Familiar Elements & Windows
Interface design in spatial apps must be recognizable and familiar by making them similar to 2D apps.
In the example below, the elements in the Music Spatial App are similar to the Music App in 2D devices.

Similarity in design between 2D app and Spatial app (click to play)
How to design legible windows?
* To maintain contrast between separate sections of an app, use a darker material on top of a lighter material and use a lighter material on top of a darker material.
* For example, in the Music app below, the window is of lighter material, so use a darker material for the sidebar to maintain enough contrast. Then lighter material on top of side bar to bring attention to interactive elements, like buttons. To increase contrast for standard components, like input fields, more darker material can be used.

Contrast between different sections of a window (click to play)
* Use white text or symbols so they are always clearly visible. If you need to use color, mostly use system color and use it in a background layer or an entire button so people can see it.

Use color in button or background layer, avoid colored text
* In the example below, since the text is white and there are lighter buttons, it’s better to use a darker cell behind each region to add more contrast.

Use a darker cell behind each region to add more contrast.
* As shown below, avoid stacking lighter materials on top of each other, as it impacts legibility and reduces contrast.

Avoid stacking lighter materials on top of each other
SF Pro is the system font in visionOS and text defaults to white.
visionOS uses bolder versions of the Dynamic Type body and title styles, like using medium instead of regular for body text, and using bold instead of semi bold for titles, to keep text clear all the time.
It also introduced Extra Large Title 1 and Extra Large Title 2 for wide, editorial-style layouts.

Extra large title used in wide layout
Use 2D text for legibility. Avoid 3D text as that’s difficult to read.

Use 2D text for legibility, not 3D (click to play)
Vibrancy is one of the most important details to maintain legibility across the system. On spatial platform, since the background can be constantly changing, vibrancy updates in real time to make sure the text is always legible. In the example below, the vibrancy is turned on and off to show the difference.

Vibrancy turned on increases legibility (click to play)
There are three modes in vibrancy: primary, secondary, and tertiary. Use primary for standard text. Or use secondary for description text, footnotes, and subtitles.