Web Application Clipboard and How It Stores Different Data

If you've been using a computer for long enough, you probably know that the clipboard can store a variety of data (pictures, formatted text, files, etc.). As a developer, I began to get frustrated by the lack of a clear understanding of how the clipboard stores and organizes different types of data.

I recently decided to dig into the mechanics of the clipboard and wrote this post based on my findings. It's mostly about the clipboard and its API, but we'll also discuss how it interacts with the system clipboard.

Let's start by exploring the various APIs and their history. These APIs have some interesting limitations on the types of data, and we'll see how some companies have gotten around these limitations. We'll also look at some proposals that aim to address these limitations (most notably, Web Custom Formats).

If you've ever wondered how the web clipboard works, then this article is for you.

Using the Asynchronous Clipboard API

When we copy website content and paste it into Google Docs, some of the text formatting is preserved, including links, font size, and colors.

However, if you paste the same text into VS Code, only the text content is copied without formatting.

This is possible because the clipboard stores data in different performancesbased on MIME types. W3C Clipboard Specification requires support for three types of data for reading and writing:

In the example above, Google Docs used the view text/html and saved the formatting based on it. VS Code is only interested in plain text, so when pasting, the view is used text/plain.

Get the view we need using an asynchronous method read The Clipboard API is quite simple:

const items = await navigator.clipboard.read();

for (const item of items) {
  if (item.types.includes("text/html")) {
    const blob = await item.getType("text/html");
    const html = await blob.text();
    // Do stuff with HTML...
  }
}

Writing to the clipboard using the method write requires a little more body movement, but is still quite simple. First we form Blob for each of the views that you want to write to the clipboard:

const textBlob = new Blob(["Hello, world"], { type: "text/plain" });
const htmlBlob = new Blob(["Hello, <em>world<em>"], { type: "text/html" });

Then we pass them to the new object ClipboardItem in key-value format, where the data type is the key and Blob — the corresponding value:

const clipboardItem = new ClipboardItem({
  [textBlob.type]: textBlob,
  [htmlBlob.type]: htmlBlob,
});

Note: I like that ClipboardItem uses a key-value format because this is consistent with the idea of ​​using data structures that do not allow invalid states to be displayed, as discussed earlier in “Parse, don't validate“.

And finally we call the function writeusing the previously created object ClipboardItem:

await navigator.clipboard.write([clipboardItem]);

What about other types of data?

HTML and images are great, but what about other data exchange formats like JSON? If I were writing an app that supported copy/paste, I could imagine a situation where I would need to write JSON or some binary data to the clipboard.

Let's try to write JSON to the clipboard:

// Create JSON blob
const json = JSON.stringify({ message: "Hello" });
const blob = new Blob([json], { type: "application/json" });

// Write JSON blob to clipboard
const clipboardItem = new ClipboardItem({ [blob.type]: blob });
await navigator.clipboard.write([clipboardItem]);

When we try to execute this code, we get an error:

Failed to execute 'write' on 'Clipboard':
  Type application/json not supported on write.

And why is that? The thing is that specification for the method write requires that data types except text/plain, text/html And image/png were not processed.

If type is not in the mandatory data types list, then reject […] and abort these steps.

It is interesting that the type application/json was on the must-have list 2012 By 2021 year, but was removed from the specification in w3c/clipboard-apis#155. Before this change, the list of required types was much larger – 16 types for reading from the clipboard and 8 for writing. After the change, only text/plain, text/html And image/png.

This was done after browsers decided to stop supporting many of the required types due to possible security issues. This is stated in the warning in the section on mandatory data types in the specification:

Warning! The data types that untrusted scripts are allowed to write to the clipboard are limited as a security precaution.

Untrusted scripts can attempt to exploit security vulnerabilities in local software by placing data known to trigger those vulnerabilities on the clipboard.

So we can only write a limited number of types of data to the clipboard, but what about the mention of “untrusted scripts”? Can we execute code with a “trusted” script that will allow us to write other types of data to the clipboard?

isTrusted property

Perhaps events that have the property are considered “trusted” isTrusted. isTrusted — is a read-only property that takes the value true only if the event was sent by the user or user agent, and not generated by a script.

document.addEventListener("copy", (e) => {
  if (e.isTrusted) {
    // This event was triggered by the user agent
  }
})

By “sent by the user agent” we mean that it was performed by the user – for example, a copy event caused by the user pressing Ctrl+C. This is as opposed to an artificially triggered event by a script sent via dispatchEvent().

document.addEventListener("copy", (e) => {
  console.log("e.isTrusted is " + e.isTrusted);
});

document.dispatchEvent(new ClipboardEvent("copy"));
//=> "e.isTrusted is false"

Let's see what clipboard events exist and whether they allow us to write arbitrary types of data to the clipboard.

Clipboard Events API

ClipboardEvent is sent for copy, cut, and paste events and contains the property clipboardData type DataTransfer. DataTransfer is used Clipboard Event API to store multiple representations of data.

Writing to the clipboard in an event copy it is easy to do:

document.addEventListener("copy", (e) => {
  e.preventDefault(); // Prevent default copy behavior

  e.clipboardData.setData("text/plain", "Hello, world");
  e.clipboardData.setData("text/html", "Hello, <em>world</em>");
});

Reading from clipboard in event paste it is also easy to do:

document.addEventListener("paste", (e) => {
  e.preventDefault(); // Prevent default paste behavior

  const html = e.clipboardData.getData("text/html");
  if (html) {
    // Do stuff with HTML...
  }
});

The main question is, can we write JSON to the clipboard?

document.addEventListener("copy", (e) => {
  e.preventDefault();

  const json = JSON.stringify({ message: "Hello" });
  e.clipboardData.setData("application/json", json); // No error
});

There are no errors, but was JSON actually written to the clipboard? Let's check this by writing a handler for the paste event that loops through all the entities in the clipboard and prints their type to the console:

document.addEventListener("paste", (e) => {
  for (const item of e.clipboardData.items) {
    const { kind, type } = item;
    if (kind === "string") {
      item.getAsString((content) => {
        console.log({ type, content });
      });
    }
  }
});

After adding these two handlers and copying and pasting, we will see the following output in the console:

{ "type": "application/json", content: "{\"message\":\"Hello\"}" }

It works! It looks like it does. clipboardData.setData does not restrict data types the way the async method does write.

But why? Why can we read and write arbitrary data types using clipboardDatabut we can't do it using the asynchronous Clipboard API?

History of clipboardData

A relatively new asynchronous Clipboard API was added to the specification in 2017 year, while clipboardData has been around for a long time. The W3C draft for the Clipboard API is 2006 year describes clipboardData and his methods setData And getData (they show us that MIME types were not used at that time).

setData() This takes one or two parameters. The first must be set to either 'text' or 'URL' (case-insensitive).

getData() This takes one parameter, that allows the target to request a specific type of data.

But it turns out that clipboardData even older than this 2006 document. Take a look at this quote from the “Status of this document” section:

In large part [this document] describes the functionalities as implemented in Internet Explorer…

The intention of this document is […] to specify what actually works in current browsers, or [be] a simple target for them to improve interoperability, rather than adding new features.

IN article from 2003 years describes how at that time in Internet Explorer 4 and above it was possible to use clipboardData to read the user's clipboard without their consent. Since Internet Explorer 4 was released in 1997, it seems that the interface clipboardData at least 26 years old at the time of writing.

MIME types were introduced in specifications in 2011:

The dataType argument is a string, for example but not limited to a MIME type…

If a script calls getData('text/html').

At that time, the specification did not yet define what data types should be used.

While it is possible to use any string for setData()'s type argument, sticking to common types is recommended.

[Issue] Should we list some “common types”?

Ability to use any string in setData And getData remains today. For example:

document.addEventListener("copy", (e) => {
  e.preventDefault();
  e.clipboardData.setData("foo bar baz", "Hello, world");
});

document.addEventListener("paste", (e) => {
  const content = e.clipboardData.getData("foo bar baz");
  if (content) {
    console.log(content); // Logs "Hello, world!"
  }
});

If you paste the snippet above into DevTools and hit copy and paste, you'll see the message “Hello, world” in the console.

Apparently, the ability to use data of any type in clipboardData kept for historical reasons. “Don't break the web.”

Back to isTrusted

Let's look again at the sentence from the section “about mandatory data types»:

The data types that untrusted scripts are allowed to write to the clipboard are limited as a security precaution.

What happens if we try to write data to the clipboard in a synthetic clipboard event?

document.addEventListener("copy", (e) => {
  e.preventDefault();
  e.clipboardData.setData("text/plain", "Hello");
});

document.dispatchEvent(new ClipboardEvent("copy", {
  clipboardData: new DataTransfer(),
}));

It will run successfully, but will not make any changes to the clipboard. This is expected behavior, described in the specification:

Synthetic cut and copy events must not modify data on the system clipboard.

Synthetic paste events must not give a script access to data on the real system clipboard.

Only copy and paste events, which are triggered by the user agent, can make changes to the clipboard. This makes sense, since no one wants sites to be able to freely read the contents of the clipboard and, for example, steal passwords.


Let's sum it up:

  • The asynchronous Clipboard API, introduced in 2017, limits the types of data that can be written to and read from the clipboard. However, it can read and write data to the clipboard at any time, provided that the user has granted permission to do so (and the page is in focus).

  • The older Clipboard Events API does not have strict restrictions on the types of data that can be written to and read from the clipboard. However, it can only be used in user agent-initiated copy and paste event handlers (that is, when isTrusted equal true).

Using the Clipboard Events API seems to be the only option if you want to write to the clipboard more than just text, HTML, or images. It has fewer limitations in this context.

But what if you want to make a Copy button that writes non-standard types of data to the clipboard? It doesn't seem like you can use the Clipboard Events API unless the user triggers a copy event, right?

Create a copy button with arbitrary data entry

I tested the copy buttons in different web applications and looked at what was written to the clipboard. The result was interesting.

Google Docs has a copy button available in the right-click menu.

This button writes three data representations to the clipboard:

Note: The third view contains JSON data.

They write to the buffer their own data type, so the asynchronous Clipboard API is not used. How do they do this through the click handler?

I launched the profiler, clicked “Copy” and examined the result. It turned out that when I clicked the copy button, the call was triggered document.execCommand("copy").

This surprised me. My first thought was: “Is it possible? execCommand isn't this an obsolete method of copying text to the clipboard?

That's true, but Google uses it for some reason. Feature execCommand in that it allows you to programmatically send a trusted copy event as if it were performed by the user.

document.addEventListener("copy", (e) => {
  console.log("e.isTrusted is " + e.isTrusted);
});

document.execCommand("copy");
//=> "e.isTrusted is true"

Note: Safari requires an active selection to send a copy event using execCommand("copy"). This can be simulated by adding a non-empty input element to the DOM and selecting it before calling execCommand("copy")after which the input can be removed from the DOM.

So, the use execCommand allows us to write arbitrary data to the clipboard in response to click events. Great!

What about the insert? Can we use execCommand("paste")?

Create an insert button

Let's check out the insert button in Google Docs and see what it does.

On my MacBook I received a notification that an extension needs to be installed to use the insert.

But on a Windows laptop, the paste button just worked.

It's strange, where does this inconsistency come from? You can check whether the insert button will work by running the function queryCommandSupported("paste"):

document.queryCommandSupported("paste");

On my MacBook I got false in Chrome and Firefox and true in Safari.

Safari, being privacy conscious, asked for confirmation of pasting. In my opinion, this is a good idea, as it makes it clear that the site is about to read something from the clipboard.

On a Windows laptop I got true in Chrome and Edge and false in Firefox. Chrome's inconsistency is puzzling – why does it allow you to use execCommand("paste") on Windows but not on macOS? I couldn't find any information on this topic.

I also find it strange that Google doesn't try to use the asynchronous Clipboard API when the function execCommand("paste") unavailable. Even if they couldn't use the view application/x-vnd.google-[...]the HTML representation contains internal IDs that could be used.

<!-- HTML representation, cleaned up -->
<meta charset="utf-8">
<b id="docs-internal-guid-[guid]" style="...">
  <span style="...">Copied text</span>
</b>

Another example of an app with an insert button is Figma, and their approach is completely different. Let's take a look at how they do it.

Copy and Paste in Figma

Figma is a web app, their native app uses Electron. Let's see what their copy button writes to the buffer.

When copying in Figma, two views are written to the clipboard: text/plain and text/html. At first glance, this is surprising. How does Figma display different layout and styling options using plain HTML?

But if we look at the HTML content, we will see two empty span elements containing the properties data-metadata And data-buffer:

<meta charset="utf-8">
<div>
  <span data-metadata="<!--(figmeta)eyJma[...]9ifQo=(/figmeta)-->"></span>
  <span data-buffer="<!--(figma)ZmlnL[...]P/Ag==(/figma)-->"></span>
</div>
<span style="white-space:pre-wrap;">Text</span>

Note: Line length data-buffer — about 26,000 characters for an empty frame. After which data-buffer grows linearly in accordance with the amount of copied content.

Looks like base64 – by eyJ at the beginning of the line we can determine that data-metadata — a base64 encoded JSON string. When decoded, data-metadata by using JSON.parse(atob()) we get the following:

{
  "fileKey": "4XvKUK38NtRPZASgUJiZ87",
  "pasteID": 1261442360,
  "dataType": "scene"
}

Note: original fileKey And pasteID were replaced.

But what about data-buffer? Decoding from base64 gives us the following:

fig-kiwiF\x00\x00\x00\x1CK\x00\x00µ½\v\x9CdI[...]\x197Ü\x83\x03

Looks like a binary format. After a little digging, I came to the conclusion that we are talking about Kiwi message formatcreated by Figma co-founder and former CTO Evan Wallace and used to encode files of the format .fig.

Since Kiwi is based on a given schema, without knowing it we will not be able to parse the data obtained during decoding. Luckily for us, Evan created public file parser .fig. Let's try using it on our buffer!

To convert a buffer to a file .fig I wrote a simple script that generates a Blob URL:

const base64 = "ZmlnL[...]P/Ag==";
const blob = base64toBlob(base64, "application/octet-stream");

console.log(URL.createObjectURL(blob));
//=> blob:<origin>/1fdf7c0a-5b56-4cb5-b7c0-fb665122b2ab

Then I downloaded the resulting blob as a file of the format .figloaded it into the parser and voilà:

It turns out that when copying to Figma, a small Figma file is created and encoded as base64. The resulting string is then copied to the property data-buffer empty HTML element span and is saved to the user's clipboard.

Benefits of Using HTML

At first glance, this seemed a bit ridiculous to me, but there are serious advantages to this approach. To understand why, you need to look at how the web-based Clipboard API interacts with the Clipboard APIs of different operating systems.

Windows, macOS, and Linux have different formats for writing data to the clipboard. To write HTML to the clipboard in Windows is there CF_HTMLand in macOSNSPasteboard.PasteboardType.html.

All operating systems provide types for “standard” formats (plain text, HTML, PNG images). But which format should the OS use when the user tries to copy a custom format? application/foo-bar to the clipboard?

Since there is no suitable variant, the browser does not write it in any of the formats supported by the operating system clipboard. Instead, only a display supported only by the browser is created, which allows arbitrary data to be copied between different tabs in the browser, but not between applications.

That's why using common formats text/plain, text/html And image/png so convenient. They match the formats commonly used by the OS clipboard and can be easily read by other apps, making it possible to use copy and paste between apps. In the case of Figma, using text/html allows you to copy a Figma element from figma.com and insert it into the native application and vice versa.

What is written to the clipboard by browsers when copying custom types?

We've learned that we can read and write custom data types using the clipboard between browser tabs, but not between applications. But what exactly is written to the OS clipboard when we write a custom data type to the web clipboard?

I have executed the following code in the event handler copy in each of the major browsers on my MacBook:

document.addEventListener("copy", (e) => {
  e.preventDefault();
  e.clipboardData.setData("text/plain", "Hello, world");
  e.clipboardData.setData("text/html", "<em>Hello, world</em>");
  e.clipboardData.setData("application/json", JSON.stringify({ type: "Hello, world" }));
  e.clipboardData.setData("foo bar baz", "Hello, world");
});

After which I checked the contents of the clipboard using Pasteboard Viewer. Chrome adds four entries to Pasteboard:

  • public.html contains HTML representation.

  • public.utf8-plain-text contains a text representation.

  • org.chromium.web-custom-data contains a custom view.

  • org.chromium.source-url contains the URL of the page from which the copying was made.

After viewing the contents org.chromium.web-custom-datawe will see the data we copied:

Imagine that "o" with an accent and inconsistent line breaks are the result of incorrect display of line break characters.

Imagine that the accented “î” and inconsistent line breaks are the result of incorrectly displayed line break characters.

Firefox also creates records public.html And public.utf8-plain-textbut writes custom data to org.mozilla.custom-clipdata. The source URL is not recorded anywhere, unlike Chrome.

As you might have guessed, Safari also creates entries public.html And public.utf8-plain-text. Custom data is written in com.apple.WebKit.custom-pasteboard-data and, interestingly, it stores in it the full list of representations (including HTML and plain text), as well as the source URL.

Note: Safari only allows copying data between tabs if the source URL (domain) is the same. This limitation does not apply to Chrome or Firefox, although Chrome also stores the source URL.

Raw Clipboard Access for Web Applications

In 2019, the creation of an API was proposed Raw Clipboard Access for direct access to writing and reading the operating system clipboard.

The following excerpt is from section “Motivation” on chromestatus.com for the Raw Clipboard Access API, here is a brief description of its benefits:

Without Raw Clipboard Access […] web applications are generally limited to a small subset of formats, and are unable to interoperate with the long tail of formats. For example, Figma and Photopea are unable to interoperate with most image formats.

However, the proposal to add this API was rejected due to security issuessuch as the ability to remotely execute code in native applications.

A new offering for writing custom data to the clipboard is Custom Formats (also called Pickling).

Web Custom Formats (Pickling)

In 2022, Chromium implemented support Web Custom Formats in the asynchronous Clipboard API.

It allows web applications to write custom data via the asynchronous Clipboard API by adding the “web “ to data types.

// Create JSON blob
const json = JSON.stringify({ message: "Hello, world" });
const jsonBlob = new Blob([json], { type: "application/json" });

// Write JSON blob to clipboard as a Web Custom Format
const clipboardItem = new ClipboardItem({
  [`web ${jsonBlob.type}`]: jsonBlob,
});
navigator.clipboard.write([clipboardItem]);

Reading custom data is also done using the Clipboard API:

const items = await navigator.clipboard.read();
for (const item of items) {
  if (item.types.includes("web application/json")) {
    const blob = await item.getType("web application/json");
    const json = await blob.text();
    // Do stuff with JSON...
  }
}

What's more interesting is that the data is written to the system clipboard. When writing to the system clipboard, the following information is written:

On macOS, the mapping is written to org.w3.web-custom-format.map and looks like this:

{
  "application/json": "org.w3.web-custom-format.type-0",
  "application/octet-stream": "org.w3.web-custom-format.type-1"
}

Keys org.w3.web-custom-format.type-[index] correspond to entries in the system clipboard containing the raw data from the blobs. This allows native applications to access the mapping to determine whether a given representation is available, and then read the raw data contained in the clipboard entry.

Note: Windows and Linux use another naming schemeI am for mapping and clipboard entries.

This eliminates security issues because web applications cannot write raw data to the arbitrary format of the system clipboard. This is a related compatibility issue, explicitly described in Pickling specifications in the asynchronous Clipboard API:

Non-goals

Allow interoperability with legacy native applications, without update. This was explored in a raw clipboard proposal, and may be explored further in the future, but comes with significant security challenges (remote code execution in system native applications).

As a result, native applications will need to be updated to ensure clipboard compatibility with web applications when using non-standard data types.

Web Custom Formats have been available in Chromium-based browsers since 2022, but other browsers have not yet implemented this API.

Conclusion

There is currently no good way to write custom data types to the clipboard that works across all browsers. Figma's approach of copying base64 encoded strings into the HTML view is crude, but effective in working around some of the limitations of the Clipboard API. It's a good approach for passing custom data types through the clipboard.

I find the proposal for Web Custom Formats interesting and hope it will be implemented by major browsers. It would allow safe and convenient writing of custom formats to the clipboard.

Thank you for reading and I hope you found the article interesting.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *