Meet WOPI! How to set up work with documents in the browser

Hi all! My name is Alexey Simonov. I am a developer at ELMA.

Today we will talk about a protocol called WOPI. It allows you to work with document files using a cloud server. You select the file you want to view or edit — and it immediately opens in a web editor in your browser. WOPI is supported by products such as Onlyoffice, P7-Office, My Office, Microsoft Online Office, and ELMA365, in the development of which I am participating.

There are custom modules within our Low-code platform. They add flexibility to the system by expanding functionality. More about modules here. As part of the implementation of one such module, I became acquainted with the WOPI protocol. The task was to organize work with files using various cloud document servers.

In this article I will talk about basic terms, the protocol structure and principles of its implementation. The material will be useful for web developers of any level, as well as their team leaders to understand the complexity of such a task and its decomposition.

How WOPI Works - Host, Client and Browser

How WOPI Works – Host, Client and Browser

What is WOPI?

WOPI (Web Application Open Platform Interface), as can be understood from the translation, is an interface for the interaction of web applications, which is important – open. Oddly enough, the name says nothing about the main object for which this interface exists. This object is a file. Yes, yes, WOPI is an interface through which clients (represented by ordinary users) interact with the server, performing operations on files. Typically these are office document files – text, tables, reports, presentations. Using this protocol, you can perform the following actions: reading, editing, creating new files, converting files from one format to another.

Let's take a quick look at the term interface. What is an interface in programming? This is a set of functions. If your object implements all the functions in an interface, it implements the interface. This will be useful to us a little later.

This protocol was developed by Microsoft in January 2012. Version 1.0 was released on October 8, 2012. At the time of writing, the current major version has index 14.0, released on November 16, 2021. You can see the list of versions Here.

How does the WOPI protocol work?

Now let's look at the simplest example given in the documentation, when the user wants to open a file for viewing.
The standard diagram of this example can be viewed Here.

So, we have 3 entities – a browser, a WOPI server (host) and a WOPI client. And although there are two clients and one server, we are responsible for the latter. Unless, of course, our system is faced with the task of giving users the ability to work with files using the WOPI protocol. If you are developing, for example, a cloud document server, then here it will be a client. This article examines the situation from the server side.

It all starts with the browser – the user wants to open a document for viewing.

Interaction scheme using the WOPI protocol

Scheme of interaction according to WOPI protocol

A request is sent from the browser to the WOPI server, which is also a file storage. The server contains information about a specific WOPI client. The server contacts the client for a description of the available functionality. This request is not on the original large diagram, it is described separately and the authors imply that the information is already stored on the server. In response, the WOPI client sends a certain list of functions that it can perform with files. This list is sent to the browser. Each function has a corresponding URL. Next, the browser, knowing a specific function (file viewing) and the address of this function on the client, accesses this address.

After that, a more detailed exchange begins. First, the client requests information about the file from the server, then the file itself. As a result, the user opens a client window in the browser with the file for viewing. We have analyzed the general algorithm. Next, we will talk about the main steps of WOPI exchange between the server and the client.

WOPI server-client exchange

WOPI discovery

WOPI discovery begins with a request from the server to obtain client functionality:

GET https:///hosting/discovery

Getting WOPI discovery

Getting WOPI discovery

In response, the client sends an XML document describing possible actions with files.

<wopi-discovery>
  <net-zone name="external-http">
    <app name="Word" favIconUrl="https://<favicon_url>/favicon.ico">
      <action name="view" ext="pdf" urlsrc="https://<action_url>?&<rs=DC_LLCC&><dchat=DISABLE_CHAT&><embed=EMBEDDED&><fs=FULLSCREEN&><hid=HOST_SESSION_ID&><rec=RECORDING&><sc=SESSION_CONTEXT&><thm=THEME_ID&><ui=UI_LLCC&><wopisrc=WOPI_SOURCE&>&"/>
      <action name="edit" ext="docx" default="true" requires="locks,update" urlsrc="https:/<action_url>?&<rs=DC_LLCC&><dchat=DISABLE_CHAT&><embed=EMBEDDED&><fs=FULLSCREEN&><hid=HOST_SESSION_ID&><rec=RECORDING&><sc=SESSION_CONTEXT&><thm=THEME_ID&><ui=UI_LLCC&><wopisrc=WOPI_SOURCE&>&"/>
    </app>
  </net-zone>
</wopi-discovery> 

Document structure:

<wopi-discovery>
	<app name = … >
		<action name = … requires = … ext = … urlsrc = …/>
	</app>
</wopi-discovery>  

The root tag in such a document is wopi-discovery. Each action with the file is marked with a tag action. Actions are collected into groups by the application that processes the files. For example, app=world for doc, docx, app=excel for xls, xlsx.

<action name="view" ext="pdf" urlsrc="https://<action_url>?&<rs=DC_LLCC&><dchat=DISABLE_CHAT&><embed=EMBEDDED&><fs=FULLSCREEN&><hid=HOST_SESSION_ID&><rec=RECORDING&><sc=SESSION_CONTEXT&><thm=THEME_ID&><ui=UI_LLCC&><wopisrc=WOPI_SOURCE&>&"/>

The action contains the following attributes:

  • name — description of the action

  • ext (extension) — file extension for the action

  • requires – an attribute that describes many WOPI functions that your server must implement for the action to work correctly.

  • urlsrc — the address at which the WOPI client can operate.

For name There are two main meanings characteristic view (view), edit (editing) You may also encounter editnew (creating an empty file and then editing it), convert (conversion of obsolete formats to modern ones, for example doc to docx, for editing).

Among the attribute values requires meet:

  • update implies functions PutFile And PutRelativeFile

  • locks includes functions Lock, RefreshLock, Unlock, UnlockAndRelock.

We'll talk about the functions themselves a little later.

Address urlsrc may include placeholder values, simply put, parameters that can be further configured when calling a certain action. Among them you may find:

  • ui – Interface language,

  • dchat (disable chat) – controls the display of chat inside the file.

There is also one required parameter wopisrc. Here you must specify the general URL where your server provides WOPI function calls. Now let's look at it in detail.

WOPI source

Aka wopisrc. This is the entry point for accessing the functionality of your WOPI server.

There are two rules defined for this address:

  1. It should end with nothing less than:
    /…/wopi…/files

To make it clearer, I will give examples with explanations.

https://my-wopi-server.com/my-wopi/files/… is incorrect, because the part of the URL responsible for the entry point does not begin with “wopi”. At the same time,
https://my-wopi-server.com/wopi-my/files/… – right.

  1. After /wopi…/files there should be a file id, like this
    /wopi…/files/

The second rule is actually not that strict. Id does not have to be a file identifier. In place of the id there can be any information that your server is designed to process, but inside this information there must be a file id. The file id itself is needed to uniquely identify the file for which the action is called. Thus, the second rule can be interpreted as follows:

/wopi…/files/

WOPI host page

What is needed to display the client editor window on the server with the contents of the file?

We create an HTML page of the following type:

<!--Форма которую нужно отправить для инициализации редактора-->
<!--id - идентификатор формы, нужен для отправки формы-->
<!--action - нужный нам urlsrc обязательно с заполненным параметром wopisrc указывающим на точку входа на наш сервер-->
<!--access_token - токен сформированный сервером для идентификации клиента-->
<!--access_token_ttl - время жизни токена, указывать его не обязательно-->
<form 
	id="office_form"
    name="office_form"
    target="office_frame"
    action="<%= actionUrl %>"
    method="post">
    
	<input name="access_token" value="<%= token %>" type="hidden" />
	<input name="access_token_ttl" value="<%= tokenTtl %>" type="hidden" />
</form>

<!--Элемент на странице к которому будет привязан iframe с редактором-->
<span id="frameholder"></span>

<!--Скрипт для создания iframe-->
<!--В нём будет отрисован компонент редактора wopi клиента после отправки формы-->
<script type="text/javascript">
	var frameholder = document.getElementById('frameholder');
	var office_frame = document.createElement('iframe');
	office_frame.name="office_frame";
	office_frame.id = 'office_frame';

	office_frame.title="Office Frame";
	office_frame.setAttribute('allowfullscreen', 'true');

	office_frame.setAttribute('sandbox', 'allow-scripts allow-same-origin allow-forms allow-popups allow-top-navigation allow-popups-to-escape-sandbox allow-downloads allow-modals');
	office_frame.setAttribute('allow', 'autoplay camera microphone display-capture');
	frameholder.appendChild(office_frame);

	document.getElementById('office_form').submit();
</script>

For this to work, we need to fill out all the fields on the form and submit it. If we made no mistakes, the WOPI client editor with the file open in it will be loaded into the iframe in response. The functionality of that action will be presented urlsrcwhich we indicated (located in the tag action document discovery).

How to make WOPI exchange secure

To identify the source of requests (or user), as well as the user's rights to the file for which the request is received in the protocol, there is a token (access_token).

This token is created by the server and sent to the client when the session is initialized. What does the client do with this token? It sends it back in each of its requests as a parameter (query). The server's task is to process the token and return the correct data for the request, or an authorization error if the token validation was unsuccessful.
There is an opportunity to further secure our WOPI exchange. There is an X-WOPI-PROOF-KEYS header for this purpose. The client signs each request with a private key. The public key can be found in WOPI discovery by tag proof key. In general, to confirm the originality of the client on the server, you need to calculate the reference key using a special algorithm, decrypt the private key from the X-WOPI-PROOF header using the information from the prof-key tag and compare them. The keys must match. Implementation of this check is not mandatory.

Common WOPI Query Headers

When a client sends requests to our server, it usually supplies them with service headers. If you look at the documentation, in the section where the headers are described, most of them are used for logging needs. I see no point in describing them here. You can go here and get acquainted.
I will emphasize only one header, not listed in the link above: X-WOPI-OVERRIDE contains the line “name of the operation in the current request.” This header is very important when there is only one address available for several operations.

WOPI request response codes

I will list the main HTTP codes that our server can send in response to WOPI client requests and provide links to information about them.

Functions for working with files

These functions must be implemented on the server for correct interaction with the WOPI client. Description in the documentation here.

Function address

Function name

Meaning of X-WOPI-OVERRIDE

Implementation of reading a file

Implementation of reading and editing a file

GET /wopi…/files/

CheckFileInfo

Yes

Yes

POST /wopi…/files/

Lock

LOCK

No

Yes

GetLock

GET_LOCK

No

No

PutRelativeFile

PUT_RELATIVE

No

No

Unlock

UNLOCK

No

Yes

RefreshLock

REFRESH_LOCK

No

Yes

UnlockAndRelock

LOCK + X-WOPI-OLD-LOCK

No

Yes

DeleteFile

DELETE

No

No

RenameFile

RENAME_FILE

No

No

GET /wopi…/files//contents

GetFile

Yes

Yes

POST/wopi…/files//contents

PutFile

PUT

No

Yes

GET /wopi…/containers/

CheckContainerInfo

No

No

POST /wopi…/containers/

CreateChildContainer

CREATE_CHILD_CONTAINER

No

No

CreateChildFile

CREATE_CHILD_FILE

No

No

DeleteContainer

DELETE_CONTAINER

No

No

RenameContainer

RENAME_CONTAINER

No

No

GET /wopi…/containers//ancestry

EnumerateAncestors (containers)

No

No

GET /wopi…/containers//children

EnumerateChildren (containers)

No

No

The function addresses specified in the table must strictly correspond: the WOPI client will address them, calling this or that function. But the function names can be arbitrary. What they are – is not visible from the outside!

To implement file reading via the WOPI protocol, it is sufficient to implement only two functions on the server:

If you want your server to support not only file viewing, but also editing, you will have to add implementation of the functions:

  • Lock

  • Unlock

  • UnlockAndRelock

  • RefreshLock

  • PutFile

Let's talk about each of them in more detail.

CheckFileInfo

GET /wopi…/files/

That very first request from the client to the server for information about the file with which we will work. Using the id at the end of the address, our server should be able to find the required file. The response is a JSON structure containing fields, for example:

  • BaseFileName — file name

  • Version – file version

  • OwnerId — ID of the user who created the file

  • Size — file size in bytes

  • UserId — identifier of the user on whose behalf the request is made

  • UserFriendlyName – the name of the user making the request

  • SupportsLocks — whether the server has locking functions implemented

  • UserCanWrite – whether the current user can edit the file

  • DisablePrint – ability to disable the print button in the WOPI client window

  • DownloadUrl – link from which you can download the file

GetFile

GET /wopi…/files//contents

At this address the client can receive the file from the server. However, depending on the implementation, sometimes the client can use the information from the field DownloadUrl functions CheckFileInfo. But this function must be implemented in any case.

PutFile

POST /wopi…/files//contents

X-WOPI-OVERRIDE header with value PUT.

A function that, when called, should cause the server to update the contents of the file from the request body. Called after editing is complete. Clients usually lock the file with the Lock function before calling this function. The file should only be replaced if the X-WOPI-LOCK header of the request contains the same value that was set by the Lock function. Otherwise, 409 Conflict returns the current lock identifier in the X-WOPI-LOCK header.

Lock functions

Lock

POST /wopi…/files/

X-WOPI-OVERRIDE header with value LOCK.

This function is used to lock a file. When a user opens a file for editing, the client sends a lock request after receiving the file to prevent anyone else from editing the file. The request must include the X-WOPI-OVERRIDE header with the LOCK value. Along with it, the client sends a second one — X-WOPI-LOCK, which contains a string identifier of the lock. The lock identifier is used to uniquely determine which client has locked the file.

If there is no lock identifier for the file on the server at the time, the server must lock the file and remember the identifier.

If there is an identifier and it matches the value from the request, then you need to update the lock – execute the RefreshLock function. In other cases, the server should return a 409 Conflict status and place an X-WOPI-LOCK header in the response with the value of the current file lock. The lock automatically expires after 30 minutes.

GetLock

POST /wopi…/files/

X-WOPI-OVERRIDE header with value GET_LOCK.

Allows you to get the current value of the lock ID. If the file is not locked, an empty string is sent in the X-WOPI-LOCK header. If there is a lock, the identifier value. If the current identifier for some reason does not match the format – 409 Conflict.

RefreshLock

POST /wopi…/files/

X-WOPI-OVERRIDE header with REFRESH_LOCK value.

This function extends the current lock. Typically WOPI clients prefer calling Lock again over this function. If the lock ID is not equal to the one received from the request, the file was intercepted by someone else. In this case, you need to respond with a 409 Conflict status and return the current ID. If the file is not blocked by anyone, you need to return an empty string as an identifier.

Unlock

POST /wopi…/files/

X-WOPI-OVERRIDE header with value UNLOCK.

Signals that editing is complete and the file can be released from locking. If the identifier from the X-WOPI-LOCK request header matches the current one, the lock is released and we must return X-WOPI-LOCK with the value of an empty string in the response. If the current ID is not equal to the value from the request – 409 Conflict and return the current ID. If the file is already unlocked, we do the same, but leave the X-WOPI-LOCK header empty.

UnlockAndRelock

POST /wopi…/files/

X-WOPI-OVERRIDE header with value LOCK.

The function allows you to set a new lock with a given value. Essentially identical to the Lock function. You can distinguish them by their headings. In Lock, the current identifier is transmitted in the X-WOPI-LOCK header, here this header is used to set a new identifier, and the current one is contained in X-WOPI-OLD-LOCK. In response, the X-WOPI-LOCK header should only be included if something went wrong and should be set to the current lock value. If the request is successful, a response of 200 Ok should be returned.

I have not encountered the use of other functions, so I will not talk about them. As always, if necessary, you can view the official documentation.

Now that you are familiar with the implementation details of the WOPI protocol, the file opening to view diagram can be transformed by replacing the descriptive actions on the arrows with specific functions:

Interaction diagram with WOPI functions indicated

Interaction diagram with WOPI functions indicated

Case Study

If you carefully read the beginning of the article, we implemented all work with files via the WOPI protocol in the module. A widget was created in the image and likeness of the host page, but with some changes. The necessary functions for reading and editing files were implemented in the module's API methods. The list of functions can be found in the table above. To implement tokens in the WOPI exchange, we had to slightly modify the ELMA365 platform itself. Having done this work, we managed to get a tool for working with files via the WOPI protocol using various cloud document servers. Below is the interface of the OnlyOffice reactor loaded in the WOPI module widget inside the ELMA365 system.

OnlyOffice interface works via WOPI protocol inside ELMA365

OnlyOffice interface works via WOPI protocol inside ELMA365

Instead of a conclusion

Speaking about the pros and cons of the WOPI protocol, I have identified two main advantages for myself – universality and speed of implementation. By universality, I mean the ability to attach any document server implementing the protocol as a WOPI client. Just change the client address, and voila – the editor from the new supplier is already open in the browser. I managed to experiment with the products My Office, Onlyoffice and MS Online Office. And they all worked more or less the same. The speed of implementation lies in the fact that in fact, there are few functions that need to be implemented for work. Just think about it – you only need two to read a file! Knowing all the subtleties and tricks for creating an MVP, one day will be enough (not enough), well, maybe two.

Of course, there are also disadvantages to this solution. The first and main one is that any universal thing is very angular and difficult to customize. So it is with this protocol. It supports the minimum number of operations required to work with files. As far as I can remember, there is only one interface setting – whether to show the print button or not. And you can’t get access to the unique features of a particular product at all.

The second disadvantage, which may be critical for someone, is that WOPI works a priori slower than API, because often the same API is running under the hood of WOPI. The difference in performance is not always visible to the naked eye, but as it happens according to Murphy's law, a gun can go off at the most inopportune moment.

If you have experience with WOPI, please share in the comments.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *