What to do if your customer is a scale, or How to speak the same language as household appliances

My name is Alexey Plaksin, I am a systems analyst at KODE and now I will tell you how I did reverse engineering of household appliances.

One day a large brand of household appliances came to us, which also produces and sells “smart” appliances for the home. We needed to quickly develop a new mobile application for managing smart devices.

We had three devices on hand – scales, a kettle and a multicooker. We only knew how they worked from the instructions and theoretical information on the website. We had no experience developing applications to control such smart devices. We also didn't know how Bluetooth and Wi-Fi control worked.

The project promised to be interesting.

We research and determine the source of traffic

After a quick search for information, we found such a volume of data about the Internet of Things that it became clear that we needed to narrow the search. Based on the description on the boxes, we understood that the devices work via Bluetooth. For keywords like “Multi-cooker” brand name control via Bluetooth” found a YouTube channel in which the valiant Open Source community told how to find a pattern in sending and receiving information from devices. This was our starting point: now everything needed to be checked and clarified.

Ever since university, I remembered Wireshark, a utility for analyzing network traffic. It turned out that it also processes Bluetooth traffic.

The tool had been found, all that remained was to figure out what to analyze. I remembered the rule “Whatever goes on the Internet stays on the Internet.” Since I was not interested in memo-archaeology, but in the APK of the old version of the application, I installed it, registered it with a test email – and voila, I received a traffic source for analysis.

But the devices were in Kaliningrad – a whole table in the office was covered with them, and I was a remote worker from Yekaterinburg. Therefore, I asked my colleagues from the QA team for help, I send them greetings and rays of goodness. They needed:

1. Install and run the old version of the application,

2. Enable screen recording on your phone,

3. Complete tasks like “Boil the kettle, turn on the water heating and the backlight.”

Then they pulled device logs from the phone and sent them to me along with the screencast. I opened the logs in Wireshark, compared them with what was happening on the screencast, and tried to decipher the traffic.

Lyrical digression: the log was recorded within a day, so I had to master filtering. Filtering by the device's mac address turned out to be the simplest. Bluetooth.addr == a4:c1:38:d6:a7:b5 – a command to search for all packets sent or received from this device.

Let's delve a little deeper into Bluetooth theory to highlight the key ones. Imagine that you are at an interview and the recruiter asks about your skills. A similar process occurs between the phone's Bluetooth adapter and the smart device. The controller queries the device service addresses in order and the device responds with what it can do: DeviceName, Appearance, Preferred Connection Parameters. But I was not interested in this, but in the mysterious abbreviation UART – a universal asynchronous receiver-transmitter with two channels: Rx and Tx. This was the channel of communication with the device: you can write commands in Tx, and from Rx you can read the data that the device sends – for example, firmware. Small success!

Learning to speak one language

It’s not enough to just send commands; you need to comprehensively understand how the application communicates with the device.

To do this, I used the application screencast and logs. The logs contained a timestamp that allowed me to understand what action on the phone caused a specific set of bytes. It was like learning a foreign language with a phrasebook – first you read the phrase, and then what it means.

So, I compiled a “phrase book” of actions and the sets of bytes that they cause. We also had to remember about the commands that were running in the background. For example:

  1. The request: 69|01|01|69 receives the answer 69|01|01|02|3b|69. This means that the firmware version in the application is 2.3b.

  2. The request 69|02|f0|69 receives the answer 69|02|f0|4c|34|fd|d7|fd|fb|69. This means that the mac-address of the device is fb:fd:d7:fd:34:4c.

  3. The request 69|03|ff|19|40|50|d4|80|dc|87|a5|69 receives the answer 69|03|ff|01|69. And this is the transfer of an access key to the device, without which the session will be terminated by the device itself.

Collecting requirements

When I wrote the requirements for such an implementation, I took into account that the team had little experience working with Bluetooth. Therefore, I made an artifact that I am proud of – a complete collection of logic for constructing well-known commands for all supported devices. And this is just an introduction)

Despite its cumbersomeness, this artifact significantly reduced the time it took to work out system requirements for devices: instead of re-writing commands for each device each time, I could refer to the command described in the collection.

If we talk about the general format of requirements, for each device there are inextricably two artifacts:

  1. Functional requirements – a basic description of the functionality of the application to work with the device.

  2. System requirements – description of the algorithm.

Based on the information received, it was possible to get a general idea of ​​the operation of the state machine inside the equipment. Success? Success, but too early to celebrate. Having received the key to the common protocol, I also received a whole fleet of devices to expand the list of supported devices. Here a new challenge awaited: it turned out that not allvitamins are equally useful the devices work the same. Therefore, in addition to describing the commands for working with the device, I added state diagrams and a text description of the operating logic to the system requirements (God bless PlantUML!).

They helped to clearly understand the difference in the “behavior” of devices. For example, a kettle and a multicooker have a similar set of parameters. But if you give the kettle the “Start” command, it will start boiling water. And if you give the “Start” command to the multicooker, it will not do anything until you explain in detail what exactly you want from it.

Passing the main test

But the main test was the antagonist from the title of the article – scales.

Standard Bluetooth device skills are described in the Bluetooth Foundation (BTF) documentation. For example, BTF has a special profile for floor scales – beautiful, with the ability to transfer biometrics, that's all. I logically assumed that it was used in the new scales with biometrics. However, in the exchange I was greeted by Unknown Service.

The interview analogy will help me again here. The recruiter expects the applicant to be able to work with databases and basic statistics, and asks: “Which is closer to you – SQL or NoSQL?” And in response he receives “I prefer to count through transformation matrices of multidimensional spaces at midnight with charcoal ink on last year’s birch bark” (substitute any incomprehensible bullsheet).

Naturally, such an answer will surprise the recruiter, so I was surprised. After all, I wanted the scales to display status, time, measurements and biometrics. And they gave this:

Okay, I had the tools and decided to use them.

Timestamp

Synchronizes with your phone, great. But what's inside? What format? Number of seconds? Yep, just the number of seconds since 01/01/01. Recorded it.

Weight data

Probably scales measure weight in kilograms. Let's see.

The scales show 23.1 kg. Hmm, something interesting. If we just convert this to decimals, it's 101,253,374 nonsense. But we have a value with a comma. The search leads us to the medical device standard (!) and the floating point data transmission format described for this standard in two and four bytes. And now there is sleight of hand and no fraud. Expanding the value.

The first byte – fe, tells us the number of decimal places (in this case -2). The rest is the meaning. We translate it in the calculator and get 2310. Hurray!

Status

Here I had to remember that hexadecimal values ​​can be converted not only to decimal, but also to binary. And get a string of 0 and 1, which in fact are a set of Boolean flags.

Abstract example: 08 = 1111

Each byte here will indicate some state of the device: is there a time stamp, is there data from the sensor, is the user authorized, is there a measurement result, as well as a unit of measurement. In fact, this is the device identifier.

There are static readings that change only depending on the units of measurement, and there are dynamic readings that are sent with each measurement: body resistance measurement system, weighing status, accuracy level.

Biometrics

The most important feature of the scale is biometrics. The protocol describes the transfer of ready-made parameters:

  • fat percentage,

  • percentage of water

  • bone mass,

  • muscle mass.

And we have two fields left in the line from the device, and one is always 00. So, the problem is in the second, in which the value is with a comma. It turns out 595.9. Very interesting, but nothing is clear.

We go deeper into the documentation and find that this is an indicator of the body’s resistance to milliampere current with a frequency of 50 kHz. Now know that every time you step on the scales, you get a little electric shock) Next we find an algorithm for calculating indicators and that’s it, we’re handsome!

Are we handsome?

We implement data recognition, draw a beautiful design, add data storage on the backend. With copywriting and design guides. Dear, dear. We are testing.

The developer comes and says: “The scales don’t work, it’s not clear why. Everything is implemented as you wrote, but they don’t even provide measurements. The session is being terminated and I can’t do anything about it.” Then I went to the tester and asked him to find a device that we had not yet connected to these scales, install the application there and send the log of the first connection from the new device. I compared two logs – the first log and the reconnection log, and found a difference.

The scales transmitted two values, and in response I gave one. Moreover, the command 00 25 01a existed for every connection. It turns out that the scales asked something when you first connected. As a result, I saw the exclusive “or”, a Boolean logical operation that differs from the usual “or” in that in the case of two ones it returns a zero. That is, we did the internal logic correctly, but initially did not look at the state of the first connection, and because of this, a problem arose.

The answer is simple: authorization! Libra required authorization.

conclusions

Algorithm for working with embedded systems:

  1. Study all available data about similar types of devices.

  2. Find a traffic source to work with the device.

  3. Use analysis tools – Postman, Charles, WireShark.

  4. Compare data and device states.

  5. Repeat the logic.

  6. Check.

  7. Don't be upset if it doesn't work out.

What is important to remember:

  • An embedded system exists in a vacuum. It has nowhere to take data from except the command device – which means, most likely, the set of states and internal logic are limited.

  • It is better to try standard solutions first, and only then non-standard ones.

  • If the logic is broken, there is a reset option.

  • The main thing is not to release the magical white smoke)

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *