Who are we
Hello to all! We are Ekaterina Galitskaya and Daria Egorushkina from Kaspersky Lab (documentation and localization department). A little more specific: the team in which we work is responsible for writing and localizing interface texts and help for mobile applications.
The main trigger for change was development needs. Development switched to frequent releases once every two weeks. Scope decreased, but they began to translate more often, and we had to do it faster. In fact, localization has become a narrow neck of development. And if earlier the project managers didn’t even know the names of the localizers – but why at all, because the translations magically appeared themselves – now almost everyone was aware of the problems and even knew what linguistic testing was 🙂
The localization cycle took 3 weeks:
- 3-5 days – transfer;
- 2 weeks – linguistic testing.
Everything is clear with the translation, but why linguistic testing, and what is it all about?
The main goal of linguistic testing is to verify translation in context, that is, to really make localization. The translators knew our terminology, but still they just translated the text, not seeing that it was a button or a heading, which text was next.
In addition, linguistic testing allows you to catch inconsistencies, under-translation, text that is not put in lines (hard-coded text), reduce legal risks (when payment texts, for example, are not placed in the right field). Linguistic testing is usually done using screenshots.
There is a myth that if the application is mobile, then it is small, and what is there to translate?
Haha Some statistics:
- texts in the interface – an average of 25 thousand words in the project;
- 10 applications;
- on average 19 localizations in each project;
- updating texts in the interface, translation of documentation every week.
Why could not accelerate?
Let’s see what each of the stages of localization consisted of.
Translation step (9 steps):
- pick from VCS manually from different brunches;
- manually create a translation delta;
- create translation packages;
- upload to FTP;
- write a bunch of letters to agencies, freelancers and local offices;
- after transfer, pick up from FTP, upload to CAT, check;
- put in VCS – do not get confused in brunches;
- start assembly, fix errors, rebuild assembly;
- start additional translations and bug fixes in those cases when the translation process had to be restarted.
Translation stage problems: in short, it is a limitation of old processes and a lot of routine work when using old CATs:
- The collection of lines from several brunches is not supported – the delta for the translation from all brunches was formed manually, and the translation was manually laid out in brunches. It was hard to maintain, easy to get confused, and impossible to forget this horror.
- Maintaining uniformity within the project and in languages in manual mode was not possible.
- You cannot run parallel translation in parallel – update source resources during the translation process. It was first necessary to get the first bundle of translation and only after that start additional translation.
- Cases of assembly failures due to errors in variables, apostrophes, and due to other localization errors have become more frequent.
Stage of linguistic testing (19 steps):
- Run the assembly and wait for it.
- Restart if the assembly fell due to localization errors.
- Set up a special environment if there is no debug menu.
- Take screenshots according to plan for one language.
- Repeat for 20+ languages.
- Check with the testers how to make non-received screenshots.
- Form packages – rename screenshots.
- Share on FTP.
- Set a task for agencies.
- Answer questions from agencies.
- Accept the task.
- Make edits.
- Assemble the build and wait for it (sometimes builds can take a long time in line).
- Rebuild the build in case of errors.
- Take screenshots for regression (these are screenshots with confirmation that the changes have been made).
- Formulate tasks for agencies.
- Share on FTP.
- Chat with agencies.
- If necessary (for example, if there was an additional transfer), go through another round of regression.
Problems of the linguistic testing stage: manual screenshots took the lion’s share of the time. If the feature has about 40 screens, and 20 languages, then this could reach up to 70 hours of manual screenshots …
In addition, there was a human factor.
It is one thing to go through these steps once every three months. Another thing is to repeat all this every two weeks. With each new iteration, the localizers plunged into the swamp of routine – send-accept-remove-repeat.
We had to look for a solution, and at the same time pretty quickly.
What were the solution options? It could be:
- hire more students;
- reduce the number of localization works (and therefore, squander quality);
- automate routine tasks.
We settled on the latter.
What did you want
We did not have a hundred years to sit down, pour a cup of coffee, roll up our sleeves and begin to analyze the entire cloud solutions market within a year. We were looking for a ready-made solution to start working tomorrow. Our goal was to solve the problem.
What other requirements did we have:
- Less approvals: in order not to wait until the purchase is agreed, they will write out the keys and that’s all.
- Ready basic functionalityl: to sit down and start doing. Which does not need to be written from scratch. Stable. The rest can be twisted along the way.
- Does not require huge server capacities: again, so as not to get caught up in lengthy approvals.
- Inexpensive (preferably free) starting entrance to the service.
- I didn’t need an inhouse developer: that is, adequate server-side support and the ability to deploy by yourself.
- Compliance of the service with internal security requirements: we connect to the service, and not he to us.
- Support for simultaneous work with multiple brunches: translate multiple features in parallel.
- Parallel launch of additional transfers.
Of the various options, we examined most closely Zing (translation service from Evernote developers).
Of the pros:
- customizability for yourself;
- free installation pack – only server capacities were needed;
- no monthly fee;
- connecting your translators;
- private access (can be hosted internally).
Minuses: in order to connect translators and give them access, at least two units had to be connected. What sharply raised the cost of service in time and resources.
What have you chosen
Since we cannot directly connect the CAT system to the internal version control system, we needed a different connector. You can write yourself or take an existing one. So we tested a bunch of Git – Serge – Smartcat.
Of the pros:
- Support for working with multiple brunches.
- Update resources on the fly.
- Independence from CAT parsers (writing configuration files on our side). Smartcat leaves PO files.
- Correspondence with freelancers is practically “in one window”.
- There is a search and selection of freelancers (direct communication, selection for the needs of the project – in our case, the speed and quality of the translation is important).
- You can pay for work in all languages and projects in one account.
- At our request, they raised the priority in developing new features: introduced new features (search for text in all project files, etc.), fixed some problems.
- Quick TechSupport – help in setting up.
- Actually free access to the service (subscription is optional).
- There was no text search within the entire project (and there may be more than 1000 files in the project). But Smartcat developers introduced this feature at the end of last year.
- You cannot open multiple documents in one browser tab.
- Resource files (documents in Smartcat) in one language can be up to 200. The user needs to make corrections to the translations after checking the text in the screenshots. The user does not know in which document the segment is. Therefore, the user needs to open all 200 documents and look for this line.
- There remains a problem with notifications for freelancers: they turn them off and do not receive notification of a document update. In this case, we still write in the chat.
What did and how it became
Briefly – changed the process of working with interface texts 🙂
- Tested a bunch of Git – Serge – Smartcat.
- We agreed with the developers on the brunch naming rules for writers and localizers (this is necessary to remove the correspondence with the developers, as well as to configure the rules for the locobot).
- We translated three complex projects into a new solution (each of 25 thousand words per language – these are purely texts in the interface, 20+ localizations).
- Filled glossaries in Smartcat, created configs for Serge.
- We connected internal linguists and an agency.
- The parsers on the Serge side added: the linguist can see the segment ID, comments on the segment, links to reference screenshots at the stage of translation.
- We launched cron, which finds brunches by mask for localization and editing of the sors (English) language.
- We piloted the translation of online help (successfully).
- We conducted the first set of freelancers, taught our work process: translation using screenshots, comments, a glossary.
- Supported by Monorepo: new configuration files for Serge have been developed taking into account the automatic search for files for localization by monorepository.
- Our developers have implemented a feature-based screenshoting Kaspresso framework. This allowed us to solve not only the problem with autoscreenshots.*, but also make context for translators. So, for each new line in the file, a link to a screenshot is added to understand where and how this new line is used. When a file with new lines “flies” to Smartcat, links to the screenshot fall into the “Comments on segment” field.
What localization now looks like (9 steps for everything):
- The writer commits new lines in Git. Strings are automatically processed and fly away in Smartcat.
- The localizer appoints translators (this step will be gone soon, is it true guys from Smartcat?))))
- Translators translate not just like that, but with screenshots – that is, in context.
- Localizers check the translation (make a complete file). The robot takes back the translation not by line, but when the work on the whole file is finished. The translation automatically flies back and submits to Git.
- Localizers run autoscreenshots.
- Localizers upload screenshots to FTP.
- Localizers answer the questions of linguists.
- Localizers, if necessary, make changes to Smartcat. Edits are automatically committed to Git.
- Localizers close pull request.
Of course, there is still a field for automation and improvements. But you can already feel the difference with what was first.
What is Serge
This is an open source solution, a connector between a version control system (SVN, Git, Gerrit (Git-based code review system), Mercurial) and TMS, in our case Smartcat.
Why we are “logged in”: all cloud TMS have a connector out of the box. But such boxed connectors connect directly to the repository. Which is impossible in our case. What are the options:
- disclose part of the version control system;
- Clone folders with resource files for public access;
- receive and process resource files before sending to TMS, then export to TMS.
Revealing part of the system is risky.
Making a clone is possible, only this requires temporary and human resources.
Serge is just able to receive resource files and process them before sending to TMS. As a result, the architecture is as follows: Git – Serge – TMS.
Serge takes files from Git and processes them according to certain rules. Then it converts them to PO format and sends them to Smartcat. Serge receives the translated PO files from Smartcat, converts them and commits to Git.
Also, Serge’s big plus for us is that it is deployed within our company. Thus, the entire “kitchen” remains behind a stone wall. Nothing secret goes out 🙂
- Match sors target by file and resource string ID.
- The ability to select files by mask in the path or by content.
- Processing the contents of resource files before / after parsing.
- Configuring parsers.
The most important thing is that in a relatively short time, about three months, we solved the problem and ceased to be a narrow neck.
Results and numbers
|Stage||How many hours were (2018)||How many hours have become (end of 2019)|
|Collect lines from all brunches. Manually||1||0|
|Get only new or changed lines for iteration, load into the old CAT tool for 20 languages||4||0.25|
|Create translation packages. Repeat for 20 languages.||0.5||0|
|Set tasks for agencies / translators. 1 language = 1 agency.||2||0|
|Download packages with FTP translations for each language. Repeat for 20 languages.||0.5||0|
|Write off, get confirmation from the agency / translator that the task has been taken. Repeat for 20 languages.||2-3||0|
|Answer the questions of the translator. Repeat for 20 languages.||2-4||0.5|
|Accept translation for each language||1||0.25|
|Run build||<8 (editing bugs from the old CAT tool)||0.25|
|Additional translation (repeat all of the above)||eight||0.25|
|Get screenshots||16-32 (manual by yourself)||8 (auto screenshot)|
|Upload to FTP||eight||1|
|Chat with the agency / freelancers||eight||1|
|Pour changes in git||eight||0.25|
- Assemblies do not fall: variables, non-translatable words are placed in placeholders, apostrophes are escaped at the stage of applying parsers.
- We do not select devices from testers.
- We do not waste time with developers and testers in order to fix the assembly or figure out how to take one or another screenshot.
- Translation in context: English screenshots are already at the stage of translation, and they are EASY to open and view.
- Smartcat makes it possible to take untranslated segments into a critical error – they found some important lines from the old CAT.
In addition, a bunch of Git – Serge – Smartcat allowed translating the work of UX writers into Smartcat. How we did this, we will tell in the next article :).
*More about autoscreenshots: our colleagues wrote autotests and created Kaspresso, a framework for autotests. Just it made autoscreenshot, which we use in localization. As a by-product of autotests.