Yaml is the king of meta descriptions
XML/HTML – the format's peculiarity is that nodes are objects, and this is arch-necessary, first of all, for encoding rich “human visualizations”. Hence the postfix “ML” – markup language. This format is used for websites or, for example, for storing MS Excel data. Nodes are represented by objects due to the presence of tag attributes, as well as the need to traverse the tree in all directions. In the early 2000s, XML was more widely used for transferring data between remote hosts, but was often supplanted by JSON, since the latter is much simpler, and the power of XML (as well as complexity) were not in demand.
JSON – also represents hierarchical data, but node keys cannot be objects, but only scalars: strings or integers. The peculiarity of the format is that it has a minimum number of rules necessary for the practical implementation of parsers. Hence its popularity, including for exchanging messages between remote hosts. JSON is often not read by people, and encoding and decoding occurs automatically.
YAML – also cannot contain objects in node keys, but only scalars. This format includes, as a subset, JSON, is often used for configurations, and is oriented more towards human reading and writing than JSON. This article compares the last two formats, but IMHO, it should be emphasized that Yaml is intended for humans, and JSON is not only. And asking the question “who is cooler” is not very correct.
The situation with Yaml is probably the most complicated and confusing, for example, this whining post tells how bad everything is. But negative circumstances can and should be reversed. To do this, everything should be put in its place. 23 years have passed since Yaml appeared, but it is actively used in various programming languages for configurations, including the flagship PHP framework Symfony. In my opinion, this fact is enough to say that Yaml has its own “paradise”, you just need to understand it. Probably you need to grow up to Yaml, for example, I wrote my Yaml parser and understood a lot of what I write about only last year, although I have been programming for a long time.
I think there is a really amazing fuckup in the standard Yaml architecture. The initial idea is brilliant, but the final details are unsatisfactory. As the saying goes, “started well, finished badly”… Nevertheless, over the 23 years of Yaml's life, the fact of its local use has become established. Its viability and locality are explained as follows: it is really beautiful, minimalistic and expressive, but its parser will always be more complex, voluminous and less efficient than JSON. Performance is not a problem, since compiled data can always be placed in the cache. I called Yaml the king of meta-descriptions, because I think Yaml syntax has absorbed all the most powerful ways of representing hierarchical structures that will look beautiful and expressive to a person. In fact, Yaml is convenient to use not only for configurations, but also for programming. But more on that later.
I didn't find a clear term for “meta-descriptions” in Wikipedia, so I'll describe what I mean by this. Sometimes, in programming, it is effective to come up with your own format for presenting data that will reflect the algorithmic features of the parser working with such a format. I call data in such an arbitrary format meta-descriptions. For example, among my past works, there is a PHP implementation of the CSS framework Tailwind. Up to 200 meta-descriptions are used to generate all (!) Tailwind utilities. This is an example of one meta-description for generating 42 “Grid Column Start / End” utilities:
- MetaName: col #grid
ForMenu: 2col span-start-end
Body: |
auto|span-full|start-auto|end-auto|@num
grid-column: auto
.auto
grid-column: 1 / -1
.span-full
grid-column-start: auto
.start-auto
grid-column-end: auto
.end-auto
span-~num!=: span {0} / span {0}|start-~num!=-start: {0}|end-~num!=-end: {0}
grid-column{@}
.@num
In the console you can test how Vesper (this is my implementation of Tailwind) generates utilities, here are three examples:
>php vendor/bin/sky venus tw col-auto
.col-auto {
grid-column: auto;
}
>php vendor/bin/sky venus tw col-span-2
.col-span-2 {
grid-column: span 2 / span 2;
}
>php vendor/bin/sky venus tw col-start-2
.col-start-2 {
grid-column-start: 2;
}
By the way, this article includes the work of a PHP programmer. For those who want to learn more about my work, an empty demo application “Hole” is available for installation. Disclaimer: Coresky is an experimental framework, more details in the spoiler:
Hidden text
This means that I use the framework only for developing new architectural solutions and do not recommend using it for the public part of the Internet. A lot of work has been done thanks to endorphins and the pleasure of creativity alone. A lot of painstaking, non-creative work has not been done and you can easily find errors. They are partly present because I still continue to make backwards incompatible changes from new versions to old ones. Nevertheless, what I write about in this article works great. Most of the code has a release-candidate status.
>composer create-project coresky/hole
# и можно сразу запустить PHP-DEV-WEB сервер:
>cd hole
>php vendor/bin/sky s
Let's fix the standard Yaml
Look, there are JSON and XML for transferring data between hosts, and we will use Yaml only locally. So there is no problem to write your own parser. Let's fix the architectural fuckup to minimize annoying “pitfalls” and on the other hand, reveal the full potential power of Yaml. Of course, it is necessary to try to preserve the standard Yaml as much as possible, this will simplify the study of the Coresky modification. Composer package symfony/yaml like many other popular implementations, they also do not fully support the standard. For example, multiple Yaml documents are usually not supported. In Coresky, the Yaml parser is a single file containing 577 lines. I use a tokenization technique, while Symfony uses regular expressions.
Implicit typing. The Norway issue is some kind of nonsense. Let's leave only three literals working, which are standard in JSON and correspond to the type in PHP: true
, false
, null
. We will leave only decimal integers and floating point numbers. floatin full compliance with PHP syntax. Binary, hexadecimal INT and remove all other implicit typing. For this purpose, it is better to use transformation functions or operators. More on that below.
Explicit typing. Here, we need to expand the strategy and use the appropriate terms “operator” or “transformation function”. Coresky has its own view compiler – Jet, which is a descendant of Laravel/Blade, it uses operators that start with the ampersand symbol. We use the same approach in Yaml. To implement most transformation functions, you literally need a couple of lines of PHP code. How the values are transformed is written in standard JSON notation:
# @base64 тоже что и !!binary в стандартном Yaml
img: @base64 |
R0lGODdhDQAIAIAAAAAAANnZ2SwAAAAADQAIAAACF4SDGQ
ar3xxbJ9p0qa7R0YxwzaFME1IAADs=
array: @csv(:) a:b:c # трансформируется в массив: ["a", "b", "c"]
hi: @hex2bin > # трансформируется в строку: "Hello word!"
48 65 6C 6C 6F 20 77 6F 72 64
21
# десятичные числа в спец-нотации:
card: @dec 1234_5678_9012_3456 # >>> 1234567890123456
big-number: @dec 1 000 000 000 # >>> 1000000000
phone: @dec 2-777-222-22-22 # >>> 27772222222
# двоичное число:
five: @bin 101 # >>> 5
# Search-And-Replace, используется функция PHP preg_replace(..)
sar: @sar(|=+| is ) a========1, b==2 # >>> "a is 1, b is 2"
# @left(string), @right(string) - дописать "string" слева или справа
key: @left(left-) value # >>> "left-value"
# и так далее ..
Operators can only be specified at the beginning of Yaml or JSON notation values. Multiple operators can be specified for a single value, and they can be cascaded across all values in a sub-array. The order of execution is from right to left and from bottom to top. To privately cancel cascade execution, you can use @deny:
one: @right( RR)
two: @bin
- 0b11
- 101 # можно и без префикса 0b
- @deny @left(LL ) 1010
three: four
color: @left(#) @each @bang(. ) @sar(| +| ) > # not in cascade
aliceblue.f0f8ff antiquewhite.faebd7
beige.f5f5dc bisque.ffe4c4
# для кодирования цветов мы указали только информационную "соль"
# вот почему Yaml - король метаопределений
{
"one": {
"two": ["3 RR", "5 RR", "LL 1010"],
"three": "four RR"
},
"color": {
"aliceblue": "#f0f8ff",
"antiquewhite": "#faebd7",
"beige": "#f5f5dc",
"bisque": "#ffe4c4"
}
}
Anchors and links. We will not expand the syntax, we can use the transformation function @path
:
one:
two: [1, 3]
three: @path(one.two.1) # this value will eq. to 3
Multiple Yaml. Must have named parts! Then you can freely compose Yaml data and apply the principle DRY. The Jet compiler supports file part markers. We use the same strategy in Coresky Yaml:
#.run =========================
- @inc(.test) # @inc это include
- 2
#.run
#.test =========================
+ 123
#.test
<?php
# функция yml(..) выполняет inline-yaml, может кешировать компилированный PHP
# и передавать переменные, подобно как это делается в Jet (Blade)
print_r(yml('+ @inc(run) filename.yml'));
# stdout: array(0 => 123, 1 => 2)
PHP Hybrid Arrays. There is one confusion in the standard Yaml notation: in order to specify a value with a string key at the bottom level of the hierarchy, an indent is required, but for enumerations, an indent is optional:
aaa:
bbb: # { "aaa": { "bbb": null } } отступ ==> глубина
aaa:
bbb: # { "aaa": null, "bbb": null } <--- здесь глубина не изменилась
aaa:
- bbb # { "aaa": [ "bbb" ] } отступ ==> глубина
aaa:
- bbb # { "aaa": [ "bbb" ] } <--- здесь глубина изменилась хотя нет отступа!!!
# Исправим синтаксис для предыдущего примера в Coresky вот так:
aaa:
- bbb # { "aaa": null, "0": "bbb" }
Yaml compiles to a PHP array, and PHP arrays can be hybrid. Let's allow string keys and enumeration keys (hyphen) to be combined in Yaml notation. This will expand the potential of Yaml usage. Below is the current code for generating the Coresky system configuration HTML form for the Root-Admin-Section:
#.system
- <fieldset><legend>Primary settings</legend>
- ['', [[<b><u>Production</u></b>, li]]]
trace_root: [Debug mode on production for `root` profile, chk]
trace_cli: [Use X-tracing for CLI, chk]
error_403: [Use 403 code for `die`, chk]
empty_die: [Empty response for `die`, chk]
gate_404: [Gate errors as 0.404 (soft), chk]
log_error: [Log ERROR, radio, [Off, On]]
log_crash: [Log CRASH, radio, [Off, On]]
- [Hard cache, {
cache_act: ['', radio, [Off, On]],
cache_sec: ['Default TTL, seconds', number, style="width:100px", 300]
}]
- </fieldset>
- <fieldset><legend>"Visitor's & users settings"</legend>
- [Cookie name, {
c_name: ['', '', '', sky],
c_upd: ['Cookie updates, minutes', number, style="width:100px", 60]
}]
visit: ['One visit break after, off minutes', number, '', 5]
reg_req: [Users required for registrations, radio, [Both, Login, E-mail]]
- </fieldset>
#.system
Coresky Yaml – Jet's little brother
Coresky has functionality for generating HTML forms from PHP arrays. Now forms can be defined using Yaml. Let's add operators @php
And @preflight
below is an example of Yaml code and the corresponding compiled PHP cache file:
#.test =======================================
+ @preflight($v_1, &$v_2) |
return SomeClass::method_1($v_1, $v_2);
- string
- @php OtherClass::method_2($__return)
- {a: b, c: @php(OtherClass::method_3([1, $v_3])), x: y} # json notation
#.test
<?php
# preflight code
$__return = call_user_func(function() use ($v_1, &$v_2) {
return SomeClass::method_1($v_1, $v_2);
});
# other yaml after compile
return array(
0 => 'string',
1 => OtherClass::method_2($__return),
2 => array(
'a' => 'b',
'c' => OtherClass::method_3([1, $v_3]),
'x' => 'y'
),
);
In the operator @php
PHP code can be specified either in the parameter (in brackets) or in the value (after the operator). In the first example above, on the last line, Yaml syntax highlighting worked incorrectly: everything specified in the brackets of the operator @php
is not JSON notation data, but is PHP code. If @preflight
has brackets, then the code will be executed in an isolated scope, which is organized using Closure and call_user_func
and if there are no brackets, then without insulation.
Calling such code and passing variables can be done using the Coresky function yml(..)
:
<?php
$array = yml('cache_filename', '+ @inc(test) filename.yaml', [
'v_1' => $var_1,
'v_2' => $var_2,
'v_3' => $var_3,
]);
In the examples above, the scope of Coresky Yaml definitely correlates with the Jet view compiler. Imagine an accounting application with a huge number of forms. Now, next to the folder mvc
you can make a folder forms
and place all the application forms in it. The names of the files containing yaml forms can be named in accordance with the names of the controllers and placed several in one file, separated by markers, in accordance with the names of the actions where these forms are used. Can you imagine how this will unload the code of models, controllers and views? As the author of the article “Some YAML techniques” said, may KISS and DRY be with us, and also relevant: “divide and conquer”.
Yaml forms are not a panacea, look, for example this file. Quite often you need to “juggle” pieces of code and data to generate the final code. Previously, in such cases, I used the capabilities of Jet. Now it is more convenient to do this in Yaml. The “Hole” application, which you can install using Composer, has several examples of forms in Yaml, including one with passing variables to the compiled cache.
You probably noticed the “plus” in the Yaml notation in the examples above, where there would normally be a hyphen. This is another new feature in Coresky Yaml syntax. There are other new features, read more in documentation. I tried to show the good side and hidden potential of Yaml, although, indeed, the standard has shortcomings. Often, you can write a web application without Yaml, but, for example, in in this project without it it would be tough. Yaml in Coresky is another powerful mechanism for programming and simplifying code. If you have a more or less significant array of data in your project, which is not rational to place in the DB, you have the opportunity to use Yaml.
Other interesting ideas in Coresky
I will write a few words about each and put the text in spoilers, as it is a little off-topic. If you are not interested – you can skip this part.
Plans and Products:
Hidden text
No more bicycles! The product system at Coresky is an attempt to make reusing functional code as easy as possible. Products come in three types: prod
(functional code), dev
(these are basically just plugins for developer tools), view
(additional sets of Jet templates and CSS styles). The product name of the main application is always main. The products are similar to Zend/Laminas modules, but the latter does not have any special tools for integrating them into the application, nor any special system tools for supporting them. The products are installed using the web interface of the developer tools, at which time a special initialization of the product may occur: creating tables in the database, etc.
Plans are an add-on to application entities. Standard plans that are available in any Sky application: app, cache, mem, gate, jet. Plans are essentially managed by the abstract cache engine.
Sky Gate – heavenly gates
Hidden text
Coresky does not have a conventional routing system. SkyGate is a web-based utility that configures external input data for controllers. The first part of the request address defines the controller, and the second part defines the action in it. If non-standard addressing is required, Coresky Rewrites must be used.
Jet Presentation Compiler
Hidden text
The idea of Laravel/Blade was taken as a basis. Jet has many innovations: a preprocessor, a binary idea of blocks, including operators @block
, @use
, #use
file part markers, three types of visualization generation: top
, sub
, block
This means that each visualization runs an action in the controller, prepares variables, and generates the visualization using a compiled Jet template.
Conclusion
If you work with a significant amount of data, especially hierarchical data, and you don't really like Yaml: take the time to learn it better. It's 23 years old and is doing great, despite all the bad rumors. If so, then someone needs it, maybe you do too? Once you, like Jack Sally, say “Yaml, I see you..”, you won't be able to tear yourself away from it.
Standard Yaml has hidden potential, and in this article I tried to show how to reveal it.
Be aware of the pitfalls if you “don't see Yaml”, be careful.