What is a COM object, how is it developed, what are the features of the Microsoft COM implementation?

Good afternoon everyone. I would like to get short and understandable for a beginner answers to the following questions:
1) What is a COM object?
2) How is the development of a COM object?
3) What are the features of Microsoft’s COM implementation?
Please refrain from throwing off multi-page literature.
Thank you.

I came across this question, I offer my detailed answer. It is interesting to get the assessment of the community of professionals. I hope newbies can find something useful too.

A COM object is a C++ class created according to the rules of COM. One of the fundamental rules of COM is the way COM objects are identified by the functionality they implement. In order to understand what a COM object is, we would have to study not only quite extensive documentation on the development of COM objects, but also analyze at least one practical example for which this technology is used. Thus the answer to the second question is actually the answer to the first question.

Let me try to tell, based on my experience, using the well-known example, what COM objects are and why the concept of modeling objects in the form of components is generally needed, by analogy with the world of physical devices. The idea is simple, just as different smartphones (for example) are assembled from a chipset, resistors, a capacitor, it is also possible to assemble various large applications (it would seem, it would be desirable) from localized components, created in isolation and saved in different files (DLLs). ) and therefore completely independent, and therefore not able to break at least each other’s compilation.

For example, consider the task of creating a universal player for video files (video player).

Implementation complexity problem (not to be confused with runtime complexity)

Few people think about how many image encoding formats (for video), audio encoding formats, formats for packaging encoded audio and video streams in one file exist.

To understand the scale of the program that will claim to be universal when playing video, let’s say we have 10 formats of each type. Thus, our universal video player must be able to process 10 * 10 * 10 = 1000 configuration options for the file being played, imagine a switch for a thousand cases! This is, of course, a very rough abstraction for estimating the scale of a video player, but, oddly enough, it gives a very clear and completely adequate estimate of the scale of the problem that needs to be solved. It is obvious that in one case the complexity is determined by the product of integers greater than one, and in the other by the sum plus some constant comparable to one. By the way, regarding the rating of 10 “ten” options – for those who are not in the know – this is a very moderate estimate of the number, in reality this number is estimated at several dozen and, in fact, there are much more types of codecs than three.

The solution, as usual, will be obvious after someone has formulated it. Instead of writing 1000 variants of the algorithm for each of the cases, you need to:

1. write a separate component (DLL-library) for each decoder, unpacker, renderer, …, that is, in the considered case, only 30 DLLs instead of 1000 cases in the switch!

2. Write some logic that finds the necessary components for playing any file with an arbitrary configuration of data types inside, instantiates these components, connects them into a chain to process input data streams from the file

For those who like to calculate the complexity by formulas, the original one was:

O(product of all ni for all m codec types) where

ni is the number of codecs of the same type i,

m is the number of codec types.

In the second case, the complexity is:

O(sums Ni for m types + L) where

Ni is the number of codecs of the same type i designed as components,

m is the number of codec types and

L is the complexity of the logic from point 2. It seems that N will be slightly higher than n, since some additional design is needed, in fact, this design often simplifies development, as it forces you to focus on the really significant aspects of the implementation of the component and provides a significant level of unification code.

There is a separate huge plus of the component approach. We have the opportunity to develop decoders (for example, for example Mpeg4, MP4) completely isolated from each other, in different projects, by different people, in parallel, … whatever! The only condition is that the developed component complies with certain rules for component design.

And now we can dwell on these rules for designing components in more detail, they still need to be understood in some form to understand the whole technology. To understand these rules, you need to try to formulate the logic by which the video player application will be built and run, based on the use of codecs, each of which can be loaded from a separate DLL.

When you open a video player, it has no idea what type of file you choose to play. Therefore, it makes no sense to download codecs even if there are only 30 of them. It is quite obvious that the type of codecs to be loaded can be understood only by the results of the analysis of the file opened for playback. Video files naturally store information about what format the streams are recorded in and how they are packed, for this a certain identification system has been formed, which also naturally allows you to determine what type – which codecs need to be loaded in order to decode these streams and output them to the appropriate rendering devices (playback ) which, by the way, are also represented by the corresponding components.

How it looks like on some kind of diagram

In the picture below you can see the usual result of the video player when opening a file. The result is a graph built from the components of various types found, loaded and docked for data transmission and reception, necessary for parsing, decoding and displaying data after pressing the “play” button.

Oddly enough, in the world of COM objects, the video player DOES NOT decode, DO NOT play the video itself, it builds a connected graph from the components that will play the video, and THEN translates user commands (such as pressing the Stop-Play buttons) into commands for this graph . Leaving a little aside, it can be noted that the graph is also represented by some object inside the video player and the translation of user commands to this object is also largely regulated, I’m not sure if it is within the framework of the COM technology.

I have already noted that the method of assessing the complexity of the implementation of the functionality that I used is very rough (although adequate in terms of the results obtained). If we even dive a little into the details of how the right codec is selected for the next stage of data processing, for example, decoding, we will see that even a separate codec depends not only on the type of data in the stream that it needs to process, but also on, for example, how this data it is fed by the previous codec (for example, an MP4 decoder must receive data from a file reader that parses the file-extracting video and audio streams accordingly). Concerning all this diversity growing with the depth of implementation, it can be noted that all this diversity is successfully managed within the framework of the concept of COM objects. COM technologies make it possible to maintain controllability of the code, to avoid the ancient, but still relevant problem known as DLL HELL.

I think it’s now quite clear that the point of a COM technology or concept is to break down the solution of one very large task, such as the development of a universal video player, into many subtasks that can be solved independently of each other. It is easy to understand that a solution that divides a huge monolithic program into many independent components and some lightweight kernel for linking these components into a complete application requires support from the operating system. The operating system must provide a search for the required component to process the identified data format, in fact, the OS must maintain a database of executable components and their descriptions so that components (usually in the form of DLLs) can be added / registered in this database, and then found and activate-use these components according to the identification system defined in this database. The component identification system is also one of the global infrastructure functions of the OS, the components receive identifiers regardless of whether they have already been added to the database of the operating system components or not.

Class factories and other design patterns

It is interesting to note that, for example, the use of the “class factory” design pattern is an integral part of COM technology. In order to be able to create objects of classes implemented in a DLL by an identifier (or by some structure with a description of the functionality of the required object), each DLL implements one or more factory classes for components, the same factory classes are called to obtain information about the content DLLs when registering in the component database.

As I understand it, most of the design patterns have been, at least, practically worked out, if not originally formulated precisely within the framework of the development of the concept of COM objects, so this is a very useful topic to study, and will forever remain so, as it seems to me.

Third question: What are the specifics of Microsoft’s COM implementation?

Frankly, for me the question does not look right. As far as I know, COM technology is implemented only by Fine-Soft :). And the analysis of features involves a comparison with some other implementation, and there is simply no other implementation, at least I don’t know of any.

It is interesting, as an answer to the 3rd question, to try to compare the KOM technology with some kind of comparable and / or something similar technology. Take for example (unexpectedly) the technology of instantiating (creating, injecting, …) bean objects from Java Spring.

MS COM is essentially also a technology for instantiating COM objects.

To some extent, Dependency Injection is an interpretation of the ideas of creating class objects by its identifier, multiplied by the idea of creating some universal algorithm for instantiating objects based on code analysis.

Thus, Dependency Injection for beans from Java Spring seems to me to be in some sense the development and transfer to a new context of ideas originally formulated in the framework of COM technology. At the same time, I do not exclude at all that COM technology was also at one time a development and transfer to a new context of ideas that were originally formulated somewhere else earlier. I am not a historian, I am a developer, historical accuracy is not in the first place for me.