A simple code analyzer for a program written in 8051 Assembler

The main task was to reduce the size of the program, since used a microcontroller with a small amount of memory, and the functionality of the product should be large. Therefore, the idea came up to use a code analyzer, a search on the Internet did not give anything, so I had to do it myself.

I decided to share my ideas, because I think that someone can write a more decent program for analyzing a program in 8051 assembler.

In this article I will describe the main stages of the resulting analyzer. Some of these steps can be used to analyze programs written in other languages.

Stage 1. First, you need to convert the source code of the program to the simplest possible form. It is more convenient to work with the program code from which everything unnecessary has been removed.

Example:

Function_ADD:

                mov       A,Peremenay1

                add         A,#Constanta1

                mov       Peremenay2,A

bring to mind

code_0001: mov A,020h

                add         A,#030h

                mov       021h,A

For this I decided to make disassembly from a HEX file. So it seemed much more convenient than processing the source, besides, the compiled file already has the code allocated in memory, which will be useful later.

Stage 2. I created a table in which I entered the lines of the source program, the type of the instruction, the type of the operands, the address in memory, etc.

Example:

Line number | Jump address (if any) | Operation name | Operand 1 | Operand 2 | Jump address | Operand type 1 | Operand type 2 |

1|code_0001|mov|A|020h| |ACC|DIR|

2| |add|A|030h| |ACC|CONST|

3| |mov|021h|A| |DIR|ACC|

The more filled-in columns with different possible options, the easier it will be to do the analysis.

Stage 3. Let’s go directly to the analysis. While I have analyzed two neighboring commands, the algorithm for such an analysis is quite simple, we check the commands according to certain rules.

Example:

1. Substitution of procedure call commands

LCALL code_0001

RET

can be replaced with the command

LJMP code_0001

this will save 1 byte.

2. Replacement of assignment commands

MOV A,#CONSTANTA

MOV @R0,A

can be replaced with the command

MOV @R0, #CONSTANTA

or

MOV A,@R0

MOV RAM,A

can be replaced with the command

MOV RAM,@R0

3. Replacing movement commands

JNZ label1

SJMP label2

label1:

.................

label2:

can be replaced with the command

JZ label2

label1:

..........

label2:

And there are a lot of such options, when a program is written, it is not always written taking into account space saving, but so that it would be clear in the future.

Stage 4. Count how many hits for each link address, the task is to find single hits. We also look at which command is in front of the jump address in the program. Based on these data, we add additional data regarding the transitions to the table.

Example:

MOV A,030h

                MOV 020h,A

                ret

code_0001: MOV A,020h

This means that it is possible to move a part of the program starting from the jump address “code_0001” closer to the command of accessing this address, or to remove the jump command altogether, and move part of the program to its place.

There are also zero calls, this is when a certain address can only be accessed through a jump command, but it is not in the program, for example, as in the example, before the command with the address “code_0001” is the command “RET”.

Stage 5. Count how many times each cell of data memory (RAM) is used, and how it is used (initial data for a command or changing data into cells).

For example, it may turn out that a memory cell is only changed, while it is not used anywhere as source data. You can get rid of commands that change such a cell.

Stage 6. Compilation of the source code. After compilation, it is possible to check the number of bytes between the branch instruction and the branch address, for example, in order to shift a part of the code closer to the branch instruction, replacing it from LJMP to SJMP.

When using this analysis, I got the result that the 8-kilobyte program was reduced by 300 bytes.

Further plans, to analyze the code based on its functioning, i.e. what a particular part of the program does and how it can be replaced.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *