Caesar Cipher in Assembly Language

Friends, colleagues, greetings to everyone! In this article we will write an encoder in dirty “macro-pure” Assembler and as an elementary cipher we will use the Caesar Cipher (Shift Cipher with key k = n). The article is written for beginners who can dive into the world of “first” cryptography. ZY The code can be modified as you wish (for example, implement a different cipher) and used, for example, in coursework (at a university). Fortunately, specialized universities still devote some time to Assembly language 🙂

So, we will write the encoder in MASM (you can use FASM, etc. – any assembler whose syntax you know).

A little theory…

The Caesar cipher is one of the simplest and most famous encryption methods that was used in Ancient Rome. It is named after Julius Caesar, although there is no definitive evidence that he used it.

The principle of the Caesar cipher is very simple: each letter in the plaintext is replaced by a letter located at a certain fixed number of positions forward or backward in the alphabet. This number by which the letters are shifted is called the encryption key (we will use the key k = 1 for ease of understanding).

Let's start by declaring directives:

.386
.model flat, stdcall
option casemap:none
  1. .386: This is a directive that tells the assembler to use instructions for x86 processors starting with 80386 (or just 386). This means that the program will use instructions that were introduced with the advent of 80386 and higher processors.

  2. .model flat, stdcall: This is also a directive that defines the memory and function calling model for the program. In this case:

    • flat indicates the use of a flat memory model, where all addresses are the same size and the address space is represented as a single, linear space.

    • stdcall specifies the use of the stdcall function calling convention, which determines the order in which arguments are placed on the stack. The convention of stdcall is that the calling function clears the stack after the call, which is different from the convention of cdecl, where it is the responsibility of the called function to clear the stack.

  3. option casemap:none: Causes the assembly language compiler to distinguish between uppercase and lowercase letters in labels and procedure names.

    Connecting libraries:

include C:\masm32\include\kernel32.inc
include C:\masm32\include\windows.inc
includelib C:\masm32\lib\user32.lib
includelib C:\masm32\lib\kernel32.lib

These directives allow you to use functions and data structures from Windows libraries and interact with the Windows operating system when developing programs.

We declare the BSIZE constant:

BSIZE equ 128

When the compiler encounters BSIZE in code, it replaces this constant with 128.

Next, let's define a data section (.data) and declare several variables:

.data
buf db BSIZE dup(?)
result db BSIZE dup(?)
stdin dd ?
stdout dd ?
cRead dd ?
cWritten dd ?

buf db BSIZE dup(?): This line declares a byte array called buf. Its size is determined using BSIZE, which we defined earlier is equivalent to 128. Each element of the array will consist of one byte (db), and all elements will be initialized with the value ?. A question mark (?) means that memory is reserved without initialization (assigning an initial value).

Further:

.code
start:
invoke GetStdHandle, STD_INPUT_HANDLE
mov stdin, eax
invoke ReadConsole, stdin, ADDR buf, BSIZE, ADDR cRead, NULL
  1. code: This directive marks the beginning of a program code section. All instructions after this directive will be interpreted as processor instructions and not as data.

  2. start:: This is a label indicating the beginning of the program's entry point. The program starts execution from this label.

  3. invoke GetStdHandle, STD_INPUT_HANDLE: This instruction calls the WinAPI GetStdHandle function, which returns a handle to the standard input. After calling this function, the eax register contains the return value. We store the descriptor in the stdin variable (mov stdin, eax).

  4. invoke ReadConsole, stdin, ADDR buf, BSIZE, ADDR cRead, NULL: This instruction calls the WinAPI ReadConsole function, which reads data from the standard input (which we defined earlier as stdin).

Description of the ReadConsole function on the MS website

Description of the ReadConsole function on the MS website

The last option requires Unicode input by default. For ANSI mode, set this parameter to NULL.

Now it's time to write our encryption algorithm. Since ReadConsole allowed our keyboard input to fit into buf, we will copy the beginning addresses of the buf and result arrays into the esi and edi registers, respectively (we will transfer our “processed” characters to result):

mov esi, offset buf
mov edi, offset result

Let's start writing a loop in which our input will be “processed”:

loop1:
mov bl, [esi]
cmp bl, 13
je endloop
add bl, 1
mov [EDI], bl
inc esi
inc edi
loop loop1

Here's how each instruction works:

  1. loop1:: This is a label indicating the start of the loop.

  2. movbl, [esi]: This instruction loads a byte (8 bits) which is stored in the esi register into the bl register (the first character we entered from the keyboard).

  3. cmp bl, 13: This instruction compares the value stored in the bl register with the number 13 (13 is the code for the carriage return character, according to this condition we will exit the loop, since the carriage return character will serve as the end of the entered line from the keyboard).

  4. je endloop: If the result of the comparison indicates that the value in bl is 13, the program jumps to the endloop label, breaking the loop.

  5. add bl, 1: If the value in bl is not 13, then 1 is added to it (this is our key k = 1).

  6. mov [EDI]bl: The value stored in the bl register is copied into memory at the address stored in the edi register (we save the “processed” character).

  7. inc esi: The esi register is incremented by 1 to point to the next array element.

  8. inc edi: The edi register is also incremented by 1 to point to the next element in the second array into which the “processed” characters are copied.

  9. loop loop1: This instruction decrements the value of the ecx register by 1 and, if it is not 0, jumps to the loop1 label. This ensures that the loop is repeated for each element of the array until the value of ecx reaches 0.

Let's write the code after the endloop label:

endloop:
invoke GetStdHandle, STD_OUTPUT_HANDLE
mov stdout, eax
invoke WriteConsoleA, stdout, ADDR result, cRead, ADDR cWritten, NULL
invoke ExitProcess, 0
end start
  1. invoke GetStdHandle, STD_OUTPUT_HANDLE: This instruction calls the WinAPI GetStdHandle function to obtain a handle to standard output (stdout). The result (standard output descriptor) is stored in the stdout variable (mov stdout, eax).

  2. invoke WriteConsoleA, stdout, ADDR result, cRead, ADDR cWritten, NULL: This instruction calls the WinAPI WriteConsoleA function to write the data from the result array to standard output (we already have a “processed” string in result).

  3. invoke ExitProcess, 0: This instruction calls the WinAPI ExitProcess function to terminate the process. Parameter 0 indicates successful completion of the program.

  4. end start: This is a directive indicating the end of the program.

Let's look at how the program works:

Full program listing:

.386
.model flat, stdcall
option casemap:none

include C:\masm32\include\kernel32.inc
include C:\masm32\include\windows.inc
includelib C:\masm32\lib\user32.lib
includelib C:\masm32\lib\kernel32.lib
BSIZE equ 128
.data
buf db BSIZE dup(?)
result db BSIZE dup(?)
stdin dd ?
stdout dd ?
cRead dd ?
cWritten dd ?
.code
start:

invoke GetStdHandle, STD_INPUT_HANDLE
mov stdin, eax
invoke ReadConsole, stdin, ADDR buf, BSIZE, ADDR cRead, NULL

mov esi, offset buf
mov edi, offset result

mov ebx, 0

loop1:
mov bl, [esi]
cmp bl, 13
je endloop
add bl, 1
mov [EDI], bl
inc esi
inc edi
loop loop1

endloop:
invoke GetStdHandle, STD_OUTPUT_HANDLE
mov stdout, eax
invoke WriteConsoleA, stdout, ADDR result, cRead, ADDR cWritten, NULL

invoke ExitProcess, 0

end start

ZY All that remains is to make the transition from the last printed character to the first. I think if you understand this code, then this will not be a problem for you! 🙂

Thank you all for your attention!

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *