Assembler's commands. Basic information about the assembler language. By discipline "System programming Program in the assembler language

National University of Uzbekistan named after Mirzo Ulugbek

Faculty of computer technology

On the topic: Semantic analysis of the exe file.

Performed:

Tashkent 2003.

Preface.

Assembler language and team structure.

EXE -File structure (semantic analysis).

The structure of the COM file.

The principle of action and spread of the virus.

Disassembler.

Programs.

Preface

The profession of programmer is amazing and unique. Nowadays, science and life cannot be submitted without the newest technology. All that is associated with human activity does not do without computing technology. And this contributes to its high development and perfection. Let the development of personal computers begins not so long ago, but during this time there were colossal steps on software products and for a long time these products will be widely used. The area associated with computers knowledge has undergone an explosion, as well as the corresponding technology. If you do not take a commercial party in consideration, then we can say that there are no professional activities in this area of \u200b\u200bprofessional activity. Many are developing programs not for the sake of benefit or earnings, but according to their own will, by passion. Of course, this should not affect the quality of the program, and in this matter, so to speak "business" there is a competition and demand for the quality of execution, in stable work and meets all the requirements of modernity. Here it is also worth noting the appearance of microprocessors in the 60s, which came to replace the large number of lamps. There are some varieties of microprocessors that are very different from each other. These microprocessors are different from each other of the discharge and built-in system teams. The most common such as: Intel, IBM, Celeron, AMD, etc. All of these processors are related to the developed architecture of Intel processors. The spread of microcomputers was the reason for revising the attitude towards the assembler language for two main reasons. First, programs written in the assembler language require significantly less memory and execution time. Secondly, the knowledge of the assembler language and the resulting machine code gives an understanding of the architecture of the machine, which is unlikely to be provided when working in a high level language. Although most software specialists are developing in high-level languages, such as Pascal, C or Delphi, which is easier when writing programs, the most powerful and efficient software is fully or partially written in the assembler language. High-level languages \u200b\u200bwere designed to avoid special technical features of specific computers. And the assembler language, in turn, is designed for specific processor specifics. Consequently, in order to write a program in the assembler language for a particular computer, you should know its architecture. In the present day, the type of main software is the EXE file. Given the positive aspects of this, the author of the program can be confident in its integrity. But often sometimes it is not so. There is also a disassemblybler. With the help of a disassembler, you can learn interrupts and program codes. A person who disassembled in the assembler will not be difficult to remake the entire program to their taste. Perhaps hence the most unresolved problem - the virus. Why do people write a virus? Some ask this question with surprise, some with angrily, but nevertheless people continue to exist, who are interested in this task not from the point of view of causing some harm, but as interest to systemic programming. Wrises write for various reasons. One like system challenges, others improve their knowledge in the assembler. I will try to set out about all this in my course work. It also said not only about the structure of the EXE file, but also about the assembler language.

^ Assembler language.

It is interesting to trace, starting from the time of the appearance of the first computers and ending with today's day, for the transformation of the ideas about the assembler language from programmers.

Once the assembler was tongue, without knowledge of which it was impossible to make a computer to do something useful. Gradually, the situation changed. More convenient means of communicating with a computer appeared. But, unlike other languages, the assembler did not die, so he could not do this in principle. Why? In search of the answer, we will try to understand what kind of assembler language is in general.

If briefly, the assembler language is a symbolic presentation of the machine language. All processes in the machine at the lowest, hardware level are driven only by commands (instructions) of the machine language. It is clear that, despite the general name, the assembler language for each type of computer. This applies to the appearance of programs written in the assembler, and ideas, the reflection of which this language is.

To truly solve the problems associated with the equipment (or even, moreover, depending on the equipment, as, for example, improving the speed of the program), it is impossible without the knowledge of the assembler.

A programmer or any other user can use any high-level tools, up to the programs for building virtual worlds and, perhaps, not even suspect that the computer actually does not perform the language commands on which his program is written, and their transformed presentation in the form of boring and sad Communication sequences of a completely different language - machine. And now we will imagine that such a user has a non-standard problem or simply did not hold something. For example, its program should work with some unusual device or perform other actions requiring knowledge of the principles of computer equipment. No matter how smart is a programmer, no matter how good the language in which he wrote his wonderful program, without knowing the assembler, he could not do. And it's not by chance that almost all compilers of high-level languages \u200b\u200bcontain the means of communication of their modules with modules on the assembler or support the access to the assembler programming level.

Of course, the time of computer universal has already passed. As they say it is impossible to argue the immense. But there is something in common, a kind of foundation on which any serious computer education is being built. This knowledge of the principles of the computer, its architecture and the assembler language as a reflection and embodiment of these knowledge.

A typical modern computer (based on i486 or Pentium) consists of the following components (Fig. 1).

Fig. 1. Computer and peripheral devices

Fig. 2. Structural scheme of a personal computer

From the figure (Fig. 1) it can be seen that the computer is made up of several physical devices, each of which is connected to one block called the system. If you argue logically, it is clear that it plays the role of a certain coordinating device. Let's look inside the system unit (you do not need to try to penetrate inside the monitor - there is nothing interesting, besides it is dangerous): open the case and see some boards, blocks, connecting wires. To understand their functional purpose, let's look at the structural scheme of a typical computer (Fig. 2). It does not claim unconditional accuracy and is intended to show the purpose, the relationship and the typical composition of the elements of the modern personal computer.

Let's discuss the scheme in Fig. 2 in several unconventional style.
It is typical, meeting with something new, look for some associations who can help him know the unknown. What associations is a computer? I have, for example, the computer is often associated with the person himself. Why?

Man creating a computer somewhere in the depths of herself thought that he would create something like himself. The computer has the perception of information from the outside world - is a keyboard, a mouse, storage devices on magnetic disks. In fig. 2 These organs are located on the right of the system tires. The computer has the "digestive" organs the information received is a central processor and RAM. And finally, the computer has speech organs that outstand the results of processing. These are also some of the devices on the right.

Modern computers, of course, far to a person. They can be compared with creatures interacting with the outside world at the level of a large, but limited set of unconditional reflexes.
This set of reflexes forms a system of machine commands. On whatever high level you do not communicate with the computer, ultimately everything comes down to the boring and monotonous sequence of machine commands.
Each machine team is a kind of irritant to excite one or another unconditional reflex. The reaction to this stimulus is always unambiguous and "sewn" in the microcomand block in the form of a firmware. This firmware implements actions to implement the machine command, but already at the signal level submitted to certain logic circuits of the computer, thereby driving various computer subsystems. This is the so-called firmware management principle.

Continuing the analogy with a person, we note: In order for the computer to eat correctly, many operating systems are invented, compilers of hundreds of programming languages, etc. But all of them are, in fact, only a dish, on which food (Programs) is delivered by definite rules Stomach (computer). Only the stomach of the computer loves the dietary, monotonous food - give him information structured, in the form of strictly organized sequences of zeros and units whose combinations and make up the machine language.

Thus, externally, being a polyglot, the computer understands only one language - the language of machine commands. Of course, to communicate and work with a computer, it is not necessary to know this language, but almost any professional programmer sooner or later faces the need to study. Fortunately, the programmer does not need to try to comprehend the value of various combinations of binary numbers, since in the 50s the programmers began to use the symbolic analogue of the machine language for programming, which was called the assembler language. This language accurately reflects all the features of the engine language. That is why, in contrast to high-level languages, the assembler language for each type of computer is yours.

Of all the above, we can conclude that, since the assembler language for the computer "native", then the most effective program can only be written on it (provided that it writes a qualified programmer). There is one small "but": it is very laborious, requiring great attention and practical experience in the process. Therefore, in fact, the assembler is written in the main program that should ensure efficient work with the hardware. Sometimes the assembler is written by the time of execution or spending memory. Subsequently, they are drawn up in the form of subroutines and are combined with the code in the high-level language.

To study the assembler language of any computer makes sense to start only after finding out which part of the computer is left visible and accessible to programming in this language. This is the so-called computer software model, part of which is a microprocessor software model that contains 32 registers to one way or another available for use by the programmer.

These registers can be divided into two large groups:

^ 16 user registers;

16 system registers.

In the programs in the assembler language, the registers are used very intensively. Most registers have a definite functional purpose.

As the name follows, the user registers are called because the programmer can use them when writing its programs. These registers include (Fig. 3):

Eight 32-bit registers that can be used by programmers for storing data and addresses (they are also called general purpose registers (Ron)):

six segment registers: CS, DS, SS, ES, FS, GS;

status and Management Registers:

EFLAGS / FLAGS flags register;

EIP / IP pointer pointer register.

Fig. 3. User registers of microprocessors I486 and Pentium

Why many of these registers are shown with an inclined separation feature? No, these are not different registers are parts of one large 32-bit register. They can be used in the program as individual objects. This is done to ensure the performance of programs written for the younger 16-bit models of Intel microprocessors, starting with i8086. I486 and Pentium microprocessors have mostly 32-bit registers. Their number, with the exception of segment registers, the same as in I8086, but the dimension is larger, which is reflected in their designations - they have
prefix E (extended).

^ General Registers
All registers of this group allow you to contact our "younger" parts (see Fig. 3). Considering this drawing, notice that it is possible to use for self-addressing only the younger 16 and 8-bit parts of these registers. The older 16 bits of these registers as independent objects are not available. This is done as we noted above, for compatibility with the younger 16-bit models of Intel microprocessors.

We list the registers belonging to the general purpose registers group. Since these registers are physically located in a microprocessor inside an arithmetic and logical device (Allu), they are also called registers by Allu:

eAX / AX / AH / AL (Accumulator Register) - battery.
It is used to store intermediate data. In some teams, the use of this register is required;

eBX / BX / BH / BL (BASE REGISTER) - Basic register.
It is used to store the basic address of some object in memory;

eCX / CX / CH / CL (Count Register) is a register meter.
Used in commands that produce some repeatable actions. Its use is often implicitly and hidden in the algorithm of the relevant team.
For example, the LOOP cycle organization, in addition to transmitting the command management, located at some address, analyzes and reduces the value of the ECX / CX register;

eDX / DX / DH / DL (DATA REGISTER) - data register.
Just like the EAX / AX / AH / AL register, it stores intermediate data. In some teams, its use is mandatory; For some commands, this is implicitly.

The following two registers are used to support the so-called chain operations, that is, operations producing sequential processing of chains of elements, each of which can have a length of 32, 16 or 8 bits:

eSI / SI (Source Index Register) - source index.
This register in chain operations contains the current address of the element in the source chain;

eDI / DI (DESTINATION INDEX REGISTER) - receiver index (recipient).
This register in the chain operations contains the current address in the receiver chain.

In the microprocessor architecture, such a data structure is supported on the software and hardware level as the stack. To work with a stack in the microprocessor command system, there are special commands, and in the microprocessor software model there are special registers:

eSP / SP (STACK POINTER REGISTER) - stack pointer register.
Contains the stack vertex in the current stack segment.

eBP / BP (Base Pointer Register) - Stack frame base pointer register.
Designed to organize arbitrary access to the data inside the stack.

The stack is called the program area for temporary storage of arbitrary data. Of course, the data can also be saved in the data segment, but in this case it is necessary to start a separate named memory cell during this time, which increases the size of the program and the number of names used. The convenience of stack is that its area is used repeatedly, and saving them from there using efficient PUSH and POP commands without specifying any names.
The stack is traditionally used, for example, to save the contents of the registers used by the program, before calling the subroutine, which, in turn, will use the processor registers "for their personal purposes". The initial content of the registers is projected from the stack after returning from the subroutine. Another common reception is the transfer of the subprogramme of the parameters required by it through the stack. The subroutine, knowing, in what order are placed on the stack of parameters, can pick them out from there and use when performing it. A distinctive feature of the stack is the peculiar order of the sample of the data contained in it: only the upper element is available on the stack at any time, i.e. The element loaded into the stack last. Unloading from the stack of the upper element makes the following element available. The stack elements are located in the memory area allotted under the stack, starting from the bottom of the stack (i.e., from its maximum address) on consistently decreasing addresses. The address of the upper, available element is stored in the SP stack indicator register. Like any other area of \u200b\u200bthe program memory, the stack must enter some segment or form a separate segment. In any case, the segment address of this segment is placed in the SS stack segment register. Thus, a pair of registers SS: SP describe the address of an affordable stack cell: the segment address of the stack is stored in SS, and in SP - the displacement of the latter stored in the stack of this (Fig. 4, a). Please note that in the initial state, the SP stack pointer indicates a cell under the bottom of the stack and not included in it.

Figure 4. Stack organization: A - initial state, B - after loading one element (in this example - the contents of the register ah), in - after loading the second element (the contents of the DS register), g - after unloading one element, D - after unloading Two elements and return to its original state.

Loading the stack is carried out by a special command of working with Push stack (push). This command first reduces the contents of the stack pointer to 2, and then places the operand at the address in SP. If, for example, we want to temporarily save the contents of register ah on the stack, you must execute the command

The stack goes to the state shown in Fig. 1.10, b. It can be seen that the stack pointer is shifted to two bytes up (towards smaller addresses) and this address records the operand pushing command. Next download command on the stack, for example,

will the stack in the state shown in Fig. 1.10, c. The stack will now store two elements, and only the upper one will be affordable, which indicates the SP stack pointer. If, after some time, we needed to restore the original contents saved in the stack of registers, we must execute unloading commands from the POP stack (push out):

pop DS
pop AX.

What size should be the stack? It depends on how intensively it is used in the program. If, for example, it is planned to store an array of 10,000 bytes in the stack, then the stack should be no less than this size. At the same time, it should be borne in mind that in some cases the stack is automatically used by the system, in particular, when executing the INT 21H interrupt command. According to this command, the processor first places the return address on the stack, and then DOS sends the same register contents there and other information related to the interrupted program. Therefore, even if the program does not use the stack at all, it should still be present in the program and have no less than several tens of words. In our first example, we took under the stack of 128 words, which is definitely enough.

^ Program structure on assembler

The assembler program is a set of memory blocks called memory segments. The program may consist of one or more such block segments. Each segment contains a set of language suggestions, each of which occupies a separate line of the program code.

Assembler offers are four types:

commands or instructions that are symbolic analogues of machine commands. In the process of broadcasting, the assembler instructions are transformed into the corresponding commands of the microprocessor command system;

macrocomands - the definite proposal of the program's text proposals that are replaced during broadcast by other proposals;

directives that are an indication of an assembler translator for performing certain actions. Directives have no analogues in the machine representation;

rows of comments containing any characters, including the letters of the Russian alphabet. Comments are ignored by the translator.

^ Assembler syntax

The proposals that make up the program may be a syntactic design, a corresponding team, Macrokomand, Directive, or a comment. In order for the assembler's translator to recognize them, they must be formed by certain syntactic rules. To do this, it is best to use a formal description of the syntax of the language like the rules of grammar. The most common ways of such a description of the programming language are syntactic charts and extended forms of Bakusa Naur. For practical use, syntactic diagrams are more convenient. For example, the assembler suggestions syntax can be described using syntactic diagrams shown in the following figures.

Fig. 5. assembler offer format

Fig. 6. Format Directive

Fig. 7. Command format and macros

On these drawings:

the name of the label is the identifier, the value of which is the address of the first byte of the proposal of the source text of the program, which it denotes;

the name is an identifier that distinguishes this directive from other directives of the same name. As a result of the processing assembler, a specific directive can be assigned to certain characteristics;

operation code (CPC) and directive is mnemonic designations of the relevant machine team, macros, or translator directives;

operands - parts of the team, macro commanders or assembler directives denoting objects over which actions are made. Assembler operands are described by expressions with numeric and text constants, marks and identifiers of variables using operations and some reserved words.

^ How to use syntactic charts? It is very simple: for this you just need to find and then go through the way from the login of the chart (left) to its exit (right). If this path exists, the offer or design is syntactically correct. If there is no such path, it means that the compiler will not accept this design. When working with syntactic charts, we pay attention to the direction of the bypass, indicated by the arrows, since among the paths there may be those for which you can go to the right left. In essence, the syntactic charts reflect the logic of the translator when the program's input proposals isolate.

Permissible symbols when writing text texts are:

All Latin letters: A-Z, A-Z. In this case, the capital and lowercase letters are considered equivalent;

Numbers from 0 to 9;

Signs?, @, $, _, &;

Dividers ,. ()< > { } + / * % ! " " ? \ = # ^.

Suggestions of the assembler are formed from the lexeme, which are syntactically inseparable sequences of permissible symbols of the language that make sense to the translator.

Lexes are:

identifiers - sequences of permissible characters used to refer to such object objects such as operation codes, variable names and label name. The identifier recording rule is as follows: The identifier may consist of one or more characters. As characters, you can use the letters of the Latin alphabet, numbers and some special signs - _,?, $, @. The identifier cannot start the digit symbol. The identifier length can be up to 255 characters, although the translator perceives only the first 32, and the rest ignores. You can adjust the length of possible identifiers using the MV command prompt option. In addition, there is an opportunity to specify the translator to distinguish between the uppercase and lowercase letters or ignored their difference (which is done by default).

^ Assembler commands.

Assembler commands disclose the ability to transfer their requirements to the computer, the management mechanism in the program (cycles and transitions) for logical comparisons and software organization. However, programmable tasks are rarely so easy. Most programs contain a number of cycles in which several commands are repeated until a certain requirement to achieve, and various checks that determine which of several actions should be performed. Some commands can transmit control by changing the normal sequence of steps directly modify the offset value in the command pointer. As mentioned earlier, there are various teams for various processors, we will consider a number of some commands for processors 80186, 80286 and 80386.

To describe the status of flags after performing some team, we will use the sample from the table reflecting the EFLAGS flag structure:

At the bottom line of this table, the values \u200b\u200bof the flags are given after executing the command. At the same time, the following notation is used:

1 - After executing the command, the flag is set (equal to 1);

0 - After executing the command, the flag is reset (equal to 0);

r - the value of the flag depends on the result of the work of the team;

After executing the command, the flag is not defined;

space - After executing the command, the flag does not change;

To represent operands in syntactic diagrams, the following notation is used:

r8, R16, R32 - operand in one of the byte size registers, word or double word;

m8, M16, M32, M48 - Operand in memory byte size, word, double word or 48 bits;

i8, i16, i32 - direct operand byte size, word or double word;

a8, A16, A32 - relative address (offset) in the code segment.

Teams (alphabetical order):

* These commands are described in detail.

Add.
(Addition)

Addition

^ Team diagram:

add receiver, source

Purpose: Addition of two operands Source and receiver byte, word or double word.

Work algorithm:

fold the operands source and receiver;

record the result of adding to the receiver;

install flags.

Flag status after executing the command:

Application:
The Add command is used to add two integer operands. The result of the addition is placed at the address of the first operand. If the result of the addition is over the border of the operand receiver (overflow), then consider this situation by analyzing the CF flag and the subsequent possible application of the ADC command. For example, lay values \u200b\u200bin the AX register and the memory area CH. When adding, you should consider the possibility of overflow.

| 000000dw | ModRegr / Rm |

| 0000010w | --Data-- | Data, if w \u003d 1 |

Call
(Call)

Call procedure or task

^ Team diagram:

Purpose:

transfer of the control of a close or long-term procedure with memorization in the return point address stack;

switching tasks.

Work algorithm:
determined by the type of operand:

Messenger label - The contents of the EIP / IP command pointer is entered into the stack and the new address value is loaded into the same register, the address corresponding to the label;

Label label - the contents of the EIP / IP and CS command pointer is entered into the stack. Then, the same registers are loaded with new values \u200b\u200bof addresses corresponding to the long-tag;

R16, 32 or M16, 32 - determine the register or memory cell containing offsets in the current command segment where control is transmitted. When controlling control, the contents of the EIP / IP command pointer is entered;

Memory pointer - defines a memory cell containing 4 or 6-byte pointer to the called procedure. The structure of such a pointer 2 + 2 or 2 + 4 bytes. The interpretation of such a pointer depends on the mode of operation of the microprocessor:

^ Flag statla After executing the command (except for switching task):

execution of the team does not affect flags

When you switch the task, the flag values \u200b\u200bare changed in accordance with the EFLAGS register information in the TSS status segment of the task to which switching.
Application:
The call command allows you to organize a flexible and multivariate control to the subroutine while saving the return point address.

ABOUT D (four formats):

Direct addressing in the segment:

| 11101000 | DISP-LOW | DIEP-HIGH |

Indirect addressing in the segment:

| 11111111 | MOD010R / M |

Indirect addressing between segments:

| 11111111 | MOD011R / M |

Direct addressing between segments:

Cmp
(Compare Operands)

Comparison of operands

^ Team diagram:

cMP Operand1, Operand2

Purpose: Comparison of two operands.

Work algorithm:

perform subtraction (operand1 operand2);

depending on the result, install flags, operand1 and operand2 not to change (that is, the result is not memorized).

Application:
This command is used to compare the two operands by subtracting, while the operands do not change. According to the results of the command, flags are installed. The CMP command applies to the conditional transition commands and the byte installation command by the SETCC value.

ABOUT K O D (three formats):

| 001110dw | MODREGR / M |

Direct value with register AX (AL):

| 0011110W | --Data-- | Data, if w \u003d 1 |

Direct value with register or memory:

DEC
(Decrement Operand By 1)

Reducing operand per unit

^ Team diagram:

dec Operand

Purpose: Reducing the operand value in memory or register by 1.

Work algorithm:
the command deducts 1 from the operand. Flag status after executing the command:

Application:
The DEC command is used to reduce the value of bytes, words, double words in memory or register per unit. At the same time, notice that the command does not affect the CF flag.

^ Register or memory: | 1111111W | MOD001R / M |

Div
(Divide unsigned)

Decision unsaluable

Team diagram:

div divider

Purpose: Perform the operation of dividing two binary unsigned values.

^ Work algorithm:
For the command, it is necessary to specify two operands - divide and divider. The dividend is defined implicitly and its size depends on the size of the divider, which is specified in the command:

if a divider in the bytes, then the dividend must be located in the AX register. After the operation, the private is placed in Al, and the residue in AH;

if the divider is the size of the word, then the dividend must be located in the pair of registers DX: AX, and the youngest part of the dividend is in AX. After the operation, the private is placed in AX, and the residue in DX;

if a dual-word divider, the divisible must be located in the EDX register pair: EAX, and the youngest part of the division is in EAX. After the operation, the private is placed in EAX, and the residue is in EDX.

^ Flag statla after executing the command:

Application:
The team performs an integer division of operands with the issuance of the result of dividing in the form of a private and residue from division. When performing a division operation, an exceptional situation may occur: 0 - division error. This situation occurs in one of two cases: the divider is 0 or private too large for its placement in the EAX / AX / AL register.

Oh k about d:

| 1111011W | MOD110R / M |

Int.
INTERRUPT)

Calling Interrupt Service Subprogram

^ Team diagram:

int number_name

Purpose: Calling an interrupt service subroutine with an interrupt number specified by the command operand.

^ Work algorithm:

write on the EFLAGS / Flags flags and return address. When recovering the return address, the contents of the CS segment register are first recorded, then the contents of the EIP / IP command pointer;

reset IF and TF in zero flags;

transfer control to the interrupt processing program with the specified number. The control mechanism depends on the mode of operation of the microprocessor.

^ Flag statla after executing the command:

Application:
As can be seen from the syntax, there are two forms of this command:

iNT 3 - has its own individual code of the 0CCH operation and takes one byte. This circumstance makes it very convenient for use in various software debuggers to install interrupt points by replacing the first byte of any command. The microprocessor, encountered in the command sequence command with the 0CCH operation code, causes a interrupt processing program with a vector number 3, which serves to communicate with the program debugger.

The second form of the command occupies two bytes, has a 0CDH operation code and allows you to initiate a call of the interrupt processing routine with a vector number in the range of 0-255. Features of the management of management, as noted, depend on the mode of operation of the microprocessor.

About the (two formats):

^ Register or memory: | 1111111W | MOD000R / M |

JCC.
JCXZ / JECXZ.
(JUMP IF Condition)

(JUMP IF CX \u003d ZERO / JUMP IF ECX \u200b\u200b\u003d ZERO)

Transition if a condition is satisfied

Transition if CX / ECX is zero

^ Team diagram:

jCC Tag
jcxz label
jECXZ Tag

Purpose: The transition within the current command segment depending on some condition.

^ Commanding algorithm (except JCXZ / JECXZ):
Checking the status of flags depending on the code of the operation (it reflects the verifiable condition):

if the test condition is truly true, then proceed to the cell designated by the operand;

if the valid condition is false, then transfer the following command.

JCXZ / JECXZ team work algorithm:
Check the equality conditions zero the contents of the ECX / CX register:

if the verifiable condition

Topic 1.4 Assembler Mnemonic. Structure and formats of commands. Types of addressing. Microprocessor team system

Plan:

1 Assembler language. Basic concepts

2 Symbols of the assembler language symbols

3 Types of assembler operators

4 Assembler directives

5 processor command system

1 J.zyK assembler. Basic concepts

Assembler language - This is a symbolic presentation of the machine language. All processes in the machine at the lowest, hardware level are driven only by commands (instructions) of the machine language. It is clear that, despite the general name, the assembler language for each type of computer.

The assembler program is a combination of memory blocks, called memory segments.The program may consist of one or more such block segments. Each segment contains a set of language suggestions, each of which occupies a separate line of the program code.

Assembler offers are four types:

1) commands or instructions, Presenting symbolic analogs of machine commands. In the process of broadcasting, the assembler instructions are transformed into the corresponding commands of the microprocessor command system;

2) macrocomands -decorated in a certain way of the text proposals of the program, replaced during broadcast by other proposals;

3) directivesan indication of the assembler translator to perform certain actions. Directives have no analogues in the machine representation;

4) rows of comment containing any characters, including the letters of the Russian alphabet. Comments are ignored by the translator.

Program structure on assembler. Assembler syntax.

The proposals that make up the program may be a syntactic design, a corresponding team, Macrokomand, Directive, or a comment. In order for the assembler's translator to recognize them, they must be formed by certain syntactic rules. To do this, it is best to use a formal description of the syntax of the language like the rules of grammar. The most common ways of such a description of the programming language - Syntactic charts and extended Bakusa Naur forms. For practical use more convenient syntactic charts.For example, the assembler suggestions syntax can be described using syntactic diagrams shown in the following figures 10, 11, 12.

Figure 10 - assembler offer format

Figure 11 - Format Directives

Figure 12 - Team format and macros

On these drawings:

name Tags - the identifier, the value of which is the address of the first byte of that proposal of the source text of the program, which it denotes;

name - The identifier distinguishing this directive from other directives of the same names. As a result of the processing assembler, a specific directive can be assigned to certain characteristics;

operation code (CPA) and directive - These are mnemonic designations of the relevant machine team, macros, or translator directives;

operands - Parts of the team, macro commanders or assembler directives denoting objects over which actions are made. Assembler operands are described by expressions with numeric and text constants, marks and identifiers of variables using operations and some reserved words.

Syntactic diagrams help find and then go through the way from the login of the chart (left) to its exit (right). If this path exists, the proposal or design is syntactically correct. If there is no such path, it means that the compiler will not accept this design.

2 Assembler Language Symbols

Permissible symbols when writing text texts are:

1) all Latin letters: A-Z., A-Z.. In this case, the capital and lowercase letters are considered equivalent;

2) numbers from 0 before 9 ;

3) signs ? , @ , $ , _ , & ;

4) dividers , . () < > { } + / * % ! " " ? = # ^ .

Assembler offers are formed from lexemerepresenting the syntactically inseparable sequences of permissible symbols of the language that make sense to the translator.

Lexemes are:

1) identifiers - Sequences of permissible characters used to designate such object objects such as operation codes, variable names and label name. The identifier recording rule is as follows: the identifier may consist of one or more characters;

2) chains of symbols - sequences of symbols enclosed in single or double quotes;

3) whole bulwa of one of the following surcharge systems : binary, decimal, hexadecimal. The identification of numbers when recording them in the assembler programs is made according to certain rules:

4) Decimal numbers do not require any additional characters to identify any additional characters, for example 25 or 139. To identify in the source text of the program binary numbers It is necessary after recording zeros and units included in their composition, put Latin " b.", For example 10010101 b..

5) hexadecimal numbers have more conventions at their record:

First, they consist of numbers 0...9 , lowercase and capital letters of the Latin alphabet a., B., c., D., E., F. or A., B., C., D., E., F..

Secondly, the translator may have difficulties with the recognition of hexadecimal numbers due to the fact that they can consist of some numbers 0 ... 9 (for example, 190845), so begin with the letter of the Latin alphabet (for example, eF15). In order to "explain" the translator that this lexeme is not a decimal number or identifier, the programmer must specially allocate a hexadecimal number. To do this, at the end of the sequence of hexadecimal numbers, which make up a hexadecimal number, write the Latin letter " h.". This is a prerequisite. If the hexadecimal number begins with the letter, the leading zero is recorded before him: 0 eF15 h.

Almost every proposal contains a description of the object over which or with which some action is performed. These objects are called operanda. They can be identified like this: operands. These are objects (some values, registers or memory cells) to which instructions or directives operate, or these objects that define or specify the operation of instructions or directives.

It is possible to carry out the following classification of operands:

constant or direct operands;

address operands;

transferred operands;

address meter;

basic and index operands;

structural operands;

Entries.

Operands are elementary components from which part of the machine command is formed, denoting objects over which the operation is performed. In a more general case, operands may include as part of more complex education, called expressions.

Expressions there are combinations of operands and operators considered as a whole. The result of the calculation of the expression may be the address of a certain memory cell or some constant (absolute) value.

3 Types of assembler operators

We list the possible types assembler operators and syntactic rules for the formation of assembler expressions:

arithmetic operators;

shift operators;

comparison operators;

logical operators;

index operator;

type redefinition operator;

operator redefinition of the segment;

naming Operator Type of Structure;

operator of obtaining the segment component of the address of the expression;

operator to obtain an expression displacement.

1 Assembler directives

Assembler directives are:

1) segmentation directives. During the previous discussion, we found all the basic rules for recording teams and operands in the assembler program. The question remained the question of how to create a sequence of commands correctly, so that the translator can process them, and the microprocessor is to execute.

When considering the microprocessor architecture, we learned that it has six segment registers, through which it can simultaneously work:

with one code segment;

with one stack segment;

with one data segment;

with three additional data segments.

A physically segment is a memory area, occupied by commands and (or) data, whose addresses are calculated relative to the value in the corresponding segment register. The syntactic description of the segment on the assembler is a design shown in Figure 13:

Figure 13 - Syntax description of the segment on the assembler

It is important to note that the functional purpose of the segment is somewhat wider than the simple breaking of the program on blocks of code, data and stack. Segmentation is part of a more general mechanism associated with concept of modular programming. It involves the unification of the design of object modules created by the compiler, including from different programming languages. This allows you to combine programs written in different languages. It is for the implementation of various options for such an association and operands in the Segment directive are intended.

2) Listing management directives. Listing management directives are divided into the following groups:

general Listing Management Directives;

output directives in the listing of files included;

conditional assembly blocking directives;

output directives in the listing of macrocomand;

listing information on cross-references listing;

listing format change directives.

2 Processor command system

The processor command system is presented in Figure 14.

Consider the main groups of teams.

Figure 14 - Classification of assembler commands

Teams are:

1 data transfer commands. These commands occupy a very important place in the system of commands of any processor. They perform the following major functions:

saving in memory of the content of internal processor registers;

copying content from one memory area to another;

writing to I / O devices and read from I / O devices.

In some processors, all these functions are performed by one single team.MOV. (for byte shipments -Movb. ) but with various methods of addressing operands.

In other processors, in addition to the teamMOV. There are several more commands to perform listed functions. Also, the data transfer commands include information sharing commands (their designation is based on a wordExchange. ). There may be an exchange of information between internal registers, between two half of one register (Swap ) or between the register and the memory cell.

2 Arithmetic teams. Arithmetic teams consider operand codes as numeric binary or binary-decimal codes. These commands can be divided into five main groups:

fixed semicolons (addition, subtraction, multiplication, division);

floating semicolons (addition, subtraction, multiplication, division);

cleaning commands;

commands of increment and decrement;

compare command.

3 Fixed semicolons operate commands work with codes in processor registers or in memory as conventional binary codes. Floating communion (point) operations commands use the format of representation of numbers with order and mantissa (usually these numbers occupy two consecutive memory cells). In modern powerful processors, a set of floating semicolons is not limited to four arithmes, and contains many other more complex commands, for example, the calculation of trigonometric functions, logarithmic functions, as well as complex functions necessary when processing sound and images.

4 Cleaning commands are designed to write a zero code into a register or memory cell. These commands can be replaced by zero-code shipping commands, but special cleaning commands are usually faster than shipping commands.

5 Increment teams (increasing per unit) and decrement

(Reducing per unit) are also very comfortable. They can be in principle to replace the summation commands with a unit or subtraction of a unit, but the increment and decrement are faster than the summation and subtraction. These commands require one input operand, which is both the output operand.

6 The comparison command is designed to compare two input operands. In essence, it calculates the difference of these two operands, but the output operand does not forms, but only changes the bits in the processor state register according to the result of this subtraction. The following command to the comparison command (usually this is a transition command) will analyze bits in the processor status register and perform actions depending on their values. Some processors provide a chain comparison commands of two sequences of operands in memory.

7 logical commands. Logical commands are performed on operands logical (bitwise) operations, that is, they consider operand codes not as a single number, but as a set of individual bits. They differ from the arithmetic teams. Logical commands perform the following basic operations:

logical and, logical or, addition of module 2 (excluding or);

logical, arithmetic and cyclic shifts;

check bits and operands;

installing and cleaning bits (flags) of a processor state register (PSW).

Logic operation commands allow you to braidly calculate the basic logic functions from two input operands. In addition, the operation and is used to forced cleaning the specified bits (as one of the operands, the mask code is used, in which the discharges that require cleaning are set to zero). Operation or applied to forced installation of specified bits (as one of the operands, the mask code is used in which the discharges that require installation per unit are equal to one). Operation "Excellenging or" is used to invert the specified bits (as one of the operands, the mask code is used, in which the bits that are subject to inversions are set per unit). Commands require two input operands and form one output operand.

8 Shifts commands allow you to blend the operand code to the right (towards younger discharges) or left (in the direction of senior discharges). The shift type (logical, arithmetic or cyclic) determines which will be the new value of the older bit (when the shift to the right) or the younger bit (during the shift to the left), and also determines whether the oldest value of the older bit (when the shift to the left) is saved somewhere or younger bit (when shifting to the right). Cyclic shifts allow you to move the bits of the operand code in a circle (clockwise when the shift is right or counterclockwise when the shift left). At the same time, the shift ring can enter or not enter the transfer flag. In the transmission flag bit (if used), the value of the older bit is recorded during the cyclic shift to the left and the younger bit during the cyclic shift to the right. Accordingly, the value of the transfer flag bit will correspond to the lowest discharge during the cyclic shift to the left and in the senior discharge during the cyclic shift to the right.

9 commands of transitions. Complaus teams are designed to organize all kinds of cycles, branches, calls to subroutines, etc., that is, they violate the serial progress. These commands are recorded in the command register a new meaning and thereby call the processor transition not to the next command, but to any other command in the memory of the programs. Some commands of transitions provide further return back to the point from which the transition was made, others do not provide for this. If the refund is provided, then the current processor parameters are stored in the stack. If the refund is not provided, then the current processor parameters are not saved.

Compare commands without refund are divided into two groups:

unconditional transitions;

common crossing teams.

In the designations of these commands are used wordsBranch (branching) and Jump (jump).

The commands of unconditional transitions cause the transition to a new address, regardless of anything. They can call the transition to the specified displacement value (forward or backward) or to the specified memory address. The displacement value or the new address value is indicated as an input operand.

Conditional transition commands cause the transition is not always, but only when performing the specified conditions. As such conditions, flag values \u200b\u200bin the processor state register are usually acting.PSW. ). That is, the transition condition is the result of the previous operation changing the values \u200b\u200bof flags. Total transition conditions can be from 4 to 16. Several examples of conditional transition teams:

transition, if zero;

transition, if not zero;

transition if there is an overflow;

transition if there is no overflow;

transition if more zero;

transition, if less than or equal to zero.

If the transition condition is executed, it is downloaded to the new value command register. If the transition condition is not executed, the command meter is simply increasing, and the processor selects and performs the following command in order.

Especially for checking the transition conditions, the comparison command (SMR) is applied preceding the conditional transition command (or even several conventional transition teams). But flags can be installed and any other command, such as a data transfer command, any arithmetic or logical command. It should be noted that the flag transition commands themselves do not change that it makes it possible to put several transition commands one after another.

A special place among the transition teams are occupied by interrupt commands. These commands as an input operand require interrupt number (vector address).

Output:

The assembler language is a symbolic presentation of the machine language. Assembler language for each computer type. The assembler program is a set of memory blocks called memory segments. Each segment contains a set of language suggestions, each of which occupies a separate line of the program code. Suggestions of the assembler are four types: teams or instructions, macros, directives, comment lines.

All Latin letters are permissible when writing the text of the programs: A-Z., A-Z.. In this case, the capital and lowercase letters are considered equivalent; Figures OT 0 before 9 ; signs ? , @ , $ , _ , & ; dividers , . () < > { } + / * % ! " " ? = # ^ .

The following types of assembler operators and syntactic rules for the formation of assembler expressions are used. Arithmetic operators, shift statements, comparison operators, logical operators, index operator, type redefinition operator, segment redefinition operator, structure type naming operator, operator of obtaining a segment component of an expression address, an expression displacement operator.

The command system is divided into 8 major groups.

Control questions:

1 What is an assembler language?

2 What characters can be used to record commands on the assembler?

3 What are the labels and their appointment?

4 Calculate the structure of the assembler commands.

5 List 4 types of assembler suggestions.

Structures in the assembler language

We considered above arrays are a combination of the same type of elements. But often in the applications there is a need to consider some set of different type data as some single type.

This is very relevant, for example, for database programs, where it is necessary to bind a set of data from a different type with one object.

For example, we previously reviewed Listing 4, in which the work was carried out with an array of three-tailed elements. Each element, in turn, was two elements of different types: a single-byte counter field and a double-litter field, which could carry some other information needed for storage and processing. If the reader is familiar with one of the high-level languages, then he knows that such an object is usually described using a special data type - structures.

In order to increase the ease of use of the assembler language, it also introduced this type of data.

A-priory structure - This is a data type consisting of a fixed number of elements of different types.

To use structures in the program, you must perform three actions:

Set template structure .

In meaning, this means the definition of a new data type, which can be used to determine the variables of this type.

Determine instance of the structure .

This stage implies initialization of a specific variable in advance of a structure (using a template) structure.

Organize appeal to structure elements .

It is very important that you understand from the very beginning, what is the difference between description structures in the program and its definition.

Describe The structure in the program means only to specify its scheme or pattern; Memory does not stand out.

This pattern can be considered only as information for the translator about the location of the fields and their default value.

Determine The structure means to specify the translator to highlight the memory and assign a symbolic name to this area.

You can describe the structure in the program only once, and determine - any number of times.

Description Template structure

Description of the structure template has the following syntax:

struc._name

endS Structure Name

Here represents the sequence of data description directives dB, DW, DD, DQ and dt..

Their operands determine the size of the fields and, if necessary, the initial values. These values \u200b\u200bwill be initialized by the corresponding fields when determining the structure.

As we have already noted when describing the template, the memory is not allocated, as it is just information for the translator.

Location The template in the program can be permissal, but following the logic of the operation of the single-pass translator, it must be positioned before the location where the variable is determined with the type of this structure. That is, when describing in the data segment of a variable with a type of some structure, its template must be placed at the beginning of the data segment or in front of it.

Consider work with structures on the example of modeling the database on employees of some department.

For simplicity, to get away from the problems of information transformation when entering, we agree that all fields are symbolic.

We define the structure of recording this database with the following template:

Defining data with structure type

To use the structure described using a template in the program, it is necessary to determine the variable with the type of this structure. This uses the following syntax design:

[variable name] Structure name

name of the variable - identifier of the variable of this structural type.

The task of the variable name is optional. If you do not specify it, the area of \u200b\u200bmemory in the amount of the lengths of all elements of the structure will be simply allocated.

list of values - A list of initial values \u200b\u200bof the elements of the structure separated by commas enclosed in angular brackets.

His task is also optional.

If the list is not fully listed, then all the fields of the structure for this variable are initialized by the values \u200b\u200bfrom the template, if such are specified.

It is allowed to initialize individual fields, but in this case the missed fields should be separated by commas. The missed fields will be initialized with the values \u200b\u200bfrom the structure template. If, when determining a new variable with the type of this structure, we agree with all the values \u200b\u200bof the fields in its template (that is, the default set), then you just need to write corner brackets.

For instance: victor Worker..

For example, we define several variables with the type described above.

Methods of working with structure

The idea of \u200b\u200bintroducing a structural type to any programming language consists in combining different type variables into one object.

In the language there must be a means of accessing these variables within a specific instance of the structure. In order to refer to the team on the field of some structure, a special operator is used - symbol ". " (point). It is used in the following syntactic design:

address_Morce- identifier of a variable of some structural type or expression in brackets in accordance with the syntactic rules specified below (Fig. 1);

name PLA_TRAVE - The name of the field from the structure template.

This, in fact, also address, or rather, the offset of the field from the beginning of the structure.

Thus, the operator " . "(point) calculates the expression

Fig. five. The syntax of the address expression in the operator handling the structure of the structure

We will demonstrate on the example of the structure we define worker. Some work techniques with structures.

For example, remove in aX. The values \u200b\u200bof the field with age. Since the age of a able-bodied person is unlikely to be more than 99 years old, then after placing the contents of this symbolic field in the register aX. It will be convenient to convert to a binary representation by the team aAD..

Be careful because due to the data storage principle "Junior byte for younger address" The older number of age will be placed in al, and the youngest - in aH..

To adjust it is enough to use the command xchg Al, Ah:

mOV AX, Word Ptr Sotr1.age; in Al Age Sotr1

and you can and so:

Further work with the array of structures is performed in the same way as with a one-dimensional array. There are a few questions here:

What to be with the size and how to organize the indexation of the elements of the array?

Similar to other identifiers defined in the program, the translator assigns the name of the type of structure and the variable name with the type of type attribute structure. The value of this attribute is the size in bytes involved in the fields of this structure. You can remove this value using the operator type.

After the size of the structure of the structure has become known, the indexing in the array of structures does not represent a special complexity.

For instance:

How to copy the field from one structure to the corresponding field of another structure? Or how to perform copying the whole structure? Let's perform a field copying nAM. third employee in the field nAM. Fifth employee:

mAS_SOTR WORKER 10 DUP ()

mOV BX, OFFSET MAS_SOTR

mOV SI, (Type Worker) * 2; Si \u003d 77 * 2

mOV DI, (Type Worker) * 4; Si \u003d 77 * 4

It seems to me that the programmer's craft sooner or later makes a person look like a good housewife. He, like her, is constantly in search, where to save something, cut and make a wonderful lunch from a minimum of products. And if it succeeds, then moral satisfaction is not less than it, and maybe more than from the wonderful lunch at the housewife. The degree of this satisfaction, it seems to me depends on the degree of love for your profession.

On the other hand, success in the development of software and hardware relaxes are somewhat relaxed, and a situation similar to a well-known proverb about fly and an elephant is quite often observed - for solving some small task, heavyweight means are involved, the effectiveness of which, in the general case, is significant only Implementation of relatively large projects.

The presence in the language of the following two types of data is probably due to the desire of the "hostess" as efficiently use the work area of \u200b\u200bthe table (RAM) during the preparation of food or to place products (program data).

Teams of the Assembler (Lecture)

Plan lectures

1. Basic groups of operations.

Pentium.

1. Basic groups of operations

Microprocessors perform a set of teams that are identified as alarm group partitions:

Operations,

Arithmetic operators

Logical operations

OperationSwig

OperationSitization,

Bit operations

Operations of the program;

Operations of the processor.

2. Processor command modules Pentium.

When describing the commands, their mnemonic designations (mnokodes) are commonly used, which serve to task the command when programming in the assembler language. For various versions of the assembler, some commands may differ. For example, for the call command of the subroutine uses mnemokodeCall or JSR. ("Jump to Subroutine"). However, the meters of most teams for the basic types of microprocessors coincide or differ slightly, as they are abbreviations of the respective English words that determine the operation performed. Consider the mismatches of commands adopted for processorsPentium.

Shipping commands. The main command of this group is the teamMOV. which provides data transfer between two registers or between register and memory cell. In some microprocessors, shipment is implemented between two memory cells, as well as group shipment of the contents of several register of memory. For example, microprocessors of the family 68xXX company Motorola. Perform a commandMove. providing forwarding from one memory cell to another and commandMovem. which is recorded in memory or download from the content memory of the specified set of registers (up to 16 registers). TeamXchg. Mutually sharing the contents of two processor or register registers and memory cells.

Teams input IN. and output Out. Implement data transfer from the processor register to the external device or receiving data from the external device into the register. In these commands, the interface device number (I / O port) is set through which data is transmitted. Note that many microprocessors do not have special commands to appeal to external devices. In this case, the input and output of data in the system is performed using the commandMOV. which sets the address of the desired interface device. Thus, the external device is addressed as a memory cell, and in the address space there is a certain partition in which the addresses of the interface devices connected to the system (ports) are located.

Commands of arithmetic operations. The main in this group are teams of addition, subtraction, multiplication and division, which are intelligent. Teams of addition Add. and subtraction Sub. perform the appropriate operations withc.obsessed two registers, register and memory cells or using a direct operand. Teams AD C. , Sb. B. Make addition and subtraction, taking into account the value of the featureC.installed in the formation of transfer during the execution of the previous operation. Using these commands, the sequential addition of operands is being implemented, the number of discharges exceeds the discharge of the processor. Team NEG. Changes the operand sign, translating it into an additional code.

Multiplication and division operations can be performed above numbers with a sign (teamsI. MUL, I. Div ) or an instance (commands Mul, Div ). The idearanDeradoDevexEvegistra, the second may be in the register, memory cell or be direct operand. The result of the operation is located in the register. When multiplying (commandsMul. , Immul ) It turns out the result of a double bit, for the placement of which two registers are used. When dividing (teamsDiv , Idiv ) The dual-time discharge operand is used as a division, placed in two registers, and as a result, a private and residue is recorded in two registers.

Commands of logical operations . Almost all microprocessors produce logical operations, or eliminating or, which are performed over the same names of the operands using commands And, or, X. Or. . Operations are performed on the contents of two registers, register and memory cells or using the immediate operand. Team Not. Inverts the value of each operand discharge.

Shift teams. Microprocessors are carried out arithmetic, logical and cyclic shifts of addressed operands to one or more digits. The operand shift may be in the register or memory cell, and the number of shifting discharges is set using a direct operand contained in the command, or is determined by the contents of a specified register. In the implementation of the shift usually participates a sign of transferC. in state register (Sr.or Eflags.), where there is a last operand discharge, extended from the register or memory cell.

Compare and Test Teams . Operand Comparison is usually made using the command.Cmp which performs the subtraction of operands with the setting of signs N, Z, V, C In the state register in accordance with the result obtained. In this case, the result of subtraction is not saved, and the values \u200b\u200bof the operands do not change. Subsequent analysis of the obtained values \u200b\u200bof the signs allows you to determine the relative value (\u003e,<, =) операндов со знаком или без знака. Использование различных способов адресации позволяет производит сравнение содержимого двух регистров, регистра и ячейки памяти, непосредственно заданного операнда с содержимым регистра или ячейки памяти.

Some microprocessors perform testing command TST which is a single-handed version of the comparison command. When performing this command manual signs N, Z. In accordance with the sign and meaning (equal or not equal to zero) of the addressable operand.

Bit operations commands . These commands produce the setting of the character valueC. In the state register in accordance with the value of the tested bitbN. in the addressable operand. In some microprocessors, based on the test of the bit, an attribute setting is made.Z.. The number of the tested bitn. Set the contents specified in the register command or direct operand.

Commands of this group are implementing different options for amendableness. Bt. Saves the value of this bit unchanged. Command B. T. S. postcard Sets the value bN.\u003d 1, and the team B. T. C. - Value bN.\u003d 0.And B. T. C. inverts the BN bn value after testing it.

Program management operations. To manage the program, a large number of commands are used, among which you can allocate:

- unconditional control commands;

- conventional transition teams;

- commands of program cycles;

- interrupt commands;

- commands change signs.

Unconditional control is made by a teamJmp. which downloads to the program counterPC. New content, which is the address of the next command being executed. This address is either directly indicated in the commandJmp. (direct addressing), or calculated as the sum of the current contentPC. And asked in the bias team, which is a number with a sign (relative addressing). AsPC. Contains the address of the next program command, the last method specifies the transition address, displaced relative to the next address to a specified number of bytes. With a positive offset, a transition to subsequent program commands is performed, with a negative displacement - to the previous one.

Calling the subroutine is also made by unconditional management using the commandCall (or JSR. ). However, in this case, before loading inPC. The new contents specifying the address of the first command command, you must save its current value (address of the next command) to ensure that after executing the subroutine, it is returned to the main program (or to the previous subroutine when attaching subroutines). Conditional transition commands (program branches) produce loading inPC. The new contents, if certain conditions are performed, which are usually set in accordance with the current value of various features in the status register. If the condition is not implemented, the following program program is running.

Managing commands are provided to write - read the contents of the status register in which the signs are stored, as well as the change in the values \u200b\u200bof individual features. For example, in Pentium processors, commands are implemented. Lahf. and Sahf. that run the loading of the younger byte, where signs are contained from the state register Eflag. in the junior byte register EAX and filling the younger byte Eflags. from the register E A.X... Teams CLC, STC. carry out the setting of the signs of the transfer CF \u003d 0, CF \u003d 1, and the team CMC. Causes inverting the value of this feature. Since the signs define the progress of the program execution when conditional transitions, the signs change commands are commonly used to manage the program.

Commands of the processor . This group includes breakdown commands, lack operations and a number of commands that determine the mode of operation of the processor or its individual blocks. TeamHLT. stops executing the program and transfers the processor to the stop state, the output from which occurs when the interrupt or restart signals arrive (Reset). Team Nop. ("Empty" command), which does not cause any operations, serves to implement software delays or filling out permits formed in the program.

Special teams CLI, STI Prohibit and allow interrupt request service. In processorsPentium. This uses a control bit (flag)If. In the register Eflags..

Many modern microprocessors perform the identification command that allows the user or other devices to obtain information about the processor type used in this system. In processors Pentuim. For this serves as a team CPUID when executing which the necessary data on the processor is entered into registers EAX,EBX,ECX,EDX And can then be read by the user or operating system.

Depending on the operation modes implemented by the processor and the specified types of data, the set of executed commands can significantly expand.

Some processors produce arithmetic operations with binary-decimal numbers or perform special results of the result correction commands in the processing of such numbers. Many high-performance processors includeFPU. - number processing unitc. "Floating point."

In a number of modern processors, multiple processing of several integers or numbers is implementedc. "Floating point" with the help of one principleSIMD ("Single Instration - Multiple Data ") -" One command is a variety of data. " The simultaneous execution of operations over multiple operands significantly increases the performance of the processor when working with video and audio data. Such operations are widely used to process images, sound signals and other applications. To perform these operations, special blocks implementing the corresponding sets of commands that are entered into processors that are in various types of processors (Pentium, Athlon) Received nameMMX (“ Milti.- Media Extension. ") - multimedia expansion,SSE ("Streaming SIMD Extension") - Streaming SIMD - expansion, “3 D. – Extension” - Three-dimensional expansion.

Characteristic feature of company processorsIntel Starting from the 80286 model, is the priority control when accessing memory, which is provided when the processor is in the operation of secure virtual addresses - "Protected Mode. "(Protected mode). To implement this mode, special command groups are used that serve to organize memory protection in accordance with the adopted priority algorithm.

The command structure in the assembler language. Programming at the level of machine commands is the minimum level on which computer programming is possible. The system of machine commands should be sufficient to implement the required actions, issuing instructions of the machine. Each machine command consists of two parts: an operating, defining "what to do" and the operant, defining processing objects, that is, something "over what to do." The machine command of the microprocessor, recorded in the assembler language, is one line having the following form: the label command / operand director (s); Comments Tagging, command / directive and operand are separated by at least one symbol of a space or tab. The operands of the team are separated by commas.

Command structure in the assembler language The assembler command indicates a translator, which action must perform a microprocessor. Assembler directives are the parameters specified in the program text that affect the assembly process or the properties of the output file. The operand determines the initial value of the data (in the data segment) or the elements over which the command is performed (in the code segment). The command can have one or two operands, or not have operands. The number of operands is implicitly set by the command code. If the command or directive needs to continue on the next line, the "Reverse Slash" symbol is used: "". By default, the assembler does not distinguish capital and lower case letters in writing commands and directives. Examples of directive and COUNT DB 1 commands; Name, directive, one MOV EAX, 0 operand; Team, two operands

Identifiers - sequence of permissible characters used to designate variable names and label names. The identifier may consist of one or more of the following symbols: All letters of the Latin alphabet; numbers from 0 to 9; Specialist: _, @, $ ,? . As the first symbol, the tag can be used. The identifiers cannot use the reserved assembler names (directives, operators, command names). The first character of the identifier should be the letter or specialist. The maximum length of the identifier of 255 characters, but the translator perceives the first 32, the rest ignores. All tags that are recorded in a string that does not contain the assembler directive must end with the colon ":". The label, the command (directive) and the operand should not necessarily begin with any particular position in the string. It is recommended to record them in the column for greater ydetteability of the program.

Tags All tags that are written in a row that does not contain the assembler directive must end with the colon ":". The label, the command (directive) and the operand should not necessarily begin with any particular position in the string. It is recommended to record them in the column for greater ydetteability of the program.

Comments Using comments in the program improves its clarity, especially where the assignment of the command set is incomprehensible. Comments begin on any line of the source module from the "point with a comma" symbol (;). All the characters on the right of "; "To the end of the line, are a comment. The comment may contain any printed characters, including the "space". The comment may occupy the entire string or follow the command on the same line.

Program structure in the assembler language The program written in the assembler language may consist of several parts, called modules, in each of which one or more data segments, stack and code can be defined. Any completed program in the assembler language should include one main, or the main one, the module from which its execution begins. The module may contain software segments, data segments and stacks, declared using the appropriate directives.

Memory models Before declaring segments, you need to specify a memory model using the directive. Model Modifier Modifier_Pamyti, Agreement_O_SOB, Type_OS, Parameter_Stell Basic Assembler Language Model: Memory Model Addressing Code Addressing Data Operating System Code Alternation Code and Data Tiny Near MS-DOS Visible Small Near MS-DOS, Windows No Medium Far Near MS-DOS, Windows No Compact Near Far MS-DOS, Windows No Large Far MS-DOS, Windows No Huge Far MS-DOS, Windows No Near Windows 2000, Windows XP, Windows Allowable Flat Near NT,

Memory Models The TINY model works only in 16-digit MS-DOS applications. In this model, all data and code are located in one physical segment. The size of the program file in this case does not exceed 64 KB. The Small model supports one code segment and one data segment. Data and code when using this model are addressed as Near (Middle). The MEDIUM model supports several program code segments and one data segment, while all references in the default program code segments are considered long (FAR), and links in the data segment are near (NEAR). The Compact model supports several data segments that use long-range data addressing (FAR), and one code segment with neighboring addresses (NEAR). The LARGE model supports several code segments and several data segments. By default, all references to code and data are considered long (FAR). The HUGE model is almost equivalent to the Large memory model.

Memory Models The FLAT model assumes the non-non-configuration of the program and is used only in 32-digit operating systems. This model is similar to the TINY model in the sense that the data and code are posted in one segment, only 32-digit. To develop a program for the FLAT model in front of the directive. Model Flat should be placed one of the directives :. 386 ,. 486 ,. 586 or. 686. Choosing a processor selection directive defines a set of commands available when writing programs. Letter P after the processor selection directive means the protected mode of operation. Data and code addressing is neighboring (NEAR), while all addresses and pointers are 32-digit.

Memory model. Model Modifier Modifer_Pamyti, Agreement_O_Obli, Type_OS, parameter_tec parameter Modifier Used to determine the types of segments and can take values: USE 16 (segments of the selected model are used as 16bit) USE 32 (segments of the selected model are used as 32 -bit). The Agreement_no_name parameter is used to determine the method of transmitting parameters when calling a procedure from other languages, including high-level languages \u200b\u200b(C ++, Pascal). The parameter can take the following values: C, Basic, Fortran, Pascal, Syscall, Stdcall.

Memory model. Model Modifier Modifier_Pamyti, Agreement_O_Obli, Type_OS, Parameter_name Parameter Type_OS is equal to OS_DOS by default, and at the moment this is the only supported value of this parameter. The Parameter_set parameter is set to: NEARSTACK (SS register is DS, the data area and stack are placed in the same physical segment) FarsTack (SS register is not equal to DS, the data area and stack are placed in different physical segments). The default is NEARSTACK.

An example of "nothing doing" program. 686 P. Model Flat, Stdcall. Data. Code Start: Ret End Start Ret - microprocessor team. It provides the right end of the program. The rest of the program refers to the operation of the translator. . 686 P - Pentium 6 Protected Commands (Pentium II) commands are allowed. This directive selects a supported set of assembler commands, indicating the processor model. . Model Flat, stdcall - flat model of memory. This model of memory is used in the Windows operating system. STDCALL - used procedure challenges.

An example of "nothing doing" program. 686 P. Model Flat, Stdcall. Data. Code Start: Ret End Start. Data is a program segment containing data. This program does not use the stack, so the segment. Stack is missing. . Code is a program segment containing code. Start is a label. END START is the end of the program and the message to the compiler that it is necessary to start running the program from Start Tag. Each program should contain an END directive that marks the end of the source code of the program. All lines that follow the END directive are ignored by the label specified after the END directive, tells the name of the main module from which the program begins. If the program contains one module, the label after the END directive can not be indicated.

Sluts of the assembler language translator - a program or technical means that performs the program transformation presented in one of the programming languages \u200b\u200binto the program on the target language called the object code. In addition to supporting the mnemonic of machine teams, each translator has its own set of directives and macros, often with nothing compatible. The main types of Languages \u200b\u200bof the Assembler language: MASM (Microsoft Assembler), Tasm (Borland Turbo Assembler), FASM (Flat Assembler) - Freely distributed Multiportion Assembler, written by Tomash Hrystar (Polish.), NASM (Netwide Assembler) - Free Assembler for Intel X Architecture 86 was created by Simon Thatham together with Julian Hall and is currently developing a small team of developers on Source. Forge. NET.

Src \u003d "https://present5.com/presentation/-29367016_6361097/image-15.jpg" alt \u003d "(! Lang: livecast program in Microsoft Visual Studio 2005 1) Create a project by selecting the File-\u003e NEW menu\u003e Project and"> Трансляция программы в Microsoft Visual Studio 2005 1) Создать проект, выбрав меню File->New->Project и указав имя проекта (hello. prj) и тип проекта: Win 32 Project. В дополнительных опциях мастера проекта указать “Empty Project”.!}

Src \u003d "https://present5.com/presentation/-29367016_63610977/image-16.jpg" alt \u003d "(! Lang: livecast program in Microsoft Visual Studio 2005 2) in the project tree (View-\u003e Solution Explorer)"> Трансляция программы в Microsoft Visual Studio 2005 2) В дереве проекта (View->Solution Explorer) добавить файл, в котором будет содержаться текст программы: Source. Files->Add->New. Item.!}

Broadcasting the program in Microsoft Visual Studio 2005 3) Select the type of Code C ++ file, but specify the name with the extension. ASM:

Broadcasting a program in Microsoft Visual Studio 2005 5) Set the compiler settings. Select on the right button in the draft Custom Build Rules menu ...

Broadcasting a program in Microsoft Visual Studio 2005 and in the Select Microsoft Macro Assembler appeared.

Broadcasting a program in Microsoft Visual Studio 2005 to check on the right button in the Hello file. ASM Tree Project Menu Properties and install General-\u003e Tool: Microsoft Macro Assembler.

Src \u003d "https://present5.com/presentation/-29367016_6361097/image-22.jpg" Alt \u003d "(! Lang: Broadcasting a program in Microsoft Visual Studio 2005 6) Complete the file by selecting Build-\u003e Build Hello. PRJ."> Трансляция программы в Microsoft Visual Studio 2005 6) Откомпилировать файл, выбрав Build->Build hello. prj. 7) Запустить программу, нажав F 5 или выбрав меню Debug->Start Debugging.!}

Programming in Windows OS Programming The OC Windows is based on the use of API features (Application Program Interface, i.e. software application interface). Their quantity reaches 2000. The Windows program largely consists of such calls. All interaction with external devices and operating system resources occurs, as a rule, through such functions. Windows operating system uses a flat memory model. The address of any memory cell will be determined by the contents of one 32-Bible register. Possible 3 types of program structures for Windows: dialogue (main window - dialog), console, or challenged structure, classic structure (window, frame).

Calling Windows API functions in the help file. Any API feature is presented as a feature type type (Fa 1, F 2, Fa 3) Type - Type of Return Value; Fak - a list of formal arguments in order of them for example, int message. Box (HWND H. WND, LPCTSTR LP. TEXT, LPCTSTR LP. CAPTION, UINT U. TYPE); This feature displays a window with a message and a button (or buttons) of the output. The meaning of the parameters: h. WND is a window window in which a window message, LP will appear. Text - text that will appear in the window, LP. Caption - text in the title window, u. Type - window type, in particular, you can define the number of output buttons.

Calling Windows API Functions Int Message. Box (HWND H. WND, LPCTSTR LP. TEXT, LPCTSTR LP. CAPTION, UINT U. TYPE); Almost all parameters of API functions are in reality 32 -bit integers: HWND - 32bit integer, LPCTStr - 32bit pointer to the string, UINT - 32 -bit integer. Suffices "A" is often added to the name of the functions to go to a newer versions of functions.

Calling Windows API Functions Int Message. Box (HWND H. WND, LPCTSTR LP. TEXT, LPCTSTR LP. CAPTION, UINT U. TYPE); When using MASM, it is necessary at the end of the name to add @n n - the number of bytes, which occupy the transmitted arguments in the stack. For Win 32 API functions, this number can be defined as the number of arguments n multiplied by 4 (bytes in each argument): n \u003d 4 * n. To call the function, use the Call Assembler command. In this case, all function arguments are transmitted to it through the stack (Push command). Direction of argument transfer: left to right - bottom up. The first will be placed on the stack of the U argument. Type. Calling the specified function will look like this: Call Message. Box. [Email Protected]

Calling Windows API Functions Int Message. Box (HWND H. WND, LPCTSTR LP. TEXT, LPCTSTR LP. CAPTION, UINT U. TYPE); The result of the execution of any API function is, as a rule, an integer that returns to the EAX register. The Offset directive is a "shift in segment", or, translating in the concept of high-level languages, the "pointer" of the beginning of the row. EQU directive is similar to #Define in Si language defines a constant. The Extern directive indicates the translator that the function or identifier is external with respect to this module.

Example of the program "Hello everyone!" . 686 P. Model Flat, Stdcall. Stack 4096. Data MB_OK EQU 0 STR 1 DB "My First Program", 0 Str 2 DB "Hello everyone!", 0 HW DD? Extern Message. Box. [Email Protected]: Near. Code Start: Push MB_OK Push Offset Str 1 Push Offset Str 2 Push Hw Call Message. Box. [Email Protected] Ret End Start

The Invoke Directive The MASM language translator allows you to simplify the call for functions using macro-free - Invoke: invoke directive function, parameter1, parameter2, ... There is no need to add @ 16 to the function call; Parameters are recorded exactly in the order in which the function describe is described. Macrifices of translator The parameters are placed on the stack. To use the Invoke Directive, you need to have a description of the prototype function using the proto directive in the form: Message. Box. A proto: DWORD ,: DWORD If the program uses a variety of Win 32 API functions, it is advisable to use the Include C: MASM 32InCludeUser 32. Inc.