Terminology and basic concepts of relational databases. The overall characteristics of the relational data model. Introduction to data normalization

Support database languages

Special languages \u200b\u200bare used to work with the database, in general, called database languages.

In the first databases there were 2 languages:

1. SDL base diagram default language.

2. DML data manipulation language.

The first one served to determine the logical structure of the database, and the second contained a set of operators, which allowed to manipulate the data, that is, to put into the database and delete them. In modern DBMS, usually, one language containing everything is supported. required funds To work with the database. This language allows you to create a database and to provide a user with a database.

To date, the most common language is

S.tructured.

L.anguage.

This language supports, and creates a database schema and allows these data to manipulate. It contains all the necessary means to ensure the integrity of the database. These integrity restrictions are contained in special directories, which allows in the language level to control the entire database state. Special operators sQL Language Determine the so-called database views. View - ϶ᴛᴏ Requests that are stored in the database. For a user view - ϶ᴛᴏ Table with which you can limit or expand the visibility of the database for a specific data user. The SQL language contains so special operands that provide authorization of access to database objects. Since different users have different powers to work with the data, these powers are described in special tables - directories that are supported in the language level.

The main concepts of relational databases are: data type, domain, attribute, tuple, primary key, attitude.

Under the data type in the relational model, it is customary to understand the same as the type of data in programming languages, that is, data are symbolic, numeric, bit strings, special numerical data (money), as well as special temporal data (time, time interval ).

In very general The domain is determined by the task of a certain basic type of data to which the elements of this domain include the concept of domain refers to its understanding, as a permissible multiple value database. The domain has a semantic load. Data is considered comparable only in the case when they relate to one domain.

According to the tuple, it is customary to understand many pairs of database elements that contain one entry of each attribute seed into the relationship scheme.

Relation scheme - ϶ᴛᴏ The named multiple pair of elements. A B.

corget \u003d attribute name value, i.e., a tuple is a set of named values \u200b\u200bfor this type.

The ratio is ϶ᴛᴏ a set of tuples corresponding to some single scheme, that is, a relational database - ϶ᴛᴏ a set of relationships whose names coincide with the names of relations in the database structure.

Lecture 4.

· Basic concepts of relational databases

Lay out the following basic concepts relational databases: data type, domain, attribute, court, attitude, primary key.

To begin with, we will show the meaning of these concepts on the example relations Employees containing information about employees of some enterprise (Fig. 1).

Fig. 2.1.

· Data type

Data values \u200b\u200bstored in relational databaseare typified, i.e. it is known for the type of each stored value. Concept data type in relational data model Fully corresponds to the concept data type In programming languages. Recall that the traditional (non-stroke) definition data type Consists of three main components: determining the set of values \u200b\u200bof this type; determination of a set of operations applicable to type values; Determining the method of external representation of type values \u200b\u200b(literals).

Usually in modern relational databases Storage of symbolic, numeric data (accurate and approximate), specialized numeric data (such as "money"), as well as special "temporal" data (date, time, time interval) is allowed. In addition, relational systems maintain the ability to determine users of their own data type .

In the example in fig. 1 We are dealing with data three types: strings of characters, integers and "money".

· Domain

Concept domain More specific to databases, although there are analogies with subtypes in some programming languages. In general domain determined by setting some basic data typeto which elements relate domain, and an arbitrary logical expression applied to the element of this data type (domain restrictions ). Data element is an element domain In that and only if the calculation of this logical expression gives the result true (For logical values, we will alternately use the designations true and false or true. and false). With every domain The name is associated, unique among the names of all domains appropriate database.

The most correct intuitive interpretation of the concept domain It is his perception of a permissible potential, limited subset of the values \u200b\u200bof this type. For example, domain Names In our example, it is defined on the basic type of character strings, but one of its values \u200b\u200bcan include only those lines that may represent the names (in particular, for the possibility of representing Russian names, such lines cannot begin with a soft or solid sign and cannot be longer, For example, 20 characters). If some attribute relations Determined at some domase (as, for example, in Fig. 1 attribute SIDS Defined by domase Names ) then in the future domain restriction acting limitations of integrityimposed on the meaning of this attributa.

It should also be noted the semantic burden of the concept domain: Data is considered comparable only when they relate to one domain. In our example value domains Passage numbers and departments These are the type of integers, but are not comparable (it would be meaningless to allow them to compare).

· Elements of the relationship

Concept relations is the most fundamental in relational approach to the organization of databases, insofar asn. -Air attitude is the only generic data structure stored in relational database. This is reflected in the general name approach - term relational (Relational) derived from relation. However, the term itself attitude is extremely inaccurate because, speaking about any saved data, we must keep in mind a type This data, values of this type and variablesin which values \u200b\u200bare saved. Accordingly, to clarify the term attitude Concepts stand out title relations, values \u200b\u200bof the relationship and variable relationship. In addition, we will need auxiliary concept. corget.

So, title (or Scheme) Relationship r (Hr. ) called finite many ordered pairs of type where A. called the name attributa, butT. denotes the name of some basic type or previously defined domain. By definition requires all names attributes in header relationship Was different. In the example in fig. 2.1 title relationship Servants is a lot of par{<слу_номер, номера_пропусков>, <слу_имя, имена>, <слу_зарп, размеры_выплат>, <слу_отд_номер, номера_отделов>} .

If everyone attributes title relations Defined on different domains, so as not to produce unnecessary names, reasonably use for naming attributes Names are relevant domains (not forgetting, of course, that this is just a convenient method of naming, which does not eliminate the differences between concepts domain and attributa).

Tuple tr corresponding header Hr. , called many ordered type triplets , one such triplet for each attributa inHr. . Third element -v - Triplet must be a permissible value data type or domain T. . Title relationship Servants correspond, for example, the following cortezzi: {<слу_номер, номера_пропусков, 2934>, <слу_имя, имена, Иванов>, <слу_зарп, размеры_выплат, 22.000>, <слу_отд_номер, номера_отделов, 310>} , {<слу_номер, номера_пропусков, 2940>, <слу_имя, имена, Кузнецов>, <слу_зарп, размеры_выплат, 35.000>, <слу_отд_номер, номера_отделов, 320>} .

Tel Br. relations R. called an arbitrary set tuple tr . One of the possible bodies relationship Servants Showing in Fig. 2.1. Note that in the general case, as they demonstrate, in particular, fig. 2.1 and an example of the previous paragraph may exist such cortezzi tr that correspondHR, but not included in Br.

Meaning Vr. relations R. called a couple of setsHR and BR. . One of the permissible relationship values Servants Showing in Fig. 2.1.

In changeable relational database Store relations, values which change over time. Variable Varr. called a named container that can contain any permissible value Vr. . Naturally, when determining anyVarr. It is required to specify the corresponding title Relationship HR.

It is worth emphasizing that any update adopted in practice database – Insert. (insert corget in variable relationship), Delete. (Delete corget from the value relations variable relationship) I.Update. (modification corget values relations variable relationship) - from a model point of view is an assignment operation variable relationship some new meaning relations. This does not mean that the listed operations should be carried out in this way in the DBMS: the main thing is that the result of operations correspond to this model semantics.

Note that in the future in cases where the exact meaning of the term is clear from the context, we will use the term attitude as in the sense the value of the relationshipand in the sense variable relationship.

A-priory, degree or "aronym" , title relations, corgetcorresponding to it header, body relations, values \u200b\u200bof the relationship and variable relationship is power title relations. For example, degree of relationship Servants equal to four, i.e. it is 4-arral ( quaternary).

With the definitions of the definitions are reasonably considered scheme of relational database Set of par<имя_VARr, Hr> including names and headlines of all variable relationshipswho are defined in database. Relational database - This is a set of couples (Of course every variable relationship at any time contains some attitude, in particular, empty).

Note that in classic relational databases After determining database circuits Only values \u200b\u200bcould change variable relationships. However, now in most implementations is allowed to change database circuits: Definition of new and changing the titles of existing variable relationships. It is called evolutiondatabase circuits.

· Primary key

A-priory, primary key variable relationship This is a subset S. Set attributes her header that at any time value primary key (composite, if the composition primary key It includes more than one attributa) in any corgeta body relations differs from the value primary key in any other corgeta body of this relations, and no subset S. This property does not possess. In the next section, we will show that existence primary key in any values \u200b\u200bof the relationship is a consequence of one of the fundamental properties relationship, namely that property that body relations is a set tuple.

By the usual everyday idea relations is an table, headline which is the scheme relations, but strings – cortezzi relations-Enexpiler; In this case, names attributes correspond to the name column This table. Therefore, sometimes they talk about the "Columns of the Table", meaning " attributes relations».

Of course, it is quite rough terminology because ordinary tables and lines and columns are ordered while attributes and cortezzi relationship are elements of disordered sets. Nevertheless, when we turn to consideration practical questions Organizations relational databases and management means, we will use this "everyday" terminology. Such terminology adhere to most commercial relational DBMS. Sometimes the terms are also used. file as an analog table, record as an analog line and field As an analog column. Let me remind you that we used this terminology in lectures 1.

· Fundamental properties of relationships

Let us now dwell on some important properties. relationshipwho follow the previously defined definitions.

Lack of tuples-duplicates,
Primary and possible relationship keys

Then property that body anyone relations never contains tuple-Delikatov, follows from the definition body relations as many tuple. In the classic theory of sets, by definition, any set consists of various elements.

It is from this property that the presence of each values \u200b\u200bof the relationship primary key - minimal set attributeswhich subseta header This relations, Compound value which uniquely determines court relations. Really because at any time everything cortezzi body anyone relations Different, any values \u200b\u200bof the relationship The property of uniqueness has at least full set his attributes. However, in the formal definition primary key Requires its "minimality", i.e. in the set attributes primary key should not include such attributeswhich can be discarded without prejudice to the main property - unambiguous definition corget. A little later we will show why the minimal property primary key It is critical. It is clear that if any relations There is a set attributeshaving a property of uniqueness, then there is a minimum set attributeshaving a property of uniqueness.

Of course, may exist values \u200b\u200bof the relationship With several inconsistent minimal sets attributeshaving the properties of uniqueness. For example, if you return to the assumptions of lecture 1 on the uniqueness of the values attributes Sidewayer and service relations Servants then for everyone values of this relations We have two sets attributesclaiming primary key – (Service) and (s) . In this case, the designer database must decide which of alternative sets attributes call primary key, and the rest of the minimum sets attributeshaving the property of uniqueness is called possible keys 1).

Concept primary key is extremely important in connection with the concept of integrity databases. Note that although formally existence primary key values \u200b\u200bof the relationship is a consequence of what body relations - This is a lot, in practice primary (and possible) Keys variable relationships appear as a result of explicit instructions of the designer relations. Determined variable relationship, the designer simulates part of the subject area, the data from which will contain database. And of course, the designer should know the nature of this data. For example, it should be known that no two employees at no time can have a certificate with the same number. So he can (and even should, as will be shown a little later) clearly declare(Case) possible key. If the company has found that all employees should have different full names, the designer may (and again should declare possible key and(SIDE) . Then the designer must appreciate which of possible keys is more reliable (the property of its uniqueness will never be canceled) and choose the most reliable possible key as primary (in our case, the key would be the key(Case) Because the decision on the uniqueness of the full names of the employees looks artificial and can be easily canceled by the management of the enterprise).

Now we will explain why the designer should explicitly declare the primary and possible keys of variable relations 2). The fact is that as a result of this, the DBMS declares the information that will be used as a limitations of integrity 3). DBMS will never allow the appearance in variable relationship values relationscontaining two corget With the same meaning attributa SIDMER (definition primary key For a given variable relationship Cancel cannot be canceled). The appearance of two tuple With the same meaning attributa SIDS It will also be impossible as long as the definition remains(SIDE) as possible key. Thus, announcements primary and possible keys give the DBMS the opportunity to maintain integrity database Even in case of attempts to bring incorrect data into it.

Finally, back to the minimal property primary and possible keys. As noted above, this property is critical, and the importance is manifested in the interpretation primary and possible keys as interquality restrictions. In our example with relation Servants Not only many will be possessing the property of uniqueness. attributes (Case) , but also, for example, a lot(SIDMER, SID_OTD_NOMER). But if we were put up as limitations of integrity Requirement of uniqueness(SIDMER, SID_OTD_NOMER)then the DBMS would guarantee the absence tuple With the same meaning attributa SIDMER not in the same way relations Servants , but only in groups tuple with the same meaning attributa Sail_Tener . It is clear that this does not correspond to the meaning of the simulated subject area.

Run ahead, note that in many practical implementations Relational DBMS allowed a violation of the property of uniqueness tuple For intermediate relationshipgenerated implicitly when executing queries. Such relations are not sets, but multisets, which in some cases makes it possible to achieve certain advantages, but often leads to serious problems. We will focus on this in more detail when discussing the SQL language.

Lack of orderliness of tuples

Of course, formally the property of the lack of orderliness tuple in relationship value is also a consequence of determination body relations as many tuple. However, this property can also look at the other side. Yes, the fact that body relations is a set tuple, facilitates the construction of a complete mechanism relational data model, including basic means of manipulating data - relational algebra and calculus. But, in my opinion, the main reason is not in this.

Often enough users of relational DBMS and developers information systems causes irritation the fact that they cannot store cortezzi relationship At the physical level in the desired order. And links to the requirements of relational theory are not very appropriate here. It would be possible to develop another theory in which ordered " relations" However, keep ordered lists tuple in conditions of intensively updated database It is much more complicated technically, and the support of ordering entails significant overhead.

No requirement to maintain order on the set tuple relations gives the DBMS additional flexibility during storage databases in external memory and when performing requests to database. This does not contradict the fact that when formulating a database request, for example, in SQL, you can need to sort the resulting table according to the values \u200b\u200bof some columns. This result, generally speaking, is not relation, and some ordered list tupleAnd it can only be a final result that requests can already be addressed.

Lack of ordering attributes

Attributes relationship not ordered because by definition title Relationship There are many par<имя атрибута, имя домена> . For reference to value attributa in corgeta relations Always used name attributa. Easy to notice an explicit analogy between titles of relationships and structural types in programming languages. Even in the programming language C with its practically unlimited possibilities of working with pointers, it is persistently recommended to contact the fields of structures only by their names. If, for example, the structural variable is defined in the language C

struct (int a; char b; int c) d;

that language standard is not recommended to use to access the symbol field.b Design * (& D + SizeOF (int)) (take the address of the structural variabled. , add to it the number of bytes in a number of numbers and take the value of the byte at the address received). This is explained by the fact that with a real location in the memory of the fields such a structural variable in the order as defined, in many computers it will be necessary to align the fieldc. By byte with an even address. Therefore, one byte just disappears. When the structural variable is located in the memory, the economical compiler (or rather, the optimizer) will rearrange the fieldb and C. , and the above design will not provide access to the fieldb. . For correct handling of the fieldb variable D. You need to use constructionsd.B or & D-\u003e B , i.e., explicitly specify the field name.

Similar practical considerations justify the lack of orderliness attributes in header relationship. In this case, the DBMS itself decides on what physical order should be stored attributes tuple (although usually the same physical order is supported for all tuple EVERY relations). In addition, this property facilitates the operation of the modification of the schemes of existing relationship not only by adding new attributesBut also by removing existing ones.

Running forward again, we note that in SQL in some cases, an index indication is allowed attributes, and as an implicit order attributes It uses their order in the linear form of determining the scheme relations (This is one of the condemned characteristics of the SQL language).

Atomicity of attribute values
First normal form

Values \u200b\u200bof all attributes are atomic (or rather scalar). This follows from the definition domain as a potential set of scalar values data type, i.e. among the values domain There may be no values \u200b\u200bwith the visible structure, including the set of values \u200b\u200b( relations). Note that this does not contradict what was said in the section "The basic concepts of relational databases" on the potential use of using the specification attributes data typedefined by users. For example, it would be possible to add to the scheme relations Servants attribute Sidewo defined by domase (or type of data) PHOTOS . The main thing in the atomicity of values attributes It is that the relational DBMS should not provide users with an explicit visibility of the internal structure of the value. With all the values \u200b\u200bcan only be applied using operations defined in the appropriate type of data.

It is customary to say that in relational databases Only normalized are allowed relations, or relationsrepresented by B. first normal form.

An example is abnormalized relations shown in fig. 2.2. We can say that here we have binary attitudein which values attributa Departments are relations. Note that the original attitude Servants It is a normalized option relations Departments serving . The normalized variant is shown in Fig. 2.3.

Normalized relations Make up the basis of classic relational approach to the organization of databases. They have some limitations 1) (not every information is convenient to represent in the form of flat tables), but significantly simplify data manipulation. Consider, for example, two identical application operators corget:

n. to enroll employee Kuznetsov (pass number 3000, salary 25000.00) to the number 320;

n. enroll the employee Kuznetsov (skip number 3000, salary 25000.00) to the number 310.

Fig. 2.

Fig. 3. Relationship Service: Normalized Option
Relationship Services

If information about employees is presented as relations Servants , both operators will be performed equally (insert court in attitude Servants ). If you work with abnormalized relation Departments serving then the first operator will lead to a simple insertion corget, and the second to add corget in value attitude attributa THE DEPARTMENT corget from primary key 310 .

When working with abnormalized relationships Similar difficulties occur when performing removal and modification operations tuple.

· Relational data model

When in the previous sections we talked about basic concepts relational databasesWe did not relieve on any specific implementation. These arguments equally refer to any system, when the construction of which was used relational approach.

In other words, we used the concepts of the so-called relational data model. Data Model (in the context of the region databases) describes a certain set of generic concepts and signs that all specific DBMSs and them databaseIf they are based on this model. The presence of a data model allows you to compare specific implementations using one common language.

Although the concept of the data model is general, and we can talk about hierarchical, network, semantic and other data models, it should be noted that in the area databases This concept was introduced by Edgar Coddo in relation to relational systems and is most effectively used in this context. Attempts to rectilinear use of similar models to the duulant organizations show that relational model Too "great", and for the shooting organizations it turns out to be "small."

general characteristics

Although the concept relational data model The first introduced the founder relational approach Edgar Codd, the most common interpretation relational data modelApparently, belongs to a well-known popularizer of the ideas of the Codda Cristofour, which reproduces it (with various clarifications) in almost all its books (see, for example, K. Date. Introduction to the database system. 6th ed., M. ; St. Petersburg: Williams.- 2000). According to the interpretation of Date, relational model consists of three parts describing different aspects relational approach: structural part, manipulation part and holistic part.

In the structural part models It is fixed that the only generic structure of data used in relational database is normalizedn. -Air attitude. Defined concepts domains, attributes, tuple, header, body and variable relationship. In essence, in the two previous sections of this lecture, we considered the concepts and properties of the structural component relational model.

In manipulation models The two fundamental mechanism of manipulation of relational databases are determined - relational algebra and relational calculus. The first mechanism is based mainly on the classical theory of sets (with some clarifications and additions), and the second is on the classical logical apparatus of the calculation of first-order predicates. We will consider these mechanisms in more detail in the following lectures, but for us only note that the main function of the manipulation part relational model is to ensure the relationship of the relationship of any particular language of relational databases: the language is called relational, if it has no less expressiveness and power than relational algebra or relational calculus.

Integrity of Entity and Links

Finally, in a holistic part relational data model Two basic integrity requirements are recorded that must be supported in any relational DBMS. The first requirement is called the requirement of essence integrity (Entity Integrity). The object or essence of the real world in relational databases correspond cortezzi relationship. Specifically, the requirement is that any court any meaning relations any variable relationship should be distinguished from any other corget of this values \u200b\u200bof the relationship According to the constituent values \u200b\u200bof a predetermined set attributes variable relationship, i.e., in other words, any variable relationship must have primary key. As we have seen in the previous section, this requirement is automatically satisfied if the basic properties are not violated in the system. relationship.

In fact, the requirement integrity of essence Fully sounds as follows: any variable relationship must exist primary key, and no importance primary key in cortech values relations variable relationship should not contain uncertain values. So that this wording is fully understood, we must at least briefly discuss the concept uncertain meaning (NULL).

Of course, theoretically any courtstamped attitudemust contain all the characteristics of the essence of the real world that we want to keep in database. However, in practice, not all of these characteristics can be known by the time when it is required to fix the essence in database. Simple example There may be a procedure for taking a person to work, whose wages are not yet defined. In this case, the employee of the personnel department, which enters attitude Servants courtdescribing the new employee simply can not provide value attributa SIDMARP (Any value domain Dimensions_Text It will incorrect to characterize the salary of the new employee).

Edgar Codd proposed to use in such cases uncertain values. Uncertain value Not belongs to any data type and may be present among the values \u200b\u200bof any attributadefined on any type of data (if it is clearly not prohibited when determining attributa). If aa. - This is some data type orNULL, OP. - any two-dimensional "arithmetic" operation of this data type (eg,+), and LOP - Operation of comparing the values \u200b\u200bof this type (eg,= ), then by definition:

a OP NULL \u003d NULL

NULL OP A \u003d NULL

a LOP NULL \u003d Unknown

NULL LOP A \u003d Unknown

Here unknown - This is the third value of logical, or boolean, such as the following properties:

Not unknown \u003d unknown

true and Unknown \u003d Unknown

true or Unknown \u003d True

false and unknown \u003d false

false or Unknown \u003d Unknown

(Recall that operations and or OR are commutative) 2). In this lecture, we have enough brief introduction to uncertain valuesBut in the following lectures, we will repeatedly return to this topic.

So, the first of the requirements is the requirement integrity of essence - means that primary key must fully identify every entity, and therefore in the composition of any value primary key No presence is allowed uncertain values. (In classic relational model This requirement applies to possible keys; As will be shown in the following lectures, in SQL-oriented DBMS such a requirement for possible keys Not supported.)

Second requirement called referential Integrity (referential integrity) requirement, is more complex. Obviously, when compliance with normalization relationship The complex essences of the real world are presented in the relational database in the form of several tuple several relationship. For example, imagine what is required to submit relational database EssenceTHE DEPARTMENT from attributes Reminner (department number), Department (number of employees) andDeparture (Many employees of the department). For each employee need to storeSIDMER (employee number),SIDS (employee name) andSIDMARP (wage employee). As we will see in the lecture 7, with the right design of the corresponding database in it will appear two relations: Departments (Departure, Department) (primary key – (Departure)) and Employees (SIDMER, SIDS, SIDMARP, SIDS) (primary key – (SIDSER)).

As seen, attribute SID_OTD_NA introduced B. attitude Servants not because the department number is its own property of the employee, and only in order to be able to restore the full essence if necessaryTHE DEPARTMENT . Value attributa SID_OTD_NA in any corgeta relations Servants Must correspond to value attributa Rem in some corgeta relations Departments . Attribute this kind (perhaps composite) is called foreign key (Foreign Key) because its values \u200b\u200bunambiguously characterize the entities presented cortays Some other relations (i.e. set their values primary key). Sure, external key may be composite, i.e. consist of several attributes. They say that attitudein which it is determined external key, refers to the appropriate attitudein which the same attribute is an primary key.

Demand link integrity , or integrity requirement external key, is that for each value external keyemerging in corgeta values relations referring variable relationshipor in value attitude variable relationshipon which the link indicates, should be found court with the same meaning primary keyOr the external key value must be completely undefined (i.e. it does not indicate anything) 3). For our example, this means that if the service number is specified, this department must exist.

Note that, like primary key, external key must be specified in determining variable relationship and is a limitation for permissible values relations This variable. In other words, definition external key is a definition limitations of integrity database.

Restrictions integrity of essence and according to the links Must be supported by the DBMS. For compliance integrity of essence Enough to guarantee the absence in any variable relationship values relationshipcontaining cortezzi with the same meaning primary key (and prohibit entry into value primary key uncertain values). FROM link integrity The situation is somewhat more complicated.

It is clear that when updating the reference relations (inset of new tuple Or value modifications external key in existing cortech) It is enough to ensure that incorrect values \u200b\u200bdid not appear. external key. But how to be when removing corget of relationswhat is the link lead?

There are three approaches here, each of which supports link integrity. The first approach is that it is forbidden to remove corgetfor which there are links (i.e., you first need to either delete the references cortezzior to change their values \u200b\u200baccordingly external key). With the second approach when removing corgeton which there are links in all referring cortech value external key automatically becomes completely uncertain. Finally, the third approach (cascading removal) is that when removing corget of relationswhich is referenced from referring relations All references are automatically deleted. cortezzi.

In developed relational DBMS, you can usually choose how to maintain link integrity For each case of definition external key. Of course, to make such a decision, it is necessary to analyze the requirements of a specific applied area.

· Conclusion

Most likely, the potential readers of this course work or will work with any SQL-oriented DBMS. Any company producing similar DBMS calls them relational systems. It is very important to clearly understand which properties of such systems are really relational, and that they do not quite correspond to the source, clear and strict ideas relational approach And even contradicts them. It will help more correctly organize database and build applications in a SQL-oriented DBMS environment.

In several lectures of this course, the possibilities of current SQL languages \u200b\u200bare sufficiently discussed in detail: SQL: 1999 and SQL: 2003. But first readers are offered a material that represents relational approach in pure form. In this lecture, the concept basis is introduced relational approach; The main terms are determined; The fundamental investigations are investigated basic definitions. Considered relational data model It is intended primarily to assess the compliance of various implementations of the DBMS total relational approach.

Introduction

The beginning of the XXI century, experts refer to the century computer technology. Humanity enters a fundamentally new information era. All the components of the lifestyle of people are changing. The level of information becomes one of the characteristics of the level of development of the state.

Many developing countries have realized at the proper level the advantages that are not covered with the dissemination and development of information and communication technologies. And there is no doubt that the fact that the movement to the information society is a kind of path, which is aimed at the future of human civilization.

Based on the relational model, the database is a specific set of tables over which operations are performed that are formulated in terms of relational algebra and relational calculus.

In the relational model of the operation relative to the database objects, there are theoretical character being the core of any database. The model represents a variety of structural data, integrity constraints and data manipulation operations.

Basic concepts of relational data model

The main concepts peculiar to relational data are considered to be the data type, domain, attribute, tuple, primary key relationship. Initially, we will note the meaning of these concepts on the example of the relationship of "employees", containing information on employees of some organization

The concept of the data type is commensurate in a relational data model with the concept of data type in programming languages. In modern relational databases, storage of symbolic numeric data, bit strings, as well as special "temporal" data, which are quite actively developing in the process of expanding the possibilities of relational systems.

The concept of a domain has a certain specificity for databases, although they have some anthology with odds with respect to some programming languages. In general, the domain is determined by the task of some basic type to which the domain element and an arbitrary logical expression is related to the application in the data type item. In the case when the calculation of this logical expression represents the "truth" result, the element is an element of the domain.

A more correct interpretation of the concept of domain is considered to be the understanding of the domain, as one of the permissible potential sets of the values \u200b\u200bof this type.

For example, a domain "Names" In our case, on the basic type of symbol, it is defined, but the number of its values \u200b\u200bwill include only those deadlines that are able to depict the name) such dates cannot begin with a soft sign). It is also necessary to note the semantic load of the concept of the domain: only in the case the data will be comparable when they will be relevant to the domain, but only one

In our case, the values \u200b\u200bof the domains of the "pass numbers" and "group numbers", which are related to the type of integer, can not be comparable. Note that in some cases the concept of the domain itself does not find application itself, because Already supported in Oracle v.7.

The scheme of the relationship is a personal multiple pairs: which includes: the attribute name, type, but only if the concept of the domain is not supported. The degree of "artity" is a relationship scheme - this certain power These are sets.

At the same time, the relations "employees" will be equal to four and reckon with 4-arral. And if all attributes of one relationship are defined on relatively different domains, it is intelligently to naming attribute names of the respective domains, without forgetting that this is considered only one of the convenient naming method and does not provide an opportunity to eliminate differences regarding the concept of a domain and Attribute. Database diagram is a specific set of relationship circuits.

A tuple that corresponds to this scheme of the relationship is a set of pairs, which is reflected in the entry of each attribute name owned by the relationship scheme.

The "value" is considered to be the permissible value of the domain of this attribute, in the case when the concept of the domain is not supported. As a result, the degree of the taper, i.e. The number of certain elements coincides with the degree of relevant relationship scheme

The tuple is a set of nominal values \u200b\u200bof a specified type.

Attitude is a large number of tuples that correspond to one relationship scheme. In fact, the concept of the relationship scheme is closer to the concept of a structural type of data in programming languages \u200b\u200bwas heading, and attitude as a set of tuples was a body relationship. Therefore, it would be logical to resolve the relationship scheme separately, and later, one or more relations with this scheme, but the relational databases are not accepted.

The name of the relationship circuit relative to these databases in most cases coincides with the name of the corresponding specimen. In classical relational databases, after a specific database schema, only instance relationships change. They may appear new and existing tufts. But at the same time, in many implementations, a change in the database schema is found: determining the new and changing the already existing relationship circuits, which is customary called the evolution of the database schema.

The usual view of the relationship is considered to be a table, the title of which is considered a relationship scheme, and strings - the cortex of the instance of the instance, in this case the attribute names are called the columns of this table. In this regard, sometimes they say the "Column of the Table", implying "attribute of the relationship". As can be seen, the main structural concepts of the relational data model (except for the concept of domain) have a very simple intuitive interpretation, although in the theory of relational database they are all determined absolutely formally and accurately.

As mentioned, relational data models are most popular. In accordance with the relational data model, the data is submitted in the form of a set of tables over which operations can be performed formulated in terms of relational algebra or relational calculus.

Unlike hierarchical and network models of data in the relational model of operations over objects are theoretical and multiple nature. This allows users to formulate their requests more compact, in terms of larger data units.

Consider the terminology used when working with relational databases.

Primary key.The primary key is a field or a set of fields, unambiguously identifying the record.

Often there are several options for choosing the primary key. For example, in a small organization of the primary keys of the Employee's entity, both a table number and a combination of the surname, name and patronymic number (with confidence that there are no complete thesis in the organization), or the number and series of passport (if there are passports for all employees) . In such cases, when choosing a primary key, preference is given to the simplest keys (in this example - the table number). Other candidates for the role of the primary key are called alternative keys.

Requirements for primary key:

uniqueness - that is, the table should not exist two or more records with the same value of the primary key;

the primary key should not contain empty values.

When choosing a primary key, it is recommended to choose an attribute, the value of which does not change during the entire time of the instance of the existence (in this case, the table number is preferable to the surname, as it can be changed, enhancing it).

In the fields that are often used when searching and sorting data are set secondary keys: They will help the system to find the necessary data much faster. Unlike the primary fields of the fields for indexes (secondary keys) may contain non-unique values.

Primary keys are used to establish links between tables in the relational database. In this case, the primary key of one table (Parental) corresponds to external keyanother table (subsidiary). The external key contains the values \u200b\u200bof the associated field, which is the primary key. Values \u200b\u200bin the outer man can be unreasonable, but should not be empty. Primary and external keys must be the same type.

Ties between tables. The entries in the table may depend on one or more records of another table. Such relations between tables are called connections.Communication is defined as follows: Field or several fields of one table, called external keyrefers to the primary key of another table. Consider an example. Since each order must come from a specific client, each table entry Orders.(orders) must refer to the appropriate table entry Customers.(Customers). This is the connection between the tables Orders.and Customers.. Table Orders.there must be a field where links to those or other table entries are stored. Customers..

Types of connections. There are three types of connections between tables.

One to one -each recording of the parent table is connected only with one child's recording. Such a connection is found in practice much less frequently one to manyand implemented by determining a unique foreign key. Communication one to oneuse if they do not want the table "flipped" from a large number of fields. Databases that include tables with such a bond cannot be considered completely normalized.

One to many -each recording of the parent table is associated with one or more subsidiary entries. For example, one client can make several orders, but several customers cannot make one order. Communication O. dean to manyit is the most common for relational databases.

Many to many -several records of one table are associated with several entries. For example, one author can write a few books and a few authors - one book. In case of such a connection, in general, it is impossible to determine which entry of one table corresponds to the selected record of another table, which makes the impossible physical (at the level of indexes and triggers) the implementation of such a relationship between the relevant tables. Therefore, before moving to the physical model, all links "many to many" must be redefined (some CASE funds, if those are used when designing data, do it automatically). A similar relationship between the two tables is implemented by creating a third table and the implementation of the type "one to many" of each of the available tables with an intermediate table.

Database (database) -this is the named set of structured data relating to a certain subject area and intended for storage, accumulation and processing using computers.

Relational database (RBD) - This is a set of relationships whose names coincide with the names of the schemes in the DB scheme.

Basic conceptsrelational databases:

· Data type - Type of specific column values.

· Domain (Domain) - the set of all valid attribute values.

· Attribute (attribute) - Table column title, characterizing the named object property, such as student surname, ordering date, place of employee, etc.

· Court - A string of a table, which is a set of values \u200b\u200bof logically related attributes.

· Attitude (Relation) - a table reflecting information about the objects of the real world, for example, about students, orders, employees, residents, etc.

· Primary key (Primary Key) - the field (or set of fields) of the table, unambiguously identifies each of its records.

· Alternative key - This is a field (or a set of fields), inconsistent with the primary key and a uniquely identifiable record instance.

· External key - This is a field (or a set of fields) whose values \u200b\u200bcoincide with the existing values \u200b\u200bof the primary key of another table. When typing two tables with the primary key of the first table, the external key of the second table is associated.

· Relational data model (RMD)- Data organization in the form of two-dimensional tables.

Each relational table must have the following properties:

1. Each table entry is unique, i.e. The set of values \u200b\u200bin the fields is not repeated.

2. Each value is recorded at the intersection of the string and column - is atomic (inseparable).

3. The values \u200b\u200bof each field must be one type.

4. Each field has a unique name.

5. The procedure for the location of the records is insignificant.

Basic bd elements:

Field - elementary unit logical organization data. The following characteristics are used to describe the field:

· Name, such as last name, first name, patronymic, date of birth;

· Type, for example, string, symbol, numeric, date;

· Length, for example, in bytes;

· Accuracy for numerical data, for example, two decimal signs for displaying the fractional part of the number.

Record - A set of values \u200b\u200bof logically related fields.

Index - Accelerate the search operation of records used to install connections between tables. The table for which the index is used is called indexed. When working with indexes, it is necessary to pay attention to the organization of indices, which is the basis for classification. The simple index is represented by one field or a logical expression processing one field. The composite index is represented by several fields with the possibility of using various functions. Table indexes are stored in the index file.

Data integrity - This is a data protection tool in communication fields, allowing to support tables in a consistent (consistent) state (that is, not allowing the existence in the subordinate table of entries that do not have appropriate records in the parent table).

Inquiry - A formulated question to one or more interrelated tables containing data sampling criteria. Request is carried out using a structured language. sQL queries SRTRUCRED QUERY LANGUAGE). As a result of data samples from one or more tables, a variety of records can be obtained, called the view.

Data presentation - The database is stored in the database of a data selection (from one or more tables).

The presentation is essentially a temporary table formed as a result of the execution of the request. The request itself can be sent to separate file, report, temporary table, table on disk, etc.

Report- component of the system, the main purpose of which is a description and output to print documents based on information from the database.

General characteristics of working with RBD:

The most common interpretation of the relational data model, apparently belongs to Date, which reproduces it (with various clarifications) in almost all its books. According to Date, the relational model consists of three parts describing different aspects of the relational approach: the structural part, the manipulation part and the holistic part.

In the structural part of the model, it is fixed that the only structure of the data used in the relational database is normalized N-arous attitude.

In the manipulation of the model, two fundamental mechanisms for manipulating relationulating database are approved - relational algebra and relational calculus. The first mechanism is based mainly on the classical theory of sets (with some clarifications), and the second is on the classical logical apparatus of the calculation of first-order predicates. Note that the main function of the manipulation part of the relational model is to ensure the relationship between any particular language of relational databases: the language is called relational, if it has no less expressiveness and power than a relational algebra or relational calculus.

28. Algorithmic languages. Translators (interpreters and compilers). Algorithmic language Baysik. Structure of the program. Identifiers. Variables. Operators. Processing one-dimensional and two-dimensional arrays. User function. Subroutines. Working with data files.

Language high level - Programming language, concepts and structure of which are convenient for perception by man.

Algorithmic language (Algorithmic Language) - Programming language - artificial (formal) language designed to record algorithms. The programming language is given by its description and is implemented as special Program: compiler or interpreter. Examples of algorithmic languages \u200b\u200bserve - Borland Pascal, C ++, Basic, etc.

Basic concepts algorithmic language:

Composition of language:

An ordinary spoken language consists of four main elements: symbols, words, phrases and suggestions. The algorithmic language contains similar elements, only words are called elementary structures, phrases - expressions, offers - operators.

Symbols, elementary structures, expressions and operators are a hierarchical structure, since elementary structures are formed from a sequence of characters.

Expressions - this is a sequence of elementary structures and symbols,

Operator - Sequence of expressions, elementary structures and symbols.

Language description:

The description of the characters is to enumerate the permissible characters of the language. Under the description of elementary structures, the rules for their education are understood. An expression description is the rules for the formation of any expressions that make sense in this language. The description of the operators consists of consideration of all types of operators permissible in the language. A description of each language element is set to its syntax and semantics.

Syntax Definitions establish rules for building language elements.

Semantics Determines the meaning and rules for using those elements of the language for which syntactic definitions were given.

Language symbols - These are basic indivisible signs, in terms of which all texts in the language are written.

Elementary structures - These are minimal units of language with independent meaning. They are formed from the basic characters of the language.

Expression The algorithmic language consists of elementary structures and symbols, it specifies the rule for calculating some value.

Operator Specifies full description Some action that needs to be performed. For description complicated action A group of operators may be required.

In this case, the operators are combined into Composite operator or Block. Actions, asked by operatorsare performed on the data. Suggestions of the algorithmic language, which provide information about data types, are called descriptions or non-observed operators. The combined algorithm combined descriptions and operators forms a program on the algorithmic language. In the process of studying the algorithmic language, it is necessary to distinguish an algorithmic language from the language by which the described algorithmic language is described. The usually studied language is called simply by the language, and the language, in terms of which the language is described - Matching.

Translator - (English Translator - translator) is a translator program. It converts a program written in one of the high-level languages \u200b\u200binto a program consisting of machine commands.

A program written on a high level algorithmic language cannot be directly performed on a computer. EUM understands only the language of machine teams. Consequently, the program on the algorithmic language must be translated (translated) into the language of the Specific EUM command. Such a translation is carried out by automatically special translator programs created for each algorithmic language and for each type of computers.

There are two main ways to broadcast - compilation and interpretation.

1. Completion: compiler (English. Compiler - compiler, collector) reads the entire program entirely, makes it a translation and creates a complete version of the program in the machine, which is then executed.

For compilation The whole source program immediately turns into a sequence of machine commands. After that, the resulting resulting program is performed by computer with existing source data. The advantage of this method is that the broadcast is performed once, and (multiple) performing the resulting program can be carried out at high speed. At the same time, the resulting program can take a lot of space in the memory of the computer, since one language operator is replaced by hundreds or even thousands of commands. In addition, debugging and modifications of the translated program are very difficult.

2. Interpretation: Interpreter (English Interpreter - an interpreter, interpreter) translates and executes the program string program.

For interpretations The source program is stored in the memory of the computer almost unchanged. The interpreter program decodes operators source program One by one and immediately ensures their implementation with the available data. The interpretable program occupies little place in the computer's memory, it is easy to debug and modify. But the execution of the program occurs quite slowly, because each time each execution, all operators are accepted.

Compiled programs work faster, but interpreted easier to correct and change

Each specific language is oriented either on the compilation, or to interpretation - depending on what purpose it was created. For example, Pascal is usually used to solve quite complex tasks in which the speed of programs is important. therefore this language Usually implemented using the compiler.

On the other hand, Beysik was created as a language for novice programmers for whom the construction of the program has undeniable advantages.

Sometimes there is a compiler for one language, and the interpreter. In this case, to develop and test the program, you can use the interpreter, and then compile the debugged program to increase the speed of its execution.