Monitoring and analysis of networks. OS - operating system. § System Administrator's Guide, including user interface description

27.06.2011 NOTE McA-Almond

I choose for three candidates: Whatsup Gold Premium company Ipswitch, Opmanager Professional company ManageEngine and IPMonitor company Solarwinds. The cost of each of these network scanners does not exceed $ 3,000 (for 100 devices), and at the same time each of them has a trial operating period during which you can test the selected product for free.

I work in a medium-sized company, and we have been using the same network monitoring system for about seven years. It provides our administrators with basic information about the availability of servers and services, and also sends SMS text messages to our mobile phones in case of problems. I concluded that you need to update the system or at least add an effective tool capable of providing higher performance and provide detailed information about the status of terminal servers, Exchange and SQL systems posted on your network. . Let's compare our candidates.

Detection process

To prepare for testing, first of all, it was necessary to enable the SNMP service on all devices, including Windows servers. By changing the SNMP service settings, I installed access with the "read only" privilege on all devices that the monitoring process should cover. In systems Windows Server 2003/2000 SNMP service is installed using the Windows Components Wizard located in the Add / Remove Programs panel, and the SNMP components are added to the SNMP components using the Server Manager wizard. After the wizard is completed, you must run the Services tooling in the Control Panel folder, and configure the SNMP service is easy. Controlled network devices, such as firewalls, switches, routers and printers, also have SNMP service management tools, and usually the configuration process is a fairly simple operation. For more information about the SNMP service, you can get a document "Simple Network Managment Protocol" (TechNet.microsoft.com/en-us/library/bb726987.aspx).

Next, I installed all three monitoring systems for one of your two working systems with Windows XP SP3. After installation, each system consisted of two parts: database and web server. Managing each of the selected systems through the web interface can be performed by several administrators, and you have the ability to configure accounts with different levels of access. Common for three systems is that each user has the ability to add, delete and move panels in its workspace. The panels displays the same type, such as processor loading or using memory for various devices on the network.

Before you start scanning a network (the so-called detection process), I specified an account settings that each system should use to access devices detected on the network. As shown in the comparative table, IPSwitch Watchup Gold Premium allows you to configure an account to work with SNMP, WMI, Telnet, SSH, ADO and VMware services. The ManageEngine Opmanager Professional system allows you to work on SNMP, WMI, Telnet, SSH and URL protocols, and the SOLARWINDS IPMONITOR system - using SNMP, WMI and URL protocols.

After configuring the SNMP service on network devicesah and accounts (Windows and SNMP) For each of the network monitoring systems, I launched the detection process for the IP address range on your local network. All systems have discovered about 70 devices. Using the default scan settings, the test systems showed themselves well when identifying device types, and also submitted detailed information about the state of devices. All three systems contain sensors for the main operating characteristics of devices and servers, such as: loading processor, memory usage, using / full disk, loss / packet delay, Exchange, Lotus, Active Directory service and all Windows services. Each of the systems had the ability to add sensors for both individual devices and for large groups of devices.

OPManager and Whatsup Gold packages have an interface for identifying and collecting VMware service from servers and guest systems. In addition, both products have a commutator ports manager survey function, which indicates which devices are connected to different ports of managed switches. The information obtained will help determine through which switch port is connected to a specific business application, while there is no need to manually perform the trace of cables in server rooms. In the future, you can configure alerts for certain switch ports. When working with OPManager package, to obtain the results of the port survey, it is enough to select the switch and run the Switch Port Mapper tool - the system will return the results in a few seconds. A similar means included in whatsup Gold is called Mac Address, it must be started with the marked GET Connectivity parameter. To get the result in the Whatsup Gold system leaves more time, as it tries to scan the devices and collect information about connectivity throughout the network.

IPSWITCH WHATSUP GOLD PREMIUM

IPSWITCH WHATSUP GOLD PREMIUM
PER: Provides most accurate results among three competitors, it allows you to create your own sensors, provides comprehensive means of monitoring VMWare systems, integrates with AD.
VS: fewer built-in sensors and higher cost compared to competitors (if you purchase a license in less than 500 devices).
Evaluation: 4.5 out of 5.
PRICE: $ 7495 for 500 devices, $ 2695 per 100 devices, $ 2195 per 25 devices.
Recommendations: I recommend Whatsup Gold IT units serving large VMware environments, or wishing to create your own sensors.
CONTACT INFORMATION: Ipswitch, www.ipswitch.com

When working with IPMONITOR and OPManager systems, I came from time to time faced with incomprehensible testimony that put me in a dead end. In the IPMonitor system, negative values \u200b\u200bcould be displayed in the operating panels when the processor load level decreased significantly. In another case, when the processor is close to zero, the IPMonitor system sent me a notice that the processor is activated by 11.490%! OPManager system, tracking and sending me correct information about using domain controller disks, and in some cases none of the controllers in the list of 10 servers with the maximum use of disk space. At the same time, the neighboring panel was notified that one of my domain controllers should not even be in the top ten, but in the top three. When using Whatsup Gold I did not come across similar situations. The Whatsup Gold system monitors the loading of the processors in its panels, and when I compared the results from the Whatsup Gold panels with Windows Performance Monitor readings, they exactly coincided each of the cores. Similarly, information about the use of hard drives was correctly transmitted to all relevant working area applications.

The Whatsup Gold system has a built-in sensor library that allows you to create new sensors based on existing ones. Large organizations can find this feature useful, as it allows you to create uniform sensors for monitoring different types of devices - this is the most efficient way to customize sensors for a group of devices.

The Whatsup Gold system does not have sensors for individual manufacturers devices (with the exception of a sensor for APC UPS power sources), unlike the OPManager package that uses its own sensors for Dell, HP and IBM devices, but it allows you to create Active Script sensors. This type allows you to develop your own monitoring processes using VBScript and JScript programming languages. Active Script sensors are dedicated to the online support center, in which Whatsup Gold users can receive and download ready-made scripts.

The only improvement that I would like to add to the Whatsup GOLD system concerns the interface (screen 1), mainly due to the fact that it is too linear. For example, you will need up to 5 clicks on the CANCEL and CLOSE buttons to return from the Active Monitor Library window back to the workspace. Also in the Whatsup Gold system there is no sensor (if, of course, not to write it manually), checking the state of the site, and it may be necessary, especially in cases where the site is located on a third-party server and other ways to access it are missing.

Screen 1. Whatsup Gold Premium interface

To process situations where devices are inoperative for some time, you can configure sending notifications every 2, 5 and 20 minutes. In this way, the administrator's attention can be attracted to the absence of responses from the most important nodes within a certain time.

Whatsup Gold is the only one of the systems under consideration, which has the ability to integrate into the LDAP environment - the moment may be fundamental when choosing a solution for large networks.

ManageEngine Opmanager.

ManageEngine Opmanager.
PER:the best user interface among three products; more built-in sensors than in two other systems; The lowest price when buying a license to 50 or less devices.
VS: During tests, not all devices are displayed correctly; It may be necessary to spend time debugging time to make the system fully functional.
Evaluation: 4.5 out of 5.
PRICE: $ 1995 per 100 devices, $ 995 per 50 devices, 595 dollars for 25 devices.
Recommendations: IT units wishing to get the maximum number of built-in features (with the exception of integration in AD) will appreciate the OPManager Professional system. When buying licenses in the range of 26-50 devices, its cost is almost twice the cost of two other products.
CONTACT INFORMATION:ManageEngine, www.manageengine.com.

After installing the OPManager system, I found that it is easy to set up a huge number of functions and convenience of moving between them. OPManager provides the possibility of sending (along with emails and SMS) DIRECT Message messages for an account in the Twitter system - a pleasant alternative to email. This use of Twitter accounts allows me to be aware of what is happening on the network, but since my phone does not call when delivering messages from the Twitter system, I in parallel I want to receive text notifications about the most important events. I can view information about the achievement of thresholds on any server using Twitter messages and, thus, have a log of current events on the network, but it is not necessary to use this scheme to transmit critical situations.

In addition to standard sensors, the OPManager system offers SNMP performance monitoring technologies developed by suppliers for devices such as Dell Power-EDGE, HP Proliant and IBM Blade Center. OPManager can also be integrated with Google Maps API, so that you can add your devices to Google Card. However, for this will have to acquire account google recording Maps API Premium (if you do not plan to make your network card publicly available) in accordance with the licensing conditions of the free version of the Google Maps API system.

To process situations where the administrator receives a warning, but does not respond to it within a certain time, in the OPManager system, you can configure the sending of an additional warning to another administrator. For example, an employee usually responsible for processing critical events for a specific server group may be busy or sick. This case makes sense to configure an additional warning that will attract the attention of another administrator if the first warning was not viewed or reset during the specified number of hours / minutes.

Among the three products under consideration, only the OPManager system had a section designed to monitor the quality of VoIP exchanges in global Network. To use VoIP monitoring tools, it is necessary that devices, both on the source network and in the destination network, supported Cisco IP SLA technology. In addition, the OPManager system, the interface of which is shown on the screen 2, includes more sensors and operating panels than any of the competing products.

Screen 2. OPMANAGER PROFESSIONAL interface

Solarwinds Ipmonitor.

Solarwinds Ipmonitor.
PER: unlimited number of devices at a very low price; Easy to use.
VS:there is no mechanism for harmonizing the actions of administrators.
Evaluation: 4 out of 5.
PRICE: $ 1995 - the number of devices is not limited (25 sensors for free).
Recommendations: If the budget is limited, and you need to organize the monitoring of a large number of devices if the monitoring process does not require complex solutions and you are a suitable approach to approval of administrator actions, SolarWinds is your choice.
CONTACT INFORMATION: SolarWinds, www.solarwinds.com

After the first acquaintance with the IPMonitor system, its interface shown on the screen 3 appeared to me very confusing. I almost spent eternity to find a place where the frequency of verification of the system of individual system sensors is configured (by default, the survey was performed every 300 seconds). However, after using IPMONitor for several weeks, I found that this system is extremely easy to use and has sufficient features for high-quality network monitoring. Using IPMonitor, you can configure the "default" scan in such a way that any service or performance parameter will always be included in future scanning processes. In addition to the standard (and above) sensors, the iPMonitor system offers a Windows event sensor that can be used to send warnings when critical events detected.

Screen 3. SOLARWINDS IPMONITOR interface

On the other hand, the IPMonitor system does not have tracking / destination of warnings. It does not matter if the company has one network administrator, but larger IT units are likely to find a significant disadvantage of the system's inability to confirm the receipt of warnings, assign addressees and reset the warnings. If administrators forget to coordinate their actions outside the system, there are situations where several administrators receive the same warning and start working on the same problem. However, to resolve such conflicts, it suffices to develop a coherent alert response algorithm - for example, if divided responsibility for network devices between administrators, then there will be no questions about who should deal with a solution to a problem.

Time to make a decision

I have already decided for myself which of the three products will be more suitable for my surrounding. I stopped on the MANAGENGINE OPMANAGER system with a license for 50 devices for several reasons.

First of all, I need the ability to track the maximum number of parameters of your environment, as this is the best way to avoid unexpected failures. In this question, the OPManager system is definitely ahead of the competitors. The second reason is the budget. I can continue to use our old monitoring tools working on the "enabled / off" principle for workstations and printers, and thus avoid costs for additional licenses. Finally, I really liked the approach used by employees of ManageEngine when developing OPManager, which allows you to use the advantages of new technologies, and I consider the costs of purchasing an annual service package and support that allows you to download updates that appear as the product development.

Neith Mak-Almond ( [Email Protected]) - The IT Director at the Agency for Social Services, has MCSE, Security and Network + certificates, specializes in solutions with subtle customers and medical databases

Original name: A Summary of Network Traffic Monitoring and Analysis Techniques

Link to the original text: http://www.cse.wustom.edu/~jain/cse567-06/ftp/net_monitoring/index.html

Recommendations: Presented translation is not professional. Retreats from the text are possible, the irregular interpretation of certain terms and concepts, the subjective opinion of the translator. All illustrations are made in translation unchanged.

Alisha Sessil

Overview of analyzing and monitoring network traffic

As the private internal networks of companies continue to grow, it is extremely important that network administrators know and skill to manage manually by various types of traffic that travels on their network. Monitoring and analysis of traffic is necessary in order to more effectively diagnose and solve problems when they occur, thus not bringing network services before downtime during a long time. Many different tools are available that allow you to help administrators with monitoring and analysis of network traffic. This article discusses the monitoring methods focused on routers and monitoring methods that are not oriented to routers (active and passive methods). The article gives an overview of the three available and most widely used network monitoring methods embedded in routers (SNMP, RMON and Cisco NetFlow) and provides information on two new monitoring methods that use a combination of passive and active monitoring methods (Wren and SCNM).

1. The importance of monitoring and analyzing the network

Network monitoring (network monitoring) is a difficult task that requires high strengths, which is a vital part of the work of network administrators. Administrators constantly seek to support the uninterrupted work of their network. If the network "falls" at least for a short period of time, the performance in the company will decrease and (in the case of organizations providing public services) the possibility of providing basic services will be posed. In connection with this administrators, it is necessary to monitor the movement of network traffic and performance throughout the network and check whether it appeared in it in safety.

2. Methods of monitoring and analysis

"Network Analysis is the process of capturing network traffic and quick viewing to determine what happened to the network" - Angela Orebaukh. The following sections discussed two methods of network monitoring: the first - router-oriented, the second - non-router-oriented routers. The monitoring functionality that is built into the routers themselves and does not require additional installation of software or hardware, called methods based on the router. Non-based router methods require installation of hardware and software and provide greater flexibility. Both techniques are discussed below in the relevant sections.

2.1. Methods of monitoring based on the router

Monitoring methods based on the router - toughly specified (sewn) in routers and, therefore, have low flexibility. Brief description of the most frequently used methods of such monitoring are shown below. Each method has developed many years before becoming a standardized monitoring method.

2.1.1. Protocol simple network monitoring (SNMP), RFC 1157

SNMP is an application-level protocol that is part of the TCP / IP protocol. It allows administrators to manage network performance, find and eliminate network problems, plan network growth. He collects traffic statistics to a final host through passive sensors that are implemented with the router. While there are two versions (SNMPV1 and SNMPv2), this section describes only SNMPV1. SNMPv2 is built on SNMPV1 and offers a number of improvements, such as adding operations with protocols. Another version of SNMP version is standardized. Version 3 (SNMPv3) is under consideration.

For protocol SNMP is inherent in three key components: managed devices (Managed Devices), Agents (Agents ) and network management systems (Network Management Systems - NMSS). They are shown in Fig. one.

Fig. 1. Components SNMP.

Controlled devices include SNMP agent and can consist of routers, switches, switches, hubs, personal computers, printers and other elements like this. They are responsible for collecting information and make it available for the Network Management System (NMS).

Agents include software that owns management information, and translate this information into a form compatible with SNMP. They are closed to the control device.

Network management systems (NMS) perform applications that are monitored and controlled control devices. The processor and memory resources that are needed to manage the network are provided by NMS. For any managed network, at least one control system should be created. SNMP can act exclusively as NMS, or agent, or can perform their duties or others.

There are 4 main commands used by SNMP NMS to monitor and control managed devices: reading, writing, interrupting and intersection operations. The read operation considers the variables that are stored by controlled devices. The recording command changes the values \u200b\u200bof the variables that are stored by controlled devices. The intersection operations own information on which variable managed devices are supported, and collect information from the supported tables of variables. The interrupt operation is used by controlled devices in order to inform NMS on the occurrence of certain events.

SNMP uses 4 protocol operations in accordance with: Get, GetNext, SET and Trap. The Get command is used when NMS provides a request for information for managed devices. The SNMPV1 request consists of a message header and protocol data units (PDU). PDU messages contains information that is necessary for a successful query execution that will either receive information from the agent or set the value in the agent. The controlled device uses the SNMP agents located in it to obtain the necessary information and then sends the message NMS "y, with a response to a request. If the agent does not own any information with respect to the request, it does not return anything. The GetNext command will get the following An object instance. For NMS, it is also possible to send a request (SET operation) when the value of the elements without agents is set. When an agent must provide NMS events, it will use the TRAP operation.

As mentioned earlier, SNMP is a protocol of the application level that uses passive sensors to help the administrator trace the network traffic and network performance. Although, SNMP can be a useful tool for a network administrator, it creates an opportunity for a safety threat, because it is deprived of the ability to authenticate. It differs from remote monitoring (RMON), which is discussed in the next section, the fact that RMON works on the network level and below, and not on the applied.

2.1.2. Remote Monitoring (RMON), RFS 1757

RMON includes various network monitors and console systems to change the data obtained during network monitoring. This extension for SNMP management information database (MIB). Unlike SNMP, which must send information about providing information, RMON can adjust the signals that will "monitor" a network based on a specific criterion. RMON provides administrators with the ability to manage local networks also well as remote from one particular location / point. Its monitors for the network layer are shown below. RMON has two versions of RMON and RMON2. However, this article states only about RMON. RMON2 allows you to monitor on all network levels. It focuses on IP traffic and application-level traffic.

Although there are 3 key components of the RMON monitoring environment, only two of them are given here. They are shown in Fig. 2 below.

Fig. 2. RMON components

Two RMON components are a sensor, also known as an agent or monitor, and a client, also known as the control station (control station). In contrast to the SNMP, the sensor or the RMON agent collects and stores network information. The sensor is a built-in network device (for example, a router or switch) software. The sensor can also be launched on a personal computer. The sensor should be placed for each different segment of the local or global network, as they are able to see traffic, which goes only through their channels, but they do not know about traffic for their adhesives. The client is usually a control station that is associated with a sensor using SNMP to obtain and correct RMON data.

RMON uses 9 different monitoring groups to obtain network information.

Statistics - statistics measured by a sensor for each monitoring interface for this device.
History - accounting for periodic statistical samples from the network and storing them to search.
ALARM - periodically takes statistical samples and compares them with a set of threshold values \u200b\u200bfor generating an event.
Host - contains statistical data associated with each host detected on the network.
Hosttopn - prepares tables that describe the top of the hosts (main host).
Filters - turns on packet filtering based on the filter equation for capturing events.
Packet Capture - capture packages after passing through the channel.
Events - generation control and event registration from the device.
Token Ring - Support for ring lexing.

As established above, RMON is built on the SNMP protocol. Although traffic monitoring can be performed using this method, analytical information on information received by SNMP and RMON have low performance. The Netflow utility, which is discussed in the next section, works successfully with many analytical software packages to make the administrator's work much easier.

2.1.3. NetFlow, RFS 3954

NetFlow is an extension that was presented in Cisco routers that provide the ability to collect IP network traffic if it is specified in the interface. Analyzing the data provided by NetFlow, the network administrator can identify such things like: a source and a traffic receiver, a service class, the causes of overcrowding. NetFlow includes 3 components: Flowcaching (caching flow), Flowcollector (flow information collector) and Data Analyzer (data analyzer). Fig. 3 shows the netflow infrastructure. Each component shown in the figure is explained below.

Fig. 3. Infrastructure NetFlow.

Flowcaching analyzes and collects data about IP streams that are included in the interface, and converts data for export.

From NetFlow-packages the following information can be obtained:

The address of the source and the recipient.
The number of the incoming and output device.
Port number of the source and receiver.
Protocol 4 levels.
The number of packets in the stream.
The number of bytes in the stream.
Temporary stamp in the stream.
The autonomous system (AS) number of the source and receiver.
Service type (TOS) and TCP flag.

The first flux package passing through the standard switch path is processed to create a cache. Packages with similar flow characteristics are used to create a stream record that is placed in the cache for all active streams. This entry marks the number of packets and the number of bytes in each thread. Cacheable information is then periodically exported to Flow Collector (flow collector).

Flow Collector is responsible for collecting, filtering and storing data. It includes a history of flow information that has been connected using an interface. Reducing the data volume also occurs using Flow Collector "and using selected filters and aggregations.

Data Analyzer (Data Analyzer) is required when you need to submit data. As shown in the figure, the collected data can be used for various purposes, even different from network monitoring, such as planning, accounting and networking.

The advantage of NetFlow over the rest of the monitoring methods, such as SNMP and RMON, is that it has software packages intended for various traffic analysis that exist to obtain data from NetFlow-packets and presenting them in a more friendly for the user.

When using tools such as NetFlow Analyzer (this is only one tool that is available for analyzing NetFlow packets), the information shown above can be obtained from NetFlow packets to create diagrams and ordinary graphs that the administrator can explore for greater understanding of His network. The greatest advantage of using NetFlow, in contrast to the available analytical packages, is that in this case numerous graphs can be built, describing network activity at any time.

2.2. Technologies not based on routers

Although technologies that are not built into the router are still limited in their capabilities, they offer greater flexibility than technology built into routers. These methods are classified as active and passive.

2.2.1. Active monitoring

Active monitoring reports network problems, collecting measurements between two end points. The active dimension system deals with such metrics as: utility, routers / routes, packet delay, packets, packet loss, unstable synchronization between arrival, bandwidth measurement.

Mainly use tools, such as Ping command, which measures the delay and loss of packets, and traceroute, which helps to determine the network topology, is an example of the main active measurement tools. Both of these tools send trial ICMP packets to the destination point and wait when this point will respond to the sender. Fig. 4 is an example of a Ping command that uses an active measurement method by sending an ECHO-request from the source through the network to the set point. The recipient then sends an echo request back source from which the request came.

Fig. 4. Ping Team (Accent Measurement)

This method can not only collect single metrics about the active dimension, but also can determine the network topology. Another important example of active measurement is the iPerf utility. IPerf is a utility that measures the quality of the bandwidth TCP and UDP protocols. It reports the bandwidth of the channel, the existing delay and loss of packets.

The problem that exists with active monitoring is that the presented samples in the network may interfere in normal traffic. Often, the time of active samples is processed differently than normal traffic, which makes iteping the importance of the information provided from these samples.

According to general information described above, active monitoring is an extremely rare monitoring method taken separately. Passive monitoring on the contrary does not require large network expenses.

2.2.2. Passive monitoring

Passive monitoring in contrast to active does not add traffic to the network and does not change the traffic that already exists on the network. Also, in contrast to active monitoring, passive collects information only about one point in the network. Measurements occur much better than between two points, with active monitoring. Fig. 5 shows the installation of the passive monitoring system, where the monitor is placed on the unit channel between the two endpoints and monitors the traffic when it passes through the channel.

Fig. 5. Installation of passive monitoring

Passive measurements deal with such information as: traffic and mixture of protocols, bits (bitrate), packet synchronization and time between arrival. Passive monitoring can be carried out using any program that pulls out the packages.

Although passive monitoring has no costs that have active monitoring, it has its drawbacks. With passive monitoring, measurements can only be analyzed from-line and they do not represent a collection. This creates a problem associated with the processing of large data sets that are collected during the measurement.

Passive monitoring can be better active in the fact that the data of service signals are not added to the network, but post-processing can cause a large number of time costs. That is why there is a combination of these two monitoring methods.

2.2.3. Combined monitoring

After reading the sections above, you can safely move to the conclusion that the combination of active and passive monitoring is the best way than using the first or second separately. Combined technologies use the best parties and passive, and active media monitoring. Two new technologies representing combined monitoring technologies are described below. This is "viewing resources at the ends of the network" (Wren) and "network monitor with your own configuration" (SCNM).

2.2.3.1. View resources at the ends of the network (Wren)

Wren uses a combination of an active and passive monitoring technician, actively processing data when the traffic is small, and passively processing data over the time of large traffic. He watches traffic and from the source, and from the recipient, which makes it possible to more accurate measurements. Wren uses packet tracing from the created traffic application to measure the useful bandwidth. Wren is divided into two levels: the main level of fast package processing and user-level trace analyzer.

The main level of fast package processing is responsible for obtaining information related to incoming and outgoing packages. Fig. 6 shows a list of information that is collected for each package. The Web100 adds a buffer for collecting these characteristics. Access to the buffer is carried out using two system calls. One call begins to trace and provides the necessary information for its collection, while the second call returns trace from the kernel.

Fig. 6. Information collected at the main packet trace level

Package trace object- Can coordinate calculations between different machines. One machine will activate the operation of another machine by setting the flag in the outgoing packet header to start processing a certain range of packages that it traces. Another car will in turn tracing all the packages for which she sees that the header is set to a similar flag. Such coordination ensures that information about similar packages is stored at each endpoint regardless of the connection and what happens between them.

User-level trace analyzer - another level in Wren environment. This is a component that begins to trace any package collects and processes the returned data at the operator kernel level. According to the design, the user-level components do not need to read information from the package trace object. They can be analyzed immediately after the trace is completed to make a real-time conclusion, or data can be saved for further analysis.

When the traffic is small, Wren will actively introduce traffic to the network while maintaining the order of measurement flows. After numerous studies, it was found that Wren presents similar dimensions in oversaturated and in non-oversaturated media.

In the current implementation of Wren, users are not forced only to capture trace, which were initiated by them. Although any user can follow the traffic of applications from other users, they are limited in information that can be obtained from traces of other users. They can only get the sequence and confirmation of numbers, but cannot receive current data segments from the packets.

In general, Wren is a very useful installation that uses advantages and active and passive monitoring. Although this technology is at an early stage of development, Wren can provide administrators with useful resources in monitoring and analyzing their networks. The network configuration monitor (SCNM) is another toolkit that uses technology and active, and passive monitoring.

2.2.3.2. Network monitor with own configuration (SCNM)

SCNM is a monitoring tool that uses the connection of passive and active measurements to collect information at 3 penetration levels emerging routers and other important network monitoring points. The SCNM environment includes hardware, and software components.

The hardware is installed at the critical points of the network. It is responsible for the passive collection of packet headers. Software starts at the end point of the network. Fig. 7, shown below, shows the SCNM environment software component.

Fig. 7. SCNM software component

Software is responsible for creating and package. activated packagesused to start the network monitoring. Users will send the activation packages to the network containing packages that they want to be obtained for monitoring and collecting. Users do not need a knowledge of the SCNM host location, taking the truth that all hosts are open to the "wiretapping" of the packages. Based on the information that exists within the activation package, the filter is placed in the data collection stream, which also works at the end point. The headlines of the network and transport level packets are collected, which correspond to the filter. The filter will be automatically entered in a time out, after exactly the specified time, if it receives other application packages. Package sampling service that runs on the SCNM host uses the TCPDUMP command (like package sampling program) in the order of received queries and records of traffic that matches the query.

When the passive monitoring tools determine the problem, traffic can be generated using active monitoring tools, allowing you to collect the added data for a more detailed study of the problem. When deploying this monitor on the network on each router throughout the path, we can study only network sections that have problems.

SCNM is designed to install and use, mainly administrators. Nevertheless, ordinary users can use some part of this functionality. Although ordinary users are able to use parts of the SCNM monitoring environment, they are allowed to watch only their own data.

In conclusion, let's say that SCNM is another way of combined monitoring that uses both active and passive methods to help administrators monitor and analyze their networks.

3. Conclusion

Selecting private instruments for using them in network monitoring, the administrator must first decide whether it wants to use well-proven systems that have already been used for many years, or new. If the existing systems are a more appropriate solution, then NetFlow is the most useful tool for use, since the analyzing data packets can be used in bundles with this utility to represent data in a more friendly user. However, if the administrator is ready to try a new system, solutions to combined monitoring, such as Wren or SCNM, is the best direction for further work.

Tracking and analyzing the network - vital in work system administrator. Administrators must try to contain their network in the manner, both for non-fragmented performance within the company and to communicate with any existing public services. According to the information described above, a number of router-oriented technologies and non-router-based router are suitable for helping network administrators in daily monitoring and analyzing their networks. Here are briefly described SNMP, RMON, and Cisco "S NetFlow - an example of several technologies based on routers. Examples of non-router-based technologies that were discussed in the article are active, passive monitoring and their combination.

Management and monitoring of IT infrastructure is one of the main tasks of the IT Department of any company. HP Software solutions will simplify the task of system administrators and organize effective control of the organization's network

Modern IT infrastructure is a complex heterogeneous network that includes telecommunication, server and software solutions of different manufacturers working on the basis of various standards. Its difficulty and scale determines the high level of automated monitoring and management tools that should be used to ensure reliable network operation. HP Software software products will help solve monitoring tasks at all levels, from infrastructure (network equipment, servers and storage systems) to control the quality of business services and business processes.

Monitoring systems: What are they?

In modern platforms for monitoring IT, there are 3 directions for the development and conclusion of monitoring on new level. The first is called the "bridge" ("umbrella system", "manager of managers). Its concept is to dispose of investments in already existing systems that perform the tasks of monitoring individual parts of the infrastructure, and transform the systems themselves into information agents. This approach is the logical development of the usual monitoring of the IT infrastructure. As the prerequisites for the implementation of the "Bridge" type system, the IT department of the decision can consolidate scattered monitoring systems to transition to monitoring IT services / Systems as something whole, scattered systems are not able to show the whole picture, the case is not diagnosing a serious application failure, as well as A large number of warnings and emergency signals, the absence of a single coverage, prioritization and detection of causal relationships.

The result of implementation will be the automated collection of all available events and metrics of IT infrastructure, comparison of their condition and influence on the "health" of the service. In the event of a failure, the operator will receive access to the panel displaying the root cause of the failure with the recommendations to eliminate it. In the case of a typical failure, it is possible to assign a script that automates the necessary operations of the operator.

The next trend is called "Anomaly Analytics". Here, as in the first case, metrics and events are collected from a number of infrastructure monitoring systems, and in addition, the collection of IT and security logs are configured. Thus, a huge amount of information is accumulated every minute, and the company wants to benefit from its disposal. For the introduction of "Analyst anomalies", there are a number of reasons: the complexity of timely collection, storage and analysis of all data, the need to reactively eliminate unknown problems, the inability to quickly determine important to eliminate information failures, the complexity of manually executing the search for individual logs, as well as the need to determine deviations and Repeated failures.

The implementation of the system will allow you to implement automated collection of events, metrics and logs, storing this information required period of time, as well as an analysis of any information, including logs, performance information and data systems. In addition, it will be possible to predict and resolve any types of problems and prevent known failures.

Finally, "application performance management", or identifying and eliminating failures in end-user transaction transactions. Such a solution can be a useful addition to tight contact with previous two. At the same time, such a system itself can also give a quick result from the introduction. In this case, the company has applications important for business. At the same time, the availability and quality of services are important, one of the key elements of which is an application (Internet banking, CRM, billing, etc.). If you fall into the availability or quality of this service, it usually comes to talk about proactivity and rapid recovery. Such a system is usually implemented when it is necessary to increase the availability of applications and performance services, as well as reduce the average recovery time. In addition, this approach is good to eliminate the extra costs and reducing risks associated with the service level agreement (SLA), and to prevent customer care (business protection).

The results of implementation depending on the main task may differ. In general, it allows you to implement the fulfillment of typical actions of the user by the "robot" from different regions \\ network segments, the analysis of "mirrored" traffic, checking the availability and quality of services to identify bottlenecks, informing the operator about the need to restore performance indicating the place of degradation. If necessary, it becomes possible to deeply diagnose the application of the application to search for the causes of systematic deterioration of the services.

The above approaches can be implemented using HP Software products, which will be discussed below.

"Bridge" from HP

HP Operations Bridge presents the newest generation "Umbrella monitoring systems". The solution combines monitoring data from own agents, various HP Software monitoring modules and other developers monitoring. The flow of events from all sources of information is superimposed on the resource-service model, correlation mechanisms are applied to it to determine which events are causes, symptoms and consequences.

Separately, it should be highlighted on a resource-service model, and accurate models, since such models may not be limited to analyzing information in different angles. From its completeness and relevance, the ability to solve the correlation of the flow of events depends. To maintain the relevance of the models, the means of reconnaissance on the basis of agents and adegent technologies, allowing to receive detailed information On the components of the service, the relationship between them and mutual influence on each other. There is also the possibility of importing data on the topology of the service from external sources - monitoring systems.

Another important aspect is the convenience of control. In difficult and dynamically changing media, it is important to ensure the monitoring system under the change in the structure of the systems and add new services. Operations Bridge includes a Monitoring Automation component, which allows you to configure systems in the perimeter of monitoring in automatic mode, which use data about service and resource models. At the same time, configuration and change of the previously executed monitoring settings has been supported.

If earlier, the administrators could perform the same settings of the same type of infrastructure components (for example, Metrics on Windows, Linux or Unix servers), which required considerable time and effort, now you can now dynamically configure the threshold values \u200b\u200bfor the metric in the section of the service or service.

Analytics of applications

The use of a traditional approach to monitoring implies that it is originally known which parameters to monitor and what events track. The growing complexity and dynamics of the development of IT infrastructures makes it look for other approaches, as it becomes more difficult to control all aspects of the system.

HP Operations Analytics allows you to assemble and save all application data: log files, telemetry, business metrics and performance metrics, system events, etc., and use analytical mechanisms to identify trends and forecasting. The solution brings the collected data to a single format and then, exercising a contextual selection, based on the data log files, it displays on the timeline, which, at what point and on which system it happened. The product provides several forms of data visualization (for example, an interactive "heat card" and the topology of the relationships of log files) and uses the assistant function in order to find the entire set of data collected during the specific period in the event in the context of the event or via the search box. This helps the operator to understand what has led to a collection (or, when using HP SHA data, along with HP OA data, make the appropriate forecast), as well as reveal both the culprit and the root cause of the failure of the failure. HP Operations Analytics makes it possible to play a picture of the service and environment at the time of failure and isolate it in context and time.

Another analytical tool - HP Service Health Analyzer. HP SHA identifies an abnormal behavior of controlled infrastructure elements in order to prevent possible refusal to provide services or violating the specified parameters for their provision. The product uses special data statistical algorithms based on the topological service and resource model HP BSM. With their help, it is possible to build a profile of normal performance parameter values \u200b\u200bcollected from both software and hardware platforms and other BSM modules (for example, HP RUM, HP BPM) characterizing the state of the services. In such profiles, typical values \u200b\u200bof the parameters are introduced, taking into account the days of the week and time of the day. SHA performs historical and statistical analysis of accumulated data (to understand the essence of identified data), and also compares with a dynamic profile (Baselining).

Application performance control

When it comes to control the performance of applications, the following HP solution components should be highlighted:

HP Real User Monitoring (HP RUM) - Transaction Control Control real users;
HP Business Process Monitoring (HP BPM) - Control of application availability by using user action emulation;
HP Diagnostics is to control the passage of requests inside the application.

HP Rum and HP BPM allow you to evaluate the availability of the application from the point of view of the end user.

HP Rum disassembles network traffic, revealing real users transactions in it. You can monitor data exchange between application components: client part, application server and database. This makes it possible to track user activity, time processing various transactions, as well as identify the relationship between user actions and business metrics. Using HP RUM, monitoring service operators will be able to instantly receive operational notifications about problems in the availability of services and error information faced by users.

HP BPM is a means of active monitoring that performs synthetic user transactions, for controlled systems indistinguishable from real. HP BPM monitoring data is convenient to use to calculate the Real SLA, since the "robot" performs identical checks at the same time intervals, providing permanent quality control of the processing of typical (or most critical) requests. Configuring the sample to perform synthetic transactions from several points (for example, from different offices of the company), you can also assess the availability of the service for various users, taking into account their location and communication channels. To emulate the HP BPM activity, uses the Virtual User Generator (VUGEN) tool, which also applies in the popular HP Loadrunner load testing product. Vugen supports a huge range of different protocols and technologies, so that you can control the availability of almost any services, as well as use a single set of scripts for testing and monitoring.
If the cause of failures or deceleration of the service is within such technologies as Java, .NET, etc., will help HP Diagnostics.

The solution provides deep Java, .NET control, Python on Windows, Linux and Unix platforms. The product supports a variety of application servers (Tomcat, JBOSS, WebLogic, Oracle, etc.), MidDleWare and databases. HP Diagnostics specialized agents are installed on application servers and collect data specific to a specific technology. For example, for a Java application you can see which requests are executed, what methods are used and how much time is spent on their work. The application structure is automatically drawn, it becomes clear how its components are involved. HP Diagnostics allows you to track the passage of business transactions within comprehensive applications, identify bottlenecks and provide experts with the necessary information for making decisions.

Distribution of HP solutions in

ESSAY

This document is a technical project of developing and implementing a network monitoring system of the Uperprivine City Urban Data Network of General Access of Gerk LLC. The project conducted a study of existing network monitoring systems, an analysis of the current situation in the enterprise and justified the selection of specific components of the network monitoring system.

The document contains a description of the design solutions and equipment specifications.

The design result is designed to implement and use the system:

§ Full description of all stages of design, development and implementation of the system;

§ System Administrator's Guide, which includes a description of the user interface system.

This document presents completed design solutions and can be used to implement the system.

List of sheets of graphic documents

Table 1 - List of sheets of graphic documents

1 Systems Network Monitoring220100 4010002Logical Structure of Network220100 4010003 Algorithm Network Monitoring kernel and alerts220100 4010004 Structure of the network interfaces 220100 4010005 Structure of the system logs of events220100 4010006 Interface Nagios20100 4010007 Published Network Monitoring System Structure220100 401000

List of conventional symbols, symbols and terms

Ethernet - Data Transmission Standard, released IEEE. Determines how to transmit or receive data from a common data transmission medium. Generates lower transport level and is used by various high-level protocols. Provides data transfer rate 10Mbps / s.

Fast Ethernet - data transmission technology with a speed of 100 Mbps, using CSMA / CD method, like 10Base-T.

FDDI - Fiber Distributed Data Interface is a fiber-optic data transmission interface - data transmission technology with a speed of 100 Mbps, using a marker ring method.

IEEE - Institute of Electrical and Electronic Engineers (Institute of Electrical Engineering Engineers and Electronics) - an organization that develops and publishing standards.

LAN - Local Area Network is a local network, LAN. Address - Media Access Control - The identification number of the network device, which is usually determined by the manufacturer.

RFC - Request for Comments is a set of documents manufactured by the IEEE organization and include a description of standards, specifications, etc.

TCP / IP - TRANSMISSION CONTOCOL PROTOCOL / INTERNET PROTOCOL - Internet Transmission Control Protocol / Internet.

LAN - local computing network.

OS - operating system.

Software software.

SCS is a structured cable system.

DBMS - database management system.

Trend - long-term statistics, which allows you to build a so-called trend.

EUM - electronic computing machine.

Introduction

The information infrastructure of the modern enterprise is the most complicated conglomerate of different-scale and heterogeneous networks and systems. To ensure their coordinated and effective workThe control platform of a corporate scale with integrated instrumental means is necessary. However, until recently, the structure of the network management industry itself prevented the creation of such systems - "players" of this market sought to leadership, releasing products of a limited area of \u200b\u200baction, using funds and technologies that are not compatible with systems of other suppliers.

Today, the situation is changing for the better - products applying to the versatility of management of all the variety of corporate information resources, from desktop systems to mainframes and from local networks to network resources. At the same time, the realization that control applications must be opened to solve all suppliers.

The relevance of this work is due to the fact that in connection with the distribution of personal computers and the creation of automated workplaces (ARMS), the importance of local computing networks (LAN) has increased, the diagnosis of which is the object of our study. The subject of the study is the main methods of organization and conduct of the diagnosis of modern computer networks.

"LAN Diagnostics" - process (continuous) analysis of the status of the information network. If the network devices occurs, the fault is fixed, its location and view is determined. A malfunction message is transmitted, the device is turned off and replaced with a backup.

The network administrator at which the functions of diagnostics are most often lowered, should begin to study the features of its network already on the phase of its formation. Know the network scheme and a detailed description of the software configuration with the indication of all parameters and interfaces. Special network documentation systems will be suitable for designing and storing this information. Using them, the system administrator will know in advance all possible "hidden defects" and "bottlenecks" of their system, in order to know in case of an emergency situation, with which the problem with equipment or software is related, the program is damaged or the error has been damaged. operator actions.

The network administrator should be remembered that from the point of view of users, the quality of the application of application software in the network is determined. All other criteria, such as the number of data transfer errors, degree of workload network resources, equipment performance, etc., are secondary. "Good Network" is such a network whose users do not notice how it works.

Company

Pre-diploma practice was held at the enterprise Gerkon LLC in the accompaniment department as a system administrator. The company offers Internet access services in the cities of Upper Pyshma and Sredneuralsk using Ethernet technology and switched (Dial-Up) channels since 1993 and is one of the first Internet service providers in these cities. The rules for the provision of services are settled by the Public Offer and Rules.

Scientific and Production Tasks of the Division

The accompaniment department solves the following range of tasks within this enterprise:

§ technical and technological organization of the provision of Internet access on switched and dedicated channels;

§ technical and technological organization of wireless Internet access;

§ selection of disk space for storing and providing site work (hosting);

§ support for work mailboxes or virtual mail server;

§ placement of client equipment at the platform of the provider (colocation);

§ rent of selected and virtual servers;

§ data redundancy;

§ deploying and supporting corporate networks of private enterprises.

1. Network monitoring systems

Despite many receptions and tools for detecting and troubleshooting in computer networks, the "soil under the legs" of network administrators is still enough to be enough. Computer networks increasingly include fiber optic and wireless components, the presence of which makes senseless use of traditional technologies and tools intended for conventional copper cables. In addition to it, at speeds, over 100 Mbps, traditional approaches to diagnostics are often ceased to work, even if the transmission medium is a conventional copper cable. However, perhaps the most serious change in computer network technologiesWith which the administrators had to be encountered, became the inevitable transition from Ethernet networks with a shared transmission medium to switched networks, in which individual servers or workstations often serve as commutable segments.

True, as technological transformations exercise, some old problems decided by themselves. Coaxial cable, in which to identify electrical malfunctions has always been harder than in the case of twisted pair, becomes a rarity in corporate environments. Token Ring networks, the main problem of which was to incorrect them with Ethernet (and not weakly weakly) are gradually being replaced by commutable Ethernet networks. Numerous network-level protocols generating numerous error messages, such as SNA, DECNET and AppleTalk, are replaced by the IP protocol. The IP protocol stack itself has become more stable and simple to support, which prove millions of clients and billions of Web pages in the Internet. Even the short-circuited opponents of Microsoft has to admit that the connection of the new Windows client to the Internet is substantially simpler and more reliable to install previously used TCP / IP stacks of third-party providers and separate software commutable access.

Like numerous modern technologies Neither impeded troubleshooting and network performance managed, the situation could be even harder if the ATM technology was widespread at the PC level. It was also played by the fact that in the late 90s, not having time to gain recognition, some other high-speed data exchange technologies were rejected, including token Ring with 100 Mbps bandwidth, 100VG-AnyLAN and advanced ArcNet networks. Finally, a very complex stack of OSI protocols was rejected in the United States (which, however, is legalized by a number of governments of European countries).

Consider some topical problems arising from network administrators of enterprises.

The hierarchical topology of computer networks with Gigabit Ethernet trunk channels and selected ports of switches by 10 or even 100 Mbps for individual client systems, allowed to increase maximum bandwidth, potentially accessible to users at least 10-20 times. Of course, in most computer networks there are bottlenecks at the server level or access routers, since the bandwidth occurs on a separate user is significantly less than 10 Mbps. In this regard, the replacement of the port of the hub with a bandwidth of 10 Mb / s on the dedicated port of the switch 100 Mbps for the end node does not always lead to a significant increase in speed. However, considering that the cost of switches has recently decreased, and at most enterprises Category 5 cable, which supports Ethernet technology by 100 Mbps, and network cards are installed capable of operating at a speed of 100 Mbps immediately after rebooting the system, it becomes It is clear why it is not easy to resist the temptation of modernization. In the traditional local network with a divided transmission medium, the protocol analyzer or the monitor can explore all the traffic of this network segment.

Fig. 1.1 - Traditional local network with a divided transmission medium and protocol analyzer

Although the advantage of the commutable network in performance is sometimes almost not noticeable, the distribution of commutable architectures had catastrophic consequences for traditional diagnostic tools. In a strongly segmented network, protocol analyzers are able to see only unicast traffic on a separate port of the switch, unlike the network of former topology, where they could carefully explore any package in the collision domain. In such conditions, traditional monitoring tools cannot collect statistics on all "dialogues", because each "brandy" pair of endpoints uses, in essence, its own network.

Fig. 1.2 - Switchable Network

In the switched network, the protocol analyzer at one point can "see" only the only segment if the switch is not able to mirrorize multiple ports at the same time.

To maintain control over strongly segmented networks, switch manufacturers offer a variety of means to restore full "visibility" of the network, but there are many difficulties on this path. In the switches supplied now, the "mirroring" ports are usually supported when the traffic of one of them is duplicated to the previously unused port to which the monitor or analyzer is connected.

However, the "Mirror Image" has a number of shortcomings. First, at each moment of time only one port is visible, so it is very difficult to identify problems affecting several ports at once. Secondly, the mirror reflection can lead to a decrease in the performance of the switch. Thirdly, the physical layout fails are usually not played on the mirror port, and sometimes the designations of virtual local networks are even lost. Finally, in many cases, the full-duplex channels Ethernet can not fully mirror.

A partial solution when analyzing the aggregated traffic parameters is to use MINI-RMON agent monitoring capabilities, especially since they are embedded in each port of most Ethernet switches. Although MINI-RMON agents do not support the Capture object group from the RMON II specification, providing a full-featured analysis of protocols, they nevertheless allow you to estimate the level of resource use, the number of errors and the volume of multicast.

Some shortcomings of the port technology of ports can be overcome by the installation of "passive couplers" produced, for example, the company Shomiti. These devices are pre-installed Y connectors and allow you to track using protocol analyzers or other devices not regenerated, but a real signal.

The next relevant problem is the problem with the features of optics. Computer network administrators typically use specialized optical network diagnostic equipment only to solve problems with optical cables. The usual standard software management software based on SNMP or command line interface is capable of identifying problems on switches and routers with optical interfaces. And only a few network administrators face the need to diagnose SONET devices.

As for fiber-optic cables, the reasons for the occurrence of possible faults in them are significantly less than in the case of a copper cable. Optical signals do not cause cross-interference that appear from the fact that the signal of one conductor induces a signal on the other - this factor most complicates the diagnostic equipment for the copper cable. Optical cables are immune to electromagnetic noise and induced signals, so they do not need to be located away from elevator electric motors and lamps of daylight, i.e. from the diagnostic script, all these variables can be excluded.

Signal strength, or optical power, at this point is actually the only variable you want to measure when troubleshooting in optical networks. If you can define the loss of the signal throughout the optical channel, you can identify almost any problem. Inexpensive additional modules for copper cable testers allow optical measurements.

Enterprises that deploy a large optical infrastructure and independently serving, it may be needed to purchase an optical temporal reflecto-meter, OTDR, which performs the same features for optical fiber as a reflectometer for a copper cable (Time Domain Reflectometer, TDR). The device acts like a radar: it sends pulse signals over the cable and analyzes their reflections, on the basis of which it detects damage in the conductor or any other anomalia, and then reports Ekspert, in which location of the cable should look for the source of the problem.

Although various suppliers of cable connectors and connectors simplified the processes of the termination and branching of the optical fiber, for this, some level of special skills still requires, and with reasonable policies, an enterprise with a developed optical infrastructure will be forced to train its employees. No matter how well the cable network was laid, there is always the possibility of physical damage to the cable as a result of any unexpected incident.

When you diagnose wireless LANs of the 802.11b standard, there may also be problems. The diagnosis itself, as simple as in the case of Ethernet networks on the basis of hubs, since the wireless information transmission medium is divided between all holders of client radio devices. Sniffer Technlogies The first suggested a solution for analyzing the protocols of such networks with a capacity of up to 11 Mbit / s, and afterwards the majority of the leading providers of the analyzers presented similar systems.

Unlike the Ethernet concentrator with wired connections, the quality of wireless client connections is far from stable. Microwave radio signals used in all variants of local transmission, weak and sometimes unpredictable. Even small changes in the position of the antenna can seriously affect the quality of the compounds. Wireless LAN Access Points are equipped with device management console, and this is often a more efficient diagnostic method than to visit customers of a wireless network and monitor bandwidth and error conditions using a portable analyzer.

Although the problems of data synchronization and installation of devices arising from users of personal digital secretaries (PDA) more naturally correspond to the tasks of the technical support group, and not the responsibilities of the network administrator, it is not difficult to foresee that in the near future, many such devices will turn out of individual auxiliary tools that complement PCs , in full network customers.

As a rule, corporate wireless network operators will (or must) prevent the deployment of excessively open systems in which any user in the network area and having a compatible interface card receives access to each information frame system. Wire Equivalent Privacy Security Protocol provides user authentication, integrity guarantee and data encryption, however, as usual happens, the perfect security system complicates the analysis of the reasons for network problems. In protected networks with WEP support, diagnostic specialists should know keys or passwords that protect informational resources and controlling access to the system. When accessing all packets, the protocol analyzer will be able to see all frame headers, but the information contained in them without the presence of keys will be meaningless.

When diagnosing tunneled channels, which many manufacturers call virtual private networks with remote accessThe problems arising are similar to those in the analysis of wireless networks with encryption. If the traffic does not pass through the tunneled channel, then the cause of the fault is not easy to determine. It may be an authentication error, breakage on one of the endpoints or a jam in the public Internet zone. An attempt to use protocol analyzer to identify high-level errors in tunneled traffic will be an empty spent forces, because the content of the data, as well as the headers of the applied, transport and network levels are encrypted. In general, measures taken to improve the level of security of corporate networks usually make it difficult to identify malfunctions and performance problems. Firewalls, proxy servers and intrusion detection systems may additionally complicate troubleshooting.

Thus, the problem of diagnosing computer networks is relevant and ultimately, the diagnosis of faults is a management task. For most critical corporate systems, long-term restoration work is not permissible, so the only solution will be the use of backup devices and processes that can assume the necessary functions immediately after the occurrence of failures. Some network enterprises always have an additional backup component in case of a malfunction of the main, that is, N x 2 components, where N is the number of the main components needed to provide acceptable performance. If the average recovery time (MEAN Time to Repair, MTTR) is large enough, it may be needed even greater redundancy. The fact is that the troubleshooting time is not easy to predict, and significant costs during the unpredictable recovery period are a sign of poor management.

For less important systems, reservation may be economically unjustified, and in this case it will be advisable to invest in the most effective tools (both in personnel training) in order to maximize the process of diagnosing and eliminating faults in the enterprise. In addition, the support of certain systems can be entrusted with third-party specialists, or attracting them to a contract for a contract, or using the capabilities of external data processing centers, or referring to application service providers (Application Service Providers, ASP) or management service providers. In addition to the costs, the level of competence of their own personnel can be considered the most significant factor affecting the decision to address the services of third-party organizations. Network administrators must decide whether some specific function is so closely related to the specific tasks of the enterprise, which is impossible to expect more qualitative performance from a third-party specialist than it will be done by the company's employees.

Almost immediately after the first corporate networks were deployed, the reliability of which left much to be desired, manufacturers and developers put forward the concept of "self-healing networks". Modern networks are definitely more reliable than they were in the 90s, but not because the problems began to self-say. Liquidation of software failures and hardware of modern networks still require human intervention, and in the near future, no fundamental changes are foreseen in this state. Methods and tools of diagnostics are fully consistent with modern practice and technologies, but they have not yet reached such a level that would significantly save the time of network administrators in their fight against networks and performance deficit.

1.1 Diagnostic Software

Among the software tools for diagnosing computer networks, you can select special network management systems (Network Management Systems) - centralized software systems that collect data on the status of nodes and communication devices, as well as traffic circulating data on the network. These systems not only monitor networks and analysis of the network, but also perform in the automatic or semi-automatic network to control the network - switching on and off the ports of devices, a change in the parameters of bridges of address tables of bridges, switches and routers, etc. Examples of management systems can serve as popular HPOpenView systems, SunnetManager, IBMNetView.

System management tools (System Management) perform functions similar to the functions of control systems, but relative to communication equipment. At the same time, some functions of these two types of control systems can be duplicated, for example, the system management tools can perform the simplest analysis of network traffic.

Expert systems. This type of systems accumulates human knowledge about identifying the causes of abnormal work of networks and possible methods of bringing the network to a working condition. Expert systems are often implemented in the form of separate subsystems of various network monitoring and analysis tools: network management systems, protocol analyzers, network analyzers. The simplest version of the expert system is the context-dependent HELP system. More complex expert systems are so-called knowledge bases with elements of artificial intelligence. An example of such a system is an expert system built into the Cabletron Spectrum control system.

1.1.1 Protocol analyzers

During the design of a new or modernization of the old network, it is often necessary to quantify some characteristics of a network of such, for example, as the intensity of data flows over network communication lines, delays arising at various stages of package processing, reaction times to requests for a kind, frequency of occurrence defined events and other characteristics.

For these purposes, different means can be used and first of all - monitoring tools in the network management systems that have already been discussed earlier. Some network measurements can be made and built into the operating system by software meters, the example of this is the Windows Performance Monitor component. Even cable testers in their modern execution are able to capture packets and analyzing their contents.

But the most advanced network research tool is the protocol analyzer. The process of analyzing the protocols includes capturing circulating packets in a network that implement a network protocol and study the contents of these packages. Based on the results of the analysis, you can make a reasonable and weighted change in any component of the network, optimizing its performance, troubleshooting. Obviously, in order to make any conclusions on the impact of some change on the network, it is necessary to analyze protocols and before, and after making a change.

The protocol analyzer represents either an independent specialized device, or a personal computer, usually portable, HTEBOOK class, equipped with a special network card and appropriate software. The applied network card and software must match the network topology (ring, tire, star). The analyzer connects to the network in the same way as an ordinary node. The difference is that the analyzer can take all data packets transmitted over the network, while the usual station is only addressed to it. The analyzer software consists of a kernel supporting the operation of the network adapter and decoding the data obtained, and an additional program code depending on the type of topology of the network under study. In addition, a number of decoding procedures focused on a specific protocol, such as IPX, comes. Some analyzers may also include an expert system that can produce recommendations on which experiments should be carried out in this situation, which can mean certain measurement results, how to eliminate some types of network malfunction.

Despite the relative variety of protocol analyzers presented in the market, some features can be called, in one way or another inherent in all of them:

User interface. Most analyzers have a developed friendly interface based, as a rule, on Windows or Motif. This interface allows the user to: output the results of the traffic intensity analysis; receive an instantaneous and averaged statistical assessment of network performance; set certain events and critical situations to track their occurrence; Decoding the protocols of different levels and submit in understandable form contents of the packets.

Capture buffer. Buffers of various analyzers differ in volume. The buffer can be located on the installed network cardOr for it can be assigned a place in the RAM of one of the network computers. If the buffer is located on the network card, then the control of them is carried out hardware, and due to this input speed rises. However, this leads to the rise in the rise in the analyzer. In case of insufficient performance of the capture procedure, some of the information will be lost, and the analysis will be impossible. The size of the buffer determines the ability to analyze more or less representative samples of the captured data. But no matter how much capture buffer, sooner or later it will be filled. In this case, the capture either stops, or the filling starts from the beginning of the buffer.

Filters. Filters allow you to control the data capture process, and thereby saving the buffer space. Depending on the value of certain packet fields specified in the form of filtering conditions, the package is either ignored or written to the grip buffer. The use of filters significantly accelerates and simplifies the analysis, as it eliminates the viewing at the moment of packages.

Switches are as defined by the operator some conditions start and terminating the data capture process from the network. Such conditions may be the execution of manual commands for starting and stopping the capture process, time of day, the duration of the capture process, the appearance of certain values \u200b\u200bin data frames. Switches can be used in conjunction with filters, allowing more detailed and finely analyzed, as well as productively use a limited scope buffer volume.

Search. Some protocol analyzers allow you to automate the viewing of information in the buffer, and find data on specified criteria in it. While the filters check the input stream for compliance with filtering conditions, the search features are applied to the data already accumulated in the buffer.

The methodology for conducting analysis can be presented in the form of the following six stages:

Capture data.

View captured data.

Data analysis.

Error search. (Most analyzers facilitate this work, defining the types of errors and identifying the station from which the package has come with an error.)

Research performance. The coefficient of using the network bandwidth or average response time to the request is calculated.

A detailed study of certain sections of the network. The content of this stage is specified as the analysis is carried out.

Typically, the protocol analysis process takes relatively little time - 1-2 business days.

Most modern analyzers allow you to analyze several global network protocols, such as X.25, PPP, SLIP, SDLC / SNA, Frame Relay, SMDS, ISDN, bridge / router protocols (3COM, Cisco, Bay Networks and others). Such analyzers allow measuring various protocol parameters, analyze traffic on the network, transformation between local and global networks, delay on routers with these transformations, etc. The more advanced devices provide for the possibility of modeling and decoding global network protocols, "stressful" testing, measurements maximum bandwidth, quality testing of services. For universality purposes, almost all analyzers of global network protocols implement the testing functions of the LAN and all basic interfaces. Some devices are able to analyze telephony protocols. And the most modern models can decode and submit all seven OSI levels at a convenient option. The appearance of ATM led to the fact that manufacturers began to supply their analyzers to test these networks. Such devices can carry out full testing of ATM levels E-1 / E-3 with monitoring and modeling support. The set of service functions of the analyzer is very important. Some of them, such as the possibility remote control The device is simply indispensable.

Thus, modern WAN / LAN / DTM protocol analyzers allow you to detect errors in the configuration of routers and bridges; establish the type of traffic sent on the global network; Determine the speed range used, optimize the ratio between the throughput and the number of channels; localize the source of incorrect traffic; Testing serial interfaces and total testing ATM; implement full monitoring and decoding of the main protocols on any channel; Analyze real-time statistics, including the analysis of the traffic of local networks through global networks.

1.1.2 Monitoring Protocols

SNMP Protocol (SIMPLE NETWORK Management Protocol - Simple Protocol Network Control) is a communication network management protocol based on TCP / IP architecture.

Based on the concept of TMN in 1980-1990. A number of data logging protocols with different spectrum of TMN functions were developed by various standardization bodies. To one of the types of management protocols includes SNMP. The SNMP protocol was designed to verify the operation of network routers and bridges. Subsequently, the scope of the protocol also covered other network devices, such as hubs, gateways, terminal servers, Lan Manager Server, Windows NT Machines, etc. In addition, the protocol allows you to make changes to the functioning of these devices.

This technology is designed to ensure control and control of devices and applications on the communication network by exchanging control information between agents located on network devices and managers located at the control stations. SNMP defines a network as a set of network control stations and network elements (main machines, gateways and routers, terminal servers), which jointly provide administrative links between network control stations and network agents.

When using SNMP there are managed and control systems. The controlled system includes a component called an agent that sends the reports of the control system. Essentially, SNMP agents transmit managerial information to control systems as variables (such as "free memory", "system name", "Number of running processes").

The agent in the SNMP protocol is a processing element that provides managers placed on network managers, access to the values \u200b\u200bof MIB variables, and thus gives them the ability to implement the functions for controlling and monitoring the device.

The program agent is a resident program that performs the control functions, as well as collecting statistics to transmit it to the network device information database.

Hardware Agent - Built-in equipment (with processor and memory) in which software agents are stored.

Variables available through SNMP are organized in hierarchy. These hierarchies and other metadata (such as the type and description of the variable) are described by the Management Information Bases (MIBS).

Today there are several standards for databases. managing information . The main standards are MIB-I and MIB-II, as well as a database version for remote control RMON MIB. In addition, there are standards for special MIB of specific type devices (for example, MIB for concentrators or MIB for modems), as well as private MIB specific equipment manufacturers.

The initial MIB-I specification defined only the readings of variable values. Operations of changes or setting object values \u200b\u200bare part of the MIB-II specifications.

The MIB-I version (RFC 1156) defines up to 114 objects that are divided into 8 groups:

System is general data on the device (for example, the supplier identifier, the time of the latest system initialization).

Interfaces - Describes the network interface parameters of the device (for example, their number, types, exchange rates, maximum package size).

AddResstranslationTable - describes compliance between network and physical addresses (for example, according to the ARP protocol).

InternetProtocol - data related to IP protocol (IP-gateway addresses, hosts, statistics on IP packages).

ICMP - data related to the ICMP control messaging protocol.

TCP - data related to TCP protocol (for example, TCP connections).

UDP - data related to the UDP protocol (number of transmitted, accepted and erroneous UPD-datagram).

EGP - Data related to the EXTERIORGATEWAYPROTOCOL route information exchange protocol used on the Internet (number of errors received and without messages errors).

From this list of groups of variables, it is clear that the MIB-I standard was developed with a rigid orientation for controlling routers that support TCP / IP stack protocols.

In the MIB-II version (RFC 1213), adopted in 1992, was significantly (up to 185) a set of standard objects, and the number of groups increased to 10.

RMON Agents

The newest addition to the SNMP functionality is the RMON specification, which provides remote interaction with the MIB base.

The standard on RMON appeared in November 1991, when the Internet Engineering Task Force released the RFC 1271 document called "Remote Network Monitoring Management Information Base" ("Information base of remote monitoring networks"). This document contained the description of RMON for Ethernet networks. - Computer network monitoring protocol, SNMP extension, based on SNMP, is a collection and analysis of information about the nature of information transmitted over the network. As in SNMP, the collection of information is carried out by hardware and software agents, the data from which is received by the computer where the network management application is set. The difference between RMON from its predecessor consists, first of all, in the nature of the information collected - if this information is characterized by only events occurring on the device where the agent is installed, the RMON requires that the data obtained has characterized traffic between network devices.

Before the RMON appearance, the SNMP protocol could not be used remotely, it only allowed local devices. The RMON MIB base has an improved set of properties for remote control, as it contains aggregated device information, which does not require transmission over a network of large amounts of information. RMON MIB objects include additional error counters in packages, more flexible means of analyzing graphic trends and statistics, more powerful filtering tools for capturing and analyzing individual packets, as well as more complex warning signals. RMON MIB agents are more intelligent compared to MIB-i or MIB-II agents and perform a significant part of the work on the processing of information about the device, which managers have previously been performed. These agents can be located within various communication devices, and also be made in the form of separate software modules running on universal PCs and laptops (the example is LanaLyZernVell).

RMON agent intelligence allows them to perform simple action to diagnose malfunctions and prevent possible failures - for example, within the RMON technology, you can collect data on the normal operation of the network (i.e., perform the so-called Baselining), and then set warning signals when The network mode will deviate from Baseline - this may indicate, in particular, on the incomplete function of equipment. Collecting the information obtained from RMON agents, the management application can help the network administrator (located, for example, thousands of kilometers from the network segment analyzed) to localize a malfunction and develop an optimal action plan to eliminate it.

RMON information is collected by hardware and software probes connected directly to the network. To complete the task of collecting and primary data analysis, the probe must have sufficient computing resources and the amount of RAM. Currently, the market has probes of three types: built-in, probes based on computer, and autonomous. The product is considered to be supporting RMON if at least one RMON group is implemented. Of course, the more RMON data groups are implemented in this product, the one, on the one hand, more expensive, and on the other, the more complete information about the operation of the network it provides.

Built-in probes are extension modules for network devices. Such modules are manufactured by many manufacturers, in particular, such large companies as 3COM, Cabletron, Bay Networks and Cisco. (By the way, 3Com and Bay Networks recently acquired Axon and Armon companies, recognized leaders in the development and production of RMON management tools. Such a matter of this technology from the largest network equipment manufacturers once again shows how necessary for users is remote monitoring.) The most Naturally, the solution is to embed RMON modules into hubs, because it is from the observation of these devices that you can make an idea of \u200b\u200bthe operation of the segment. The advantage of such probes is obvious: they allow you to receive information on all major RMON data groups at a relatively low price. The disadvantage in the first place is not too high performance, which manifests itself, in particular, in the fact that the built-in probes often support not all RMON data groups. Not so long ago, 3com announced the intention to release supporting RMON drivers for Etherlink III and Fast Ethernet network adapters. As a result, it will be possible to collect and analyze RMON data directly at workstations on the network.

Probes based on the computer are simply connected to the network computers with the RMON software agent installed on them. Such probes (to the number of which include, for example, the Cornerstone Agent 2.5 product of Network General) have a higher productivity than the built-in probes, and support, as a rule, all RMON data groups. They are more expensive than the built-in probes, but much cheaper than autonomous probes. In addition, the prote probes are quite large, which can sometimes limit the possibilities of their use.

Autonomous probes possess the highest performance; How easy it is to understand, it is at the same time the most expensive products from all described. As a rule, an autonomous probe is a processor (class I486 or RISC processor), equipped with a sufficient amount of RAM and a network adapter. The leaders in this market sector are Frontier and Hewlett-Packard. The probes of this type are small in size and very mobile - they are very easy to connect to the network and disconnect from it. When solving a global scale network management task, this, of course, is not too important property, but if RMON means are used to analyze the work of the corporate medium-sized network, then (given the high cost of devices) the mobility of the probes can play a very positive role.

The RMON object is assigned number 16 in the MIB object kit, and the RMON object itself combines in accordance with RFC 1271, consists of ten data groups.

Statistics - current accumulated statistical data on package characteristics, quantity quantity, etc.

History - Statistical data stored at certain intervals for subsequent analysis of the trends of their changes.

ALARMS - threshold values \u200b\u200bof statistical indicators, if the RMON agent is exceeded by the message manager. Allows the user to determine a number of threshold levels (these thresholds can relate to the most different things - any parameter from a group of statistics, amplitude or speed of its change and much more), which is exceeded by an alarm. User can also determine under what conditions the exceeding the threshold must be accompanied by alarm signal - this will avoid generating the signal "on trifles", which is bad, first, because no one pays attention to the constantly burning red light bulb, and secondly , Because the transfer of unnecessary alarms over the network leads to an excessive load of communication lines. Alarm is usually transmitted to a group of events, where it is determined what to do with it further.

Host - data on network hosts, including their MAC addresses ..

Hosttopn - Table of the most loaded network hosts. Table N of the main hosts (Hosttopn) contains a list of N first hosts characterized by the maximum value of a given statistical parameter for a specified interval. For example, you can request a list of 10 hosts for which the maximum number of errors over the past 24 hours was observed. This list will be compiled by the agent itself, and the management application will only receive addresses of these hosts and the values \u200b\u200bof the corresponding statistical parameters. It can be seen to what extent this approach saves network resources

TrafficMatrix - Statistics on traffic intensity between each pair of network hosts, ordered in the form of a matrix. The lines of this matrix are numbered in accordance with the MAC addresses of stations - sources of messages, and columns - in accordance with the addresses of recipient stations. Matrix elements characterize traffic intensity between the corresponding stations and the number of errors. After analyzing such a matrix, the user can easily find out which pairs of stations generate the most intensive traffic. This matrix, again, is formed by the agent itself, therefore it disappears the need to transfer large amounts of data to the central computer responsible for managing the network.

Filter - Package filtering conditions. Symptoms for which packages are filtered may be the most diverse - for example, you can need to be filtered as erroneous all packages, the length of which turns out to be less than some specified value. We can say that the filter installation corresponds to the organization of the channel to transmit the package. Where does this channel leads - the user defines. For example, all erroneous packages can be intercepted and sent to an appropriate buffer. In addition, the appearance of a packet corresponding to the installed filter can be considered as an event (EVENT) to which the system should respond in advance.

PacketCapture - package capture conditions. Package Capture Interception Group includes buffer to capture, where packages are sent, whose signs satisfy the conditions formulated in the filter group. At the same time, it can not be captured by the entire package, but, say, only the first few dozen packet bytes. The contents of interception buffers can be subsequently analyzed using various software, finding out a number of highly useful characteristics of the network. Rebuilding filters for certain signs, you can characterize different parameters network work.

EVENT - Conditions for registering and generating events. In the Events Group (Events), it is determined when the alarm control application should be sent when - to intercept the packages, and in general - how to respond to certain events occurring on the network, for example, to excess the threshold values \u200b\u200bspecified in the ALARMS group: should I put The application of the management application, or you just need to configure this event and continue to work. Events may not be associated with betrayal alarms - for example, the direction of the package in the interception buffer is also an event.

These groups are numbered in the indicated order, therefore, for example, the HOSTS group has a numeric name 1.3.6.1.2.1.16.4.

The tenth group consists of special objects of the TOKENRING protocol.

In total, the RMON MIB standard defines about 200 objects in 10 groups recorded in two documents - RFC 1271 for Ethernet and RFC 1513 networks for TokenRing networks.

A distinctive feature of the RMON MIB standard is its independence from the network-level protocol (as opposed to the MIB-I and MIB-II standards focused on TCP / IP protocols). Therefore, it is convenient to use it in heterogeneous environments using various network-level protocols.

1.2 Popular Network Management Systems

Network Management System (Network Management System) - hardware and / or software for monitoring and managing network nodes. The network management system software consists of agents that are localized on network devices and transmitting information to the network control platform. The information exchange method between managing applications and agents on devices is determined by protocols.

Network management systems must have a number of qualities:

true distribution in accordance with the client / server concept,

scalability

openness, allowing to cope with heterogeneous - from desktop computers to mainframes - equipment.

The first two properties are closely connected. Good scalability is achieved through the distribution of the control system. Distribution means that the system may include multiple servers and customers. Servers (managers) collect data on the current network status from agents (SNMP, CMIP or RMON) embedded in network equipment, and accumulate them in their database. Customers are graphic consoles running network administrators. The management system client software accepts requests for performing any action from the administrator (for example, building detailed card Parts of the network) and refers to the necessary information to the server. If the server has the necessary information, then he immediately transmits it to the client, if not, is trying to collect it from agents.

Early versions of the control systems combined all functions in one computer, followed by the administrator. For small networks or networks with a small number of managed equipment, such a structure is quite satisfactory, but with a large number of managed equipment, a single computer that flows information from all network devices becomes a bottleneck. And the network does not cope with a large data stream, and the computer itself does not have time to process them. In addition, a large network manages not one administrator, therefore, except for several servers in a large network, there must be several consoles, followed by network administrators, and on each console, specific information must be presented in accordance with the current needs of a particular administrator.

Support for heterogeneous equipment is rather desired than the actual property of today's management systems. The most popular network management products include four systems: CabletronSystems Spectrum, Hewlett-Packard Openview, NetView IBM and Solstice Corporation manufactured by Sunsoft - Sunmicrosystems divisions. Three companies from four themselves produce communication equipment. Naturally, the Spectrum system is best manages the equipment of Cabletron, OpenView - Hewlett-Packard equipment, and IBM NetView equipment.

When building a network card, which consists of equipment from other manufacturers, these systems are started to be wrong and taking some devices for others, and when controlling these devices, only their basic functions are supported, and many useful additional features that distinguish this device from the rest, the control system is simply Does not understand and, therefore, can not use them.

To correct this shortage, management system developers include support for not only standard MIB I, MIB II and RMON MIB bases, but also numerous private MIB manufacturing companies. The leader in this area is a Spectrum system that supports about 1000 MIB databases of various manufacturers.

Another way to better support specific equipment is to use on the basis of any application management platform that produces this equipment. Leading companies - manufacturers of communication equipment - developed and supply highly complex and multifunctional control systems for their equipment. The most famous systems of this class include the Optivity of BayNetworks, Ciscosystems Ciscoworks, 3Com Transcend. Optivity system, for example, allows you to monitor and manage networks consisting of routers, switches and baynetwork concentrators, fully using all their capabilities and properties. Equipment of other manufacturers under-keys at the level of basic control functions. The Optivity system runs on Hewlett-Packard and SunnetManager OpenView platforms (Solstice Presenter) of Sunsoft. However, work on the basis of any control platform with several systems, such as Optivity, is too complicated and requires computers that all this will work, have very powerful processors and large RAM.

However, if equipment is dominated by equipment from any one manufacturer, the presence of controls of this manufacturer for any popular management platform allows network administrators to successfully solve many tasks. Therefore, the developers of management platforms deliver with them toolsSimplifying application development, and the availability of such applications and their number are considered a very important factor when choosing a management platform.

The opening of the control platform also depends on the form of storage of collected data status data. Most leaderboard platforms allow you to store data in commercial databases, such as Oracle, Ingres or Informix. Using universal DBMS reduces the speed of the control system compared to the data storage in the operating system files, but it allows you to process this data by any applications that can work with these DBMS.

2. Statement of the task

In accordance with the current situation, it was decided to develop and implement a network monitoring system that would solve all the above problems.

2.1 Technical task

Develop and implement a monitoring system that allows you to track both switches, routers of different manufacturers and servers of various platforms. Focus on the use of open protocols and systems, with the maximum use of ready-made developments from the Free Software Fund.

2.2 Refined technical task

In the course of further formulation of the problem and study of the subject area, taking into account economic and temporary investments a technical task was carried out:

The system must meet the following requirements:

§ minimum hardware requirements;

§ open source codes of all components of the complex;

§ expandability and scalability of the system;

§ standard means providing diagnostic information;

§ availability of detailed documentation for all software used;

§ the ability to work with the equipment of various manufacturers.

3. Offered system

1 Selecting a network monitoring system

In accordance with the refined technical assignment, the NagiOS system is suitable as the core of the network monitoring system, as it has the following qualities:

§ there are means of generating diagrams;

§ there are means of generating reports;

§ there is a possibility of logical grouping;

§ there is a built-in trend record system and their prediction;

§ it is possible to automatically add new devices (Autodiscovery) with the help of an official plug-in;

§ it is possible to extended host monitoring using the agent;

§ support SNMP protocol via plugin;

§ support for the Syslog protocol through the plugin;

§ support for external scripts;

§ support for self-contained plugins and the possibility of their rapid and simple creation;

§ built-in triggers and events;

§ full-featured web interface;

§ the possibility of distributed monitoring;

§ inventory through the plugin;

§ the ability to store data both in files and in SQL databases, which is very important with increasing volumes;

§ gPL license, and therefore free basic supply, support and open source codes of the system kernel and accompanying components;

§ dynamic and customizable cards;

§ access control;

§ built-in host description language, services and checks;

§ ability to track users.

The zabbix network monitoring system has a similar set of parameters, but at the time of implementation possessed a much smaller functionality than Nagios and had the status of the Beta version. In addition, the study of thematic forums and news feeds showed the greatest prevalence among users of Nagios, which means the availability of documentation written by users and the most detailed complex moments in the setting.

Nagios allows you to monitor such network services as SMTP, Telnet, SSH, HTTP, DNS, POP3, IMAP, NNTP and many others. In addition, you can follow the use of servers resources, such as the consumption of disk space, free memory and processor load. It is possible to create your own event handlers. These handlers will be executed when there are any events initiated by checks of services or servers. Such an approach will actively respond to occurring events and try to automatically solve the problems that have arisen. For example, you can create an event handler that will independently restart the best service. Another advantage of the Nagios monitoring system is the ability to manage it remotely using a WAP mobile phone interface. Using the concept of "parent" hosts, it is easy to describe the hierarchy and the relationship between all hosts. This approach is extremely useful for large networks, because it allows you to make a complex diagnosis. And this quality, in turn, helps to distinguish non-working hosts, from those that are not available at the moment due to troubleshooting in the work of intermediate links. Nagios is able to build the work schedules of the observed systems and cards of the controlled network infrastructure.

From his practice work with Nagios, the author can lead an example showing how much it is useful for in his personal practice. On the external network interface of the firewall with a periodicity of several hours, the loss of packages began. Due to the malfunction, up to 20 percent of passing traffic. After a minute - another interface again began to work as it should be. Due to the floating nature of this problem, several weeks could not find out why, when working with the Internet, short-term failures occur periodically. Without Nagios, trouble finding a malfunction for a long time.

Many of the administrators are well the Nagios ancestor name named NetSaint. Despite the fact that the NetSaint project site is still working regularly, new developments are based on the Nagios source code. Therefore, everyone is recommended to slowly move on Nagios.

The documentation supplied with Nagios states that it will work stably and with many other UNIX such systems. To display the Nagios Web interface, we will need Apache. You are free to use any other, but in this paper it will be to be considered apache, as the most common Web server on UNIX platforms. You can install the monitoring system at all without a web interface, but we do not do that, because it significantly reduces the convenience of use.

4. Software Development

As a hardware of the implemented system, a regular IBM-compatible computer can be used as a hardware of the system, however, taking into account the possibility of further improvement of the load and the requirements of reliability and operations on the refusal, as well as the State Oyaznadzor, certified server equipment of Aquarius was purchased.

The existing network is actively used by the Debian operating system based on the Linux kernel, there is an extensive experience in using this system, most of the operations on managing, configuring and ensuring the stability of its work are debugged. In addition, this OS applies to the GPL license, which indicates its free and open source code, which corresponds to a refined technical assignment for the design of the network monitoring system. (The full name GNU / Linux is pronounced "GNU Slash Lee ́ nUKS ", also in some languages" GNU + Linux "," GNU-Linux ", etc.) - the general name of UNIX-like operating systems based on the same nucleus and collected libraries and system programs developed as part of the project GNU./ Linux works on PC-compatible systems of the Intel X86 family, as well as on IA-64, AMD64, PowerPC, ARM and many others.

The GNU / Linux operating system also includes programs that complement this operating system and applied programs that make it a full-fledged multifunctional operating environment.

Unlike most other operating systems, GNU / Linux does not have a single "official" configuration. Instead, GNU / Linux is supplied in a large number of so-called distributions in which GNU programs are connected to the Linux kernel and other programs. The most famous Distributions GNU / Linux are Ubuntu, Debian GNU / Linux, Red Hat, Fedora, Mandriva, Suse, Gentoo, Slackware, Archlinux. Russian distributions - Alt Linux And asplinux.

Unlike Microsoft Windows. (Windows NT), Mac OS (Mac OS X) and commercial UNIX-like systems, GNU / Linux has no geographical development center. There is no organization that would have owned this system; There is not even a single coordination center. Programs for Linux are the result of the work of thousands of projects. Some of these projects are centralized, some are concentrated in firms. Many projects combine hackers from all over the world, who are familiar only by correspondence. Create your project or to join the already existing maybe anyone and, if successful, the results of work will be known to millions of users. Users take part in testing free programs, communicate with developers directly, which allows you to quickly find and correct errors and implement new features.

The history of the development of UNIX systems. GNU / Linux is a UNIX-compatible, however, it is based on its own source code.

It is such a flexible and dynamic development system that is impossible for projects with a closed code determines the exceptional economic efficiency [Source is not specified in 199 days] GNU / Linux. The low cost of free developments, the debugged testing and distribution mechanisms, attracting people from different countries with different vision of problems, protection of the GPL license - all this causes the success of free programs.

Of course, such high development efficiency could not not be interested in large firms that began to open their projects. So Mozilla (Netscape, AOL) appeared, OpenOffice.org (SUN), Free Clone InterBase (Borland) - Firebird, SAP DB (SAP). IBM contributed to the transfer of GNU / Linux to its mainframes.

On the other hand, open code significantly reduces the cost of developing closed systems for GNU / Linux and reduces the price of the user's solution. That is why GNU / Linux has become a platform often recommended for products such as DBMS Oracle, DB2, Informix, Sybase, SAP R3, Domino.

The GNU / Linux community supports links through Linux users.

Most users for installing GNU / Linux use distributions. Distribution is not just a set of programs, but a number of solutions for different tasks Users combined by unified installation, management and package updates, settings and support.

The most common distributions in the world are: - quickly conquered the popularity distribution-focused on ease in development and use. - Free Supported version of the SUSE distribution, owned by Novell. It is convenient to configure and maintains thanks to the use of the Yast utility .- is supported by the Community and RedHat Corporation, precedes the releases of the commercial version of Rhel.gnu / Linux - an international distribution, developed by the extensive community of developers in non-commercial purposes. Served as the basis for creating many other distributions. It has a strict approach to the inclusion of non-free software, a French-Brazilian distribution, uniting former Mandrake and Conectiva (English). "One of the oldest distributions is distinguished by a conservative approach in developing and using. - Distribution collected from source code. Allows you to very flexibly set up the final system and optimize productivity, therefore often calls itself by a meta-distribution. Focused on experts and experienced users. - aimed to apply the latest versions of programs and constantly updated, supporting the same as binary and installation from source code and built on the philosophy of ease of KISS, this distribution is focused on competent users who want to have all the power and Modifier Linux, but not sacrificing service time.

In addition to those listed, there are many other distributions, both based on the listed and created from scratch and often intended to perform a limited number of tasks.

Each of them has its own concept, a set of packages, their advantages and disadvantages. None can satisfy all users, and therefore other firms and associations of programmers who offer their solutions, their distributions, their services are safely existing near the leaders. There are many LiveCD based on GNU / Linux, for example, Knoppix. LiveCD allows you to run GNU / Linux directly from a CD, without installing on a hard disk.

For those who wanted to thoroughly deal with GNU / Linux, any of the distributions will suit, but quite often the so-called source-based distributions are used for this purpose, that is, assuming independent assembly All (or parts) components from source code, such as LFS, Gentoo, Archlinux or Crux.

4.1 Installing the system kernel

Nagios can be installed in two ways - from source code and from the collected packages. Both methods have advantages and disadvantages, consider them.

Pros setting the package of their source code:

§ the possibility of detailed system configuration;

§ high degree of application optimization;

§ the most complete view of the program.

Cons Set the package of their source code:

§ additional time is required to assemble the package, often exceeding the time for its setting and adjustment;

§ the inability to delete the package along with configuration files;

§ the inability to update the package along with configuration files;

§ the inability of centralized controls for established applications.

When installing Nagios, from the pre-assembled package, the advantages of the "raw" method of installation become shortcomings, and vice versa. However, as practice has shown, the package assembled in advance satisfies all the requirements for the system and there is no point in spending time on the manual assembly of the package.

Since both installation methods were originally tested, then consider each of them in more detail.

4.1.1 Description of the installation of the core of their source code

Required packages.

It is necessary to make sure that the following packages are installed before the start of the Nagios deployment. A detailed consideration of the process of their installation is beyond the scope of this work.

· Apache 2.

· Php.

· GCC Content and Developer Libraries

· GD developer libraries

You can use the APT-GET utility (better than Aptitude) to install them as follows:

% sudo APT-Get Install Apache2

% sudo APT-Get Install Libapache2-MOD-PHP5

% sudo APT-Get Install Build-Essential

% Sudo APT-Get Install Libgd2-Dev

1) Creating a new user unabled account

A new account is created to start the Nagios service. You can do this from under the superuser account, which will create a serious threat to system security.

Become a superuser:

Create a new user account Nagios and let her password:

# / USR / Sbin / UseRADD -M -S / BIN / BASH Nagios

# Passwd Nagios.

Create a Nagios group and add Nagios user to her:

# / usr / sbin / groupadd nagios

# / usr / sbin / usermod -g nagios nagios

Create a NAGCMD group to allow external commands transmitted through the web interface. Add to this group of users Nagios and Apache:

# / usr / sbin / groupadd nagcmd

# / usr / sbin / usermod -a -g nagcmd nagios

# / usr / sbin / usermod -a -g nagcmd www-data

2) Download Nagios and plugins to it

Create a directory for storing downloaded files:

# mkdir ~ / downloads

# CD ~ / downloads

Swing compressed source codes of Nagios and its plugins (# "Justify"\u003e # wget # "Justify"\u003e # Wget # "Justify"\u003e 3) compile and install Nagios

Unpacking compressed source codes Nagios:

# CD ~ / downloads

# TAR XZF Nagios-3.2.0.tar.gz

# CD Nagios-3.2.0

Run the Nagios configuration script by passing it the name of the group that we created earlier:

# ./configure --with-Command-Group \u003d Nagcmd

Full list of configuration script parameters:

#. / Configure --help

`configure" Configures This Package to Adapt to Many Kinds of Systems: ./configure ... ... Assign Environment Variables (EG, CC, CFLAGS ...), Specify them as \u003d value. See Below for Descriptions of Some Of The Useful Variables.For the Options Are Specified in Brackets.:

h, --Help Display This Help and Exit

Help \u003d Short Display Options Specific To This Package

Help \u003d Recursive Display The Short Help of All The Included Packages

V, --Version Display Version Information and Exit

q, --Quiet, --silent Do Not Print` Checking ... "Messages

Cache-file \u003d File Cache Test Results in File

C, --Config-Cache Alias \u200b\u200bfor `--cache-file \u003d config.cache"

n, --No-Create Do Not Create Output Files

Srcdir \u003d Dir Find The Sources in Dir Directories:

Prefix \u003d Prefix Install Architecture-Independent Files in Prefix

Exec-Prefix \u003d EPrefix Install Architecture-Dependent Files in EPrefixDefault, `make install" Will Install All the Files in `/ usr / local / nagios / bin", `/ usr / local / nagios / lib" etc. You can Specify An Installation Prefix Other Than` / USR / Local / Nagios "Using` --Prefix ", for instance` --prefix \u003d $ home ".Better Control, Use The Options Below.Tuning of the Installation Directories:

Bindir \u003d Dir User Executables

Sbindir \u003d Dir System Admin Executables

Libexecdir \u003d Dir Program Executables

Datadir \u003d Dir Read-Only Architecture-Independent Data

Sysconfdir \u003d Dir Read-Only Single-Machine Data

SharedStateDir \u003d Dir Modifiable Architecture-Independent Data

Localstatedir \u003d Dir Modifiable Single-Machine Data

LIBDIR \u003d Dir Object Code Libraries

Includedir \u003d Dir C Header Files

Oldincludedir \u003d Dir C Header Files for Non-GCC

Infodir \u003d Dir Info Documentation

Mandir \u003d Dir Man Documentation Types:

BUILD \u003d Build Configure for Building On Build

Host \u003d Host Cross-Compile to Build Programs to Run on Host Features:

Disable-Feature Do Not Include Feature (SAME AS --ENABLE-Feature \u003d NO)

Enable-Feature [\u003d Arg] include Feature

Disable-statusmap \u003d disables compiling of statusmap cgi

DISABLE-STATUSWRL \u003d DISABLES COMPILATION OF STATUSWRL (VRML) CGI

Enable-Debug0 SHOWS FUNCTION ENTRY AND EXIT

Enable-Debug1 SHOWS GENERAL INFO Messages

Enable-Debug2 Shows Warning Messages

Enable-Debug3 SHOWS SCHEDULED EVENTS (Service and Host Checks ... etc)

ENABLE-DEBUG4 SHOWS SERVICE AND HOST NOTIFICATIONS

Enable-Debug5 SHOWS SQL QUERIES

Enable-Debugall Shows All Debugging Messages

Enable-Nanosleep Enables Use Of Nanosleep (Instad Sleep) in Event Timing

Enable-Event-Broker Enables Integration Of Event Broker Routines

Enable-Embedded-Perl Will Enable Embedded Perl Interpreter

Enable-Cygwin Enables Building Under the Cygwin EnvironmentPackages:

WITH-PACKAGE [\u003d ARG] Use Package

WITHOUT-PACKAGE DO NOT USE PACKAGE (SAME AS --WITH-PACKAGE \u003d NO)

WITH-NAGIOS-USER \u003d Sets User Name to Run Nagios

With-nagios-group \u003d Sets Group Name to Run Nagios

With-Command-user \u003d Sets User Name for Command Access

With-Command-Group \u003d Sets Group Name for Command Access

With-mail \u003d Sets Path to Equivalent Program to Mail

WITH-INIT-DIR \u003d Sets Directory To Place Init Script Into

With-Lockfile \u003d Sets Path and File Name for Lock File

WITH-GD-LIB \u003d DIR Sets Location of the GD Library

WITH-GD-INC \u003d DIR SETS LOCATION OF THE GD INCLUDE FILES

With-cgiurl \u003d Sets URL for CGI Programs (Do Not Use A Tradition Slash)

WITH-HTMURL \u003d Sets Url for Public HTML

With-Perlcache Turns on Cacheing Of Internally Compiled Perl ScriptsInfluential Environment Variables: C Compiler Commandc Compiler Flagslinker Flags, E.G. -L. IF You Have Libraries in Adirectory C / C ++ Preprocessor Flags, E.G. -I. IF You Havein A Nonstandard Directory C Preprocessorthese Variables to Override The Choes Made By `Configure" Or to Helpto Find Libraries and Programs with nonstandard Names / Locations.

Complete the source code Nagios.

We establish binary files, the initialization script, examples of configuration files and set permissions on the external command directory:

# Make Install-Init

# Make Install-Config

# Make Install-CommandMode

) Change the configuration

Examples of configuration files are installed in the / usr / local / nagios / etc directory. They should immediately be workers. It is necessary to make only one change before proceeding.

You will edit the configuration file /usr/local/nagios/etc/Objects/contacts.cfg by any text editor and change email address tied to the definition of contact Nagiosadmin to the address we are going to receive messages.

# VI /usr/Local/Nagios/etc/Objects/contacts.cfg.

5) Configuring a web interface

Set the Nagios web interface configuration file to the Apache Conf.d directory.

# Make Install-Webconf

Create a NagioSadmin account to enter the Nagios web interface

# hpasswd -c /usr/local/nagios/etc/htpasswd.users nagiosadmin

Restart Apache to change the changes.

# /etc/init.d/apache2 Reload.

It is necessary to take measures to enhance CGI security to prevent the steal of this account, since monitoring information is quite sensitive.

) Complete and install Nagios plugins

Unpacking compressed source codes of Nagios plug-ins:

# CD ~ / downloads

# TAR XZF Nagios-Plugins-1.4.11.tar.gz

Complete and install plugins:

# ./configure --with-nagios-user \u003d nagios --with-nagios-group \u003d nagios

#make Install

) Launch the Nagios service

Configure Nagios to automatic download when you turn on the operating system:

# ln -s /etc/init.d/nagios /etc/rcs.d/s99nagios

Check the syntactic correctness of exemplary configuration files:

# / usr / local / nagios / bin / nagios -v /usr/local/nagios/etc/nagios.cfg

If there are no errors, then launch Nagios:

# /etc/init.d/nagios Start

) We enter the web interface

Now you can enter the Nagios web interface using the next URL. A request will be issued to enter the user name (Nagiosadmin) and the passwords that we have previously asked.

# "Justify"\u003e) Other settings

To obtain email reminders about Nagios events, you must install the MailX package (PostFix):

% sudo APT-Get Install Mailx

% sudo APT-Get Install Postfix

It is necessary to edit the NagiOS file reminders file /USR/Nagios/etc/Objects/commands.cfg and change all links from "/ bin / mail" on "/ usr / bin / mail". After that, you need to restart the Nagios service:

# sudo /etc/init.d/nagios Restart

Detailed mail module configuration is described in Appendix G.

4.1.2 Description of the installation of the core system from the repository

As was shown above, the installation of Nagios from the source texts takes considerable time and makes sense only when the application or desire to thoroughly optimize the application or desire to thoroughly deal with the system operation mechanism. In operating conditions, most software is installed from repositaries in the form of precompiled packages. In this case, the installation comes down to entering one command:

% Sudo Aptitude Install Nagios

Package manager will independently satisfy all dependencies and install the necessary packages.

4.2 Configuring the system kernel

Before the detailed setting, you should understand how the Nagios kernel works. Its graphic description is given below in the 6.2 illustration.

4.2.1 Description of the system kernel system

The following figure shows a simplified scheme of the Nagios service.

Fig. 4.1 - System Core

The Nagios service reads the main configuration file in which in addition to the basic parameters of the service, there are references to resource files, object description files and CGI configuration files.

The algorithm and logic of the network monitoring core are shown below.

Fig. 4.2 - Nagios alert algorithm

2.2 Description of the interaction of configuration files

In the /etc/apache2/conf.d/ directory there is a Nagios3.conf file, from which the Apache web server takes the settings for Nagios.

Nagios configuration files are located in the / etc / nagios3 directory.

The /etc/nagios3/htpasswd.users file contains passwords for Nagios users. The command for creating a file and setting the password for the Nagios user by default is above. In the future, it will be necessary to omit the "-C" argument when you specify a password for a new user, otherwise the new file will raise the old one.

The /etc/nagios3/NagiOS.cfg file contains the main configuration of the Nagios itself. For example, event log files or path to other configuration files that Nagios reads when starting.

The / etc / Nagios / Objects directory sets new hosts and services.

4.2.3 Filling of the descriptions of hosts and services

As shown above, configure the system kernel using one description file for hosts and services, but this method will not be convenient with an increase in the number of tracked equipment, so you need to create a certain directory structure and files with host descriptions and services.

The created structure is shown in Appendix Z.

File hosts.cfg.

First you need to describe the hosts followed by observation. You can describe how many hosts can be described, but in this file we will limit ourselves to the general parameters for all hosts.

Here, the described host is not a real host, but a template based on the descriptions of all other hosts. The same mechanism can be found in other configuration files when the configuration is based on a predetermined multiple default values.

File hostgroups.cfg.

Here are added hosts to the hostgroup group (HostGroup). Even in a simple configuration, when the host is one, you still need to add it to the group so that Nagios knows what kind of contact group must be used to send alerts. About the contact group Read more below.

ContactGroups.cfg file

We have identified the contact group and added users to this group. This configuration ensures that all users will receive a warning if something is wrong with the servers for which the group is responsible. True, it should be borne in mind that individual settings for each of the users can block these settings.

The next step you need to specify contact information and alert settings.

Contacts.cfg file

In addition to the fact that additional user contact information is given in this file, one of the fields, contact_name, has one more destination. CGI scripts use the names specified in these fields to determine whether the user has the right to access some resource or not. You must configure authentication based on.htaccess, but besides this, you need to use the same names that are used above in order for users to work through the Web interface.

Now that hosts and contacts are configured, you can switch to configuring the monitoring of individual services, which should be monitored.

File Services.cfg.

Here we are as in the hosts.cfg file for hosts, only general parameters for all services were specified.

A huge number of additional Nagios modules are available, but if any check is still not, you can always write it yourself. For example, no module checking or not Tomcat. You can write a script that downloads the JSP page from a remote Tomcat server and returns the result depending on if some text on the page or not on the page loaded. (When adding a new command, you must refer it to the checkcommand.cfg file, which we did not touch).

Next, on each individual host, we create your own file description, in the same file we will store the descriptions of services through which we will monitor for this host. This is done for convenience and a logical organization.

It is worth noting that Windows hosts are monitored by the SNMP and NSClient protocol a supplied with Nagios. Below is the scheme of its work

Fig. 4.3 - Windows monitoring scheme of hosts

At the same time * NIX hosts are monitored by SNMP, as well as the NRPE plugin. The scheme of its work is shown in the picture.

Fig. 4.4 - Monitoring scheme * NIX hosts

2.4 Writing plugins

In addition to writing the initialization scripts, the definitions of hosts and services, the following plugins were used:

├── Check_disk.

├── Check_dns.

├── Check_Http.

├── Check_icmp

├── Check_IFOPERStatus.

├── Check_ifstatus.

├── Check_imap -\u003e Check_TCP

├── Check_linux_raid

├── Check_Load.

├── CHECK_MRTG.

├── Check_mrtgTraf.

├── Check_NRPE

├── Check_nt.

├── Check_ping.

├── Check_POP -\u003e Check_TCP

├── Check_sensors.

├── Check_Simap -\u003e Check_TCP

├── Check_SMTP.

├── Check_snmp

├── Check_Snmp_load.pl

├── Check_snmp_mem.pl

├── Check_Spop -\u003e Check_TCP

├── Check_ssh.

├── Check_ssmtp -\u003e Check_TCP

├── Check_swap

├── Check_TCP

├── Check_time.

Most of them comes with Nagios package. The initial texts of the plug-ins not included and used in the system are presented in the Annex I.

4.2.5 Setting up SNMP on remote hosts

To be able to monitor the SNMP protocol, it is necessary to pre-configure the agents of this protocol. The SNMP operation scheme in a bundle with the core of the network monitoring system is shown in the figure below.

Fig. 4.5 - Monitoring scheme via SNMP protocol

The configuration parameters of hosts are presented in Appendix C. Safety is carried out by individually configuring a batch filter on each of the hosts individually and through the organization of protected system subnets, which only authorized staff staff has access. In addition, the setup is made in such a way that through the SNMP protocol, you can only read the parameters, and not their record.

4.2.6 Setting the agent on remote hosts

For the possibilities of advanced host monitoring and services, it is necessary to establish a Nagios agent on them, which is called Nagios-NRPE-Server:

# Aptitude Install Nagios-Nrpe-Server

The agent configuration is presented in Appendix L. The operation of the agent is shown in Figure 4.5 above.

4.4 Installing and configuring download tracking module

MRTG (Multi Router Traffic Grapher) is a service that allows you to receive information from several devices using the SNMP protocol and display the channel load graphics browser (incoming traffic, outgoing, maximum, medium) in the window in minutes, hours, days and year.

Installation requirements

The following libraries are required for MRTG:

§ gD - Graph Drawing Library. Library responsible for forming graphics (# "Justify"\u003e § lIBPNG - GD is required to create graphics in PNG format (# "Justify"\u003e In our case, the installation is reduced to the execution of one command, because the method of installing a pre-commission package from the repository is selected:

# Aptitude Install MrTG

You can create configuration files manually, and you can use the configuration generators running as part of the package:

# CFGmaker @ >

After generating a configuration file, it is recommended to check it, because It may describe interfaces that we do not need to analyze on the workload. In this case, certain lines in the file are commenting or deleted. An example of the MRTG configuration file is given in Appendix M. Due to the large amount of these files, only an example of one file is given.

# indexmaker >

Index Pages are conventional HTML files and their contents are not particularly interested, so it does not make sense to bring their examples. Annex H shows an example of displaying interface download charts.

Finally, you must organize checking the workload of the scheduled interfaces. To achieve this is the easiest way tools by the operating system, namely the Crontab parameters.

4.5 Installing and configuring Module for collecting system logs of events

The syslog-ng.ng (Syslog Next Generation) package is selected as the event logging logging module (Syslog Next Generation) is a multifunctional system messaging service. Compared to the standard syslogd service, it has a number of differences:

§ improved configuration scheme

§ filtering messages not only on priorities, but also by their content

§ rEGEXPS support (Regular Expressions)

§ more flexible manipulation and organization of logs

§ ability to encrypt data channel using IPsec / Stunnel

The following table presents supported hardware platforms.

Table 4.1 - Supported Hardware Platforms

x86x86_64SUN SPARCppc32ppc64PA-RISCAIX 5.2 & 5.3NetNetNetDaPo zaprosuNetDebian etchDaDaNetNetNetNetFreeBSD 6.1 * DaPo zaprosuPo zaprosuNetNetNetHP-UNet 11iNetNetNetNetNetDaIBM System iNetNetNetDaNetNetRed Hat ES 4 / CentOS 4DaDaNetNetNetNetRed Hat ES 5 / CentOS 5DaDaNetNetNetNetSLES 10 / openSUSE 10.0DaPo zaprosuNetNetNetNetSLES 10 SP1 / openSUSE 10.1DaDaNetNetNetNetSolaris 8NetNetDaNetNetNetSolaris 9Po zaprosuNetDaNetNetNetSolaris 10After zaprosuDaDaNetNetNetWindowsDaDaNetNetNetNet Note: * Access to Oracle database is not supported.

A detailed comparison of technical features is given in Appendix P.

The rules and filter description files, as well as the configuration of remote hosts, are shown in Appendix R.

There is a RFC document that describes the SYSLOG protocol in a general form, the operation of the system magazine collector module can be submitted to the following scheme

Fig. 4.6 - Scheme operation of the Module for collecting system logs

On the client host, each individual application writes its event magazine, thereby forming the source. Next, the message flow for logs passes through the definition of storage space, then it is determined through the filters, its network direction is determined, after which, falling on the logging server, the storage location is again determined for each message. The selected module has great scaling and complicated configuration capabilities, such as filters may branch, so that system event messages will be sent to several directions depending on multiple conditions, as shown in the figure below.

Fig. 4.7 - Branching filters

The scaling ability implies that in order to distribute the load, the administrator will deploy a network from the auxiliary filtering servers, the so-called relay.

Fig. 4.8 - Zooming and load distribution

Ultimately, you can easily describe the operation of the module as possible - client hosts transmit event log messages from different applications unloading servers, those in turn can transmit them along the relay chain, and so to central collection servers.

Fig. 4.9 - generalized module operation scheme

In our case, the data stream is not so great to deploy the system of unloading servers, so it was decided to use a simplified client-server work scheme.

Fig. 4.10 - Adopted work scheme

5. Guide to the system administrator

In general, the system administrator is recommended to adhere to the existing hierarchy of the location of configuration files and directories. Adding new hosts and services to the monitoring system is reduced to the creation of new configuration files and initialization scripts, as shown in Section 5 - software development, so there is no sense to re-describe the parameters and the principles of system configuration in this work, but it is worth staying in more detail on the description. Interfaces of individual system modules.

5.1 System Web Interface Description

In order to perform interactive surveillance services, the Web interface is integrated more conveniently into the system. Web interface is still good because it gives a complete picture of the system thanks to skillful use. graphics and providing additional statistical information.

When entering the Nagios web page, the user name and password that we installed during the setup process will be requested. start page The web interface is shown in the figure below.

Fig. 5.1 - Start Page Web Interface System

On the left is the navigation panel, on the right, the results of a different submission of data on the status of the network, hosts and services. We will be interested in the first section Monitoring. Let's look at the TACTICAL OVERVIEW page.

Fig. 5.2 - Start page of the system web interface system

This page contains summing information in all parameters of the monitoring and state of hosts and services, and no details are given, however, if any problems arise, they are highlighted by special color and become a hyperlink leading to a detailed description of the problem. In our case, at the current moment, among all hosts and services there is one unresolved problem, which turn on this link (1 unhandled problem).

Fig. 5.3 - Detected service problem

Here we are in tabular form watching what kind of host the problem arose that it caused it for the service (in our case it is a big loading of the processor on the router), error status (may be normal, threshold and critical), the last check time, the time of the presence of the problem, Account verification number in the loop and detailed information with specific values \u200b\u200breturned by the installed plugin.

Fig. 5.4 - Detailed status description

Here we see a complete description of the problem, this page is useful with deep analysis of the problem, when the reason for its occurrence is not quite clear, for example, it can be in too rigidly given thresholds of the status criticality or incorrectly specified parameters of the plug-in, which will also be evaluated by the system as a critical state. . In addition to the description, from this page it is possible to execute commands over the service, for example, turn off the checks, assign another time to the next check, accept the data passively, accept the problem for consideration, turn off the alerts, send alert manually, schedule a shutdown of the service, disable the detection of the unstable state and write a comment.

Go to the Service Detail page.

Fig. 5.5 - Detailed view of all services

Here we see a list of all hosts and services, regardless of their current state. This feature can also be useful, but it is not entirely convenient to view the long list of hosts and services and it is more likely to visually imagine the amount of work performed by the system. Here each host and service, as in Figure 6.3, is a reference leading to a more detailed description of the parameter.

Fig. 5.6 - full detailed list of hosts

This table presents a complete detailed list of hosts, their statuses, the last check time, the duration of the current status and additional information. In our system, it is customary that the host status is verified by checking the availability of the host via ICMP (8), that is, the Ping command, but in the general case, you can be anywhere. Icons in the column of the right on behalf of the host talk about the group to which he belongs is done for the convenience of perception of information. The Light Icon This link leading to a detailed list of this host services, describe this table separately does not make sense, it is exactly the same as in Figure 10.4, only information is presented about the only host.

The following links list are different modifications of previous tables and deal with their content will not be difficult. The most interesting possibility of the web interface is the ability to build a network card in semi-automatic mode.

Fig. 5.7 - Complete pie chart network

Through the Parent parameter of each host and service, we can create a structure or hierarchy of our network, which will determine the logic of the network monitoring kernel and the host presentation and services on the network map. There are several display modes, in addition to the circular, most convenient is a balanced wood mode and a spark-shaped.

Fig. 5.8 - Network Map - Balanced Tree Mode

Fig. 5.9 - Net Map - Floor Mode

In all modes, the image of each host is a reference to its service table and their states.

The next important part of the monitoring core interface is the trend builder. With it, you can plan to replace the equipment to more productively, we give an example. Click on the TRENDS link. Select the report type - service.

STEP 1: SELECT REPORT TYPE: SERVICE

We choose the period of counting and generate a report.

Fig. 5.10 - Trend.

We generated the processor load trend on routing. It can be concluded from it that during the month this parameter is constantly worsening and it is necessary to take measures or to optimize the work of the host or prepare for its replacement to a more productive.

5.2 Interface Tracking Module Web Interface Configuration

The web interface of the interface load tracking module is a list of directories in which index pages of tracked hosts with the download charts of each interface are located.

Fig. 5.11 - Interface load tracking module start page

Going on any of the links, we will get the download schedules. Each schedule is a reference leading to statistics for the week, month and year.

5.3 Description Module for collecting system logs of events

Currently, improved filtering of system logs and the ability to search for them through a single web interface, because The problems requiring viewing these magazines arise quite rarely. Therefore, the development of a database for these logs and web interface is postponed. Currently, access to them is carried out by SSH and view directories in file Manager MC.

As a result of the operation of this module, the following directories obtained:

├── Apache2

├── Asterix.

├── BGP_ROUTER

├── Dbconfig-Common

├── Installer

│ └── Cdebconf.

├── LEN58A_3LVL.

├── Monitoring

├── Nagios3

│ └── Archives

├── OcsInventory-Client

├── OcsInventory-Server

├── Quagga.

├── Router_Krivous36B.

├── Router_lenina58A.

├── Router_su.

├── Router_ur39a.

├── Shaper

├── UB13_ROUTER

├── UNIVER11_ROUTER.

└── Voip.

Each directory is a repository of event logs for each individual host.

Fig. 5.13 - viewing data collected by the event logging module

6. Testing of work

When implementing the system, a gradual testing of the operation of each component was performed, starting from the system kernel. The expansion of the functional was carried out only after the final setup of the underlying hierarchy levels of the network monitoring system modules due to many dependencies of various subsystems. Step by step, in general, you can describe the process of introduction and testing as follows:

) Installation and adjustment of the nagios core;

) Setting the monitoring of remote hosts by the basic functional Nagios;

) Adjusting the network interface load tracking module by MRTG;

) Expansion of system kernel functionality and its integration with the MRTG module;

) Adjustment of the system logging module;

) Writing a script for initializing a packet filter monitoring system in order to ensure the security of the system.

7. Information security

1 Workplace Characteristics

To the harmful factors affecting the work when using PEVM include:

· increased power voltage;

· noise;

· electromagnetic radiation;

· electrostatic field.

To ensure the best conditions for efficient and safe work, it is necessary to create such working conditions that will be comfortable and maximum reducing the impact of these harmful factors. It is necessary that the listed harmful factors are consistent with the established rules and norms.

7.2 Labor Safety

2.1 Electrical safety

The projected software is created based on work on the existing server located in a specially equipped technical room. It is equipped with cable boxes for cable laying. Each server is connected to a power supply ~ 220V, a frequency of 50 Hz, with a working ground. Before entering the power supply to the room installed machines that turn off the power supply in the event of a short circuit. Separately a protective grounding.

When the computer is connected, it is necessary to connect the hardware housing with the residential protective ground in order to in order in case of failure of insulation or for any other reason, the dangerous voltage of the power supply, when a person is touched by a person's body, could not create a current of a hazardous amount through the human body.

To do this, use the third contact in the electrical outlets, which is connected to the core of the protective ground. The hardware housings are ground through the power cable on a specially dedicated conductor.

Technical measures apply to protect against defeat electric shock When touched to the electrical installation case, in the case of a breakdown of insulation of the current-carrying parts, which include:

· protective grounding;

· protective reassembly;

· protective shutdown.

7.2.2 Noise Protection

Studies show that in the conditions of noise, first of all, hearing functions suffer. But the effect of noise is not limited to influence only on rumor. It causes noticeable shifts of a number of physiological mental functions. Noise harmful affects the nervous system and reduces the speed and accuracy of sensor processes, the number of errors increases when solving intellectual tasks. The noise has a noticeable impact on human attention and cause negative emotions.

The main source of noise in the premises where the computer is located, is the air conditioning equipment, printed and copying equipment, and in the computer themselves fans of cooling systems.

The production facilities are actively used by the following measures to combat noise:

· the use of silent cooling mechanisms;

· insulation of noise sources from the environment with sound insulation and sound absorption;

· the use of sound-absorbing materials for facing the premises.

The following sources of noise are present in the workplace indoor:

· system block (cooler (25DB), hard disk (29DB), power supply (20dB));

· printer (49DB).

The general noise L, emitted by these devices, is calculated by the formula:

where Li is the noise level of one device, dB \u003d 10 * lg (316.23 + 794.33 + 100 + 79432.82) \u003d 10 * 4,91 \u003d 49,1

According to CH 2.2.4 / 2.1.8.562-96 The noise level at the workplace of mathematician-programmers and video operators should not exceed 50 dB.

7.2.3 Protection against electromagnetic radiation

Protection against electromagnetic exposure is provided by screens with an electrically conductive surface and using monitors equipped with the LOW Radiation system, which minimizes the level of harmful radiation, as well as liquid crystal monitors in which the electromagnetic radiation is completely absent.

7.2.4 Protection from the Electrostatic Field

To protect against electrostatic charge, a grounded protective filter is used, air humidifiers, and the floors are coated with antistatic coating. To maintain the normalized values \u200b\u200bof the concentration of positive and negative ions in computers with computers, air conditioners are installed, air ionization devices and natural ventilation is carried out with a duration of at least 10 minutes after every 2 hours of operation.

In order to prevent the harmful effect on the body of working people of dust with aeroins, a wet cleaning of premises is carried out daily and no less often 1 times the dust from the screens is removed when the monitor is turned off.

7.3 Working conditions

3.1 Microclimate Production Room

The equipment under consideration in this diploma project does not produce any harmful substances. Thus, the air medium in the room, where they are used, harmful effects on the human body does not and meet the requirements of the I category I, according to GOST 12.1.005-88.

The optimal norms of temperature, relative humidity and air velocity in the working area of \u200b\u200bthe industrial premises are normalized by GOST 12.1.005-88 and are shown in Table 7.1.

Table 7.1 Microclimate-Parameters

Normable parameter-conformant air-separated air, C20 - 2218 - 2020Beligence,% 40 - 60, more than 8045 air movement, m / s0,20,30,0.3

The microclimate corresponds to optimal conditions.

3.2 Production lighting

To calculate the accompaniment department in Gercon LLC in the city of Upper Pyshma, where the design of this project was developed:

· the area of \u200b\u200bthe room is 60m2;

· the area of \u200b\u200blight openings is 10 m2;

· installed 4 automated jobs.

The calculation of natural illumination is made according to the SNiP formula 23.05-95:

S0 \u003d SP * EN * KZ * N0 * KDD / 100% * T0 * T1 (7.2)

Where S0 is the area of \u200b\u200blight openings, M2;

SP - floor area of \u200b\u200bthe room, m2, 60;

eN - Natural Light Coefficient, 1.6;

KZ is a reserve coefficient, 1.5;

N0 - light characteristics of windows, 1;

KZD - coefficient, taking into account the darkening of windows with opposing buildings, 1.2;

T0 is a common transformation coefficient, 0.48;

T1 is the reflection coefficient from the surface of the room, 1.2.

The values \u200b\u200bof all coefficients are taken in SNiP 23.05.-95.

As a result of the calculation, we obtain: the required area of \u200b\u200blight opening windows S0 \u003d 3.4 m2. The real area of \u200b\u200bthe openings is equal to 10m2, which exceeds the minimum permissible area of \u200b\u200blight openings for the premises of this type and is sufficient in the daytime.

Calculation of artificial lighting for a room illuminated by 15 fluorescent lamps of the LDC-60 capacity by 60W each.

According to SNiP 23.05-95, the magnitude of the illumination with fluorescent lamps should be in the horizontal plane not lower than 300lm - for the general lighting system. Taking into account the visual work of high accuracy, the magnitude of the illumination can be increased to 1000lm.

Light flow luminescent lamp Calculated by the formula from SNiP 23.05.-95:

FI \u003d EN * S * Z * K / N * η (7.3)

where EN - normalized illumination of the room, LC, 200;

S - floor area of \u200b\u200bthe room, m2, 60;

Z is a coefficient that takes into account the ratio of medium light to the minimum, 1.1;

K is the reserve coefficient, taking into account air pollution, 1.3;

N - the number of lamps, 15;

η - the utilization coefficient of the light stream, 0.8.

As a result, we obtain fi \u003d 1340lm, the total luminous flux of all lamps is 3740lm, therefore, the illumination of the laboratory is higher than the minimum permissible.

7.4 Workplace Ergonomics

4.1 Workplace Organization

In accordance with Sanpin 2.2.2 / 4.2.1340-03, the VDT (video filled terminal) must meet the following technical requirements:

· light brightness at least 100kd / m2;

· the minimum size of the light point is not more than 0.1 mm for the color display;

· contrast image of a sign of at least 0.8;

· foreign sweep frequency at least 7 kHz

· the number of points is at least 640;

· anti-reflective screen coating;

· screen size is at least 31cm diagonally;

· the height of the characters on the screen is at least 3.8 mm;

· the distance from the eye of the operator to the screen should be about 40-80cm;

A MDT should be equipped with a rotary pad that allows you to move it in horizontal and vertical planes in the range of 130-220mm and change the angle of inclination of the screen by 10-15 devices.

The diploma project was performed on a computer with VIEWSONIC diagonal 39cm. This monitor is made in accordance with world standards and respond to all of the above technical requirements.

The following requirements are presented to the keyboard:

· case coloring in calm soft tones with diffuse light diffuse;

· matte surface with a reflection coefficient of 0.4 - 0.6 and does not have brilliant parts capable of creating glare;

The project was performed on the Logitech brand keyboard, which meets all the above requirements.

System blocks are installed at the workplace, taking into account the easy reach of drives on flexible magnetic disks and convenient access to the connectors and controls on the back side. Frequently used diskettes are stored near the system unit in the dust and electromagnetic cell. The printer is placed on the right of the user. The printed text is visible to the operator while finding it in the main working position. Near the printer in special compartments, clean paper and other necessary accessories are stored.

Connecting cables are packed in special channels. The channel device should such that the connectors do not prevent the removal of cables.

For a "mouse" manipulator to the right of the user on the table top there is a free platform, which in shape and size should be identical to the screen surface.

Workplace The operator meets the requirements of GOST 12.2.032-78 of the SSBT.

The spatial organization of the workplace provides optimal working postal position:

· the head is tilted forward by 10 - 20 degrees;

· the back has an emphasis, the ratio between the shoulder and forearm, as well as between the thigh and the shin - a straight angle.

The main parameters of the workplace must be adjustable. This ensures the possibility of creating favorable working conditions to a separate person, taking into account geoanthropometric characteristics.

The main parameters of the workplace and furniture equipped with a personal computer (Fig. 7.1)

Fig. 7.1 - Operator's workplace OPM

· seat height 42 - 45 cm;

· the height of the keyboard from the floor is 70 - 85cm;

· the angle of inclination of the keyboard from the horizontal 7 - 15 devices;

· the remoteness of the keyboard from the edge of the table is 10-6 cm;

· the distance from the center of the screen to the floor is 90 - 115cm;

· the angle of inclination of the screen from the vertical 0 - 30Gradusovs (optimal 15);

· the remoteness of the screen from the edge of the table 50 - 75 cm;

· the height of the working surface for recordings 74 - 78cm;

In the workplace there is a stand for legs recommended for all types of work related to long-term preservation in the sitting position

SanPine 2.2.2.542-96 The nature of the work of the computer operator is considered easy and refers to category 1a.

Interruptions are installed 2 hours from the start of the work shift and 2 hours after a lunch break for 15 minutes each. During the regulated breaks in order to reduce neuro-emotional stress, fatigue, eliminating the influence of hypodynamine, the exercise complexes are performed.

7.5 Fire safety

The room where the project was carried out on this project was established a category of fire danger in NPB 105-03 - combustible and non-combustible fluids, solid combustible and non-combustible substances and materials, including dust and fibers, substances and materials capable of interacting with water, oxygen Air or with each other only to burn, provided that the premises in which they are available or are formed do not refer to categories A or B. The building for the premises of the I degree of fire resistance to SNIP 21-01-97.

The following security rules are complied with the production facilities:

· passages, outputs from the room, access to fire extinguishing products are free;

· equipment in operation, is properly checked every time before starting work;

· at the end of the work, the room is inspected, de-energized the power grid, the room is closed.

The number of evacuation exits from the buildings from the room is two. The width of the evacuation exit (door) is 2m. In the paths of evacuation, conventional stairs and swing doors are used. On the staircases there are no rooms, technological communications, outputs of lifts and freight elevators. On evacuation paths, both natural and artificial emergency lighting are arranged.

As primary means of fire extinguishing in the room there are manual carbon dioxide fire extinguishers in the number of two indoors.

To detect the initial stage of sunbathing and alerts of the fire protection service, an automatic and fire alarm system is used (APS). It independently activates the fire extinguishing installation until the fire has achieved large sizes and notifies city service Fireguard.

Objects of WCs In addition to the APS, it is necessary to equip the stationary auto-fear. The installation of gas extinguishing fires is used, the action of which is based on the rapid filling of the room with a fire extinguishing gas, resulting in decreasing the oxygen content in the air.

7.6 Emergency situations

In the conditions of this room, the most likely emergency may be a fire. If the fire occurs, it is necessary to evacuate the staff and report that happened to the fire service. The evacuation plan is presented in Figure 7.2.

Fig. 7.2 - Evacuation plan for fire

8. Economic part

This section discusses the costs of developing a network monitoring system, its implementation and maintenance, as well as related materials and equipment.

The cost of the project find the Council reflection of the cost of goods consumed during the development and production of funds and labor items (depreciation, cost of equipment, materials, fuels, energy, etc.), part of the value of living labor (wage), the cost of purchased system modules.

In the process of activity and increasing the volume of service provisions, the problem of the preemptive detection of faulty and weaknesses in the organization of the network arose, that is, the task of implementing a solution to predict the need to replace or upgrade the sections of the network before the malfunctions are affected by the work of subscriber nodes.

With the growth of the client base, and as a result, the number of active equipment, there was a need to quickly monitor the state of the network as a whole and its individual elements in details. Prior to the implementation of the network monitoring system, the network administrator had to be connected by Telnet, HTTP, SNMP, SSH protocols, etc. To each object of the network and use the built-in monitoring and diagnostics tools. At the moment, the network capacity is 5000 ports, 300 2-level switches, 15 routers and 20 internal use servers.

In addition, the network overload and floating malfunctions were found only if serious problems occur in users, which did not allow to draw up plans to upgrade the network.

All this led first of all to the continuous deterioration of the quality of the services offered and improved the load on system administrators and the technical support of users, which entailed colossal losses.

In accordance with the current situation, it was decided to develop and implement a network monitoring system that would solve all the above problems, which, summarizing, can be expressed as follows:

It is necessary to develop and implement a monitoring system that allows you to track both switches, routers of different manufacturers and servers of various platforms. Focus on the use of open protocols and systems, with the maximum use of ready-made workflows from the free software fund, which from an economic point of view reduces the cost of licensing the final system to zero.

The system must meet the following economic requirements:

· minimum hardware requirements (leads to a decrease in the cost of the project's hardware);

· open source codes of all components of the complex (allows you to independently change the principle of operation of the system, without resorting to the help of third-party proprietary developments and reduces product licensing cost);

· extensibility and scalability of the system (it allows the scope of application of the application without resorting to the help of third-party and proprietary developments and reduces product licensing cost);

· standard means of providing diagnostic information (reduces the cost of maintaining system);

· availability of detailed documentation for all software used (makes it possible to quickly train a new employee);

· the ability to work with the equipment of various manufacturers (makes it possible to use one software product). (A complete list of equipment is given in Appendix B).

In general, the project development ranked 112 hours (2 weeks). The introduction of this project will require 56 hours (1 week).

1 Calculation of the project development costs

The cost of project development is made up from:

· salary costs;

· costs for damping equipment and software products;

· costs for electricity;

· overhead costs.

Spending costs.

When calculating wages, we consider that this project has developed one person: a system engineer.

The average market salary of the system engineer of the required level in the region is 30000 rubles.

Calculate the cost of 1 hour of the engineer, relying on the following data:

· 25% premium;

· district coefficient of 15%;

· the working time foundation in 2010, in accordance with the production calendar, is 1988 hours;

Thus, the pricing taking into account the district coefficient will be:

RF \u003d 30000 * 1,25 * 1,15 * 12/1988 \u003d 260 rubles

In calculating wages, the deductions paid from the accrued wages are taken into account, that is, the total amount of the insurance premiums will be equal to the maximum rate of the ESN - 26%, including:

· PFR - 20%;

· FSSR - 2.9%

· FFOMS - 1.1%;

· GFOM - 2%;

· Compulsory social insurance against accidents - 0.2%.

In the amount of deductions will be:

CO \u003d RF * 0.262 \u003d 260 * 0,262 \u003d 68 rubles

Taking into account the work of the engineer (112 hours for development and 56 hours for implementation), we calculate salary costs:

Zp \u003d (112 + 56) * (RF + CO) \u003d 168 * 328 \u003d 55104 rub

Expenses for damping equipment and software products.

A personal computer and Aquarius Server T40 S41 server were used as the main equipment at the network project development phase. The cost of the computer at the moment is approximately 17,000 rubles, while the server is 30000 rubles.

Thus, the cost of one-time investments in the equipment will be:

RVA \u003d 47000 rub

During the service life of the computer and server, their modernization is allowed, this type of cost is also taken into account when calculating. Layout 50% of RV to modernization:

RMA \u003d PV * 0.5 \u003d 23500 rub

The computer was used at the following steps:

· search for literature;

· search for the design solutions of the network monitoring system;

· development of structures and subsystems;

· design of the network monitoring system;

· registration of a document.

The server was used during the implementation of the system and directly working with the system.

Software products used in the development are obtained by free licenses, which indicates the zero cost and the absence of the need for their depreciation.

Thus, the total costs of equipment taking into account depreciation will be:

Oz \u003d RVA + RMA \u003d 47000 + 23500 \u003d 70500 rub

Useful use for 2 years. The cost of one hour of work is (by receiving the number of working days in a month 22 and at 8-hour working day):

Claim \u003d Oz / BP \u003d 70500/4224 \u003d 16,69 rubles

At the time of development and implementation, the cost of depreciation deductions will respectively:

Sachrv \u003d CHECK * TRV \u003d 16.69 * 168 \u003d 2803,92 rub

Electricity costs.

Electricity costs are consumed from computer consumed and spent on lighting. The cost of electricity:

SEN \u003d 0.80 rub / kW * h (under the contract with the owner of the premises)

RK, C \u003d 200 W - power consumed by a computer or server.

TRC \u003d 168 h - computer work time at the stage of development and implementation of the system.

TRS \u003d 52 hours - server operation time at the stage of development and implementation of the system.

Thus, the cost of electricity at the stage of development and implementation of the project will be:

SENP \u003d RK * TRC * SEN + RK * TRS * SEN \u003d (200 * 168 * 0.80 + 200 * 52 * 0.80) / 1000 \u003d (26880 + 8320) / 1000 \u003d 35.2 rubles

The workplace on which this work was made is equipped with a 100 W lamp. Calculate the cost of electricity spent by the lighting device during the development and implementation of the system:

Seno \u003d 100 * TRC * SEN \u003d (100 * 168 * 0.80) / 1000 \u003d 13,44 rub

Common electricity costs amounted to:

OZHAN \u003d SENP + SEN \u003d 35,2 + 13,44 \u003d 48.64 rub

8.2 Calculation of overhead costs

This cost point covers the cost of other equipment and consumables, also unforeseen expenses.

Overhead costs in the enterprise budget make up 400% of the accrued wages:

HP \u003d zp * 4 \u003d 55104 * 4 \u003d 220416 rub.

Thus, the cost of development and implementation of the project amounted to:

SRV \u003d zp + sacrev + ozhan + nr \u003d 55104 + 2803,92 + 48,64 + 220416 \u003d 278372,56 руб

3 Efficiency

As a result of the implementation of economic calculations, the minimum price of development and implementation of the network monitoring system was appointed 278372.56 rubles.

As can be seen from the calculations, the overwhelming part of the cost of expenses falls on materials and equipment. This is explained by the fact that manufacturers of main equipment are foreign companies and, accordingly, prices for this products are given in US dollars at the rate of CBRF + 3%. And the increase in customs duties on imported products also negatively affects the price for finite customers.

To justify independent system development, we compare its cost with ready-made solutions present in the market:

· D-Link D-View - 360,000 rubles

Monitoring and analysis of networks

Permanent control over the operation of the network is needed to maintain it in a working condition. Control is the first stage, which must be executed when managing the network. This network work process is usually divided into 2 stages: monitoring and analysis.

At the monitoring stage, a simpler procedure is performed by the procedure for collecting primary data on the operation of the network: statistics on the number of frames circulating on the network and packages of various protocols, the state of the ports of the hubs, switches and routers, etc.

Next, the analysis step is performed under which the more complex and intellectual process of understanding the information collected during the monitoring stage, comparison with the data obtained earlier and the development of assumptions about the possible causes of slow or unworthy operation of the network.

Tools for monitoring the network and detecting "bottlenecks" in its operation can be divided into two main classes:

strategic;

tactical.

The appointment of strategic means is to control the wide range of parameters of the operation of the entire network and solving the problems of configuring the LAN.

Appointment of tactics - monitoring and troubleshooting network devices and network cable.

The strategic means include:

network management systems

built-in diagnostic systems

distributed monitoring systems

diagnostic tools operating systems operating on large machines and servers.

The most complete control of the work is carried out by network management systems developed by such firms as Dec, Hewlett - Packard, IBM and AT & T. These systems are typically based on a separate computer and include workstation control systems, a cable system, connecting and other devices, a database containing control parameters for networks of various standards, as well as a variety of technical documentation.

One of best developments To manage the network that allows the network administrator to access all its elements up to the workstation, the Landesk Manager package is Intel, providing various tools for monitoring application programs, inventory of hardware and software and virus protection. This package provides real-time diverse information on applications and servers, data on user network.

Built-in diagnostic systems have become the usual component of network devices such as bridges, repeators and modems. Examples of such systems can serve Open Packages - View Bridge Manager of the company Hewlett - Packard and Remote Bridge Management Software company Dec. Unfortunately, most of them are focused on the equipment of some manufacturer and is almost incompatible with the equipment of other firms.

Distributed monitoring systems are special devices installed on network segments and intended for comprehensive traffic information, as well as violations in the network. These devices usually connected to the administrator workstation are mainly used in many segment networks.

Tactical tools include various types of test devices (network cable testers and scanners), as well as devices for integrated networking analysis - protocol analyzers. Testing devices help the administrator detect the network cable and connector faults, and protocol analyzers are to receive information on the exchange of data on the network. In addition, this category of funds include special software that allows real-time to receive detailed reports on the status of the network.

Means of monitoring and analysis

Classification

All variety of funds used to monitor and analyze computing networks can be divided into several large classes:

Network management systems (NetworkmanagementSystems) - Centralized software systems that collect data on the status of nodes and communication device devices, as well as traffic circulating data on the network. These systems not only monitor network analysis, but also perform in automatic or semi-automatic network management operation - Enable and disable device ports, change the parameters of bridges of bridges, switches and routers, etc. Controls can be popular HpopenView, SunnetManager, IBMNetView.

System management tools (SystemManagement). The system management tools often perform functions similar to the functions of control systems, but relative to other objects. In the first case, the control object is software and hardware of network computers, and in the second - communication equipment. At the same time, some functions of these two types of control systems can be duplicated, for example, the system management tools can perform the simplest analysis of network traffic.

Built-in diagnostic and control systems (EmbeddedSystems). These systems are performed in the form of software and hardware modules installed in communication equipment, as well as in the form of software modules embedded in operating systems. They perform diagnostic and control functions with only one device, and thereby their main difference from centralized control systems. An example of the means of this class is the Distrebuted 5000 concentrator control module, which implements the port auto segmentation functions when malfunctions are detected, attribute ports internal concentrator segments and some others. As a rule, the built-in part-time management modules execute the role of SNMP agents supplying the data status data for control systems.

Protocol analyzers (Protocolanalyzers). They are software or hardware and software systems that are limited in contrast to control systems only by the functions of monitoring and analyzing traffic on networks. A good protocol analyzer can capture and decode large number of protocols used in networks - usually several dozen. Protocol analyzers allow you to set some logical conditions for capturing individual packets and perform full decoding of captured packets, that is, the nesting of the protocol packages of different levels in a specialist form is shown in each other with decoding the content of individual fields of each package.

E. kspert Systems. This type of systems accumulates knowledge of technical specialists about identifying the causes of abnormal work of networks and possible ways to bring networks into an efficient state. Expert systems are often implemented as separate subsystems of various means of monitoring and analyzing networks: network management systems, protocol analyzers, network analyzers. The simplest version of the expert system is the context-dependent HELP system. More complex expert systems are so-called knowledge bases with elements of artificial intelligence. An example of such a system is an expert system built into the Cabletron Spectrum control system.

Equipment for diagnosing and certification of cable systems. Conditionally, this equipment can be divided into four main groups: network monitors, instruments for certification of cable systems, cable scanners and testers (multimeters).

Network monitors (also called network analyzers) are intended for testing cables of various categories. Network monitors and protocol analyzers should be distinguished. Network monitors collect data only on the statistical indicators of traffic - the average intensity of the network network traffic, the average intensity of the package stream with a specific error type, etc.

Assigning devices for cable System Certificationdirectly follows from their name. Certification is performed in accordance with the requirements of one of the international standards on cable systems.

Cable scanners Used to diagnose copper cable systems.

Testers are designed To check cables on the absence of physical break.

Multifunctional analyzing and diagnostic devices. In recent years, due to the widespread distribution of local networks, it was necessary to develop inexpensive portable devices that combine the functions of several devices: protocol analyzers, cable scanners, and even some network management features. As an example of this kind of devices, Microtestinc compas can be brought. or 675 LANMETERKOMPANIA FLUKECORP.

Protocol analyzers

For these purposes, different means can be used and first of all - monitoring tools in the network management systems that have already been discussed in the previous sections. Some network measurements can be made and built into the operating system by software gauges, the example of this is the component of WindowsNTPERFORMANCEMONITOR. Even cable testers in their modern execution are able to capture packets and analyzing their contents.

The protocol analyzer represents either an independent specialized device, or a personal computer, usually portable, a Notebook class equipped with a special network card and appropriate software. The applied network card and software must match the network topology (ring, tire, star). The analyzer connects to the network in the same way as an ordinary node. The difference is that the analyzer can take all data packets transmitted over the network, while the usual station is only addressed to it. The analyzer software consists of a kernel supporting the operation of the network adapter and decoding the data obtained, and an additional program code depending on the type of topology of the network under study. In addition, a number of decoding procedures focused on a specific protocol, such as IPX, comes. Some analyzers may also include an expert system that can produce recommendations on which experiments should be carried out in this situation, which can mean certain measurement results, how to eliminate some types of network malfunction.

Despite the relative variety of protocol analyzers presented in the market, some features can be called, in one way or another inherent in all of them:

User interface. Most analyzers have a developed friendly interface based, as a rule, on Windows or Motif. This interface allows the user to: output the results of the traffic intensity analysis; receive an instantaneous and averaged statistical assessment of network performance; set certain events and critical situations to track their occurrence; Decoding the protocols of different levels and submit in understandable form contents of the packets.
Capture buffer. Buffers of various analyzers differ in volume. The buffer can be located on the network card installed, or it can be assigned a place in the RAM of one of the network computers. If the buffer is located on the network card, then the control of them is carried out hardware, and due to this input speed rises. However, this leads to the rise in the rise in the analyzer. In case of insufficient performance of the capture procedure, some of the information will be lost, and the analysis will be impossible. The size of the buffer determines the ability to analyze more or less representative samples of the captured data. But no matter how much capture buffer, sooner or later it will be filled. In this case, the capture either stops, or the filling starts from the beginning of the buffer.
Filters. Filters allow you to control the data capture process, and thereby saving the buffer space. Depending on the value of certain packet fields specified in the form of filtering conditions, the package is either ignored or written to the grip buffer. The use of filters significantly accelerates and simplifies the analysis, as it eliminates the viewing at the moment of packages.
Switches are as defined by the operator some conditions start and terminating the data capture process from the network. Such conditions may be the execution of manual commands for starting and stopping the capture process, time of day, the duration of the capture process, the appearance of certain values \u200b\u200bin data frames. Switches can be used in conjunction with filters, allowing more detailed and finely analyzed, as well as productively use a limited scope buffer volume.
Search. Some protocol analyzers allow you to automate the viewing of information in the buffer, and find data on specified criteria in it. While the filters check the input stream for compliance with filtering conditions, the search features are applied to the data already accumulated in the buffer.

The methodology for conducting analysis can be presented in the form of the following six stages:

Capture data.
View captured data.
Data analysis.
Error search. (Most analyzers facilitate this work, defining the types of errors and identifying the station from which the package has come with an error.)
Research performance. The coefficient of using the network bandwidth or average response time to the request is calculated.
A detailed study of certain sections of the network. The content of this stage is specified as the analysis is carried out.

Typically, the protocol analysis process takes relatively little time - 1-2 business days.

Network analyzers

Network analyzers (do not confuse them with protocol analyzers) are reference measuring instruments for diagnosing and certifying cables and cable systems. As an example, you can bring network analyzers HewlettPackard - HP 4195a and HP 8510C.

Network analyzers contain a high-precision frequency generator and a narrowband receiver. Transferring different frequencies to the transmitting pair and measuring the signal in the receiving pair, you can measure attenuation and NEXT. Network analyzers are precision large-sized and expensive (worth more than $ 20'000) devices intended for use in laboratory conditions specially trained technical personnel.

Cable scanners

These devices allow you to determine the length of the cable, Next, attenuation, impedance, layout scheme, electrical noise level and evaluate the results obtained. The price of these devices varies from $ 1'000 to $ 3,000. There are many devices of this class, for example, microtestinc scanners., Flukecorp., Datacomtechnologiesinc., ScopeCommunicationInc. Unlike network analyzers, scanners can be used not only by specially trained technical personnel, but even newcomers administrators.

To determine the location of the cable system malfunction (cliff, short circuit, incorrectly installed connector, etc.) uses the "Cable Radar" method, or TIMEDOMAINRECTOMETRY (TDR). The essence of this method is that the scanner emits a short electrical pulse to the cable and measures the delay time before the reflected signal is coming. According to the polarity of the reflected impulse, the nature of the cable damage (short circuit or cliff) is determined. In the correctly installed and connected cable, the reflected impulse is completely absent.

The accuracy of the distance measurement depends on how accurately known the speed of the propagation of electromagnetic waves in the cable. In various cables it will be different. The speed of propagation of electromagnetic waves in the cable (NVP - NominalVELOCITYOFPROPAGATION) is usually set as a percentage of the speed of light in vacuo. Modern scanners contain an NVP data table for all major types of cables and allow the user to set these parameters on their own after pre-calibration.

The most famous manufacturers of compact (their dimensions usually do not exceed the size of the VHS video cassette) of the cable scanners are Microtestinc., Wavetekcorp., ScopeCommunicationInc.

Testers

Cable System Testers are the simplest and cheap cable diagnostic devices. They allow you to determine the continuity of the cable, however, in contrast to cable scanners, do not answer the question about where the failure occurred.

Built-in network monitoring and analysis tools

SNMP Agents

Today, there are several standards on the managing information databases. The main standards are MIB-I and MIB-II standards, as well as a database version for remote control RMONMIB. In addition, there are standards for special MIB of specific type devices (for example, MIB for concentrators or MIB for modems), as well as private MIB specific equipment manufacturers.

The initial MIB-I specification defined only the readings of variable values. Operations of changes or setting object values \u200b\u200bare part of the MIB-II specifications.