Contacts

Fundamental research

Adapter

Since the linear input of the audio adapter is the main receiver of an external signal when recording, each manufacturer seeks to provide sufficient signal gain at this input. The sensitivity of the linear inputs of most audio adapters is about the same, and the quality parameters are proportional to the overall quality of the boards. The situation with microphone inputs is quite different: the fee cost is $ 100. May have a much worst in sensitivity and quality. The entrance is rather than now wide-capacity for $ 8. The reason here is that the microphone input for a sound adapter is secondary and its functionality is most often limited to connecting the simplest Cheap microphone for feeding voice commands, where the noise level and frequency response is not so critical.

Microphone inputs of modern adapters are calculated, as a rule, to connect electrical microphones with a built-in amplifier that is powered by the adapter. Such a microphone has a high output resistance and develops up to 50-100 mV at an output, therefore, to enhance the signal to the linear input level (about 500 mV), a fairly simplest preamp. Some adapters, according to the documentation, allow you to connect and dynamic microphones that do not need nutrition, but such a microphone develops at the output of only 1-3 mV and requires a sufficiently sensitive and low-noise amplifier, which is quite rare on sound boards. Therefore, the type board at best allows you to get a loud, deaf sound, abundant with noises and tips from such a microphone, and in the worst case from the dynamic microphone you will not achieve sound. Preference to electret microphones is given due to the fact that the computer is a source of many electromagnetic emissions that have tangible interference on a sensitive microphone entrance, cope with which is quite difficult. Creating a low-noise amplifier would require a special layout of the board, thorough filtering of the supply voltages, shielding an area of \u200b\u200binput chains and other complex and expensive tricks.

Microphone entry connector of most adapters - monophonic; It uses only an end contact (TIP) of the plug, which in the stereo connector is responsible for the left channel signal. Middle contact (Ring), which in the stereo connector is responsible for the right channel, in the microphone connector is either not used at all, or serves to transfer the power supply voltage +5 V for an electrical microphone. When a separate contact for the microphone power is missing, the supply voltage is fed directly to the signal input, and the amplifiers in this case should have capacitive input and output.

Microphone

As we found out, electrometric microphones are best suited to directly connect to the adapter, which are usually produced in a fairly miniature version: in the form of "pencils" with supports or "clips" attached to clothing or to the monitor housing. They are inexpensive and sold in computer accessories stores; If there is no high quality record close to professional, - this microphone is quite possible to do. Otherwise, a high-quality professional microphone is needed, followed by the music equipment store, and its price will be about an order of magnitude higher.

With the connection of a professional microphone there will definitely arise a number of problems. Such microphones are most often dynamic and give a signal to the amplitude in the Milvololt unit, and the microphone input of most sound adapters, as already mentioned, is not able to normally perceive such weak signals. Outputs can be two: either to buy a microphone preamp in the same music store (which can be a pretty expensive toy) and connect its exit is no longer to the microphone, but to linear entrance adapter; Either use a microphone with built-in preamplifier and nutrition (battery). If there are radiotechnical skills, you can collect a simple amplifier yourself - scheme options are quite common in books and on the Internet.

In addition, professional microphones usually have XLR connectors, and computer audio adapters - mini-din, so the adapter will be required; Sometimes such adapters are sold in music stores, however, it may be that it will have to solder it yourself.

Finally, it may well, it may happen that any professional microphone will be much superior to your sound adapter for high-quality parameters and the sound you receive using such a microphone, as a result, it will not be better that it can provide a simple electret. So if you have doubts about the high quality of your adapter (and simple adapters of the price of about $ 10, the more embedded, have very mediocre parameters), then you make sense to negotiate in the store about the possible return of the purchased microphone, if you fail to get from it using enough high-quality sound.

Record technology

In contrast to the sources of the fixed signal, the microphone has a number of features that need to be considered when working with it. First of all, he likes to "phoning": if the reinforced signal from the microphone enters the columns, the microphone perceives it, the signal is reinforced, etc., that is, the so-called positive feedback is formed, which "splits" the sound path introduces it In self-excitation mode, which is manifested by a loud whistling, ringing or rockness. Even if the path does not enter a self-excitation mode, a positive connection can give a ringing or whistling calling, which will noticeably spoil the signal. In this case, a sensitive microphone can successfully catch the signal even from headphones if the sound in them is quite loud, and the outer sound insulation is weak. Therefore, it is necessary experimentally to determine this position / direction of the microphone and the volume of the enhanced sound, in which the positive connection appears the least. The final entry is recommended to produce with disconnected or at least as possible columns.

Sensitive microphones, especially simple and cheap, perfectly perceive foreign sounds like a row of fingers on the housing of the microphone or light shaking of the hull itself, even from a minor compression (for sure you had to hear such sounds when telephone conversations). To avoid such interference, the microphone is better to install on a comfortable stand or keep it freely, without squeezing with your fingers.

Another unpleasant moment in the use of a microphone is the so-called air flow, which is particularly pronounced in explosive consonants, such as "P", "B", "T" and the like. As a result of the intensive audio impulse, a sharp throw of the amplitude of the signal, overloading the amplifier and / or ADC, is formed on the intensive sound pulse membrane. Professional microphones have a windproof against it - a grid or a soft gasket located at some distance from the capsule, but even it does not always save, so it has to be escaped to each microphone, you're used to keep it either at the right angle so that direct air flows pass by, or on A sufficient distance so that they reach the microphone in an already weakened state.

Experimenting with the microphone, you will find that the timbre of the recorded voice is quite highly dependent on the distance from the mouth to the microphone and on the angle of the microphone on the face. This is due to the fact that low-frequency components of the voices are most dispersed and weakened with the distance, while high-frequency weakens less, but they have a more pronounced orientation. The most juicy and velvety voice timbre can be obtained by placing a microphone directly at the mouth, but then it will have to be pretty tinted with an angle of inclination and a lot to practice to avoid "sparning."

Recording via external devices

Recently, very exotic ways to recording sound from the microphone and transfer it to the computer appeared. So, Creative releases a JUKEBOX digital player containing a miniature hard disk drive, an autonomous controller and USB interface. The main function of the player is to play sound files that are pumped into it from the computer, but the built-in microphone allows you to use it as an autonomous voice recorder: the sound is written to the hard disk, which provides a continuous recording for several hours, and later the phonogram can be transferred to the computer. Another product Creative - PC CAM is a hybrid of a digital camera, camcorder and voice recorder and allows you to record the sound into the built-in flash memory, where it is removed using the same USB interface.

Noise removal and interference

associate a voice signal has a rather narrow spectrum (hundreds of hertz - kilohertz units), it is possible to apply noise removal operation with greater depth than in the case of an arbitrary music signal. When recording, it may also be that in the most successful fragment (from an artistic point of view) the microphone was still "incessant" in one or several places and attempts to repeat the phrase or a song versus with as well as a successful placement of accents do not give the desired result. In such cases, you can try rounding overload pulses by saving or reducing their amplitude. With a minor number of pulses, it is convenient to do manually, constructing the image before the nodal points appear that can be clinged with the mouse.

Voice processing methods

aK we have already spoken, a complex music signal contains many heterogeneous components that most sound processing methods act with different effects, therefore the spectrum of universal methods for processing the signal is very narrow. The most popular method of reverb, imitating multiple reflection of sound waves and creating the effect of space - room, hall, stadium, mountain canyon, etc.; Reverb allows you to give a juiciness and volume "dry" sound. Rest universal methods Processing is reduced to the manipulation of ACH (equalizer), cleaning the phonogram from noise and interference.

Regarding the primary, simple sound signal, the entire spectrum of existing processing methods - amplitude, frequency, phase, temporary, formative, and the like can be successfully used. Those methods that in the complex signal give rise to an intact asphony, on simple signals are often able to create very interesting and bright effects, widely used in the sound industry.

Installation

Computer installation of speech phonograms - a typical journalist occupation after writing an interview - simultaneously and simple, and complicated. At first, it seems simple, thanks to a convenient for visual analysis, the structure of speech, the presence of noticeable pauses between words, amplitude splashes in places of accents, etc. However, when trying, for example, to rearrange two phrases, divided by literally seconds, it turns out that they do not want to stick - the intonation, the phase of breathing, background noise, has managed to change, and the knob is clearly listening to the junction. Such grooves are easily distinguishable in almost any radio interview, when a human speech is recorded, which is not a professional radio journalist and, therefore, not knowing how to speak only what should be on the ether. From speech, you cut out too much, some fragments rearranged in places for more compliance with meaning, as a result of which hearing is constantly "surprised", because in the stream of natural human speech of such intonational and dynamic transitions does not happen.

To smooth out the transition effects, you can use the method of interconnection (Crossfade), although it will allow you to docile fragments of speech only by amplitude, but not by intonation and background noise. Therefore, we consider it necessary to warn those whom computer installation will seem in a convenient way to falsify the record, for example, negotiations: the examination is able to easily reveal even the seeding site indistinguishable by the ear, as in the case of fake documents using a scanner and printer.

Amplitude treatment

The simplest view of the dynamic amplitude voice processing is the modulation of its periodic signal when the amplitudes of the signals are multiplied and the voice acquires the amplitude characteristics of the modulating signal. By modulating a low-frequency (hertz units) with a sinusoidal signal, we obtain a "bullous" voice, increasing the frequency of the signal - vibrating. Using instead of a sinusoidal form, a rectangular, triangular or sawn shape, you can give the voice of metallic, distorted, "robotic" intonation.

Amplitude modulation of the dedicated phonogram fragment is performed as part of the generation GENERATE G TONES generation operation. The Base Frequency field sets the main frequency of the signal in Hertz, in the Flavor field - the type of pulse, in the Duration field - duration in seconds. VOLUME knobs set the signal level.

The Frequency Components Engine Group defines the levels of the harmonic of the main signal with the numbers specified when the numbers. The frequency modulation of the signal can be obtained using the Modulate BY fields - offset from the main frequency in Hertz - and Modulation Frequency - the modulation frequency. With a marked LOCK field ... all these parameters, including the main frequency, stationary; When removing the mark, you can set their initial / final values \u200b\u200bin the Initial / Final Settings bookmarks - they will be linearly changed for the generated segment.

The Source Modulation fields group defines how the generated signal will be used. By default, when none of these fields are noted, the signal is inserted into the phonogram or replaces the selected fragment; Otherwise, it is used to perform a given operation with the selected fragment: Modulate - normal modulation (multiplication), Demodulate - demodulation (division), Overlap (MIX) - simple mixing signals. Sequential modulation and demodulation with the same signal restore the source (possibly with a changed general level). Experiments with different combinations of parameters sometimes give very funny and unexpected results.

Temporary treatment

This type of treatment is based on the shift of the source signal in time and mixing the result with the source signal, after which the shift and mixing can be used again. When shifts at low intervals, comparable with the duration of the initial signal period, the phase effects of the type of interference occur, which is why the sound acquires a specific color; This effect received the name of the film (Flanger) and is used both from a fixed amount of shift, and with periodically changing or even at all with a random. When shifts at intervals exceeding the duration of the period, but not more than 20 ms, the choral effect occurs (Chorus). Thanks to the general technology, these two effects are often implemented by one software block with different parameters.

With multiple shifts with intervals 20 ... 50 ms, the effect of reverb (reverb) - humidity, volume, because the auditory unit interprets the delayed copies of the signal as reflections from the surrounding items. At intervals over 50 ms, the ear ceases to clearly associate individual copies of each other, as a result of which the echo effect occurs (Echo).

Cool Edit 2000 temporary delayed effects are combined into the Transform G Delay Effects group. Flanger and Chorus effects are created by the Flanger operation:

The original / delayed engine controls the ratio of source and delayed signals (intensity, or the depth of the effect). INITIAL / FINAL MIX DELAY - the initial and final delay of the copy - changes within these limits cyclically. STEREO PHASING - Phase shift angle between channels - allows you to create a curious effect of "twisting" of sound, especially in headphones. Feedback - Depth feedback (The result of the resulting signal mixed to the original operation) - allows you to control severity, sharpness.

The RATE group sets the effects of the cyclical effect. Period - the time interval for which the Milena passes from the initial delay to the final and back; Frequency - reverse value, passage frequency back-back; Total Cycles - the number of full passages along the dedicated fragment. The task of any parameter causes automatic recalculation of the others.

The MODE group manages the features of the effect: Inverted - the inversion of the detainee signal, Special EFX is an additional inversion of the original and delayed signals, Sinusoidal - the sinusoidal law of changes in the initial delay to the final (if it is disabled - the delay is changed linearly).

A set of presets allows you to visually explore the features of the operation. Try to select a few presets by changing the preset parameters in each of them and not forgetting every time you "roll back" (undo) to compare the effect on the sound of various parameter combinations.

The effect of reverb in Cool Edit 2000 can be implemented in two ways: using an Echo Chamber - a room simulator with specified sizes and acoustic properties, and the reverb of the volume effect generator based on the editor of the multiple reflections built into the algorithm editor in space. Since this type of treatment is universal and applied to any sound material, we will briefly describe the second way as the most popular.

The Total Reverb Length field / engine defines the reverberation time during which the reflected signals are completely faded; It is indirectly associated with the volume of space in which the sound is spread. Attack Time - the increase in reverberation depth of reverb to the nominal level; It serves to smoothly manifest effect during the processed fragment. HIGH FREQUENCY ABSORBTION TIME - the absorption time of high-frequency components, in proportion to the "softness" and "wrapper" volume. Perception is the degree of intelligibility: smaller values \u200b\u200b(smooth) are weak and soft reflections that do not interrupt the main signal, large values \u200b\u200b(echoey) are clear and strong, clearly audible reflections that can worsen the intelligence of speech.

Mixing engines / fields define the ratio of the original (DRY) and processed (WET) signals in the result.

Echo effect is implemented by an ECHO operation and adds to the signal to the signal gradually fading copies shifted at equal periods of time. The Decay regulator sets the damping value - the level of each next copy as a percentage of the previous level. Initial Echo Volume - the level of the first copy as a percentage of the source level. Delay - delay between copies in milliseconds. The Successive Echo Equalization regulator group controls the equalizer through which each regular copy is passed, which allows you to set the various acoustic characteristics of the simulated space.

Since the effect is "continuing" in time, it can create a sound fragment, in length exceeding the source. This provides for the Continue Echo Beyond Selection point - resolution to mix the echo signal to the phonogram section continuing abroad of the selected fragment. At the same time, only a selected fragment will be taken as the source signal, and the remaining part of the phonogram will be used solely to place the "tail". If the phonogram does not have enough space for the "tail" - an error message will be issued and will have to add a section of the silence of the generate G Silence to the end of the phonogram.

The effect is best perceived on relatively short sounds. On long words or phrases to eliminate the appearance of "taraboars" - multiple repetitions of various syllables or words interrupting each other, the effect is better to do "end", choosing to repeat only the short fragment fragment or even the last shock syllable of the word. Try experimenting with different words and phrases to feel what final part it is better to use for "reproduction" in each case.

Spectral treatment

The most striking and interesting effect of this class, implemented in Cool Edit 2000, is a change in height and speed. Everyone knows the effect of increasing or lowering the height of the signal when the feed rate is changed in the tape recorder or rotation of the plate. With the development of digital signal processing methods, it was possible to deliberately implement each of these effects separately - a change in height while maintaining time characteristics or vice versa.

Processing this type in Cool Edit 2000 provides transform G Time / Pitch G Stretch operation. Perhaps two options - with constant (constant) or with a sliding (gliding) coefficient. The coefficients are set by the Initial / Final Ratio fields, which are also associated with the engines for the convenience of change. The coefficient can also be set in indirectly by the transpose field in the form of the number of musical chromatic semitones up (dishes) or down (bembol). In the duration change mode, along with this, the Length field is available, in which you can specify the desired length of the resulting fragment.

The Precision switch sets the processing accuracy: Low (Low), Middle (Medium) and high (High) is necessary because the spectral processing operation requires a set of calculations and reduce accuracy to achieve acceleration of processing - at least at the experimental stage. The Stretching Mode switch sets the type of processing: Time Stretch - acceleration / slowdown in time, Pitch SHIFT - height shift, resample is simple pericing, similar to changing the speed of the tape / plate.

The Pitch and Time Settings parameter group manages the features of the operation. Processing is performed by splitting a fragment into small sound blocks; The SPLICING FREQUENCY parameter sets the number of such blocks in one second fragment. The increase in this "sampling frequency" makes blocks smaller, increasing the naturalness of processing, but simultaneously enhances the effect of crushing, generating unpleasant pride. The Overlapping parameter sets the degree of overlapping the adjacent blocks when assembling the resulting signal - a small mutual overlap allows you to smooth out the pride from their docking. Choose ApproPriate Defaults is used for automatic installation These parameters in the most suitable, from the point of view of the editor, values.

This article completes a short cycle on recording and processing sound on the home computer.

ComputerPress 12 "2002

In addition, different digital phonograms apply various mathematical methods, for example, interpolation of samples (REPAIR) or proportional correction (Normalize).

Spectral transformations affect the timbre of sound. These include various filters: High Pass, Low Pass or Band Pass (strip) and equalizers are parametric or graphic.
An important particular case of spectral transformations are formant transformations - manipulations with formants - characteristic frequency bands found in sounds uttered by man. By changing the formant parameters, you can emphasize or compare individual sounds, change one vowel to another, shift the register of the voice, etc.

Delay effects Based on the time delay of one copy of the signal relative to the other. Such effects can create an illusion of space or premises as reverb, echo, etc., the illusion of the multiplicity of sound sources (chorus) or the illusion of motion (fashers, maleelays).

Modulation of signal parameters. In such effects, as, for example, the fage, the signal phase is modulated by low-frequency oscillation (with a frequency significantly below the minimum sound frequency of 20 Hz). Using the amplitude modulation, the effect of tremolo is implemented, and by modulating frequency - vibrato.

Sound editors

This type of programs include software that allows you to edit and generate audio data. A sound editor can be implemented in whole or in part in the form of library, applications, web applications or an OS kernel expansion module.

Program type Wave Editor This is a digital audio editor that is usually designed to record and edit music, overlay effects and filters, appointments of stereokanals, etc.

Digital Audio Workstation (DAW) This is a program with wider capabilities, which usually consists of a variety of components combined by one graphical interface. The practical and most obvious distinctive feature of DAW is the presence of a full-featured MIDI sequencer. Many DAW also have a video editing tools designed to create a music video.

Sound editors designed to work with music, as a rule, allow the user:

  • import and export audio files of various formats,
  • write the sound from one or more inputs, and save it in the computer's memory in digital form,
  • machine the phonogram on the timeline (Timeline) using transitions (Fade in, Fade Out, Crossfading),
  • mix multiple sound / track sources with different levels of volume, panorama, etc., and direct to one or more output channels,
  • apply various effects and filters, including compression, expansion, various types of modulation, reverb, noise suppression, equalization, etc.
  • play the sound by directing it to output devices, such as speakers, external processors or recording devices,
  • convert the sound from some audiooformats to other and change the characteristics of the analog-digital conversion (discharge and sampling rate)

"Destructive" and "non-destructive" editing

Sound editors allow you to exercise both "non-deformal editing" in real time and "destructive", i.e. As a separate transformation process that is not associated with the playback or export of the phonogram, as well as combine both of these types.

Destructive editing changes the source audio file, and non-deform only changes its playback parameters. For example, if a part of the track is removed during the destructive editing process, this data is really deleted. If the non-neural or real-time editing is used, the deleted data remains, but do not reproduce.

Advantages of destructive editing:

  • In a graphic editor, all the changes made can be observed visually.
  • The number of effects that can be applied almost unlimited (or limited only disk spaceallocated for History).
  • Editing is usually accurate, on a scale to a separate sample.
  • Effects can be applied to a strictly defined region - with accuracy to the sample.
  • Mixing and export of edited sound occurs quickly, since it does not require calculation of the applied effects.

Restrictions of destructive editing:

  • After applying the effect cannot be changed. True, there is an opportunity to "cancel" the last performed action. Usually, the editor supports many levels of "cancellation history", so that several actions can be canceled in the reverse order in which they were applied.
  • The cancellation order cannot be changed (the last editing is completely canceled, etc.).

Advantages of Real-Time Editing (real-time):

  • Effects can usually be configured during playback or any other.
  • Editing can be canceled or adjusted at any time in any order.
  • Several effects can be applied sequentially, while their sequence can be changed, the effects can be removed from the chain or added.
  • Many editors support effect automation, i.e. Automatic changes to its parameters during playback.

Real-Time Restrictions Editing:

  • The signal form displayed on Timeline remains the same, the effects applied do not affect it.
  • The number of effects that can be applied is limited to a power of a computer or device. In some editors there is a function of "freezing" track (destruction of the effects stack).
  • As a rule, the effect cannot be applied only to part of the track. To apply the Real-Time effect to part of the track, the effect is turned on at one point and turns off to another.
  • In multi-track editors, if the audio is copied or moved from one track to another, the sound on the new track may differ from how it sounded on the source track, since various real-time effects can be applied to each track.
  • Mixing and export occurs slowly, as it is necessary to additionally calculate the applied real-time effects.

B. Blassman, J. M. Katis

2.1. Introduction

Sound technician tasks include recording, storage, transmission and playback of signals perceived by people using hearing organs. In practice, most often with such signals is normal music, although there are also singing of birds, electronic music, theatrical representations, hydroacoustic signals, etc., in contrast to the tasks of digital processing of speech signals, where the main requirement is speech, with a digital requirement Sound processing in most cases should also take into account some criteria for the accuracy of sound reproduction. Such criteria inevitably have a subjective nature, since the final conclusion about the quality of sound is based on the perception of signals by listeners. For this reason, this chapter will often talk about human perception, and for specialists in acoustics of one of the main problems is the definition technical parameters Sound signals affecting the perception of these sounds by man. By virtue of the wide prevalence and importance of devices for playing music, most of the works in the field of digital sound systems associated with music. Next in this chapter, the music converted to digital signalwill be considered as a representative of a wide class of signals called sound signals.

Since its inception, the sound technique was at the junction of various industries and enjoyed the achievements of chemistry and physics, especially such areas as electronics, magnetism and acoustics. Digital processing of signals, which, in its essence, apparently, most of all is in mathematics, is the newest branch of science, which entered the "Sound Family". Many experts believe that it will lead to a jump in the quality characteristics of sound systems. Although digital signal processing methods are only beginning to be applied in the field of sound technology, are already visible related to this.

potential opportunities. By the time of writing a book this area Technicians were at the initial stage of their development; Many of the most complex digital processing methods have not yet found applications in audio systems. You can not doubt that in the near future this position will change.

The need for digital processing of sound signals at first glance is not obvious. Therefore, at least some of the difficulties with which the appearance of music in the apartment of the listener is connected with. Chain of technical devices when passing sound from a microphone to acoustic column It turns out very long. It may include up to 100 independent systems, each of which performs its useful featureBut makes distortion. Quite often, each ensemble tool is recorded on a separate path of the multichannel tape recorder, and the number of these channels can reach up to 24. Such a process gives a sound operator with great features: you can, for example, recording a batch of any tool if necessary. It also helps the performer get rid of the background acoustic noise. However, with such a record, the sound becomes somewhat unnatural and differs from the one that is heard when executed in the concert hall, since there is no reverb in the record, and noticeable spectral distortions may appear depending on the position of the microphone. Such drawbacks can often be eliminated by correction of signals when mixing them (mixing). The mixing console allows the sound operator to process each launch of the primary record in different ways. The most common methods of processing sound signals include the introduction of artificial reverb and other special effects, alignment of spectra, compression of the dynamic range, noise suppression, limit. By its complexity, this process and performing it devices are approaching the functions and equipment of the cosmic flight control center (NASA).

After a highly qualified sound-operator combines the processed primary signals into a secondary stereo or quadraphonic record, it is subjected to additional processing in order to form a signal suitable for recording on a record and magnetic tape. The resulting working tape is used to control the precision recorder or magnetocks. IN lately The recorders also also appeared their own complex signal processing systems, intended for dynamic control of the cutter and creating compensation and predistortions within the framework of nonlinear processing used both in the manufacture and when playing records. Moreover, the primary copy obtained in the recorder is only the result of the first stage of the complex process, as a result of which the entry is obtained,

playing at home or in the studio. As a long way, the sound is also on broadcasting. The acoustic system in the house of the listener and loudspeakers form an important last link of the sound-reproducing chain. Thus, the process of sound reproduction can be represented as three main steps:

1. Creating and writing initial signals.

2. Storage and transmission of these signals.

3. Playing signals in the form of acoustic waves.

It may seem that some of the complex elements of the process of sound reproductive are optional, however it turns out that each stage of the process is important, and often as a means of correction of technical errors introduced at another stage of the process. For example, the signal compression at the initial record stage is necessary because storage storage devices have a limited dynamic range.

Many developments in the field of digital sound engineering are intended to replace the weak elements of the sound recording chain or sound transmission. Examples include digital tape recorders and digital audio signaling systems. Uncomplicated in theory, these systems are complicated in the implementation. Odayo their creation led to a sharp improvement in the quality of sound playback. The Mixer Cool Control was also translated into digital equipment to free the sound engineer from the difficult duty of the actual regulation of hundreds of parameters real-time. Digital electronic reverb came to replace mechanical reverberation devices. Created synthesizers that allow stereo stereo signals from a pair at home to create certain acoustic fields characteristic of large rooms.

In laboratories, more advanced methods have been applied to restore old sound recordings. Currently, there are restored records of the performances of the Caruso made at the beginning of the century, and after fixing the recording, extremely low quality began to sound much better. Digital processing is also applied in studies aimed at improving electroacoustic transducers. In the sound-reproducing chain, the loudspeaker is one of the weakest and least studied links. It affects the amplitude, phase and spatial characteristics of the resulting sound signal, and also determines the various types of signal distortion. Digital signal processing is used to experimental determination of the physical characteristics of acoustic transducers, as well as to assess the effect of these characteristics on the perception of sound.

In all such systems there are common blocks - anal-digital and digital-analog converters (ADC and DAC). Vissing his fundamental nature of these

converters will be considered here independently. Any distortions made at this signal processing phase can significantly devalue the dignity of digital processing. The characteristics of the converters must be coordinated with the peculiarities of the perception of sound signals for a number of reasons.

1, an excessively large bit rate during quantization of samples in the ADC is achieved at the expense of large economic costs, and due to the high speed of information arrival at the subsequent stages it may be necessary too much speed.

2. Distortions defined by the instruments are not always commemorated.

The question is also complicated by constructive problems that can significantly affect the quality of the system. Therefore, exist various methods Transformations and selection is determined by the assignment of the entire system.

The engineer should know the ratio between the physical and electrical characteristics of the system and the apparent sound quality. The classic determination of the signal-to-noise ratio, for example, is based on calculating the maximum signal power ratio to noise power measured in the absence of a signal. However, the perception of noise depends on the degree of its spectral resemblance or the difference with the signal, on the type of probability distribution and the nature of the change in the noise in time. Thus, two different noise process, differing in power at 20 dB, can interfere with the hearing perceived as the same.

Such examples indicate that the theory of sound systems will mostly be based on psychoacoustic studies than on the theory of systems. The theory of systems considers ways to solve the problem, and psychoacience in this case describes the nature of the desired result. So, in the aforementioned example, the goal is to make the noise beating, although it is not necessary to fully suppress it. The economic consequences of the wrong choice of the ultimate goal may be very sad. As a rule, the noises of the 16-bit ADC are not perceived by the ear and are not noticeable to the instruments, but it is worth this converter once every 100 more than a 12-bit ADC. Therefore, the sound technique should be based taking into account the peculiarities and equipment, and the human hearing system so that as a result, to optimize the subjective assessments of the quality of sound reproduction.

Sampling is a recording of sound samples (samples) of a real musical instrument. The sampling is the basis of the wave synthesis (WT-synthesis) of musical sounds. If, with frequency synthesis (FM synthesis), new sounds are obtained due to the various processing of simple standard oscillations, the basis of WT-synthesis is pre-recorded sounds of traditional musical instruments or sounds, accompanying various processes in nature and technology. With samples you can do whatever. You can leave them as it is, and the WT-synthesizer will sound voices, almost indistinguishable to the voices of primary sources. You can subjected to modulation samples, filtering, effects effects and get the most fantastic, unearthly sounds.

In principle, the sample is nothing, as a sequence of digital samples that have been saved in the synthesizer's memory, resulting from analog-to-digital conversion of musical instrument sound. If there was no problem to save memory, then the sound of each note could be recorded performed by each musical instrument. And the game on such a synthesizer would be playing these records at the necessary moments of time. The samples are stored in memory not in the form in which they are obtained immediately after the passage of the ADC. The record is exposed to surgical exposure, it is divided into characteristic parts (phases): the beginning, extended plot, the end of the sound. Depending on the proprietary technology used, these parts can be divided into even smaller fragments. Not all recording is stored in memory, but only the minimum necessary information about it from fragments. Changing the length of the sound is made by controlling the number of repetitions of individual fragments.

In order to even greater memory savings, a synthesis method was developed that allows you to store samples not for each note, but only for some. In this case, the sound height changes are achieved by changing the speed of the sample playback.

A synthesizer is used to create and play samples. Nowadays, the synthesizer is constructively implemented in one-two chip housings, which is a specialized processor to implement all the necessary conversion. From encoded and compressed using special fragment algorithms, it collects sample, sets the height of its sound, changes in accordance with the idea of \u200b\u200bthe musician the shape of the envelope of the oscillation, imitating either almost imperceptible touch, or a blow on the key or string. In addition, the processor adds different effects, changes the timbre with filters and modulators.

IN sound cards Find the use of several synthesizers of various firms.

Along with the samples recorded in the ROM of the sound card, there were currently available samples (banks) sets created both in the laboratories of firms specializing in synthesizers and amateurs of computer music. These banks can be found on numerous laser disks and on the Internet.

Modulation effects:

Diley (Delay) means "delay". The need for this effect arose with the advent of stereo. The nature of the human hearing aid suggests in most situations the receipt of two audio signals into the brain, which differs from the arrival time. If the sound source is "before your eyes", on a perpendicular conducted to the line passing through the ears, the direct sound from the source reaches both ears at the same time. In all other cases, the distance from the source to the ears is different, so either one or another ear perceives the sound first.

The delay time (difference in the time of reception of signals ears) will be maximal in the case when the source is located opposite one of the ears. Since the distance between ears is about 20 cm, then the maximum delay can be about 8 ms. These values \u200b\u200bcorrespond to a wave of sound oscillation with a frequency of about 1.1 kHz. For more high-frequency sound oscillations, the wavelength becomes less than the distance between the ears, and the difference in the time of receiving the ears of the ears becomes imperceptible. The limit frequency of oscillations whose delay is perceived by a person depends on the direction to the source. It grows as the source shifts from a point, located opposite one of the ears, to a point located in front of a person.

Delay is used primarily when recording a voice or acoustic musical instrument, made using a single microphone, are embedded in a stereo composition. This effect serves as the basis for the technology of creating stereo records. Dili can also be used to obtain the effect of a single repetition of any sounds. The value of the delay between the direct signal and its delayed copy in this case is selected greater than the natural delay of 8 ms. For short and sharp sounds, the delay time at which the main signal and its copy is distinguishable less than for extended sounds. For works performed at a slow pace, the delay may be greater than for quick compositions,

With certain ratios of the losses of the direct and delayed signal, the psychoacoustic effect of changing the apparent location of the sound source on the stereopanora can occur.

This effect is implemented using devices capable of delay acoustic or electrical signals. This device now most often serves a digital delay line, which is a chain of elementary cells - delay triggers. For our purposes, it is enough to know that the principle of action of the delay trigger is reduced to the following: the binary signal arrived at a certain time time on its entry, will appear at its output not instantly, but only at the next clock moment. The total delay time in the line is the greater, the more delay triggers are included in the chain, and the smaller the less clock interval (the more clock frequency). You can use memorable devices as digital delay lines.

Of course, for applying a digital delay line, the signal must first be transformed into a digital form. And after passing its copies through the delay line, the reverse, digital-analog conversion occurs. The source signal and its delayed copy can be separated into various stereo channels, but can be mixed in various proportions. The total signal can be directed either to one of the stereokanals or both.

In audio editors, Diley is implemented by software (mathematical) by changing the relative numbering of the source signal counts and its copies.

The base effects of the Flanger (Flanger) and Faser (Phaser) also laid the signal delay.

The re-sound effect may be caused by the propagation of sound from the source to the receiver in various ways (for example, the sound may come, first, directly and, secondly, reflected from the obstacle that is slightly away from the direct path). And in that, and in different cases, the delay time remains constant. In real life, this corresponds to an unlikely situation, when the sound source, sound receiver and reflecting items are still relative to each other. At the same time, the frequency of sound does not change, any way and in whatever he has come.

If any of the three elements are moving, the frequency of the received sound cannot remain the same as the frequency of sound transmitted. This is nothing more than the manifestation of the Doppler effect.

Both Flanger and Faiser imitate the manifestations of mutual movement of three elements: source, receiver and sound reflector. In essence, both the other effects are a combination of a sound signal delay with frequency or phase modulation. The difference between them is purely quantitative, the Flanger differs from the Faiser by the fact that for the first effect, the delay time of the copy (or times of the delay of copies) and the change in frequencies, the signal is much greater than for the second. Figuratively speaking, Flanger would have been observed in the case when the singer would have rushed to the viewer sitting in the hall, with the speed of the car. But in order to feel the fason in it, so to speak, the original form, a moving sound source is not required, the viewer is often quite often thrilled from side to side.

The mentioned quantitative differences of the effects lead to the differences in high-quality: first, the sounds treated with them acquire various acoustic and musical properties, and secondly, the effects are implemented by various technical means.

The values \u200b\u200bof the delay time characteristic of the Flanger significantly exceed the sound oscillation period, so multi-digit and multi-digit delay digital lines are used to implement the effect. From each of the taps, its signal is removed, which in turn exposed frequency modulation.

For a Faizer, on the contrary, a very small delay time is characterized. It is so little that turns out to be comparable to the sound oscillation period. With such small relative shifts, it is customary to say not about the delay of copies of the signal in time, but about their phase differences. If this phase difference does not remain constant, but varies in a periodic law, then we are dealing with the effect of a fason. So you can consider Faiter with a limit case of Flanger.

To get a Flanger, instead of one speaker system used several systems placed on different distances from listeners. At the required moments, an alternate connection of the source of the signal to speakers was made in such a way that the impression of approximation or removal of the sound source was created. The audio delay was performed and with the help of tape recorders with a pass-through path recording / playback. One head writes, the other - reproduces the sound with a delay for the time required to move the tape from the head to the head. For frequency modulation, special measures could not be invented. Each analog tape recorder is inherent in the natural disadvantage, called detonation, which manifests itself in the form of "navigation" of sound. It was worth a little more specially enhance this effect, changing the voltage that feeds the engine, and the frequency modulation was obtained.

To implement the Faiser, the methods of the analog technology used the chains of phase masters controlled by an electric path. And sometimes it was possible to observe such a picture: in an acoustic system connected to Amy or electric guitar, something like a fan began to rotate. The sound crossed with moving blades and reflected from them, the phase modulation was obtained.

Reverb refers to the most interesting and popular sound effects. The essence of the reverberation is that the original beep is mixed with its copies detained relative to it at various time intervals. This reverb resembles Diley. However, with reversing, the number of detected copies of the signal can be significantly larger than for Dilea. Theoretically, the number of copies may be infinite. In addition, with reversing, the larger time is a copy of the signal, the less its amplitude (volume). The effect depends on what time intervals between the signals are copies and what is the speed of reducing the levels of their volume. If the gaps between copies are small, then the actual reverberation effect is obtained. There is a feeling of volumetric arms. The sounds of musical instruments become juicy, volumetric, with a rich timbre composition. The voices of singers acquire the title, the disadvantages inherent in them become poor.

If the gaps between copies are large (more than 100 ms), it is more correct to speak not about the effect of reverb, but about the Echo effect. The intervals between the corresponding sounds become distinguishable. Sounds cease to merge, seem reflections from remote obstacles.

The main element implementing the reverberation effect is a device that creates an echo signal.

The echo camera is a room with strongly reflective walls, which is placed a sound source (loudspeaker) and receiver (microphone). The advantage of the echo chamber is that the attenuation of the sound occurs in it naturally (which is very difficult to provide in other ways). While the sound continues to reverb in three dimensions, the initial wave is divided into a plurality of reflected, which reach the microphone for the decreasing periods of time.

Along with the echo chambers to simulate reverb, steel plates were used, more precisely, rather large sheets of sheets. The fluctuations in them were introduced and removed using devices, by design and the principle of action similar to electromagnetic headphones. To obtain satisfactory uniformity of the amplitude-frequency characteristic, the thickness of the sheet must be supplemented with an accuracy that conventional steel rolled steel technologies provide. Reverb here was not three-dimensional, but flat. The signal had a characteristic metallic shade.

In the mid-60s, spring reverb was used to obtain the reverb effect. With the help of an electromagnetic converter connected to one of the ends of the spring, mechanical oscillations were excited in it, which with a delay reached the second end of the spring associated with the sensor. The effect of repeating sound is due to repeated reflection of the mechanical oscillation waves from the ends of the spring.

Reversables came to replace these imperfect devices. The principle of forming an echo signal in them is that the source signal is written to the tape by the recorded magnetic head, and through the time it is necessary to move the tape to the reproducing head, it is read by it. Through the feedback circuit, the delayed signal is reinforced by the amplitude, which creates the effect of repeated sound reflection with gradual attenuation. The sound quality is determined by the parameters of the tape recorder. The lack of a tape reverberator is that at acceptable speeds, the ribbon broach is possible to obtain only the echo effect. To obtain the actual reverb, it is required either even stronger to bring the magnetic heads (which does not allow them to make their design), or significantly increase the speed of the tape.

With the development of digital technology and appearance integrated microcircuitsBuilding in one building hundreds and thousands of digital triggers (which we have already said) have the opportunity to create high-quality digital reversers. In such devices, the signal may be detained at any time required for both reverb and echo.

In sound cards, reverb, ultimately, based on digital signal delay.

Observing the stages of development of reverb means, it can be assumed that mathematical models of spring and tape recorder reversers will ever appear. After all, it is possible that there are people who experience nostalgic feelings towards the sounds of music, painted by the rattle of springs or a hiss of the magnetic tape.

Methods used to process sound:

1. Installation. It consists in cutting out of the record of some sections, inserts of others, their replacement, reproduction, etc. Is also called editing. All modern sound and video recordings are in one way or another are installed.

2. Amplitude transformations. It is performed using various actions on the amplitude of the signal, which ultimately be reduced to multiplying sample values \u200b\u200bon a permanent coefficient (amplification / weakening) or a modulator-changing function (amplitude modulation). A special case of amplitude modulation is the formation of envelope to give the stationary sound of development in time.

Amplitude transformations are performed sequentially with individual samples, so they are simple in implementation and do not require a large amount of calculations.

3. Frequency (spectral) transformations. Perform over frequency components of sound. If the spectral decomposition is the form of a sound representation in which the frequencies are counted horizontally, and vertically - the intensities of the components of these frequencies, then many frequency transformations become similar to amplitude transformations over the spectrum. For example, filtering - amplification or weakening of certain frequency bands - reduces to the applix to the spectrum of the corresponding amplitude envelope. However, the frequency modulation cannot be represented in this way - it looks like a displacement of the entire spectrum or its individual sections in time on a specific law.

To implement the frequency transformations, a spectral decomposition according to the Fourier method is usually applied, which requires significant resources. However, there is a Fourier quick conversion algorithm (BPF, FFT), which is done in integer arithmetic and allows 486 on the younger models to turn out real-time spectrum of the average quality signal. With frequency transformations, in addition, processing and subsequent convolution is required, so real-time filtering is not yet implemented on general-purpose processes. Instead, exists a large number of Digital Signal Processor - DSP (DIGITAL Signal Processor - DSP), which perform these opaments in the portion and in several channels.

4. Phase transitions. They are mainly reduced to a constant shift of the signal phase or its modulation by some function or dpugimon. Thanks to the fact that the person's auditory appapat uses the phase to propagate the pressure on the source of the sound, the phase should be obtained by the effect of the sound, hoop and the like.

5. Wheeled use. It is addressed to the main signal of its copies shifted in carrying out for domestic values. These are small shifts (for less than 20 ms), this gives the effect of the sound of the sound source (the effect of the HOPA), these are large - the effect of echo.

6. Fermatic use. They are a particular case of frequency and opize with phofthas - happostepatic frequency bands, weighing in sounds, with a person. Each sound will correspond to its ratio of amplitudes and frequencies of several files, which the temperature opens the tempo and the voices. By changing the files of the FPMANT, you can experience or blend individual sounds, change one vowel to the disgust, shift the speech of the voice, etc.

Based on these methods, many hardware and sound processing software are implemented. Below is a description of some of them.

1. Compressor (from English. Compress - compress, squeeze) - it electronic device or computer programused to reduce the dynamic range of the beep. Lowing compression reduces the amplitude of loud sounds, which are above a certain threshold, and the sounds below this threshold remain unchanged. Raising compression on the contrary increases the volume of sounds below a certain threshold, while the sounds exceeding this threshold remain unchanged. These actions reduce the difference between quiet and loud sounds, narrowing dynamic range.

Compressor Parameters:

Threshold is the level above which the signal begins to be suppressed. Usually installed in dB.

Ratio (ratio) - determines the ratio of incoming / outgoing signals exceeding the threshold (THRESHOLD). For example, the ratio of 4: 1 means that the signal is exceeding the threshold of 4 dB, will be squeezed to level 1 dB above the threshold. The highest ratio ∞: 1 is usually achieved using a 60: 1 ratio, and actually means that any signal exceeding the threshold will be reduced to the threshold level (except for short sharp volume changes called "attack").

Attack and Release (attack and recovery, Fig. 1.3). The compressor can provide a certain degree of control over how quickly it acts. "Phase attack" is a period when the compressor reduces the volume to a level that is determined by the relation. "Phase recovery" is a period when the compressor increases the volume to a certain ratio, or to zero dB, when the level drops below the threshold value. The duration of each period is determined by the speed of the signal level change.

Fig. 1.3. Attack and recovery of the compressor.

In many compressors, the attack and recovery are governed by the user. However, in some compressors, they are defined by the scheme and cannot be changed by the user. Sometimes attack and recovery parameters are "automatic" or "software-dependent", which means that their time varies depending on the incoming signal.

Knee compression (KNEE) controls the bend of compression on the threshold value, it can be sharp or rounded (Fig. 1.4). Soft knee slowly increases the ratio of compression, and ultimately achieves compression by the user specified. With a rigid knee, the compression begins and stops sharply, which makes it more noticeable.

Fig. 1.4. Soft and rigid knee.

2. Extender. If the compressor suppresses the sound after its level exceeds a definite value, the expander suppresses the sound after its level becomes less than a certain value. In the whole other, the expander is similar to a compressor (sound processing parameters).

3. Distortion (English. "Distortion" is a distortion) - this is an artificial coarse narrowing of the dynamic range in order to enrich the sound of harmonics. With the compression of the wave, it is increasingly taken by non-sinusoidal, and square forms due to the artificial limitation of the level of sound, which possess the largest number of harmonics.

4. Dyleway (Eng. Delay) or Echo (Eng. Echo) - a sound effect or a corresponding device that simulates clear fading repetitions of the source signal. The effect is implemented by adding to the original signal of its copies or several copies detained in time. Under the DEEOM, a single signal delay is usually meant, while the Echo effect is multiple repeat.

5. Reverb is the process of gradually reduce the intensity of the sound when it is repeated reflections. In virtual reversers, there are many parameters that allow you to get the desired sound characteristic of any room.

6. Equalizer (English "" Equalize "-" Align ", the overall abbreviation -" EQ ") - a device or a computer program that allows you to change the amplitude-frequency response of the sound signal, that is, adjust its (signal) amplitude selectively, depending on the frequency . First of all, the equalizers are characterized by the number of frequency filters adjustable (bands).

There are two main types of multi-band equalizers: graphic and parametric. The graphic equalizer has a definite amount of frequency bands adjustable, each of which is characterized by a constant operating frequency, a fixed bandwidth around the operating frequency, as well as a level adjustment range (the same for all bands). As a rule, the extreme stripes (the lowest and high) are filters of the "sheal" type, and all others have a "bell-shaped" characteristic. Graphic equalizers used in professional areas usually have 15 or 31 bands on the channel, and are often equipped with spectrum analyzers for the convenience of adjustment.

The parametric equalizer gives much greater possibilities for adjusting the frequency response of the signal. Each of its band has three main adjustable parameters:

Central (or working) frequency in Hertz (Hz);

Quality (width of the working strip around the central frequency is denoted by the letter "q") - a dimensionless value;

Strengthening or weakening the selected band in decibels (dB).

7. Horus (Chorus) - sound effect, imitating the choral sound of musical instruments. The effect is implemented by adding to the original signal to its own copy or copies shifted over time by about 20-30 milliseconds, and the shift time changes continuously.

First, the input signal is divided into two independent signals, one of which remains unchanged, while the other enters the delay line. In the delay line, a signal delay is carried out by 20-30 ms, and the delay time varies in accordance with the low frequency generator signal. At the output, the delayed signal is mixed with the source. The low frequency generator modulates the signal delay time. It produces fluctuations in a certain form, lying from 3 Hz and below. By changing the frequency, shape and amplitude of the oscillations of the low-frequency generator, you can receive a different output signal.

Effect parameters:

Depth (depth) - characterizes the range of change in the delay time.

Speed \u200b\u200b(Speed, Rate) - the speed of changing the "navigation" of sound, is adjusted by the frequency of the low-frequency generator.

Low Frequency Generator Wave Shape (LFO Waveform) - Sinusoidal (SIN), Triangular (Triangle) and Logarithmic (LOG).

Balance (Balance, Mix, Dry / Wet) is the ratio of untreated and processed signals.

8. Faser (English Phaser), also often called phase vibrato - sound effect, which is achieved by filtering the sound signal with the creation of a series of maxima and lows in its spectrum. The position of these maxima and minima varies throughout the sound, which creates a specific circular (eng. Sweeping) effect. Also, the faser is called the appropriate device. According to the principle of operation, it is similar to the chorus and differs from it the time of the delay (1-5 ms). In addition, the signal delay in the Faizer at different frequencies is not the same and changes according to a certain law.

The electronic effect of the FAYTER is created by separating the audio signal into two streams. One thread is processed by a phase filter that changes the audio phase by saving its frequency. The value of the phase change depends on the frequency. After mixing the processed and raw signals, the frequencies in the antiphase are repaid each other, creating characteristic failures in the spectrum of sound. Changing the ratio of the original and processed signal allows you to change the depth of the effect, and the maximum depth is achieved with a rather than 50%.

The effect of the Faiser is similar to the effects of the flanger and the chorus, which also use the addition of its copies to the audio signal supplied with a certain delay (T.N. Line delay). However, in contrast to the flanger and the chorus, where the delay value can take an arbitrary value (usually from 0 to 20 ms), the value of the delay in the faser depends on the frequency of the signal and lies within the same phase of the oscillation. Thus, the fans can be viewed as a private case of the flanger.

9. Flange (English Flange - Flange, Comb) - Sound effect, resembling a "flying" sound. According to the principle of operation, it is similar to the chorus, and differs from it the delay time (5-15 ms) and the presence of feedback (Feedback). A part of the output signal is fed back to the input and in the delay line. As a result of the resonance of signals, the flange effect is obtained. In this case, in the spectrum of the signal, some frequencies are amplified, and some are weakened. As a result, the frequency response represents a number of highs and minima, reminding the comb, from where it is called. The phase of the feedback signal is sometimes inverted, thereby achieved an additional variation of the beep.

10. Vocoder (English "Voice Coder" - voice encoder) - a speech synthesis device based on an arbitrary signal with a rich spectrum. Initially, vocoders were developed in order to save frequency resources of the radio communication system when transmitting speech messages. Savings are achieved due to the fact that instead of the actual speech signal transmit only its values defined parameterswhich on the receiving side control the speech synthesizer.

The basis of speech synthesizer is three elements: generator tonal signal For the formation of vowel sounds, noise generator for the formation of consonants and system of formation filters to recreate individual voice features. After all transformations, the voice of a person becomes like a robot voice, which is quite tolerant for means of communication and interesting for the musical sphere. So it was only in the most primitive vocoders of the first half of the last century. Modern connected vocoders provide highest quality Voices with a significantly stronger compression compression with mentioned above.

The vocoder as a musical effect allows you to transfer the properties of one (modulating) signal to another signal, which is called the carrier. A person's voice is used as a modulator signal, and as a carrier - a signal generated by a musical synthesizer or other musical instrument. This is how the effect of the "speaking" or "singing" musical instrument is achieved. In addition to the voice, the modulating signal can be both a guitar, keyboards, drums and in general any sound of synthetic and "living" origin. There are also no restrictions on the carrier signal. Experimenting with the modeling and carrier signal, you can get completely different effects - a speaking guitar, drums with a piano sound, guitar, sounding like xylophone.



Did you like the article? Share it