Desmitifying latency

Discussion of music production, audio, equipment and any related topics, either with or without Ableton Live
Post Reply
tomperson
Posts: 1018
Joined: Fri Oct 08, 2004 11:55 am
Location: MVD, Uruguay, South America
Contact:

Desmitifying latency

Post by tomperson » Wed Sep 06, 2006 7:25 pm

Well,
I've read lots of posts here as well as chatted on music school with other musicians regarding audio latency. Some people seem to be obsessed with getting ever lower latency figures, and I wonder, can we REALLY tell the difference between 1.5ms and 8ms latency? Could anyone come up with some reputable source that states what's the minimum latency humans can notice?

All I've read and experienced myself says that beyond 10ms it's noticeable, but what about below 10ms? Some people say that they totally notice it, others don't...I wonder how much is hype and how much real. This is important because higher latencies mean stressing more the system and probably running less instruments/effects, so why bother going lower if we don't really notice?

I post some findings on the theme below (latency in audio and as a general issue in interactive systems).

--------------------------------------------------------------------------------

http://www.whirlwindusa.com/wwlatart.html
Now, give this a try. Plug a microphone into a standard analog sound system and speak while standing 5 feet from the loudspeaker. You're now experiencing about 5 ms of latency. Step back another 5 feet, and you're experiencing about 10 ms of latency. Move the microphone a foot away from your mouth, and it adds another 1 ms.

Delays within an ensemble of musicians can be, and often are, relatively long. Think of a symphony where performers are located across a 40-foot stage. The conductor waves a baton to keep time. The percussion section might be 30 feet (and 30 ms) away, while the second violins are 5 feet (and 5 ms) away.

Does the conductor hear all of the notes attacking at different times? The harps might be 40 feet (and 40 ms) away from the timpani. Do they think they sound out of time with each other? How do the musicians stay in synch with each other?

Actually, research sponsored by the National Science Foundation, through the Stanford University Department of Music, has shown that performers in an ensemble have no problem synchronizing with each other while experiencing latencies as high as 40 ms and even greater. In fact, latencies in the 10 ms to 20 ms range actually have a stabilizing affect on tempo and are thought to be preferred over zero latency. 2

...
...
...

The Haas Effect (or Precedence Effect) is a principle first set forth in 1949 by Helmut Haas, which established that we humans localize sounds by identifying the difference in arrival time between our two ears. The same sound arriving within 25 ms to 35 ms of itself will be suppressed and not be heard as an echo.

Only if sounds are more than about 35 ms apart will the brain recognize them as separate sounds or echoes. We tested this, connecting a microphone through a 20 ms delay and monitored with Shure E3 ear buds.

The principle here: when you monitor your own voice, the delayed sound mixes in the ear with non-delayed sound that is conducted through your head via bones, cartilage and Eustachian tubes. This effect should be exaggerated when using headphones or in-ear monitors, because the listener is not hearing all of the other room reflections with various delays and volumes.

Several people were asked to read aloud from a magazine. The subjects included experienced sound professionals and musicians, amateur players and non-technical people with no monitoring experience. Everyone heard the initial 20 ms delay as a very short echo or "doubled" sound.

Then, the delay was gradually reduced from 20 ms. The subjects were told to stop us when the delay seemed to disappear. Then this was repeated while the person spoke short, sharp syllables like "check, check!"

Every person tested seemed to think the echo disappeared somewhere between 10 ms to 15 ms. I personally found it to be a rather dramatic change too - as if someone had suddenly bypassed the delay unit.

...
...
...

Next, we wanted to evaluate the situation where a guitarist is playing direct to the PA, or a drummer is using electronic drums. I played guitar into the delay and monitored through headphones while only listening to the delay.

...
...
...

On the other hand, a delay of a few milliseconds was imperceptible. In my guitar experiment, it seems that the delay isn't noticeable at all up to about 10 ms (again). It becomes slightly noticeable between 10 ms to 15 ms almost like it's not really an echo - just "something's there," but I could still play in time. The delay started to get difficult to contend with somewhere around 15 ms to 20 ms, and above 20 ms I really struggled with timing.

Now, this is an admittedly small sample. However, after these tests, it appears that even with the subjects being told that it's there, they couldn't detect latency as echoes with less than about 10 ms to 15 ms of latency. 3
...
...
...
Any time a sound arrives at different times, the sound interacts with itself to affect the overall frequency response. This also affects response when a single sound reaches two microphones at slightly different locations at about the same level
...
...
...
I'd venture that without this evidence, the general consensus would have been that less latency would always produce less comb effect and always be preferable. But a comparison of the waveforms from this section shows that the peaks are fewer and wider at 1 ms, and occurs at frequencies that would affect the vocal range just as the latency at 5 ms or even 10 ms does.

In fact, if the comb filtering at 1 ms were producing unacceptable tone quality, a possible solution might be to actually add a few milliseconds of latency to shift the affected frequencies and move the peaks closer together.
...
...
...
When latency was 10ms - 15ms, although some people detected the presence of "something", they felt they could probably live with it. Others made faces. When the latency was 15ms - 20ms, more people heard an effect and felt that it was becoming distracting. The tolerance will vary from performer to performer and with the material performed as to when they will be unable to deal with it.
--------------------------------------------------------------------------------

From adobe: http://www.adobe.com/support/techdocs/331631.html
General guidelines that apply to latency times.

Less than 10ms - allows real-time monitoring of incoming tracks including effects.

At 10ms - latency can be detected but can still sound natural and is usable for monitoring.

11-20ms - monitoring starts to become unusable, smearing of the actual sound source and the monitored output is apparent.

20-30ms - delayed sound starts to sound like an actual delay rather than a component of the original signal.


Note: The human ear is accustomed to latency because it occurs naturally in the world around us. The frequency of a sound, distance from the sound's source, and the physical properties of the human ear all play a part in when we hear a sound. However, the amount of latency introduced in the recording and monitoring process is due to the physical properties and limitations of the sound card, device drivers and processing power of the computer CPU.
--------------------------------------------------------------------------------

Sound on Sound: http://www.soundonsound.com/sos/apr99/a ... etency.htm
Musicians will be most comfortable with a figure of 10mS or less - the equivalent latency value you find on MIDI gear between pressing a key on the keyboard and hearing the sound.
--------------------------------------------------------------------------------

General talk about interactivity and latency.

http://www.stuartcheshire.org/papers/LatencyQuest.html
Interactivity
Given that other human perception parameters are known so accurately, you might think that the threshold of interactive response would be another well-known, widely studied parameter. Unfortunately it is not. For a number of years I've been interested in this question and I've asked a number of experts in the field of computer graphics and virtual reality, but so far none has been able to give me an answer. There seems to be a widespread belief that whatever the threshold of interactive response is, it is definitely not less than 15ms. I fear this number is motivated by wishful thinking rather than by any scientific evidence. 15ms is the time it takes to draw a single frame on a 66Hz monitor. Should it turn out that the threshold of interactive response is less than 15ms, it would be dire news for the entire virtual reality community. That would mean that there would be no way to build a convincing immersive virtual reality system using today's monitor technology, because today's monitor technology simply cannot display an image in under 15ms. If we find that convincing immersive virtual reality requires custom monitor hardware that runs at say, 200 frames per second, then it is going to make effective virtual reality systems prohibitively expensive.

...
...
...

100ms
As another data point, the rule of thumb used by the telephone industry is that the round-trip delay over a telephone call should be less than 100ms. This is because if the delay is much more than 100ms, the normal unconscious etiquette of human conversation breaks down. Participants think they hear a slight pause in the conversation and take that as their cue to begin to speak, but by the time their words arrive at the other end the other speaker has already begun the next sentence and feels that he or she is being rudely interrupted. When telephone calls go over satellite links the round-trip delay is typically about 250ms, and conversations become very stilted, full of awkward silences and accidental interruptions.

For these reasons I'm going to suggest 100ms as a target round-trip time for general network interaction. Some interactions may require a faster response time than this, but these interactions should hopefully be ones that can be computed locally without resorting to network communication. Some styles of interaction, like playing a game of chess, may work effectively with delays longer than 100ms, but we don't want to limit ourselves to only being able to support these limited kinds of interaction.
The previous article is from 1996, but the idea remains the same.

--------------------------------------------------------------------------------

Let's discuss!!!
Turn up the radio. Turn up the tape machine. Look into the sunset up ahead. Roll the windows down for a better taste of the cool desert wind. Ah yes. This is what it's all about. Total control now.

rikhyray
Posts: 3645
Joined: Wed Aug 25, 2004 12:13 pm
Contact:

Post by rikhyray » Wed Sep 06, 2006 7:58 pm

Audio latency is not much of an issue, considering current atate of technology- I mean any decent card RME, Motu,Emu even cheap Indigo do fine job. It gets complicated with midi latency which adds to it. Fastest firewire are 3ms +, twice slower then any PCI/ PCMCIA-at 1.5ms ( talking midi latency here) check Martin Walkers tests on SOS
This is the reason new Mac notebooks will be out of question for me unless they include PCMCIA or some manufacturers will get Express cards

tomperson
Posts: 1018
Joined: Fri Oct 08, 2004 11:55 am
Location: MVD, Uruguay, South America
Contact:

Post by tomperson » Wed Sep 06, 2006 8:23 pm

Okey, let's say fastest firewire is effectively 3ms+. If we agree on the figures of the articles I posted, then we're still quite safe, well below 10ms which seems to be the borderline for audible latency. So, is firewire really a problem at all?
Turn up the radio. Turn up the tape machine. Look into the sunset up ahead. Roll the windows down for a better taste of the cool desert wind. Ah yes. This is what it's all about. Total control now.

dj superflat
Posts: 1279
Joined: Wed Nov 02, 2005 5:31 pm
Location: leadville, CO

Post by dj superflat » Wed Sep 06, 2006 8:51 pm

i would think that latency may contribute to the "artificial" quality of some electronic music. as others have noted, there's always some latency due to (e.g.) the time it takes sound to travel from the bass amp to the other players. in the digital realm, there may not be the same "natural" latency, instead latency being determined by the sound card, the softsynths or other processing in use, etc. this may result in more precise music in some ways, which may feel artificial (though i love it when stevie sings ahead of the beat). or i may just be spouting nonsense.

tomperson
Posts: 1018
Joined: Fri Oct 08, 2004 11:55 am
Location: MVD, Uruguay, South America
Contact:

Post by tomperson » Wed Sep 06, 2006 9:11 pm

Well, I think the public expects that kind of "mathematical precision" when listening to electronic music . And I think most electronic musicians really look forward to getting the tightest sound possible. If we need "funkiness" then better hire a guitar player and give him some beer than trying to get it out of loops :)
Turn up the radio. Turn up the tape machine. Look into the sunset up ahead. Roll the windows down for a better taste of the cool desert wind. Ah yes. This is what it's all about. Total control now.

rikhyray
Posts: 3645
Joined: Wed Aug 25, 2004 12:13 pm
Contact:

Post by rikhyray » Wed Sep 06, 2006 11:42 pm

tomperson wrote:Okey, let's say fastest firewire is effectively 3ms+. If we agree on the figures of the articles I posted, then we're still quite safe, well below 10ms which seems to be the borderline for audible latency. So, is firewire really a problem at all?
Note that midi latency as added to the audio lateny so if you have at 256 6+6-12ms you must add that 3-4 ms to it. With 512 you will be around 20-30ms total.
The main problem is that anything else then PCI make lots of things hardly workable- at least for me. Exception was combination of Cubase with Midex - indeed as good as hardware seqencer despite USB.
P.S.
Real funk does not come through a beer or drugs but intentional command over rhythm/ time, so lousy digital timing wont make anything "funky" in musical sense but funky like in cheesy/stinking.

tomperson
Posts: 1018
Joined: Fri Oct 08, 2004 11:55 am
Location: MVD, Uruguay, South America
Contact:

Post by tomperson » Thu Sep 07, 2006 1:02 pm

rikhyray wrote:Note that midi latency as added to the audio lateny so if you have at 256 6+6-12ms you must add that 3-4 ms to it. With 512 you will be around 20-30ms total.
The main problem is that anything else then PCI make lots of things hardly workable- at least for me. Exception was combination of Cubase with Midex - indeed as good as hardware seqencer despite USB.
I hadn't thought of the added latency due to midi. It certainly adds up. What's the average latency of a midi message?

Other thing i'm not getting is the 6+6 you mention. Let's say 6ms are of audio driver (buffer samples), what's the other 6?

rikhyray wrote: P.S.
so lousy digital timing wont make anything "funky" in musical sense but funky like in cheesy/stinking.
I totally agree, that's why I said electronic musicians want the tightest timing possible.
Turn up the radio. Turn up the tape machine. Look into the sunset up ahead. Roll the windows down for a better taste of the cool desert wind. Ah yes. This is what it's all about. Total control now.

rikhyray
Posts: 3645
Joined: Wed Aug 25, 2004 12:13 pm
Contact:

Post by rikhyray » Thu Sep 07, 2006 1:09 pm

tomperson wrote:
rikhyray wrote:Note that midi latency as added to the audio lateny so if you have at 256 6+6-12ms you must add that 3-4 ms to it. With 512 you will be around 20-30ms total.
The main problem is that anything else then PCI make lots of things hardly workable- at least for me. Exception was combination of Cubase with Midex - indeed as good as hardware seqencer despite USB.
I hadn't thought of the added latency due to midi. It certainly adds up. What's the average latency of a midi message?

Other thing i'm not getting is the 6+6 you mention. Let's say 6ms are of audio driver (buffer samples), what's the other 6?

rikhyray wrote: in/out open preferences >Audio there you can see it.
P.S.
so lousy digital timing wont make anything "funky" in musical sense but funky like in cheesy/stinking.
I totally agree, that's why I said electronic musicians want the tightest timing possible.

Ousy
Posts: 12
Joined: Fri Nov 27, 2009 3:41 am

Re: Desmitifying latency

Post by Ousy » Mon Aug 30, 2010 12:40 am

Thanks tomperson, good read.

bicarbone
Posts: 385
Joined: Tue May 09, 2006 6:31 pm
Location: Switzerland
Contact:

Re: Desmitifying latency

Post by bicarbone » Mon Aug 30, 2010 1:11 am

I believe tomperson might have retired by now..
Talk about latency: that was 4 years ago! :wink:
|soundcloud|

MBP 2.2 GHz i7 quad 10.7.5 8GB ram | Live9suite | Reaper | Metric Halo ULN-2 + DSP | PSI A21-M active monitors | Littlepapercones passive speakers | Studer 169 analog mixer

Pitch Black
Posts: 6433
Joined: Sat Dec 21, 2002 2:18 am
Location: Auckland New Zealand
Contact:

Re: Desmitifying latency

Post by Pitch Black » Mon Aug 30, 2010 1:23 am

Playing percussion parts on a MIDI kbd, I notice 10ms (audio buffer setting as reported by Live, not including any MIDI latency) as very sludgy. A 6ms setting is better, but still noticable to me. That's unfortunately the lowest I can go, and only with certain less complex Live sets.
MBP i5 2.53GHz | OSX 10.12.6 | Live10.0.4 | Fireface800 | Push 2
https://soundcloud.com/paddyfree

Post Reply