Hi Victor,
Below (at the bottom) is more discussion relative to your remaining questions.
Again, excuse the formatting. I copied your last questions and moved them to the end in [[CAPS]] for clarity, since the post is so complex. I hope it is clear.
I apologize ahead of time, if I have merely confused the issues even more.
I assume more dialog will be forth coming…
All the best,
- James
luminous skrev:Hi James,
Since your answer is quite long I have made quite many quotes below:
Ultimately, one must arrive at a power response emitted from the loudspeaker system (the amplitude/response curve “emitted IN each direction”) that creates the “head related power response” (amplitude at each frequency “received FROM” each direction) that is appropriate to create timbral and spatial neutrality.
How do you define spatial neutrality? The spatial aspects of reproduction (sensation of room, envelopement, liveliness etc) ought to be rather subject to individual taste, and most recordings are not meant for "sound field reproduction" but to create a pleasurable experience...
There are well-defined vector/amplitude sets that provide the most neutral timbre, but they are inextricably related to the use configuration model of the loudspeaker device.
This includes the boundary relationship of the transducer, including 'global' listening environment boundaries and 'local' enclosure boundaries.
If the device is boundary coupled to the front wall, floor, sidewall, or free standing, requires a very different radiation pattern. And each of these placement models has a range of parameters to be defined, again related to the environment and enclosure relationships (front of enclosure to boundary, vs. back and sides of enclosure to boundaries).
Do you mean that the "reference sound field" at the listening position differs with the use model?
Of course, the loudspeaker design would differ according to the loudspeaker use model if the same reference sound field at the listening position is the goal.
And by "There are well-defined vector/amplitude sets that provide the most neutral timbre", I don't know if I interpret vector/amplitude sets correctly but to define a sound field at the listener that provides the most neutral timbre, I guess one would need to have knowledge of how to go from "sound amplitude versus frequency versus angle of incidence versus time of incidence at the head" to perceived timbre, sounds like a very difficult problem to me.
The use model also includes the relationship of the listener’s upper body to the transducer/enclosure, such as angle (vertical and horizontal), distance, etc.
Example: Depending on wave launch source position relationship to listener, the optimal power response will differ.
I guess it depends on what the goal of the source is, but if it is to generate the same perceived timbre regardless of source angle to the listener, then why should the design goal be different for different source angles? After all our brain should be expecting all modifications on the sound field that our body gives for different sound angles.
This is just a small subset of the variations that must be taken in to consideration.
Unfortunately, most loudspeaker manufacturers still don’t define a specific use model (because they want to let people use the speakers where ever it is convenient) so for most loudspeakers, it is not possible to accurately define an effective power response.
I agree that it is a pity that not more manufacturers define a "use model" for their loudspeakers. They will never have full control over how the speakers will sound in peoples homes.
Each use model, once clearly defined, has a unique optimal radiation pattern/power response.
Once the use model and boundary relationship and listener relationship is defined then a specialized set of measurements must be taken to calibrate the power response.
To be truly effective, this is not defined by just measuring the radiation pattern outward from the loudspeaker.
One approach is to measure the polar response outward from the device under test, with a more precision angle weighting function depending on angle, with certain angles tangent to the listening position differentiated from all other angles with a specific non-linear priority.
Then, with a varying boundary set (multiple room sizes and forms) one measures a “reception” power response at the listener’s torso (upper body).
Do you take actual measurements at a listening position with a listener present? How do you know what such measurements should look like?
Ultimately, while measuring the power response “IN all directions” from the loudspeaker, one is really only providing a beginning step in defining what is more important, which is what is the amplitude response of arrivals at the listener “FROM all directions”.
By having a defined set of amplitude vs. frequency arrivals “from each direction” one can work back to the loudspeaker device being calibrated for a power response of how it should radiate “in all directions”.
This is combined with a few other adjustments and design processes that impact the sound field around the listener’s head.
I think the most interesting problem to discuss here is what kind of sound field you want to have at the listening location - if you know that, then it's a separate problem how you should design loudspeakers for a certain use model to achieve this.
To make the matter even more complex, the ideal power response is not a single channel, emission definition, but must be calibrated to the channel count employed to achieve an optimal spatial response while maintaining timbral neutrality. This requires the power response from each loudspeaker to be recalibrated differently for “stereo” if timbre is to be maintained and “multi-channel coloration” minimized.
This is an interesting subject as well… Multiple loudspeakers will of course add in a complex manner. I imagine that they will add differently at a microphone than at a listener's head. And it should also depend on the music signal - if the sound is correlated or uncorrelated between the channels.
Then, the next step, depending on the architecture of the loudspeaker, ideally, one optimizes the polar response(s) to maintain neutrality across a listening window wide enough for at least three listeners seated beside each other. Some loudspeaker system topologies allow for this adaptation more than others.
Again, depending on the type of emission architecture (dipole, monopole, free-standing, ½ space, ¼ space..., etc.) and use model (listener/loudspeaker/environment relationship) the above stated calibration technique will result in a unique amplitude/vector set for each system type.
Hopefully, what I have written so far is of interest, even though it does not provide a quick and simple answer to the very important question that was asked.
I’ll see if there is feedback on what I’ve written so far, and if there is interest, we can explore further towards a more complete answer.
For my part there is a large interest in these questions. Thanks for the meaty post!

/Viktor
[[OF COURSE, THE LOUDSPEAKER DESIGN WOULD DIFFER ACCORDING TO THE LOUDSPEAKER USE MODEL IF THE SAME REFERENCE SOUND FIELD AT THE LISTENING POSITION IS THE GOAL.
AND BY "THERE ARE WELL-DEFINED VECTOR/AMPLITUDE SETS THAT PROVIDE THE MOST NEUTRAL TIMBRE", I DON'T KNOW IF I INTERPRET VECTOR/AMPLITUDE SETS CORRECTLY BUT TO DEFINE A SOUND FIELD AT THE LISTENER THAT PROVIDES THE MOST NEUTRAL TIMBRE, I GUESS ONE WOULD NEED TO HAVE KNOWLEDGE OF HOW TO GO FROM "SOUND AMPLITUDE VERSUS FREQUENCY VERSUS ANGLE OF INCIDENCE VERSUS TIME OF INCIDENCE AT THE HEAD" TO PERCEIVED TIMBRE, SOUNDS LIKE A VERY DIFFICULT PROBLEM TO ME.
I GUESS IT DEPENDS ON WHAT THE GOAL OF THE SOURCE IS, BUT IF IT IS TO GENERATE THE SAME PERCEIVED TIMBRE REGARDLESS OF SOURCE ANGLE TO THE LISTENER, THEN WHY SHOULD THE DESIGN GOAL BE DIFFERENT FOR DIFFERENT SOURCE ANGLES? AFTER ALL OUR BRAIN SHOULD BE EXPECTING ALL MODIFICATIONS ON THE SOUND FIELD THAT OUR BODY GIVES FOR DIFFERENT SOUND ANGLES.]]
One example would be that if the vertical source angle is a non-zero angle, then the ears will be receiving a different frequency response, due to angular pinnae interaction and also the vertical spatial location and size will be altered. So, for a non-zero vertical angle, the response from the loudspeaker may wish to be altered to simulate a zero vertical angle arrival.
Besides vertical angle, variation in source types such as point source or line source may need to each have a unique amplitude-vector to achieve a similar perceived sound field match and natural timbre perception.
[[I AGREE THAT IT IS A PITY THAT NOT MORE MANUFACTURERS DEFINE A "USE MODEL" FOR THEIR LOUDSPEAKERS.
THEY WILL NEVER HAVE FULL CONTROL OVER HOW THE SPEAKERS WILL SOUND IN PEOPLES HOMES.]]
Never say never. ☺
[[DO YOU TAKE ACTUAL MEASUREMENTS AT A LISTENING POSITION WITH A LISTENER PRESENT? HOW DO YOU KNOW WHAT SUCH MEASUREMENTS SHOULD LOOK LIKE?]]
Yes, we take measurements, and it is difficult to capture and quantify with conventional techniques. It requires an unusual, proprietary technique to achieve meaningful results.
Many folks tend to measure loudspeakers and sound fields with a single microphone. Most engineers with psycho-acoustical backgrounds recognize that one microphone doesn’t give us an accurate picture of what an ear-brain system senses in an acoustic space where a head is placed, because we have two ears.
But the truth is, that even a dual microphone arrangement or dummy head binaural system doesn’t begin to replicate the dual-ear/brain system. This is for a number of reasons, but a significant issue is that we don’t have just two ears.
When we listen to any sonic event, we don’t hold our heads still. Our two ears are constantly moving in space and a lot of how our hearing characterizes the spatial aspect (and to a degree tonal) is related to the movements of our head/ears in the space around our heads. Our ear brain system is constantly roaming “sampling” the three dimensional space, analyzing it, recalculating the information to create a dynamic, head related multi-dimensional transfer function.
If you clamp someone’s head and hold it in a fixed position, not allowing them to move, they loose part of their ability to accurately characterize a spatial event. Even though we don’t realize it, we move our heads all the time as we listen, tilting, rotating the angles of our ears on a micro basis.
This movement can be captured with accelerometers and translated to dynamic ear positions.
(This is a method that can be applied to headphones to create more accurate, out-of-the-head spatial development of reproduced program material.)
[[I THINK THE MOST INTERESTING PROBLEM TO DISCUSS HERE IS WHAT KIND OF SOUND FIELD YOU WANT TO HAVE AT THE LISTENING LOCATION - IF YOU KNOW THAT, THEN IT'S A SEPARATE PROBLEM HOW YOU SHOULD DESIGN LOUDSPEAKERS FOR A CERTAIN USE MODEL TO ACHIEVE THIS.]]
The relationship can be defined, but, as I have hopefully conveyed in the previous discussion, the multi-variable, dynamic, interactive nature of loudspeaker emission and near-head sound field requires a recursive method of matching the radiation of the loudspeaker to achieve the desired sound field at the listener and is not a simple, linear relationship. As much as it would be nice to provide a simplified representation, it would be misleading to do so.
[[THIS IS AN INTERESTING SUBJECT AS WELL… MULTIPLE LOUDSPEAKERS WILL OF COURSE ADD IN A COMPLEX MANNER. I IMAGINE THAT THEY WILL ADD DIFFERENTLY AT A MICROPHONE THAN AT A LISTENER'S HEAD. AND IT SHOULD ALSO DEPEND ON THE MUSIC SIGNAL - IF THE SOUND IS CORRELATED OR UNCORRELATED BETWEEN THE CHANNELS.]]
Yes, you are correct in that the summation is complex and has very little correlation between a single microphone and the ear-brain system. (As we discussed above, even monophonic transmission is received in a manner that is much more complex than any current microphone can interpret).
The application of two-channel stereo, of course must be integrated into the characterization from the beginning. It doesn’t work to start with optimization of mono channels and then just add two of them to have stereo. If one is to operate a two channel stereo-based system, the system should be developed as a two-channel system starting with the most basic elements of design.
The geometric relationship between a single pair of loudspeakers (passive, without signal processing) and the listener has an optimal use model that begins with, and is set by achieving the minimum inter-aural cross-correlation, which happens at approximately +/- 21 degree horizontal angles. At these angles there is an opportunity for correlated and uncorrelated signals being sorted out more effectively around the headspace. Use models, that don’t provide the +/-21 degree angle, are difficult to optimize without incorporating signal processing.
Correlated signals, that arrive correlated after room interaction transit, must have secondary arrivals, suppressed in level or be decorrelated by way of boundary interaction. Central front wall reflections, floor and ceiling reflections, all fall into this category. If these are not suppressed, spatial and tonal corruption will be difficult to overcome.
Ultimately, throughout all the issues being discussed here, it is important to keep in mind that angular arrivals don’t just impact spatial perception, but they impact tonal reception as well. So, as one makes any angle changes (vertical or horizontal) one must monitor both the spatial and spectral impact.
The stereo realm is by definition, one that doesn’t quite work from a mathematical transfer function standpoint, but by optimizing the system by way of a better understanding of how the ear-brain receives the signals, and fortunately, having the forgiveness of the powerful ability of our ear-brain system to adapt to the circumstances, we have the potential for a surprisingly good facsimile of the live event.
Again, I find these answers to probably be of little help and/or interest, as they don’t give rule of thumb, specific application information. But, I hope they provide a ground-work to establish at least partially, what some of the variables are and what must be dealt with before establishing a final calibration.
I will continue to attempt to provide useful answers to questions, but I am concerned it will be hard to satisfy, because, if there are too many questions in one post, or questions that are too complex, it is difficult to provide an answer that does the subject justice. On the other hand, if a question is simple and narrow, it most often has so many interactive variables, that it can’t be answered as simply as asked.
I always found that I couldn’t get my complete answers from some one giving a lecture or writing an article, but what they could provide, is a pointer, to help me see where to look to find my answers. If what I have to say here is at least thought provoking, in that manner, I’ll be satisfied.
That said, I’ll do my best to work towards answers that are useful and maybe over time in our discussion, we can arrive at a more complete understanding of how to achieve the best solutions.
Cheers,
- James