baygeldin/hamony_explained.md

## hamony_explained.md

      
    Raw
  

              hamony_explained.md
            
          
    Table of Contents


The physical science

How is a sound produced?
What is a sound?
How does it all relate to the volume of a sound?
How does it all relate to the pitch of a sound?
How do string instruments generate sound?
What properties of a string affect the pitch?
What is a tone?
Harmonics

What is a fundamental tone?
What is an overtone?
How do overtones sum up?
What are harmonics?
What are harmonic series?


What is a timbre?
What is a vibrato?


The computational science

Why is it important?
What is cartoon physics?
Why does the brain recognize timbre?
What is a virtual pitch?

How does the brain compute it?
Do animals compute it?


What is a relative pitch?

How does the brain compute it?
How does the brain normalize the tones?


What does a brain consider a sweet sound?
What's interesting to the brain?

How to achieve simplicity?
How to achieve complexity?


How does the brain recognize?


How it all applies to harmonic music

What is a major triad?
How is a major scale derived?
What about other triads (e.g. a D Major)?
What is a wolf interval?
What is the equal temperament (i.e. tuning)?
How is a minor scale derived?
Chords

What are chords?
What is the difference between simple and complex chords?
The standard chord dictionary (in the C key)
What is the chord voicing?
What standard chords construct the harmonic series?
What are unstable chords and why do they want to resolve into stable chords?
Why do ambiguous chords still sound good?


Miscellaneous objections

What is a circle of fifth?

Why are perfect fifths so important?
Is it somehow fundamental?
Why is the circle useful?
Why do chords that are close on the circle of fifths sound good together?


Is it really universal or just cultural?
Why does un-harmonic still may be pleasurable?
Why Hermann von Helmholtz fails to explain theory?
What about the rhythm?
What about the melody?
Why does music convey emotion?


This is all just a theory (yet a very convincing one). I've also added some additional information from the web (particularly in the physical science part).

The physical science


How is a sound produced?

Sound is produced when something vibrates. The vibrating body causes the medium (water, air, etc.) around it to vibrate. Vibrations in air are called traveling longitudinal waves, which we can hear.

What is a sound?

A sound is a wave which our ears know how to detect (i.e. to hear). Sound waves consist of areas of high and low pressure called compressions and rarefactions, respectively. A wave is usually represented as a sine wave (i.e. sinusoid) where the upper part of the curve represent compressions and the lower part represent rarefactions.

How does it all relate to the volume of a sound?

The height of the wave is the amplitude and it defines the loudness.

How does it all relate to the pitch of a sound?

The length of a full cycle of a wave is called the wavelength. Since sound travels at 343 meters per second at standard temperature and pressure, speed is a constant. Thus, frequency is determined by speed / wavelength. The longer the wavelength, the lower the pitch.

How do string instruments generate sound?

Strings don't make much sound on their own, they need an amplifier (e.g. a sound box) made of something that easily vibrates. On the guitar, the string vibrations are picked up by the bridge and then by the sound box goes and then by the air particles inside it.

What properties of a string affect the pitch?

The more thin, tighter and shorter the string, the quicker it vibrates. The more thick, loose and longer the string, the slower it vibrates.

What is a tone?

A tone can be expressed simply as a wave frequency in Hertz (Hz), the number of cycles per second. A sine wave is a pure tone. Oscilloscopes can play tones.

Harmonics


What is a fundamental tone?

There is one frequency (the fundamental tone) at which the string or air vibrates.

What is an overtone?

The form of an actual sound wave is always more complex than a sine wave, because a vibrating body vibrates not only on its own, but with every part of it, and thus generates additional sound waves which interfere with each other. These additional sound waves are called overtones.

How do overtones sum up?

When two waves meet, there can be two kinds of interference patterns; constructive and destructive. Constructive inteference is when two waveforms are added together. The peaks add with the peaks, and the troughs add with the troughs, creating a louder sound. Destructive interference occurs when two waves are out of phase (the peaks on one line up with troughs on the other). In this, the peaks cancel out the troughs, creating a diminished waveform.

What are harmonics?

Harmonics are harmonic overtones. Non-harmonic overtones result in noise (i.e. sounds with ambiguous pitch, like percussion). Harmonic overtones support the fundamental frequency. Harmonic overtones always equal the fundamental frequency multiplied by a whole number. Harmonic overtones start and finish at the same phase of the fundamental frequency.

What are harmonic series?

This sequence of tones forming a note is called the harmonic series or overtone series of the fundamental tone.

What is a timbre?

Each instrument has its own overtone profile, which is like a fingerprint. It's called a timbre.

The timbre of a sound is the principal feature that distinguishes the grow of a lion form the purr of a cat, the crack of thunder from the crash of ocean waves. Timbral discrimination is so acute in humans that most of us can recognize hundreds of different voices. We can even tell whether someone close to us — our mother, our spouse — is happy or sad, healthy or coming down with a cold, based on the timber of that voice.
Timbre is a consequence of the overtones. When you hear a saxophone playing a tone with a fundamental frequency of 220 Hz, you are actually hearing many tones, not just one. The other tones you hear are integer multiples of of the fundamental: 440, 660, 880, 1200, 1420, 1640, etc. The different tones — the overtones — have different intensities, and so we hear them as having different loudnesses. The particular pattern of loudnesses for these tones is distinctive of the saxophone, and they are what give rise to its unique tonal color, its unique sound — its timbre. A violin playing the same written note (220 Hz) will have overtones at the same frequencies, but the pattern of how loud each one is with respectively to the others will be different. Indeed, for each instrument, there exists a unique pattern of overtones. For one instrument, the second overtone might be louder than in another, while the fifth overtone might be softer. Virtually all of the tonal variation we hear — the quality that gives a trumpet its trumpetiness and that gives a piano its pianoness — comes from the unique way in which the loudnesses of the overtones are distributed.
– This is Your Brain on Music by Daniel J. Levitin

The distortions made by the overtone series of a given instrument to the ideal Harmonic Series are a predictable, systematic function of the instrument kind which means that two notes (series-es of overtones) made by the same (kind of) instrument will be distorted from the ideal Harmonic Series in the same (or similar) way.

What is a vibrato?

Vibrato is a musical effect consisting of a regular, pulsating change of pitch.

The computational science


Why is it important?

It is as fundamental as physical science.

Computational laws/idioms/patterns/algorithms are universal: the brain works using a combination of simple computational algorithms of which we are likely already aware.

If we view our brain as a computational machine, it's an important conjecture, because it allows us to reason why we perceive the sound this way and not the other.

What is cartoon physics?

To make an object look round, shade the object the more its face bends away from the viewer and put highlights where the light source would reflect off of it. This is cartoon physics.

The brain uses cartoon physics, that is, physics that is easy to compute, but not necessarily faithfully accurate to reality.

Both the use of cartoon physics and the inaccuracy of cartoon physics are due to the simple fact that the brain is computationally limited.
Possibly, the brain is using "cartoon physics" when processing sounds as well.

Why does the brain recognize timbre?

Finding the difference between what we hear and the ideal harmonic series is a valuable tool for recognizing people and determining their emotional state.

Finding harmonics is a common and important problem, so the brain has hardware for recognizing the Harmonic Series.


What is a virtual pitch?

If the harmonic series is processed to remove the fundamental tone and then played to a person, that person will hear the note, including the fundamental tone, even thought it is not played. It's called the virtual pitch. It is what allows engineers to fake bass notes on small speakers.

How does the brain compute it?

It searches for the greatest common divisor of overtones' frequencies.

A tone having strong partials with frequencies of 800, 1000, and 1200 Hz will have a virtual pitch corresponding to the 200 Hz missing fundamental, as in Demonstration 20. If each of these partials is shifted upward by 20 Hz, however, they are no longer exact harmonics of any fundamental frequency around 200 Hz. The auditory system will accept them as being "nearly harmonic" and identify a virtual pitch slightly above 200 Hz (approximately 1/3 * (820/4 + 1020/5 + 1220/6) = 204 Hz in this case). The auditory system appears to search for a "nearly common factor" in the frequencies of the partials.


Do animals compute it?

Yes, so it seems that this algorithm is fundamental.

When I was in graduate school, my advisor, Mike Posner, told me about the work of a graduate student in biology, Petr Janata…. Peter [sic] placed electrodes in the inferior colliculus of the barn owl, part of its auditory system. Then, he played the owls a version of Strauss's "The Blue Danube Waltz" made up of tones [by "tones" here he means what we are calling "notes": each note is an entire series of overtones] from which the fundamental frequency [what we are calling the fundamental tone of the overtone series] had been removed. Petr hypothesized that if the missing fundamental is restored at the early levels of auditory processing, neurons in the owl's inferior colliculus should fire at the rate of the missing fundamental. This was exactly what he found. And because the electrodes put out a small electrical signal with each firing – and because the firing rate is the same as a frequency of firing – Petr sent the output of these electrodes to a small amplifier, and played back the sound of the owl's neurons through a loudspeaker. What he heard was astonishing; the melody of "The Blue Danube Waltz" sang clearly from the loudspeakers: ba da da da da, deet deet, deet deet. We were hearing the firing rates of the neurons and they were identical to the frequency of the missing fundamental. The harmonic series has an instantiation not just in the early levels of auditory processing, but in a completely different species.
– This is Your Brain on Music by Daniel J. Levitin


What is a relative pitch?

Relative pitch is the ability of a person to identify or re-create a given musical note by comparing it to a reference note and identifying the interval between those two notes.

How does the brain compute it?

Having separate hardware in the brain for recognizing each combination of tones that co-occur in nature is sub-optimal and it would just be an expensive way to use up neurons. The algorithm every engineer resorts to in this situation, and what I suspect the brain does also, is to find a way to re-use code. So, it probably computes ratios of tones.

The brain normalizes tones by dividing tones to get tone ratios.


How does the brain normalize the tones?

Processing sound requires operating on frequencies over several orders of magnitude. Thus, normalization is needed for the code re-use. So, let's consider that the brain is probably halving or doubling the frequency of a wave until it is within a particular range before passing it to the recognizer that calculates the ratios.

The brain normalizes tones by halving or doubling them until within a particular frequency range spanned by a factor of two.

If this were so, then tones (and notes) that differ from each other by a factor of two would sound very much alike. And indeed they are! The range of notes that are all within one factor of two is called in music an octave. The difference between C3 note and C4 note is that C4 frequency equals C3 frequency multiplied by 2.

Here is a fundamental quality of music. Note names repeat because of a perceptual phenomenon that corresponds to the doubling and halving of frequencies. When we double or halve a frequency, we end up with a note that sounds remarkably similar to the one we started out with. This relationship, a frequency ratio of 2:1 or 1:2, is called the octave. It is so important that, in spite of the large differences that exist between musical cultures – between Indian, Balinese, European, Middle Eastern, Chinese, and so on – every culture we know of has the octave as the basis for its music, even if it has little else in common with other musical traditions.
– This is Your Brain on Music by Daniel J. Levitin


What does a brain consider a sweet sound?


Absence of distortion (or personality or timbre) is sweetness.

So, let's assume that the ideal harmonic series is sweetness. Instruments are not ideal, so are their harmonic series. Yet, when we play multiple notes we can produce a more pleasing sound than playing a single note. Let's see how this conjecture relates to that.
Piano strings are not the strings of ideal physics, they don't make an ideal harmonic series each tone in the series is moved by being multiplied by some factor. This factor should therefore be somewhat consistent across strings, because strings have the same physical properties.
If we play a single note, the ratio of an overtone to the fundamental have this factor in it. However, if we play two notes, we can now compute the ratio between corresponding overtones of each note and such ratio won't have the factor in it (i.e. it will be pure). It doesn't mean though that if corresponding overtones intervals are pure it would save the harmony from being destroyed by the dissonance of the intervals between overtones of a single note.

Therefore we see that note ratios induce a set of the same tone ratios. Further these tone ratios are pure, have balanced amplitudes, and are all of the same interval.

So, by playing multiple notes, on instruments having the same (or similar) timbre, and relying on relative pitch to subtract the differences for us, from distorted overtone series-es we can magically recreate parts of the ideal harmonic series!

What's interesting to the brain?

There is an art to balancing the simplicity and complexity: if understanding and predicting a storyline are too easy, then it is boring, and if too hard, then it is noise, but if just right, then it is interesting.

How to achieve simplicity?

Having a theme for all of the elements of a given situation reduces the surprise, but allows the brain to construct a whole from the parts.

The brain wants input to have a theme. That is, the brain both infers themes from input and uses themes as context when processing input.


How to achieve complexity?


Much of the brain is a massive disambiguation engine that is running all the time and is functioning at its computational limit.

Thus, by introducing ambiguity (like we do in punchlines and story plots) we can surprise the brain.

The brain enjoys having its disambiguation engine teased.


How does the brain recognize?

Humans naturally abstract; that is, they retain the features that are important for a given purpose and discard the rest. An abstraction is a reduced amount of information that still serves the purpose.

The brain uses feature vectors for recognition.


How it all applies to harmonic music


What is a major triad?

Let's take the middle C (i.e. C4) and try to find notes that the brain wants to hear together. We increase the frequency by a whole number and divide by 2 as many times as needed to stay in the same octave (or near, at least) for simplicity. Via this process except the C4 note we get G4 and E4. The ratios of C4, E4 and G4 frequencies to the fundamental note (i.e. C4) are 1, 5/4 and 3/2 respectively. This is called a major triad and it consists of a root, a major third and a perfect fifth.

How is a major scale derived?

A major triad sounds good, so let's make some more triads. For example, by making the perfect fifth of one triad is the root of the next and dividing by 2 when necessary to keep everything within the same octave. We get 7 notes with different ratios to the middle C. Let's plot them!
But our brain is listening to the multiplicative ratios of frequencies, not the additive distances. So, for this plot to mean something, we would want equal ratios to show up equally on the plot. How do we turn (multiplicative) ratios into (additive) distances?

After going through the logarithm, multiplicative factors turn into additive increments.

So, we apply the log2 to ratios. Now we sort the logarithms and remove duplicates. Let's give them letter names and for some strange reason let's start at C instead of A and wrap around.

C: log₂  1/1 ~ 0.000.
D: log₂  9/8 ~ 0.170.
E: log₂  5/4 ~ 0.322.
F: log₂  4/3 ~ 0.415.
G: log₂  3/2 ~ 0.585.
A: log₂  5/3 ~ 0.737.
B: log₂ 15/8 ~ 0.907.

Now let's plot them on the unit interval to within 0.02 units.

C       D       E    F       G       A       B    C
-------------------------------------—+
0    1    2    3    4    5    6    7    8    9    0

Doesn't it look familiar?

C   #   D   #   E    F   #   G   #   A   #   B    C
-------------------------------------—+
0    1    2    3    4    5    6    7    8    9    0

How about now?
Seems like the major scale is somehow fundamental and that's why it's almost build into the notation. If you play notes by going up the white keys of a piano keyboard one step at a time, which is the same as going up the alternating lines and spaces of an unadorned musical score, you are playing the C Major Scale.
Notice that there was no resorting to the following arguments:

"Because the Ancient Greeks did it this way."
"Because if the notes were equally spaced your ear would lose its place."
"Because your culture has trained these notes into your ear since you were a baby."


What about other triads (e.g. a D Major)?

Uh, oh, we don't have all the notes for the D Major Triad in our C Major Scale (the white keys). You can repeat the construction of the C Major Scale above and discover that the missing note is F#.
So if we measure carefully, we notice that the intervals are not all exactly right for playing another key, such as D Major. In fact, if we do more key changes, moving, say, repeatedly "up" by a Perfect Fifth again (beyond D), some of the other triads will be even less right and will start to sound really bad.
Let's compute the interval between A and D (i.e. the ratio of frequencies). First, we get the ratio of each note to the fundamental and then compute their ratio:

We got D as the Fifth above G, and G as the Fifth above C.
(We divide by 2 to keep it in the same Octave:)
D = (3/2) * (3/2) / 2 = (9/4) / 2     = 9/8.
We got A as the Third above F, and F as the Fifth below C:
A = (1/(3/2)) * (5/4) = (2/3) * (5/4) = 5/3.
A over D is therefore
A / D = (5/3) / (9/8) = (5*8)/(3*9) = 40/27 ~ 1.481.
Whereas a Perfect Fifth should be        3/2  = 1.5.
The error is therefore
(PerfectFifth - (A/D)) / PerfectFifth
= ((3/2) - (40/27)) / (3/2)
~ 0.0123456790123457
~ 1.2%.

This is because a whole number of just perfect fifths will never add up to a whole number of octaves, because they are incommensurable. There are more constraints than variables to play with. Engineers call this situation "the problem is over-constrained".
So, if we want to change a key and use D as fundamental instead of C, should be buy a new piano or deal with the fact that D Major Triad doesn't sound as good?
There's no easy solution for this and many were proposed. They all amounted to fudging the actual note values so that instead of three triads sounding really right and the rest pretty "off", some would sound less right and others less off. These different settings of the frequencies of the note values were called different tunings.

What is a wolf interval?

Wolf intervals are an artifact of keyboard design are forced on meantone tunings by the one-dimensional piano-style keyboard. If you go "up" by enough perfect fifths you come to a key that turns out to be maximally bad (it can't keep getting worse because there are only 12 keys), namely the key of F# (also known as Gb). This key sounded so bad it was called the Wolf Key.

What is the equal temperament (i.e. tuning)?

Someone realized that if we just make all the steps equally far apart, well, it sounds sort of ok once you get used to it. It's called equal temperament.
Notice that by making all the notes equally-spaced, each semi-tone is the number such that when you multiply it by itself 12 times you get an Octave, or factor of 2. Every time you go up a semi-tone, you are adding six percent to the frequency of the note. Let's call this ratio of the interval of a semi-tone TwR2 (twelfth-root of 2).
Now none of the keys are tuned really right — now they all sound a little "off" — some of the sweetness is permanently gone.

In our insistence on symmetry we have lost both some sweetness and some richness – a common theme of Modernism.

But when the notes were all tuned to make one key sound perfect, other keys sounded "off" in different ways and this could be used for dramatic effect by the composer, so when we play that old music today on a modern keyboard, we no longer hear it as it was intended.

How is a minor scale derived?

There is only one major scale and many minor scales. Major scale sounds happy while minor scales sound sad. Probably, because major scale is sweet due to absence of distortion and minor scales are a little bit off.

There is one way for things to go right, but many ways for things to go wrong.

Here's harmonics of the major triad:

Root:            1 = 1.0.
Major Third:   5/4 = 1.25.
Perfect Fifth: 3/2 = 1.5.

And here's a set of ratios of pairwise intervals of the major triad:

Major Third   / Root:        (5/4) / (1)   = 5/4.
Perfect Fifth / Root:        (3/2) / (1)   = 3/2.
Perfect Fifth / Major Third: (3/2) / (5/4) = 6/5.

Now let's tweak the harmonics a bit:

Root:            1 = 1.0.
Minor Third:   6/5 = 1.2 (new!).
Perfect Fifth: 3/2 = 1.5.

Now the triad is not as sweet, because it's a little bit off, yet look at the intervals:

Perfect Fifth / Minor Third: (3/2) / (6/5) = 5/4.
Perfect Fifth / Root:        (3/2) / (1)   = 3/2.
Minor Third   / Root:        (6/5) / (1)   = 6/5.

They are the same! Our brain will think that this is the same set of intervals due to the phenomena of relative pitch. Now comes the art. If the brain is hearing the series but also the relative intervals, we can tease it, for artistic purposes, by giving it the right intervals but in the wrong order.
The way the Minor Triad will sound to the brain is:

I hear the pairwise intervals as the harmonic series
but I do not hear the harmonic series itself (something is missing)

The theory is that the minor scales are just games with intervals where we tease the harmonic series recognizer of the brain with partial recognition: we fire some of the intervals in the feature vector for the harmonic series, but not the whole series itself.

Chords


What are chords?

Let's call these groups of notes that sound like something chords.

What is the difference between simple and complex chords?

High-theme/low-complexity chords are major triad and harmonic series chords. Low-theme/high-complexity chords are minor and ambiguous chords. As we progress, more and more of the theme of the harmonic series is lost and more and more complexity is introduced. Musically untrained listeners like major chords while more musically trained listeners are more tolerant to loss of theme and more interested in complexity.

The standard chord dictionary (in the C key)


Name
Notes in C Major Scale
Semi-tones from fundamental


C "Major Triad"
C-E-G
0-4-7


Cm "Minor Triad"
C-Eb-G
0-3-7


Cdim "Diminished"
C-Eb-Gb
0-3-6


C+ "Augmented"
C-E-G#
0-4-8


Csus "Sustained"
C-F-G
0-5-7


C6 "Sixth"
C-E-G-A
0-4-7-9


Cm6 "Minor Sixth"
C-Eb-G-A
0-3-7-9


C7 "Dominant Seventh"
C-E-G-Bb
0-4-7-10


Cmaj7 "Major Seventh"
C-E-G-B
0-4-7-11


Cm7 "Minor (Dominant) Seventh"
C-Eb-G-Bb
0-3-7-10


Cdim7 "Diminished Seventh"
C-Eb-Gb-A
0-3-6-9


C(add9) "Add 9"
C-E-G-D
0-4-7-14


C9 "Ninth"
C-E-G-Bb-D
0-4-7-10-14


Cmaj9 "Major Ninth"
C-E-G-B-D
0-4-7-11-14


Cm9 "Minor Ninth"
C-Eb-G-Bb-D
0-3-7-10-14


C11 "Eleventh"
C-E-G-Bb-D-F
0-4-7-10-14-17


C13 "Thirteenth"
C-E-G-Bb-D-A
0-4-7-10-14-21


What is the chord voicing?


If you play all the notes of a chord within one Octave, it sounds like mud.

All of this dividing and multiplying of intervals by factors of two as we move across Octaves has to stop someplace. No such processing of signals can be taken to an extreme: the more you do it the more it corrupts the signal. In fact, selecting which octaves in which to play the notes of a chord is so important it has a term: voicing the chord.
Playing the notes of a chord in octaves that put them closer to the ideal harmonic series really sounds better which supports the theory that notes sound good together because they are all from one harmonic series, rather than because their intervals are just somehow special.

What standard chords construct the harmonic series?

The major triad's three notes correspond to 1, 3 and 5 harmonics. What if we keep going? We want to skip even numbers since they result in existing harmonics when divided by 2. The next number is 7. But no note in our Twelve-Semi-Tone Scale is terribly close to 7/8. This is just the way the math works out. In this situation we'll just pick a key on one side or the other of the real 7/8.

Ninth        9/8 = 1.125;  log 1.125 / log TwR2 ~ 2.039.
Eleventh    11/8 = 1.375;  log 1.375 / log TwR2 ~ 5.513.
Thirteenth  13/8 = 1.625;  log 1.625 / log TwR2 ~ 8.405.

Hmm, well those eleventh and thirteenth harmonics are pretty badly approximated by the piano keyboard! But let's play the higher harmonics in higher octaves to make them sound at least a little bit better. So, we get these chords that somehow approximate the harmonics series to one side or the other:


Name
Notes in C Major Scale
Semi-tones from fundamental


C "Major Triad"
C-E-G
0-4-7


C7 "Dominant Seventh"
C-E-G-Bb
0-4-7-10


C(add9) "Add 9"
C-E-G-D
0-4-7-14


C9 "Ninth"
C-E-G-Bb-D
0-4-7-10-14


C11 "Eleventh"
C-E-G-Bb-D-F
0-4-7-10-14-17


C13 "Thirteenth"
C-E-G-Bb-D-A
0-4-7-10-14-21


What are unstable chords and why do they want to resolve into stable chords?


The brain wants to hear one harmonic series.

If we drop a couple of notes, out brain will fill in those gaps (remember the virtual pitch?). But what if we play so few notes that the implied harmonic series is ambiguous, that the missing Harmonic Series could be completed in more than one way?
Some chords are ambiguous therefore unstable: if we give the brain more than one alternative then the sound is is "unsettled" until the player provides enough notes to "break symmetry" and disambiguate the series.


What are sustained chords?
One ambiguity is to have two instances of the interval from root to harmonic 3, the Perfect Fifth.
If we play notes C, F and G, we create the possibility of either F or C being the Root (there are two perfect fifths: between F and C and between C and G). This chord is called the sustained chord and musicians say that it wants to resolve to a major triad at F or C.


What are augmented chords?
Another ambiguity is to have two instances of the interval from root to harmonic 5, the Major Third. For example, if we play notes C, E and G# we have this situation. Counting carefully, note that these three notes actually make three major thirds! Unsurprisingly, it doesn't sound very satisfying, but the brain does "recognize it as something", as opposed to sounding like noise. These are augmented chords.


What are diminished chords?
Another ambiguity we can create is to have two instances of the minor third. For example, if we play notes C, Eb and Gb we have this situation and if we add the note A, we have not only three but four (they wrap around) copies of this interval all at once! These are diminished chords.


Why do ambiguous chords still sound good?

While rare and ambiguous chords may sound strange in isolation, the theme created by the preceding music before the chord may bring a certain sense to them. Think of one standard structure for a joke: a story (creating a theme) and then a punchline; the punchline would not be funny in isolation without the context provided by the story, and yet we attribute the funniness of the joke to the punchline and not the story which did the work.

Miscellaneous objections


What is a circle of fifth?

If you start at, say, middle C, and go up by a perfect fifth occasionally stepping back an octave eventually the notes will repeat and you will come back to the middle C. This is called the circle of fifths.

Why are perfect fifths so important?

Of all the intervals, the fifth is the most sweet (near the bottom of the harmonic series), and interesting (not so harmonic as to be boring; that is, not the octave interval).

Is it somehow fundamental?

No, the circle is just a combinatorial coincidence. You visit every note because a fifth is 7 half-steps, the octave is 12 half-steps, and 7 is "relatively prime" to 12: they have no common divisors. We went up a factor of 3/2 12 times and went down an ocatave 7 times:

(3/2)¹² / 2⁷ = 3¹² / 2¹⁹ = 1.0136432647705078125 (exact!)

When that small amount of error is spread out evenly over twelve fifth intervals, you can see how the equally tempered scale is rather appealing. It gets Fifths almost exactly right, to within almost a tenth of a percent.

Why is the circle useful?

Musicians would often like to make transitions between chords that are not too jarring, or sound natural. One really common way to do this is for the second chord to have as its root a note of the first chord, such as fifths or thirds
Notice that moving by either a fifth, a major third, or a minor third, you get you to different places on the Circle, and therefore by combining these different transitions, you can go one direction in the "harmonic space", and come back by another direction!

Why do chords that are close on the circle of fifths sound good together?

It's possible that our brain really remembers the previous note played and expects to hear the next chord with some overtones of the previous chord to maintain the harmony. The fact that the closer the notes on the circle of fifths the better it sounds. Try combining the C major with other notes on the circle and you'll notice that the best combinations are with F major, G major and A minor. These chords have the most similar harmonics to C major.

Is it really universal or just cultural?


Nearly all this variation in context and sound comes from different ways of dividing up the octave and, in virtually every case we know of, dividing it up into no more than twelve tones. Although it has been claimed that Indian and Arab-Persian music use "microtuning" – scales with intervals much smaller than a semitone – close analysis reveals that their scales also rely on twelve or fewer tones and the others are simply expressive variations, glissandos (continuous glides from one tone to another), and momentary passing tones, similar to the American blues tradition of sliding into a note for emotional purposes.

Many cultures use only five notes in their scale; any such scale is called pentatonic. Much African and Chinese music makes use of pentatonic scales and yet African and Chinese people seem to have no difficulty enjoying Western music. Their brains were always capable of hearing Western music, but their culture has simply never made use of the rest of the available parameter space. Just because the brain is capable of experiencing something does not mean that the art of that culture has taken advantage of that fact.

Why does un-harmonic still may be pleasurable?

Play C and F# on a piano. It sounds awful. It's called the augmented fourth or diminished fifth. Yet, it's still used in the music. Probably, this is just a different sort of pleasure, not the natural.

The Pleasure Artists feel in hearing much of that compos'd in the modern Taste, is not the natural Pleasure arising from Melody or Harmony of Sounds, but of the same kind with the Pleasure we feel on seeing the surprizing Feats of Tumblers and Rope Dancers, who execute difficult Things. For my part, I take this to be really the Case and suppose it the Reason why those who being unpractis'd in Music, and therefore unacquainted with those Difficulties, have little or no Pleasure in hearing this Music. Many Pieces of it are mere Compositions of Tricks. I have sometimes at a Concert attended by a common Audience plac'd myself so as to see all their Faces, and observ'd no Signs of Pleasure in them during the Performance of much that was admir'd by the Performers themselves; while a plain old Scottish Tune, which they disdain'd and could scarcely be prevail'd on to play, gave manifest and general Delight.
– Ben Franklin in a letter to Lord Kames, June 2, 1765


Why Hermann von Helmholtz fails to explain theory?

19th-century German physicist Hermann von Helmholtz tried to explain theory via the definition of a beat.

A beat is an interference between two sounds of slightly different frequencies, perceived as periodic variations in volume whose rate is the difference between the two frequencies…. Beating can also be heard between notes that are near to, but not exactly, a harmonic interval, due to some harmonic of the first note beating with a harmonic of the second note.

So, he says that intervals are dissonant due to beating of their overtones and consonant if their notes are not dissonant.
However, Helmholtz was a physicist and completely ignored the computational part of things. He's not wrong, but the theory is strikingly incomplete due to this. He doesn't take into account the vertical intervals (i.e. between overtones), the virtual pitch, etc. He explains the pleasure of harmony as the absence of pain (i.e. the absence of beats), however it is clearly more, a presence of something.

What about the rhythm?

Since much human relationship is expressed through voice and movement, is music therefore recalling to us experiences of vocally and physically relating to others and therefore also recalling the associated emotions? What if harmony is an abstraction of human voice and rhythm is an abstraction of physical movement, such as walking or dance? If musical harmony is our way of gratifying our harmonic-series detectors, is it possible that rhythm is a way of gratifying our movement detectors?
Regular moments of contact occur throughout our lives while we use our bodies, for example in our breathing, heart-beating, speaking, and walking. Rhythm is an abstraction of coordinated body movement.
Just as melody plays with the expectation of harmony, rhythm plays with the expectation of movement.

What about the melody?

Does the brain somehow "keep" notes for a period after they have finished actually sounding? It seems likely. It's really easy to repeat a sound in the same pitch you just heard.
Since anticipation and prediction is one of the fundamental operations of the brain, they are at the heart of what we call narrative and how narrative can be so entertaining the mind. Recall that there is an art to balancing the simplicity of theme and the complexity of ambiguity. Any kind of expectation is useful to the brain, not just those that are commutable from direct harmonic relationship. So, maybe melody just plays with this expectation.
Melody as the narrative unifying of harmony and rhythm.

Why does music convey emotion?

The voice is of great importance to humans as it is one primary way of relating to each other. Emotion is our ancient, pre-intellectual, way of understanding each other and therefore much emotion is communicated in voice. It is likely that the brain has much hardware devoted to processing voice, both for finding signal and separating out noise, and this hardware is being repurposed when listening to music. Movements also convey emotions, then so does the rhythm.
Name	Notes in C Major Scale	Semi-tones from fundamental
C "Major Triad"	C-E-G	0-4-7
Cm "Minor Triad"	C-Eb-G	0-3-7
Cdim "Diminished"	C-Eb-Gb	0-3-6
C+ "Augmented"	C-E-G#	0-4-8
Csus "Sustained"	C-F-G	0-5-7
C6 "Sixth"	C-E-G-A	0-4-7-9
Cm6 "Minor Sixth"	C-Eb-G-A	0-3-7-9
C7 "Dominant Seventh"	C-E-G-Bb	0-4-7-10
Cmaj7 "Major Seventh"	C-E-G-B	0-4-7-11
Cm7 "Minor (Dominant) Seventh"	C-Eb-G-Bb	0-3-7-10
Cdim7 "Diminished Seventh"	C-Eb-Gb-A	0-3-6-9
C(add9) "Add 9"	C-E-G-D	0-4-7-14
C9 "Ninth"	C-E-G-Bb-D	0-4-7-10-14
Cmaj9 "Major Ninth"	C-E-G-B-D	0-4-7-11-14
Cm9 "Minor Ninth"	C-Eb-G-Bb-D	0-3-7-10-14
C11 "Eleventh"	C-E-G-Bb-D-F	0-4-7-10-14-17
C13 "Thirteenth"	C-E-G-Bb-D-A	0-4-7-10-14-21