Jump to ContentJump to Main Navigation
From Strange Simplicity to Complex FamiliarityA Treatise on Matter, Information, Life and Thought$

Manfred Eigen

Print publication date: 2013

Print ISBN-13: 9780198570219

Published to British Academy Scholarship Online: May 2013

DOI: 10.1093/acprof:oso/9780198570219.001.0001

Show Summary Details

(p.651) Appendix A3 On the Nature of Mathematical Proof

(p.651) Appendix A3 On the Nature of Mathematical Proof

Source:
From Strange Simplicity to Complex Familiarity
Publisher:
Oxford University Press

(p.651) Appendix A3

On the Nature of Mathematical Proof

1. Mathematical induction

The two main methods of mathematical proof are called “constructive” and “indirect”. Let me start with the constructive method of “mathematical induction” or, as it is also called, “complete induction”. It is the prototype of a constructive proof and the main principle for investigating the qualities of natural numbers: if a certain property holds for n = 1, the smallest of all natural numbers, and if from its validity for some n a similar validity for (n + 1) follows, then this assertion is true for all natural numbers. The proof may alternatively be based on a hypothesis, as is stated in Richard Courant and Herbert Robbins' classic book What is Mathematics? 1 (newly revised by Ian Stewart), to which I shall refer repeatedly in this appendix. I quote: “The fact that the proof of a theorem consists in the application of certain simple rules of logic does not dispose of the creative element in mathematics, which lies in the choice of the possibilities to be examined. The question of the origin of the hypothesis belongs to a domain in which no very general rules can be given; experiment, analogy and constructive intuition play their part here. But once the correct hypothesis is formulated, the principle of mathematical induction is often sufficient to provide the proof.”

In the examples I am going to discuss, the results (some of which are also quoted in the above‐mentioned book) are all well known. Moreover, to prove their correctness by mathematical induction is a relatively easy job, so that the authors (in certain cases) left this to their readers. I felt sufficiently challenged by one of the problems considered, namely the sums of arbitrary powers of natural numbers. One rainy weekend in my Californian refuge – such days occur anyway too rarely in Southern California – I recalled Courant's words quoted above. My ambition did not admit to contenting myself with a mere verification of the known solutions. Rather I wanted to show how certain surprising results can be visualised by geometrisation. I think that some of the problems treated can be recommended for advanced courses in high‐school mathematics. The problem I chose was the general solution for the sum of (p.652) the q‐th power of the first n natural numbers q n = Σ k=1 n k q for any power q of k. The classical solution of this problem is well known. It requires a knowledge of certain finite sums such as those defining the Euler and Bernoulli numbers2 (see below). My real motivation was to learn more about n‐dimensional point spaces, which play a part in my thoughts on information as described in Chapters 4 and 5.

1.1 The sum of integers 1n

I start with the trivial case of qn for q = 1. There is an amusing story of a boy at elementary school. The teacher, tired from lecturing all morning, gave his charges the problem of adding up all the numbers from one to a hundred, hoping to get some time for a little nap. But after a few moments one of the youngsters arrived at the solution, 5050, and he had to enlighten his startled teacher as to how he had got the result that quickly. The boy's name was Carl Friedrich Gauss, who did in his head what is depicted in Figure A3.1. This graphical depiction of the trick used by Gauss is one‐dimensional in nature. I say this because the method I am going to describe is based on n‐dimensional geometry. The fact that two one‐dimensional lines are involved is solely due to the arithmetical trick that Gauss used, which in its generalisation is based on mathematical induction.

1.2 1.2 The sum of squares 2n

The problem for q = 2 is only slightly more complex if we use a suitable geometrical illustration, which in this case has to be two‐dimensional (see Figure A3.2).

Closer inspection of Figure A3.2 reveals:

  1. (1) The sum of each vertical column, and hence as well of the diagonal, is given by 1 Σ n = n(n + 1 ) / 2

  2. (2) All horizontal rows are n‐fold repeats of their respective positional number k in the vertical column. Their sum for (any) position k is n·k

    
                  Figure A3.1 Maxwell's demon
                  Graphical representation of what happened in the mind of the elementary‐school boy whose name was Carl Friedrich Gauss

    Figure A3.1 Maxwell's demon

    Graphical representation of what happened in the mind of the elementary‐school boy whose name was Carl Friedrich Gauss

    (p.653)
    
                  Figure A3.2 Maxwell's demon
                  Two‐dimensional illustration for the calculation 2∑n, including a geometrical visualisation of complete induction

    Figure A3.2 Maxwell's demon

    Two‐dimensional illustration for the calculation 2n, including a geometrical visualisation of complete induction

  3. (3) 3) The total sum of all numbers in the table is n 1n = n2(n + 1)/2.

  4. (4) The sum of all squares, i.e. Σ k=1 n k 2 , is then represented by all red colored numbers of the lower‐lef triangle plus the diagonal.

    These are the conditions on which the construction of the diagram is based. In order to calculate the sum of squares 2 Σ n = Σ k=1 n k 2 , we need a fifth non‐trivial statement that can easily be proven by mathematical induction. This statement reads:

  5. (5) The sum of all red numbers in the lower‐left triangle (plus diagonal) for any n is twice the sum of all black numbers in the upper‐right (shaded) triangle.

The proof by complete induction is greatly aided by the geometrical construction in Figure A3.2. First we compare the sums of each horizontal row of the lower‐left triangle with their complementary vertical sum in the upper‐right triangle. The ratio of horizontal to vertical sum starts with 2:1, followed by 6:3, 12:6 etc., the value of each ratio being two. The same holds for the ratios of the sums in corresponding triangles. The proof can now be completed by induction - in our diagram by transition from n to n + 1. Tis means for the lower‐left triangle the addition of one horizontal row, the sum of which is (n + 1)n. (Note that the diagonal does not belong to either of the two triangles.) For the upper‐right triangle the transition means the addition of one vertical column: Σ k=1 n k=n(n + 1 )/2 , which holds for any number n. It proves that the corresponding horizontal sums in the lower‐lef triangle (plus (p.654) diagonal) of all red numbers are twice as large as those in the upper‐right triangle. Tus, the “geometrical” construction in this case provides a direct visualisation of “complete induction”, yielding (2n+1)/3 · 1 Σ n  or ( n 2 + n)(2n+1)/6 2 Σ n .

However, we see clearly from this simplest case of a higher (i.e. two) dimensional representation that we are not dealing with straightforward squares (i.e. n2) but rather with “weighted” ones. The “value” of each number in the k‐th horizontal line is “k”. Tis means a kind of “stretching” of one dimension, which will also occur in all higher‐dimensional cases.

1.3 The sum of cubes 3n

The geometrical representation introduced with 2n can be generalised by any higher dimension q for calculating qn. It appears to be more involved because we now require some spatial imagination. For the sum of cubes 3n the procedure is illustrated in Figures A3.3a and b.

The cube consists of n planes, each containing n2 equal numbers k, where k runs from 1 to n. From an inspection of the geometrical model we can make the following statements:

  1. (1) The sum of the cubes of integers 3n is given by the sum of all numbers in the (red) shaded planes, which include the “diagonal plane” from the front upper‐left point to the lower‐right edge (Figure A3.3a).

  2. (2) The total sum of all (red and black) numbers in the cube is 3 Σ tot = n 2 · 1 Σ n = n 3 (n+1)/2 .

  3. (3) The diagonal, the sum of which again equals 1n, but which does not need to be considered separately, divides the cube into two parts, i.e. the numbers in the red shaded area, the sum of which is 3∑n, and the non‐shaded rest of the numbers, the sum of which we call the “rest sum” 3rest.

  4. (4) There is no fxed proportion of 3rest to 3n, as was the case for the sum of squares. It rather depends on n. We therefore proceed to calculate 3rest using mathematical induction, which is represented geometrically in Figure A3.3b.

Enlarging the cube from (k − 1)3 to k3 positions requires an addition of three faces, which for convenience I have called the xy, xz and yz faces. Adding the bottom face (xy) is equivalent to adding k3, i.e. k2 (shaded) positions each having the value k For the xz and yz faces we add (2k − 1) sums of integers running from 1 to (k − 1). Since the bottom face belongs to the shaded part, the sums run only from 1 to (k − 1). Furthermore, since the two faces (xz and yz) have one edge in common, there are only (2k − 1) sums running from 1 to (k − 1). Hence 3rest in each single step changes by: (p.655)


                  Figure A3.3 Maxwell's demon
                  The three‐dimensional representation shows the way to treat the n‐dimensional problem (see text). (a) The points in the red‐shaded area yield the sum of cubes: 3∑n (b) How to proceed by mathematical induction by enlarging the cube (see text).

Figure A3.3 Maxwell's demon

The three‐dimensional representation shows the way to treat the n‐dimensional problem (see text). (a) The points in the red‐shaded area yield the sum of cubes: 3n (b) How to proceed by mathematical induction by enlarging the cube (see text).

(p.656)
δ 3 Σ rest = (2k 1 ) Σ l=1 k 1 1 = k(k 1 )(2k 1 )/2= k 3 3 k 2 / 2 + k/2

Now we can add up for the rest sum of the whole cube as defined above:

3 Σ rest = Σ k = 1 n k 3 3 2 Σ k = 1 n k 2 + 1 2 Σ k = 1 n k

Knowing the sum of squares 2 Σ n = (2n+1) 3 · 1 Σ n we arrive at:

3 Σ rest = 3 Σ n 3 2   2 Σ n + 1 2   1 Σ n 1 = 3 Σ n n 1 Σ n

This indicates to us already a symmetry, resulting from the particular geometry. Since the sum 3n + 3rest yields the “volume” of the whole space 3tot= n2 1n the sum 3n comes out to be ( n 2 + 1 ) 2 1 Σ n = ( 1 Σ n ) 2 .

For n → ∞ the sums 3n and 3rest converge to the same value n 2 2 1 Σ n yielding 3 Σ tot = n 2 1 Σ n .

Let me quote once more from the book of Courant et al.1, who refer in particular to this formula: “It should be remarked that although the principle of mathematical induction sufces to prove the formula once this formula has been written down, the proof gives no indication of how this formula was arrived at in the first place; why precisely the expression [n(n + 1)/2]2 should be guessed as an expression for the sum of the first n cubes, rather than any of the infinitely many expressions of a similar type that could have been considered.”

My first reaction was, too: “Wonders will never cease”. However, geometry offers a suggestive explanation. The particular space I have called 3Stot (for which 3 Σ tot = n 2   1 Σ n = 3 Σ n + 3 Σ rest ) yields a special symmetry among the two sums 3n and 3rest that is not present for any other dimension. In order to explain this let us now consider first the generalisation to any arbitrary dimension q.

1.4 The general case qn

So far I have applied mathematical induction more in a “physical sense”, namely by deriving results simply from quantitative inspection of geometrical structures. Let me conclude now by using mathematical induction to generalise the rules for an application to structures of higher dimensionality that are harder to visualise because of our lack of imagination. Nevertheless, the geometrical interpretation with the help of the three sums qtot, qn and qrest will remain the basis of our more abstract deliberations.

The four‐dimensional hypercube in Figure A3.4a will be the model that applies in a formally analogous way to any dimension q. We define, again, one co‐ordinate where k runs from 1 to n and call it the “vertical” axis. For each value k of the (p.657)


                  Figure A3.4 Maxwell's demon
                  In this figure the four‐dimensional case (q = 4), which we still have some ability to imagine, is used to introduce the generalisation. The proof cannot gain much further advantage from a geometric visualisation. The abstraction takes us close to the classical proof, which can be found in textbooks. Nevertheless, the geometrical model still provides some general insight in the case of large dimensions. (a) The complete state model for q = 4. (b) The enlargement of the subspace for increasing n as described in the text. (c) and (d) Artistic models of the enlargement.

Figure A3.4 Maxwell's demon

In this figure the four‐dimensional case (q = 4), which we still have some ability to imagine, is used to introduce the generalisation. The proof cannot gain much further advantage from a geometric visualisation. The abstraction takes us close to the classical proof, which can be found in textbooks. Nevertheless, the geometrical model still provides some general insight in the case of large dimensions. (a) The complete state model for q = 4. (b) The enlargement of the subspace for increasing n as described in the text. (c) and (d) Artistic models of the enlargement.

(p.658) “vertical” axis there is a (q − 1)-dimensional hyperface in which all positions have the same value k, quite analogous to the two‐dimensional planes in Figure A3.3b. Each of these hyperfaces contains nq−1 positions. In Figure A3.4b the three‐dimensional hyperface, referring to position k on the “vertical” axis, is highlighted in red, in order to distinguish it clearly from hyperfaces referring to other positions and to stress the analogy to the planes in Figure A3.3a. For dimensions q 〉 4 the situation is similar, except that the hyperplanes are more complex, higher‐dimensional structures with various overlappings of common faces in their lower‐dimensional hyperplanes, as shown for the four‐dimensional example in Figure 1.2.5 in the text. We again compare the total volume qtotwith the two volumes qn and qrest. Hence, as in the three‐dimensional case, the following statements hold:
  1. (1) The total sum of numbers in the hypercube qtot is given by nq−1 ·5∑n = (nq+1 + nq)/2.

  2. (2) The total sum consist of two parts: q Σ n = Σ k=1 n k q , i.e. the sum of powers q of integers 1 to n (the “unknown” we want to determine) and the rest sum q Σ rest = q Σ t ot q Σ n = n q 1 · 1 Σ n Σ k=1 n k q .

  3. (3) In Figure A3.4b q∑n is given by all red‐coloured symbols, while qrest is represented by all black symbols. Note that in the fgure the colours red and black distinguish two diferent classes of number arrangement.

  4. (4) As introduced with the three‐dimensional case we use the above relation between qtot, qn and qrest in its variational form in order to obtain an explicit expression for qrest. Its geometrical interpretation, as given for 3rest, is still obvious for Erest, but becomes increasingly cumbersome for larger dimensions because of our inability to imagine high‐dimensional structures and their sharing of common faces with lower hyperplanes.

The variational increase of the total sum when going from (k − 1) to k is:

δ q Σ tot = k q  (k+1)/2 (k 1 ) q · k/2= k q {(k+1) k(1 k 1 ) q }/2

where (1 − k−1)q can be expressed by its (finite) binomial power series yielding:

δ q Σ tot = k q 2 { 1 + q q(q 1 ) 2 ! k 1 + q(q 1 ) ( q 2 ) 3! k 2  …  ± q(q 1 ) ( q 2 )  … (q p+1) p ! k q+ 2 }  (last term  ±   p = q

In order to obtain the variational expression for the rest sum we have to subtract the hyperplane kq from the above expression: (p.659)

δ q Σ rest = q 1 2 k q q(q 1 ) 2 · 2 ! k q 1 + q(q 1 ) ( q 2 ) 2 · 3! k q 2 +  … (last term   k

yielding the total rest sum by adding up all δ terms:

q Σ rest = q 1 2 Σ k=1 n k q q(q 1 ) 2 · 2 ! Σ k=1 n k q 1 + q(q 1 ) ( q 2 ) 2 · 3! Σ k=1 n k q 2 +  … (last term   Σ k=1 n k

The equation qn = qtotqrest then contains as the only unknown qn, all other terms referring to sums with powers 〈 q:

q Σ n = Σ k=1 n k q = n q+1 q+1 + 1 q+1 { n q + q(q 1 ) 2 ! Σ k=1 n k q 1 q(q 1 ) (q 2 ) 3 ! Σ k=1 n q q 2 +  …  }  (last term  ± Σ k=1 n k)

Now let us look at hidden symmetries among the various sums.

qtot, which is the sum of qn and qrest,has the simple form of (nq+1 + nq)/2. The sum qn, according to the expression given above, is dominated by its “leading term” n(q+1)/(q + 1), which represents its limiting value for large values of n. For n = 100 the leading term is already about hundred times larger than all of the rest. We therefore write:

q Σ n = n q+1 + n q q+1 + q Δ n q+1

where q Δn includes all other terms listed above. Then qrest can be written as

q Σ rest = q 1 2   n q+1 + n q q+1   q Δ n q+1

Here we see immediately the symmetry of qn and qrest, the leading terms of which become identical for q = 3, where the sum of both equals the leading term of qtot, which is given by (n(q+1) + nq)/2. This symmetry is unique forq = 3 because it does not exist for any other arbitrary q value. Moreover, for q = 3, the first term in the above expression for qn becomes equal to n2 ·1n/2 while qΔn/(q + 1) is about n··1n /2, so that the whole expression 3n equals ( 1n ) 2. For analogous (p.660) reasons the term n·1n /2 has to be subtracted from the leading term, so that qrest becomes equal to 1 1n−1, again a unique symmetry for q = 3.

Now assume that any symmetry exists that allows us to write:

q Σ n = i Σ n · k Σ n

Representation of the sums by their “leading terms” would yield

n q+1 q+1 = n i +1 ( i +1) · n k +1 ( k +1)

which means that

(i+1)+(k+1)=q+1=(i+1)(k+1)

having the only solution:

i=k=1; q+1=2 × 2 = 2 + 2 = 4  and q=0+1+2=3

With this, the uniqueness of q = 3 is reduced to the trivial facts of the theory of natural numbers, namely 3 being the only number that is the sum of all its precursors and 4 being the only number given by x + x = x · x = xx, for x = 2.

The above rules hold also for any higher‐order products q= i· k· 1 that can always be reduced to a product of two sums. The general validity depends essentially on the fact that the leading terms for any q 〉 3 are devoid of the above symmetry, since (q 1 ) 2 is always (increasingly) larger than 1 for increasing q 〉 3.

In our case the solution for qn builds up iteratively from the solutions for smaller q values. Hence it is expressed in terms of powers of nq, nq−1, nq−2 etc. The same result was achieved in the classic solution in terms of Bernoulli numbers Bn defined by the generating function2

te xt e t 1 = Σ n=0 B n (x) t n n!  for  | t |   2 π

with B0 = 1, B1 = −1/2, B2 = 1/6, B4 = −1/30,…

It is interesting to notice certain symmetries, although none of them reaches that of 3/n = (1n)2 with its much simpler geometry.

My main incentive for treating the above problem is not the reproduction of expressions already known for the sums of powers of integers, but rather stressing the geometrical visualisation of the problem. Higher‐dimensional spaces are of interest in several parts of this book, for example in Sections 1.2, 1.8, 1.9, 2.9, 3.6, 4.4, 4.5 and 4.8. They are a major tool in treating the problems involved in the “generation of information”.

The final part of the proof, in which we go beyond dimension q = 4 and where our spatial imagination begins to fail, shows how the geometrical depiction converges (p.661) with the known general proof in which the Bernoulli numbers appear. My mathematician colleague and friend Andreas Dress tried to find out whether the proof given here was known in the literature, so that I could quote some reference. However, he could not find any reference, although the general opinion of his colleagues was that “geometrisation” of the problem might be familiar and considered to be “folklore”. To be serious again, I did not expect to present new mathematics in this book and I would be truly surprised if those deliberations have not been made before. Let me come to some conclusion.

1.5 Summary and assessment

The method introduced above involves geometrical visualisation. It uses the word “space” in the mathematical sense as an ordered set with a mathematically well‐defined structure. It is a geometry of sequences of natural numbers dominated by the sequence Σ k =0 n k= ( n 2 + n)/2 , calculated by Carl Friedrich Gauss when he - as the story goes - was a youngster at elementary school (see Figure A3.1). Dealing with discrete numbers the geometrical representation has to use point spaces, which in general are hyperspaces of arbitrary dimensionality q, denoted qStot. The index “tot” = total refers to the fact that this space consists of two subspaces: qSn and qSrest, with qSn comprising the set of the first n natural numbers of power q, i.e. Σ k =1 n k q and qSrest representing the set of all points o q∑tot a do not belong to qn. These spaces have the “seemingly simple” form of hypercubes, e.g. qtot = q(n−1 ) 1n, with qn−1 being the ((n − 1)-dimensional) hyperplane of the hypercube nq.

Why then do I call q(n−1) 1n a “seemingly simple” form of a hypercube? A “true” hypercube should be an analogue of a cube in three‐dimensional space. The quantity qn−1 is indeed a “true” hyperplane of the “true” hyperspace qn, but q(n−1) 1n is not because 1n = (n2 + n)/2 yields a cube with numbers valued differently in each of the n dimensions. It certainly is a hyperspace with peculiar properties and the total space qStot could be interpreted as a “hypercube” that “changes” its size in some complicated manner. “How” is seen by the calculations presented above, which involve “mathematical induction” in building up the stretched form. The calculated sums are derived from the three hyperspaces qStot, qSn and qSrest, and therefore still represent a “geometrisation” of the problem.

This can be shown in a simple way if we look at the largely dominating leading terms of the sum expressions obtained by the detailed calculations. The leading term is the one with the highest power of k that represents a good approximation at large values of n (such as 100 or more). The maximum error then may be around 1%. The leading terms for all values of q have the general form for q∑n: nq+1/(q + 1), for qtot: nq+1/2 and for qrest: ((q − 1)/2)·(nq+1/(q + 1)). If we compare the expression for qn with that for a “true” hypercube, such as qn, we see that with (p.662) the increase in n the hyperspace qSn undergoes inflation of its size (or volume) in the “vertical” direction in proportion to n/2. This is pictured for a four‐dimensional space (q = 4) in Figure A3.4b, c and d, where the latter two fgures are more an artistic refection on this effect. Compared with a normal hypercube nq the “volume” of which is the sum of n of its hyperplanes, i.e. n × nq−1 = nq, the hyperspace qStot consists of n of its infating hyperplanes n(nq−1n/2) = nq+1/2. Going back to q∑n and qrest we see that for q 〉 3 the sum qrest infates faster than qn by the factor (q − 1)/2. This is the gross behaviour, neglecting all the details of the sums which in the geometrical picture are caused by overlapping of hyperplanes (of all dimensions) planes and edges. If we want to calculate these details precisely we have to apply “mathematical induction” in each step of the iteration because the changes occur by going from k to k + 1 for all k values from 1 to n. A geometrical visualisation of this procedure (for q = 2, 3 and 4) was shown in Figures A3.2, A3.3 and A3.4. The symmetries in qn found for those spaces can be traced back to the peculiarities that distinguish the numbers 2, 3 and 4 from larger numbers (see Figure A3.2). I am sure that the geometrisation will uncover further secrets at higher dimensions, but this is not the place to go into more detail.

In physics we do not yet have an answer to the questions “why strings” and “why do we live in a three‐dimensional world for which our brain developed some intuitive access”? Fundamental theories of physics (such as string theory) are based on higher‐dimensional spaces. Maybe the symmetries observed in our geometrical visualisation are of some relevance in this connection. At the “big bang” they started from a singularity that could “materialise” only in the form of a one‐dimensional string infating further to higher dimensions.

In conclusion let me return to Richard Courant1, who has reminded us that “constructive intuition” as much as the “rules of logic [such as mathematical induction] play their part” in the extraordinary efciency of mathematics for the problems of physics, which lef such a deep impression on Eugene Wigner. I met Richard Courant on several visits to the USA. He always was curious to hear about Göttingen, where he lived until 1933. Afer the war, the city of Göttingen made him an honorary citizen. Since the same honour was granted on me, many years later, our pictures now hang in the city hall opposite one another, allowing us to indulge forever in “virtual communication”.

2. Indirect proof*

Cantor's diagonal argument is a famous example of an indirect method of mathematical proof. Reasoning in this case is based on assuming tentatively the contrary (p.663) of the statement to be proven and then demonstrating some absurdity or antinomy in the resulting conclusions. Hence the truth of the original statement is proven as a consequence of the logical principle of the “excluded middle” (tertium non datur). Cantor's diagonal argument3, which I mentioned repeatedly in other sections, has found numerous applications. I choose it as an example in this appendix because of its relevance for an appreciation of biological complexity and its astonishing consequences, discussed in Chapters 4 and 5. The following presentation is based on both an early text by Cantor's contemporary Felix Klein4, one of the great representatives of the Göttingen mathematical school (see Section 3.6), and the book by Courant and Robbins quoted in the first part of this appendix.

Cantor's argument, at first sight, might sound simple. However, it is anything but simple or trivial, and accordingly it found several opponents among his contemporary colleagues, many of them highly distinguished mathematicians. I would rather call the concept ingenious. What is the diference? Something trivial must be both obvious and true, while an ingenious idea cannot be obvious — otherwise anyone could have seen it. Yes, it may even be strange but, nevertheless, it must be true.

Let me therefore go back somewhat further. Cantor3 is one of the founders of set theory and thereby provided a mathematical analysis of infinity. The most prominent example — even for non‐mathematicians — is that of the infinite set of the sequence of positive integers, 1, 2, 3, … referred to in Appendix 1. It has “no end”; that is what the Latin adjective “infinitus” means. Adding 1 to any number n in this sequence defines the next larger integer. And this never reaches an end; but it establishes one of its most important characteristics. The infinite set of all positive (as well as of all negative) integers is “denumerable”.

Infinity is usually expressed by the symbol 8, but this does not mean that ∞ can be thought of being something like an ordinary number. This leads us immediately to some important consequences regarding the power or potency (German: Mächtigkeit) of a set, which is called its “cardinality”. If a set were finite, e.g. if the sequence of positive integers were to stop at any finite natural number n, then the total set of n natural numbers could not be equivalent to any of its proper subsets. But for an infinite set, Cantor showed that this is possible. For instance, there is a one‐to‐one correspondence between the set of all natural numbers and its subset of even numbers:

1 2 3 4 5… 2 4 6 8 10…

The two are equivalent denumerable sets.

Cantor denoted the cardinality of a set by the first symbol of the Hebrew alphabet, called aleph (אּ). The “smallest” infinite set is that of natural numbers, being assigned (p.664) the cardinality אּ0. Many other, more complicated, sets have the same cardinality because one can establish a one‐to‐one correspondence with the set of integers, and it is the fact of countability or denumerability that is expressed in the cardinality of a set. Accordingly, אּ0 has all sorts of odd properties, such as

א 0 + 1 = א 0   o r   ×   א 0 = א 0   o r   א 0 n = א 0

where n is a finite number. The most surprising fact is that the set of all rational numbers also has the same cardinality אּ0. This is surprising because we are used to the fact that rationals distribute themselves (fairly) densely between any two subsequent integers. Yes, there are infinitely more rationals than integers, but they are denumerable because each is made up of a pair of integers (p, q), i.e. p/q, yielding the cardinality of אּ2 0 = אּ0. The denumerability of the set is explained in the scheme shown in Figure A3.5a and b (following Courant et al.1).

The sequence obtained (right‐hand figure) contains every rational number, showing that the total set is denumerable. The important point for the discussion below is that all points can be connected by a line which, however, covers a two‐dimensional plane — according to the unlimited variation of both p and q.

Does the countability of rational numbers mean that any infinite set is denumerable? The answer to this question is Cantor's legacy: There are infinities that are uncountable and therefore must be “bigger” than the infinity of natural numbers. And here is Cantor's proof in the words of his contemporary Felix Klein4: “Let us write the set C1, containing the real numbers 0 〈 x 〈 1, as decimal fractions, all being brought into a denumerable series where the a, b, c, … are numerals 0, 1, …, 9 of any possible choice and sequence” (Figure A3.5 is redrawn from Klein's monograph). Klein continues with a warning against the choice of the numerals 0 and 9


                  Figure A3.5 Maxwell's demon

                        
                           
                              a) If p and q are all natural numbers, each point in the diagram represents one rational number. The diagram shows clearly that all rationals can be connected by one continuous line. The total set of rationals is therefore denumerable and has the same cardinality אּ0 as all natural numbers.
                           
                           
                              b) The red diagonal connects all points p/q = 1. Above this line, p/q is larger than 1, below the line it is smaller than 1. Both sequences of natural numbers p and q are unlimited.

Figure A3.5 Maxwell's demon

  1. a) If p and q are all natural numbers, each point in the diagram represents one rational number. The diagram shows clearly that all rationals can be connected by one continuous line. The total set of rationals is therefore denumerable and has the same cardinality אּ0 as all natural numbers.

  2. b) The red diagonal connects all points p/q = 1. Above this line, p/q is larger than 1, below the line it is smaller than 1. Both sequences of natural numbers p and q are unlimited.

(p.665) because 0.999… and 1.000… (both continued infinitely) are defined as being equal to one another. He then also stresses the necessity of always using an infinite number of decimal places, which for all real numbers excludes those with a finite number of decimal places, and replacing them by infinite decimals that close with an unlimited number of nines. He then continues: “In order to obtain a decimal fraction x′ which difers from all sequences xi contained in the table we give prominence to the numbers a1, b2, c3,… which in the above representation occupy the diagonal of the table (red colour) and replace them by a′1, b′2, c3,…, each a′1, b′2, c′3 and so on being diferent from a1, b2, c3, etc”. In order to prevent x from terminating at any finite position one should avoid 0 and 9 in any replacement. Felix Klein then remarks that the emphasis upon the diagonal determines the name of the method (see graph in Figure A3.6).

Cantors proof is of an indirect nature. It is shown that by assuming C1 to be denumerable contradictions are introduced. In order to demonstrate this, Cantor considered the “diagonal” consisting of the numerals 0, a1, b2, c3,…, keeping in mind that he was dealing with an infinite number of those numerals. He replaced all the numerals appearing in the diagonal of x by diferent symbols x′ = 0, a′1, b′2,c′3, again continued for the whole infinite set. Two non‐terminating decimal fractions can only equal one another if all numerals agree. This is not the case for x1 and x′1, and neither is it the case for any xi in the infinite set. Hence the decimal fraction x difers from all numbers (x1, x2, x3,…) of the system which we had assumed to be denumerable, proving the non‐denumerability of the continuous set C1.

Klein also explains that the x′ obtained is a true decimal fraction, which excludes 0.999…= 1 or any decimal with finite number of places. He emphasises that x must difer from all decimal sequences xi contained in the original scheme. I do not want to go into further detail of his exposition, except to mention that he also generalised the sequence of numerals a, b, c,… by complexes of numerals that are surrounded by zeros. Following a suggestion by J. König4 5 he called those complexes “molecules”, perhaps showing a premonition of applications to macromolecular sequences as discussed in Chapters 4 and 5.


                  Figure A3.6 Maxwell's demon
                  Cantor's “diagonal method” for disproving the denumerability of all real numbers. C denotes the entirety of all real numbers in the form of decimal fractions in the interval 0 〈x 〈 1; a, b, c being the numerals 0, 1, …, 9 occuring in all possible combinations and sequences. The diagonal is highlighted in red. It is shown in the text that the numerals in all non‐terminating decimal fractions cannot be made to agree, and therefore the system is proven to be non‐denumerable.

Figure A3.6 Maxwell's demon

Cantor's “diagonal method” for disproving the denumerability of all real numbers. C denotes the entirety of all real numbers in the form of decimal fractions in the interval 0 〈x 〈 1; a, b, c being the numerals 0, 1, …, 9 occuring in all possible combinations and sequences. The diagonal is highlighted in red. It is shown in the text that the numerals in all non‐terminating decimal fractions cannot be made to agree, and therefore the system is proven to be non‐denumerable.

(p.666) As I mentioned earlier, Cantor's proof sounds simple but is not simple at all. Yes, the method can be called “diagonal”, but the proof is that of the “non‐existence of that diagonal”. Tis becomes apparent if we try to construct a regular scheme by starting with decimal fractions of finite length. Of course, we are then doing something that is not allowed for irrationals, which by definition have an unlimited number of places, producing a table containing an infinite number of decimal sequences. I use this scheme only in order to demonstrate that with every additional decimal the number of possible sequences increases by a factor of 10, ending up with a factor of 10n for n additional places. In the case of irrationals, n ultimately becomes infinite. The proof then shows that such a total set has no diagonal in the sense of that in quadratic matrices. In other words, there is no way to draw a line, as for the case of a quadratic matrix. This can easily be shown for binary sequences, which at each step double the number of equations. Since the number of steps is infinitely large, this exponential complexity causes the difculty of demonstrating non‐denumerable cases.

In my introduction to this appendix I said that it would not provide any new results to the mathematician. That is not quite true. If he is going to apply his thinking to biology, he should be aware of what we consider to be the true mathematical problems in biology.

Notes:

(1.) Courant, R. and Robbins, H. (1996). “What is Mathematics? An Elementary Approach to Ideas and Methods”. Revised by Ian Stewart. Oxford University Press, New York.

(2.) Gradshteyn, I. S. and Ryzhik, I. M. (1999). “Tables of Integrals, Series and Products”. Jefrey, A. (ed.). Academic Press, London.

(*) I think that this section should be included in view of the problems treated in this book, although it is a short summary of something treated in detail in the literature.3,4

(3.) Cantor, G. (1874). “Über eine Eigenschaf des Inbegrifes aller reellen algebraischen Zahlen” (“On a Characteristic Property of All Real Algebraic Numbers”). Crelles J. f. Mathematik 77: 258–262. Cantor's main work, in particular the papers on set theory between 1870 and 1900, are discussed in most major textbooks of mathematics.

(4.) Klein, F. (1924). “Elementarmathematik I” in Band XIV der “Grundlehren der Mathematischen Wissenschafen”. pp. 271–288.

(5.) König, J. Quoted in ref. 4, p. 279.