A Skeptical Look at The Torah Codes

Note: This is the text of an article that Dr. Simon published in the March, 1988 Jewish Action. It is more of historical interest than anything else. For a more comprehensive discussion of the issue, you are strongly urged to read instead, or, in addition, The Case Against The Codes on this website.

A Skeptical Look at the Torah Codes

By Dr. Barry Simon

The Torah Codes have been much in the news lately because of the flurry of press notice that accompanied the release of Michael Drosnin's book [1] which at one point was on the top ten best sellers list simultaneously in New York, London, Paris and Rome. I was already contemplating a piece on questionable uses of science and pseudoscience in the Orthodox community when this explosion crystallized my decision to look closely at the codes and focus on them alone. I'll talk mainly about the more substantial work of the Israeli group which includes the mathematician Eliyahu Rips who is a professor at Hebrew University rather than the more shallow claims of Drosnin.

My goal is to explain in some detail the various examples of the codes and to discuss in layman's language both some of the general mathematical issues and some detailed analysis of the precise method. This analysis has led me to believe there is much reason to doubt the assertions that there are codes in the Torah of the type studied by these groups. Since the presentation of the codes to the general Orthodox public by some proponents has been given greater weight than they deserve, I will use statements from one organization's website on the Internet to highlight what assertions need clarification.

The outreach phenomenon is one of the wonders of our era and my respect for outreach professionals is enormous. My complaints below about some of the statements used by one of those groups should not be interpreted as anything more than a commentary on those particular statements. I feel that certain ideas have been overstated both to potential ba'alei t'shuvah and to the Orthodox community as a whole and I think it important to set the record straight. Moreover, just as there have been reports that the codes have been an effective method for drawing some people toward Yiddishkeit, there have been reports that other people have been turned off because of what they expect to see when something is presented as science.

Torah Judaism has so much richness to offer that I hope those who use the codes in drawing people towards Torah will reconsider the wisdom of using in science's name, ideas whose scientific foundation is so shaky, especially since these ideas are so peripheral to Jewish values and practice. While I disagree with some of the tactics, I understand those using them are working l'shem shomayim.

Examples of the Codes

All the codes involve searching for so called ELS - Equal Letter Sequences - words separated by the same number of spaces. That is, one takes the entire Torah or a specific book, drops spaces between words and looks for new words in the resulting stream with, for example, every fourth letter rather than successive letters. The spacings considered can be quite large - for example, Drosnin's celebrated location of Rabin's name has a spacing of 4772 letters, that is there are skips of 4771 "unused" letters between the letters that happen to spell out his name.

Those writing about the subject often display ELS by writing down the section of Torah containing the ELS with lines as long as the spacing of the ELS or with a length a few letters longer or shorter than this spacing. That way, the ELS appears as a straight up and down vertical or as a neat diagonal (see Figure 1 which only shows part of the long lines used horizontally). Please don't lose sight of the fact that this method of display has no special significance although it could make you forget the large spacings sometimes involved. One is allowed to look for ELS going forwards or backwards; indeed, both ELS in Figure 1 run backwards.

Figure 1

You have to realize that the number of ELS is very, very large. The number of letters in the Chumash is 304,805 which means the number of ELS with spacings of 5000 or less, forwards or backwards is about 3 billion! So when you search for an ELS of a relatively short word, you are far from searching a needle in a haystack - rather you are searching for a blade of hay.

I will discuss three kinds of examples: one whole class and two specific detailed analyses.

The simplest examples of the codes either rely on the fact that some word appears more often than you might expect or that two or more words appear near to one another. For example, Chanukah and Hashmonai (the name of the dynasty founded by the Macabees) are near one another as seen in Figure 1. Some of these examples are quite charming - for example, the names of many trees are found near the parsha where Avrohom is promised the land of Israel. I call these simple word pair examples.

Because many words will occur as ELS multiple times in Chumash, when looking for clusters of words the more responsible researchers normally restrict themselves to that ELS with the minimal skip between letters. This is something that Drosnin sometimes failed to do. It also explains why Chanukah in Figure one appears with the hay in front. The occurrence of the word without the hay is an ELS but not one with a minimal spacing.

Much attention has been paid to the second example - a sophisticated statistical analysis done on a list of rabbis correlated to their dates of birth or death. This is an involved attempt to try to show that the codes can't be explained by random chance. The analysis was published by D. Witztum, E. Rips, and Y. Rosenberg in the journal Statistical Science [2]. I call it the Famous Rabbis example.

Thirdly, there is a preprint given me by Professor Rips [3] that finds a correlation between the names of the seventy nations (goyim) in Parshat Noach and the locations of specific phrases containing these nations. That is, they look for correlations between, for example, the nation of Magog in the original text and four phrases: "the people of Magog", "the land of Magog", "the language of Magog" and "the script of Magog". The name of the nation is the actual text in Noach. The phrases are searched for as ELS with spacing two or more. The statistical method of [2] is used to measure randomness. Of course, Hebrew phrases (e.g. Am Magog for "people of Magog") are searched for. Following Witztum et al., I call it the Nations example.

To cut to the chase, I regard the simple word pair examples as an uncontrolled parlor game that I cannot take seriously for reasons I'll discuss below. I'll explain why I find the Nations example totally unconvincing. I find many reasons to be skeptical about the famous Rabbis example and it is far from compelling.

Overstated Claims

The codes with special emphasis on the Famous Rabbis paper have been used by various organizations. In this connection, there have been certain assertions that depend on the general public's awe of science and that show a lack of understanding of the process of scientific research. To be specific (the quotes here are from the web site of one outreach organization as of 9/18/97), here are some of the claims made:

Publication of a paper in a scientific journal is a guarantee of its validity ("Professional scientific and mathematical journals consult with a cadre of world class experts for the purpose of insuring that an article containing a mathematical or scientific flaw is not published in their journal.").
That the codes have been scientifically proven to occur ("This can be scientifically demonstrated.").
That the scientific community generally supports the notion that Witztum et al have proven that the codes phenomenon is real ("Since its publication over two and half years ago, world class statisticians and Bible scholars have reproduced and verified these results." There is also a partial positive quote - "the Codes phenomenon is real" - from the former chairman of the Math Dept. at Harvard.).

Let's examine these, one at a time. First, consider the notion that publication in a scientific journal is a sort of Kashrut certificate attesting to the validity of a result. I have been an editor of the most distinguished journal in my specialty for roughly twenty years and I have pride in the high standards of my section but I wouldn't eat in a restaurant whose standards of kashrut were only as high as that of my journal. Referees (and editors!) make mistakes and usually place their own research and other concerns as a higher priority than refereeing. One distinguished mathematician I know says: "My job as a referee is not to check the author - that's his/her responsibility. I'm supposed to check that the result is believable and important enough to warrant publication."

Not only is this true in general but in this particular case. Robert Kass of Carnegie Mellon, the accepting editor for Statistical Science is quoted in the New York Times (confirmed this via email) as saying about papers that Statistical Science accepts: "We hope that the material in them is correct but we also try to publish pieces that are amusing to a wide variety of statisticians."

As for the claim that the Witztum et al paper is a scientific proof that the Torah has hidden codes, it isn't science. Normally, a scientific assertion can, at least in principle, be disproved. But it isn't clear what the codes proponents would consider a disproof. Certainly if I looked for say famous rebbetzins in the text and couldn't find them, proponents would claim that not everything was there. How many examples of codes not found would it take to invalidate the hypothesis? What kinds of examples?

I explicitly asked Professor Rips this question and he admitted it was an interesting question to which he didn't have an answer. If it isn't possible to disprove, then the hypothesis is not a scientific hypothesis. This is not to say that statistical analysis can't be a valid way to analyze what might be going on, but without the possibility of disproving a hypothesis, that hypothesis is outside the realm of science as we understand it.

Some presentations have made much of a letter of approbation signed by four distinguished mathematicians (Joseph Bernstein, then of Harvard, now of Tel Aviv University, Hillel Furstenberg of Hebrew University, David Kazhdan of Harvard and Ilya Piatestski-Shapiro of Tel Aviv University and Yale). Three of these are Orthodox and I'd be hard pressed to find a more distinguished group of Orthodox mathematicians.

But their letter is very carefully worded to state nothing more than that they find the Famous Rabbis experiment interesting and worthy of further study. Professor Rips himself told me that he didn't think any of them were convinced of its validity and that is certainly true of the two of them I know and consulted. Indeed, Kazhdan's response to the presentation of his position on the above website (Kazhdan was the chairman of the Harvard Mathematics Department at the time the page was prepared) was that "I am sorry to see my position in such a distorted form."

I have discussed the codes with a large number of the very best Orthodox mathematicians: the spectrum of responses varies. Shlomo Sternberg, an Orthodox mathematician at Harvard and a rav, has written an extremely negative view of the codes in Bible Review.[4] I've focused on the response of Orthodox mathematicians, not because of a different view in the community at large (on the contrary, the mathematical community at large is more negative) but because there is an assumption among frum laymen that I've talked to that it is only "atheistic scientists" who could possibly doubt the codes. Suffice it to say that not only isn't it true that there is general support in the scientific community for the validity of this research - the overwhelming majority of well meaning scientists who have looked at this have the opposite view.

Probabilities Before or After the Fact

Toss a coin 30 times and write down the exact sequence of heads and tails that you get. The probability of getting that precise sequence is less than one in a billion but you got it. If someone tries a number of different trials looking for something and only reports the successes, the calculations of the a priori probabilities of those successes is meaningless.

In his autobiography [5], Eugene Wigner (one of the great theoretical physicists of the century) tells about a class he had with Einstein in 1928: "He told us once: 'Life is finite. Time is infinite. The probability that I am alive today is zero. In spite of this, I am now alive. Now how is that?' None of his students had an answer. After a pause, Einstein said, 'Well, after the fact, one should not ask for probabilities.'" While neither modern science nor Jewish tradition would agree with the assertion of Einstein's example that time is infinite, his basic point remains: it is dangerous to rely on probabilities after the fact.

The fact that you can't rely on such probabilities makes all the simple word pair examples suspect - much too susceptible to the analog of someone that shoots an arrow and then draws a target around the point that it hits. Not only are there lots of potential word combinations but, because of the nature of Hebrew, there are variants. For example, in the Grace After Meals, Chanukah is spelled chet-nun-chof-he with the "ooo" sound gotten from a quboots (three dots under the line) attached to the nun. You'll notice that Figure 1 uses the variant with the "ooo" obtained from the shuruq (the vowel that looks like a vav). There's the issue of the extra hay before Chanukah in Figure 1 that I discussed above. Both our tradition and the Hebrew language are so rich that one is bound to find lots of combinations in a text as long as a book of the Chumash. There is no doubt that similar combinations can be found any other text of equal length although, of course, you shouldn't expect to find the exact same combinations in some other text that were found in Chumash.

This means that these simple pairs are basically textual witticisms. It should be emphasized that there is no well-established tradition for the codes analysis as there is for the established principles of halachic analysis. There are a very few, isolated examples of Torah personalities who used ELS like devices but in specific instances without providing us with any guidance about their general use by us.

The gemara warns us that even traditional methods of extracting halachic inference from the text can go awry - unless we have a definite mesorah telling us how to employ them. For instance, we are not to use the method of gezerah shavah (inference made from the appearance of identical words embedded in two different texts) on our own. We only employ it when we have a specific mesorah to treat the given text through this method

The use of a method with no firm guidelines is open to misuses such as various Christian missionary groups have used to proselytize among Jews. Anyone interested in the potential misuses of the codes should check out a few Internet sites that present them, for example http://www.grantjeffrey.com, http://home.cwnet.com/crm, http://www.yfiles.com/yeshuacodes.html. These sites illustrate the Pandora's box that is opened when using uncontrolled techniques of analysis of the Torah.

More on Probability Computations

There is a way to describe the issue of searching and only reporting successes that makes it clear that you can sometimes get what appear to be extremely improbable events this way. Suppose there is some test that has a success rate of one in a thousand. You do the test many times on a computer until you have three successes. You'd guess that this would take about 3000 tries and you'd be right. (If the tests are random, after 4000 tries, over 80% of the time, you'll have found at least three successes.) Now you think of the result not as three individual successes but as a success for the test of finding the precise three events you found - something with a probability of one in a billion (1000x1000x1000).

The Nations example is one that is too susceptible to being produced by this method. When looking for "language of Magog", the authors used safat for "language" while they could have used lashon. Professor Rips told me if they had, the effect would have disappeared. Similarly, they used am for nation instead of bnei. In addition, while these choice of phrases is justified in [3] by appealing to some writing of the Vilna Gaon [6], there are other Torah personalities who have written about phrases applied of the Nations (for example, the Ramban [7]) and they used other phrases than these three, so picking the Gaon's set represents a significant choice.

Repeated tests on a computer needn't be associated with someone deliberately setting out to fool us. The very fact that these phenomena are called codes means that a searcher has to experiment to find what might be is encoded. Modern computers allow the trial of myriad possibilities so that even a well-meaning searcher can inadvertently produce what appear to be rare occurrences when doing multiple tests.

Description of the Famous Rabbis Example

The problems with the Famous Rabbis example are more subtle and unfortunately for a non-technical discussion like this involve some of the details of the analysis which I'll therefore need to partially explain.

The authors took a list of 32 moderately famous rabbis. In their initial tests, they took 34 very famous rabbis, but after refining their methods of analysis, they say that they took the moderately famous rabbis to avoid any claims of having fitted the tests to the data. Moderately famous was defined as having an entry in the book Encyclopedia of Great Men of Israel [7] of between 1.5 and 3 columns of text.

The basic idea was to take the names of these 32 great men and their dates of birth and death and see how close the dates were to the names when they searched for the names and dates as ELS in Genesis. To do this, they invented a measure of closeness (that I'll return to soon) and in this measure of closeness they compared the correct sets of names and dates to some incorrect names and dates.

They got the correct sets of names and dates by placing the names in one column and the dates in a second. The incorrect sets were obtained by leaving the names column alone but rearranging the dates column so that the dates and names are no longer lined up correctly. There are an enormous number of ways to rearrange a column of 32 numbers (over 2 followed by 35 zeros!). The authors randomly picked 999,999 of these so that counting the correct way they had a million possibilities. They ranked these possibilities using the measure of closeness they'd invented.

They actually used four different measures of closeness and in those measures the rank of the correct pairing among the million possibilities was between 4 and 453, that is the overwhelming majority of rearrangements were worse in their measure of closeness. For comparison they took the start of a Hebrew translation of Tolstoi's War and Peace of the exact same length as Genesis. The rankings in that case varied from 277,103 up to 748,183, that is more or less consistent with what you expect if there were no special correlations between names and dates. These are, on the surface, indeed impressive numbers.

There are some complications of the analysis that are very significant. First, I talked as if each of the 32 rabbis had a single name and a single date but that is not the case. Not only are there two dates (of birth and death!) but there are variant spellings of the dates. Only month and day (not year) are used but three spellings of the Hebrew dates are used: with no leading bet (the preposition for "on the_"), with a bet before the day of the month and with a bet before the month. Multiple spellings are also used for the 15th and 16th of the month because of the two formats used for those numbers. The authors did not use dates found in the Encyclopedia but ones that they determined from their own research. Not all the rabbis have both birth and death dates listed and indeed, two have no dates listed at all!

The Hebrew names used for each of the rabbis also involve more than one possibility since rabbis are often known by names of their books (e.g. the Chofetz Chaim). Here they used name variants supplied by a Professor at Bar Ilan. The number of appellations for each rabbi varied from one to eleven so the 32 rabbis had a total of over 100 appellations.

For each rabbi, they take all possible pairs of one appellation and one date for that rabbi and so get several hundred pairs. For each pair, they assign a number between 1 and 125 which is supposed to measure closeness of the pair of words in the text under study. Smaller numbers mean closer pairs. They then look at closeness overall by a method that counts up whether there are among these several hundred pairs an anomalously large number of small closeness weightings. It needs to be emphasized that the effect they find is not due to all 32 rabbis but to a small fraction of them (between 5 and 10) whose word pairs are anomalously close.

It is important to note the number of choices in this procedure since each choice can be a source of inadvertent bias in the result. Not only are choices about what to include important - so are choices about what not to include. For the analysis counts up the number of anomalously close pairs so not including some not close pairs improves the result. In addition, some dropped pairs could make one of the permuted choices better. The authors chose which date forms to use (there are other ones besides the three they use) and the choice of which appellations to make for each rabbi were not based on a criteria that would allow an independent person to confirm the choices made.

Famous Rabbis and Potential Errors in the Method

There is a quote often attributed to Mark Twain: "there are lies, damned lies, and statistics". One statistician I know who consults for various companies to analyze their data told me that whenever he starts a project for a new client he warns them that he's glad to discuss the way he's going to analyze the data in detail before he does the analysis. But once he does the analysis, he's not going to go back in response to - "why don't you try looking at it in thus and such a way". Because he guarantees that if you reanalyze it often enough, you can make the analysis show whatever you want.

The point is that even well intentioned people if they keep reanalyzing data by changing the methods can produce results that aren't statistically valid.

In the Famous Rabbis analysis, one is struck by a number of ad hoc aspects of the methods which both bear on this and on the claim that the tests show that the word pairs are close. As I explained above, the closeness is measured by assigning a number between 1 and 125. A low number doesn't necessarily mean the words are close in any sense that you or I would think of as close. Rather, one compares the closeness of ELS associated to the pair to certain non ELS associated to the pair.

Mathematicians often talk about natural method and natural objects. One way of defining naturalness is that if some other mathematician studied the subject deeply, he/she would find a similar object. In that sense, I find the method of assigning a distance ranking unnatural - I refer here to the complete method which is too complex to describe in this note and not just the last step that assigns the number between 1 and 125. The entire procedure is such a complex notion of closeness that I think it highly unlikely that some other mathematicians trying to find a notion of closeness would use the one in the paper [2].

This very unnaturalness makes me uncomfortable and suggests that the authors were led to their metric by experimenting with a few pieces of data - perhaps for a few famous rabbis. Given that other aspects of their analysis give undue weight to a few select anomalously "close" pairs, a little unintended bias in the method can go a long way.

I want to emphasize that for this unease to be valid doesn't imply that anyone was trying to pull a hoax. Indeed, having talked with Professor Rips, it is clear that he sincerely believes in what he has written. But the very notion that one develops a method while trying to decode something suggests that of course you're likely to find things encoded there. In situations like this where the methods are ad hoc, a small built in, even inadvertent, bias in the method could have a dramatic effect.

Famous Rabbis and Potential Errors in the Data

There are a number of aspects of the analysis that make it extremely sensitive to the precise data that is used. There are two parts to the data: the text of Genesis and the precise list of appellations and dates used for the Rabbis.

With regard to the text of Genesis, we know that there are a few slight differences between today's accepted text, and the ones used in earlier times. Kiddushin 30A tells us that we are no longer expert in the various optional vowel letters, like yud and vav. While these differences do not affect the meaning of the text, they can change the codes significantly. Adding a single letter in the skips of an ELS kills the ELS!

One rebuttal I've heard to this imprecise text issue is that the results would have been even better with the "real" text. This is nonsense. A ranking of 4 out of a million isn't merely good - it is fantastic. If it were the original text that was this good, random perturbations of the data would have been bound to make things worse than what is observed so we are forced to the second rebuttal.

This argues that if we are looking for proof that God placed codes in the text, we can certainly imagine that He placed them not in the initial text but into the text that He knew it would evolve to by the age of computers. Because of the ad hoc nature of the analysis of [2], we have to also suppose that God wrote the Torah with this specific analysis in mind and the specific lists of appellations that are used in [2]. We are led to such a complex edifice that it wouldn't convince a skeptic.

The issues of method chosen to fit the data and imprecise text produce unease about the analysis but I've saved the strongest reason for doubt until the last. It has to do with the sensitivity of the result to the precise variants of Rabbis' names that are used. Because the measure of closeness used is so sensitive to a few anomalously small values, the result depends heavily on the inclusion of those few names that produce these small values. It depends not only on what names one chooses to include but also what names one excludes since the excluded names could give a rearranged list a better ranking of closeness.

This issue is illustrated in a devastating way by some work of Bar-Natan and McKay, two mathematicians who decided to analyze the work of the Rips group to see if it stands up. A preliminary version of their paper on these names is posted on the web [9].

These code busters have produced a list of appellations for the moderately famous rabbis that differs somewhat from the list in [2]. First of all, they drop the two rabbis with no dates which one can argue only produce noise in the rearranged data. Secondly they have dropped one rabbi and added a different one on the basis that the authors of [2] appear to have miscounted their rabbis and included one Rabbi (the one dropped by Bar-Natan and McKay) whose entry in the encyclopedia is slightly less than 1.5 columns and not included another (the one added in [9]) whose entry is the right length!

Of the list of name spellings in [2], Bar-Natan and McKay have taken over 51 appellations, changed the spelling slightly of 4, dropped 15 appellations and added 24 alternate appellations. For each new appellation they added, they have an argument based on the historic figure why it is not unreasonable to make the choice.

After they make the change, they repeat the analysis of [2] on the Hebrew translation of War and Peace. And the results are an extremely low ranking for War and Peace, that is with this list it appears that the rabbis are encoded in Tolstoy!

I've no doubt there will be loud arguments on the Internet about the validity of each change they made, but to me the point is that the fact that they can make this list of simple changes and turn the results of [2] upside down shows that the Famous Rabbis example is totally dependent on the particular choice of names used in a way that makes me doubt the validity of the enterprise.

My conclusion is that there is much to be skeptical about in looking at the codes. Not only are some of the claims that have surrounded them unfounded, the various examples are far from convincing if analyzed carefully. Doubts that I may have had about publishing my conclusions are overridden, I feel, by a gemara in Shabbos, which says, "The seal of HaKadosh Baruch Hu is Truth."

Appreciation

In the preparation of this piece I've benefited from discussions with many individuals too numerous to thank, but a few stand out. Professor Ilya Rips, even knowing of my skepticism, generously spent several hours with me. While I have questions about the conclusions of his codes research, I was left with a tremendous admiration for his personality and sweet temperament. As always, my rebbi, Rabbi Yitzhak Adlerstein was an invaluable resource. I appreciate discussions (live or via email) with Dr. Dror Bar-Natan, Professors Sylvain Cappell, Percy Deift, Persi Diaconis, Menachem Friedman, Hillel Furstenberg, Mr. Harold Gans, Mr. Alec Gindis, Professor David Kazhdan, Dr. Brendan McKay, and Professors Shlomo Sternberg, and Larry Zalcman.

References

1. M. Drosnin, The Bible Code, Simon and Schuster, New York, 1997

2. D. Witztum, E. Rips and Y. Rosenberg, Equidistant Letter Sequences in the Book of Genesis, Stat. Science, 9, (1994) 429-438; this article is available on the web at http://www.fortunecity.com/tattooine/delany/11/genesis.html.

3. D. Witztum, E. Rips and Y. Rosenberg, Equidistant Letter Sequences in the Book of Genesis: II. The Relation to the Text, 1997 Preprint.

4. S. Sternberg, Snake Oil for Sale, Bible Review (August 1997), pg 24-25.

5. Eugene Paul Wigner, Andrew Szanton, The Recollections of Eugene P. Wigner, Plenum, New York, 1992

6. Hagra, A Commentary on the Book of Job, Jerusalem.

7. Ramban, Commentary on Torah, Vayikra, 18:25.

8. M. Margalioth, Encyclopedia of Great Men of Israel, Joshua Chachik, Tel Aviv, 1961.

9. N. Bar-Natan and B. McKay, Equidistant Letter Sequences in Tolstoy's "War and Peace", weblication posted at http://cs.anu.edu.au/~bdm/dilugim/WNP

Further details as well as links to other codes commentaries can be found on Dr. McKay's home page at http://cs.anu.edu.au/~bdm/dilugim/torah.html.

As usual, there is an enormous amount of information (also, as usual, some of it of dubious quality!) available on the Web. Anyone wanting to explore can find links to many commentaries on the codes at http://www.math.gatech.edu/~jkatz/Religions/Numerics/

Note: This article on the Torah Codes was prepared for the December, 1997 Jewish Action (the magazine of the Orthodox Union, a major American Orthodox Jewish organization). It was accepted for publication in the form below but then a decision was made to postpone it one quarterly issue until March 1988 to allow Prof. Rips and/or Mr. Witztum to prepare a reply. It is possible that in connection with that reply that the final version will change so this article should be viewed as a draft. Please note that the article is copyright, 1997 by Barry Simon, © all rights reserved.

Back to Main Page