The question, then, is whether or not we can find an algorithm that would turn a computer into an entity that is equivalent to a human brain. A computer, after all, can run any algorithm that we might define because of its innate universality (subject only to its capacity). The human brain, on the other hand, is running a specific set of algorithms. Its methods are clever in that it allows for significant plasticity and the restructuring of its own connections based on its experience, but these functions can be emulated in software.
The universality of computation (the concept that a general-purpose computer can implement any algorithm)—and the power of this idea—emerged at the same time as the first actual machines. There are four key concepts that underlie the universality and feasibility of computation and its applicability to our thinking. They are worth reviewing here, because the brain itself makes use of them. The first is the ability to communicate, remember, and compute information reliably. Around 1940, if you used the word “computer,” people assumed you were talking about an analog computer, in which numbers were represented by different levels of voltage, and specialized components could perform arithmetic functions such as addition and multiplication. A big limitation of analog computers, however, was that they were plagued by accuracy issues. Numbers could only be represented with an accuracy of about one part in a hundred, and as voltage levels representing them were processed by increasing numbers of arithmetic operators, errors would accumulate. If you wanted to perform more than a handful of computations, the results would become so inaccurate as to be meaningless.
Anyone who can remember the days of recording music with analog tape machines will recall this effect. There was noticeable degradation on the first copy, as it was a little noisier than the original. (Remember that “noise” represents random inaccuracies.) A copy of the copy was noisier still, and by the tenth generation the copy was almost entirely noise. It was assumed that the same problem would plague the emerging world of digital computers. We can understand such concerns if we consider the communication of digital information through a channel. No channel is perfect and each one will have some inherent error rate. Suppose we have a channel that has a .9 probability of correctly transmitting each bit. If I send a message that is one bit long, the probability of accurately transmitting it through that channel will be .9. Suppose I send two bits? Now the accuracy is .9 2= .81. How about if I send one byte (eight bits)? I have less than an even chance (.43 to be exact) of sending it correctly. The probability of accurately sending five bytes is about 1 percent.
An obvious solution to circumvent this problem is to make the channel more accurate. Suppose the channel makes only one error in a million bits. If I send a file consisting of a half million bytes (about the size of a modest program or database), the probability of correctly transmitting it is less than 2 percent, despite the very high inherent accuracy of the channel. Given that a single-bit error can completely invalidate a computer program and other forms of digital data, that is not a satisfactory situation. Regardless of the accuracy of the channel, since the likelihood of an error in a transmission grows rapidly with the size of the message, this would seem to be an intractable barrier.
Analog computers approached this problem through graceful degradation (meaning that users only presented problems in which they could tolerate small errors); however, if users of analog computers limited themselves to a constrained set of calculations, the computers did prove somewhat useful. Digital computers, on the other hand, require continual communication, not just from one computer to another, but within the computer itself. There is communication from its memory to and from the central processing unit. Within the central processing unit, there is communication from one register to another and back and forth to the arithmetic unit, and so forth. Even within the arithmetic unit, there is communication from one bit register to another. Communication is pervasive at every level. If we consider that error rates escalate rapidly with increased communication and that a single-bit error can destroy the integrity of a process, digital computation was doomed—or so it seemed at the time.
Remarkably, that was the common view until American mathematician Claude Shannon (1916–2001) came along and demonstrated how we can create arbitrarily accurate communication using even the most unreliable communication channels. What Shannon stated in his landmark paper “A Mathematical Theory of Communication,” published in the Bell System Technical Journal in July and October 1948, and in particular in his noisy channel-coding theorem, was that if you have available a channel with any error rate (except for exactly 50 percent per bit, which would mean that the channel was just transmitting pure noise), you are able to transmit a message in which the error rate is as accurate as you desire. In other words, the error rate of the transmission can be one bit out of n bits, where n can be as large as you define. So, for example, in the extreme, if you have a channel that correctly transmits bits of information only 51 percent of the time (that is, it transmits the correct bit just slightly more often than the wrong bit), you can nonetheless transmit messages such that only one bit out of a million is incorrect, or one bit out of a trillion or a trillion trillion.
How is this possible? The answer is through redundancy. That may seem obvious now, but it was not at the time. As a simple example, if I transmit each bit three times and take the majority vote, I will have substantially increased the reliability of the result. If that is not good enough, simply increase the redundancy until you get the reliability you need. Simply repeating information is the easiest way to achieve arbitrarily high accuracy rates from low-accuracy channels, but it is not the most efficient approach. Shannon’s paper, which established the field of information theory, presented optimal methods of error detection and correction codes that can achieve any target accuracy through any nonrandom channel.
Older readers will recall telephone modems, which transmitted information through noisy analog phone lines. These lines featured audibly obvious hisses and pops and many other forms of distortion, but nonetheless were able to transmit digital data with very high accuracy rates, thanks to Shannon’s noisy channel theorem. The same issue and the same solution exist for digital memory. Ever wonder how CDs, DVDs, and program disks continue to provide reliable results even after the disk has been dropped on the floor and scratched? Again, we can thank Shannon.
Computation consists of three elements: communication—which, as I mentioned, is pervasive both within and between computers—memory, and logic gates (which perform the arithmetic and logical functions). The accuracy of logic gates can also be made arbitrarily high by similarly using error detection and correction codes. It is due to Shannon’s theorem and theory that we can handle arbitrarily large and complex digital data and algorithms without the processes being disturbed or destroyed by errors. It is important to point out that the brain uses Shannon’s principle as well, although the evolution of the human brain clearly predates Shannon’s own! Most of the patterns or ideas (and an idea is also a pattern), as we have seen, are stored in the brain with a substantial amount of redundancy. A primary reason for the redundancy in the brain is the inherent unreliability of neural circuits.
The second important idea on which the information age relies is the one I mentioned earlier: the universality of computation. In 1936 Alan Turing described his “Turing machine,” which was not an actual machine but another thought experiment. His theoretical computer consists of an infinitely long memory tape with a 1 or a 0 in each square. Input to the machine is presented on this tape, which the machine can read one square at a time. The machine also contains a table of rules—essentially a stored program—that consist of numbered states. Each rule specifies one action if the square currently being read is a 0, and a different action if the current square is a 1. Possible actions include writing a 0 or 1 on the tape, moving the tape one square to the right or left, or halting. Each state will then specify the number of the next state that the machine should be in.
Читать дальше