Summary: Hopfield nets are one of the easier neural net models to understand -- and they can be useful, too. The main ability of the Hopfield net is to undo noise and reconstruct known patterns. Python programmer Andrew Blais is your guide to learning more about Hopfield nets, and exploring his net.py application.
Hot things cool, obviously. The house gets messy, frustratingly. In much the same way, messages are distorted. Short-term strategies for reversing these things include, respectively, reheating, cleaning, and the Hopfield net. This article introduces you to the last of these three, an algorithm that can, within certain parameters, undo noise. A very simple Python implementation, net.py, will show you how its basic parts fit together, and why a Hopfield net can sometimes retrieve a pattern from its distortion. This implementation, while limited, will still enable you to perform a number of instructive and revealing experiments on the Hopfield net.
I assume that you are reading this because you have some computational problem. You have been advised that some neural net algorithm might provide a solution. Specifically, the suggestion is that you might employ a Hopfield net. I suppose further that you need a precis of the idea so you can decide whether the suggestion is feasible and warrants more exploration. The following broadly sketched application of the Hopfield net may get you started solving your problem.
Your problem begins with a base set of patterns that are encoded with -1s and +1s. They can be encoded in 0s and +1s, if necessary. These patterns could be normalized binary images of postage stamps (see Resources). The next element is a set of patterns that deviate from this base. Your quest is to create code that inputs a deviant pattern and outputs just one base pattern. Your quest then would be an algorithm that can input an encoded representation of a particular stamp, and output just one base stamp pattern. You're not searching for certainty. There is an acceptable failure rate that does not adversely affect your enterprise. For you, there is a rate at which stamps can be misrecognized that doesn't significantly interfere with your project.
If this reminds you of your problem, the following might be the beginning of the development of your solution. By the end, you should be better able to answer the basic questions. What is this Hopfield thing? How does it work? What are its limitations? What can it do for me? Do I want to spend more time investigating this?
Let's begin with five arbitrary patterns patterns that will be distorted and subsequently retrieved. Visually, they can be represented as ten-by-ten matrices of black and white squares. Figure 1 shows the first pattern, p1.
Figure 1. Visual representation of p1
Clicking on any of p2 through p5 in net.py displays the other patterns. For coding purposes, these five patterns are initially represented as Python lists. So, the first pattern, for example, is represented in Listing 1.
p1 = [ [1, 1, 1, 1, 1, 1, 1, 1, 1, 1], [-1, -1, -1, -1, -1, -1, -1, -1, 1, 1], [-1, -1, -1, -1, -1, -1, -1, -1, 1, 1], [-1, -1, -1, -1, -1, -1, -1, -1, 1, 1], [-1, -1, -1, -1, -1, -1, -1, -1, 1, 1], [-1, -1, -1, -1, -1, -1, -1, -1, 1, 1], [-1, -1, -1, -1, -1, -1, -1, -1, 1, 1], [-1, -1, -1, -1, -1, -1, -1, -1, 1, 1], [-1, -1, -1, -1, -1, -1, -1, -1, 1, 1], [-1, -1, -1, -1, -1, -1, -1, -1, 1, 1] ] |
Black and white squares correspond to -1 and +1, respectively. Such a list is subsequently converted into an array. (SeeResources for references to the Python libraries that I utilized.) For each element in such a pattern, a -1 or +1, there is a node object in an array of nodes.
A node object has three central attributes:
- A node object has a value that is an element in the pattern.
- A node also has an address, that is, the address of its position in an array.
- Each node also has a color that makes it possible to display it.
As noted, one function of a Hopfield net is to undo noise. To implement this, there needs to be a way to introduce patter into a pattern. Clicking Add Noise does precisely this. Adding noise to p1 produced Figure 2.
To introduce noise into a pattern, the Hopfield net visits each address in the array of nodes. It then picks a random number in [0,1), that is, in the interval from zero inclusive to one exclusive. If the number is below some fixed level, the net changes the node's value and color, else it leaves it undisturbed. By default, this level is set to 0.20, and so there is a twenty percent probability that any given node will change its value and color. You can use the slider to vary this probability. At zero, there is no noise, and at one hundred, the node array is simply inverted. In between, there are all the other usual suspects. Each introduces a specific degree of noise into a pattern. Because the Hopfield net is an algorithm for undoing noise, it is able to input a distorted pattern such as the one in Figure 2 and then output its original in Figure 1.
Although sometimes obscured by bad exposition, the pertinent algorithms are fairly simple to implement. In what follows, I'll walk you through an implementation of the algorithm, and then I'll explain briefly why these algorithms can undo noise.
As David Mertz and I wrote in a previous developerWorks article, An introduction to neural nets, the human brain consists of about one hundred billion neurons, and each is, on average, connected to several thousand other neurons. Neurons both send and receive varying quantities of energy. An important feature of neurons is that they don't react immediately to the reception of energy. Instead, they sum their received energies, and they send their own quantities of energy to other neurons only when this sum has reached a certain critical threshold.
When the brain is learning, it is tempting to say that it is adjusting the number and strength of these connections. There is no doubt that this is an extreme simplification of the biological facts. Simplifications can be useful, especially when they serve as a model, in this case, for implementing cybernetic neural nets. The transition from the biological to the algorithmic is achieved by translating connections into weights. (A perceptron utilizes weights in a different and perhaps more intuitive way. Before you read on here, you might want to review An introduction to neural nets.
A weight object primarily encapsulates a numerical value that represents the weight between one node and another. A weight object also possesses an address and a color. The address is its location in the weight array. The color is for visualization. Figure 3 is one possible visual representation of a weight array. Net.py (see Resources for a link) keeps track of the lowest and greatest weights, and it will display a key to the numerical values of the colors in the weight display.
Figure 3. One visual representation of a weight array
On each row of the weight array, there is a list of all the weights between one given node and all the others. The Hopfield net has two forms. One form of node has a weight to itself, and the other doesn't. Experimenting with this via net.py shows that when nodes aren't self-weighted, node arrays don't always reconstruct to themselves. Select the No Self Weight option, and try reconstructing p3 or p5. There are one hundred nodes, and so there are ten thousand often redundant weights. When nodes are self-weighted, which is the default, then there are 5050 non-redundant weights, else there are just 4950.
Figure 4. The genesis of a weight
PAT = { x: x is a RxC pattern } WA = { x: x is a (R*C)x(R*C) weight array } For all (i,j) and (a,b) in the range of R and C: SUM = 0 For p in PAT: SUM += p(i,j) * p(a,b) WA( (R*i)+j, (C*a)+b ) = SUM |
The biologically inspired concept that underlies the Hopfield net was explored in 1949 by Donald Hebb. He conjectured that if a pair of nodes are simultaneously sending their energy to one another, then the weight between them is greater than if just one sent its energy. He wrote, "When an axon of cell A is near enough to excite B and repeatedly or persistently takes part in firing it, some growth process or metabolic change takes place in one or both cells such that A's efficiency, as one of the cells firing B, is increased" (see Resources for details). In terms of the Hopfield net, when a pair of nodes has the same value, in other words, -1 or +1, the weight between them is stronger. The sum of the products of the values of all possible node pairs determines the content of the weight array. When they are the same, their product is positive and the sum is increased. In the case of different values, this sum is decreased.
Where, more specifically, do weights come from? To begin with, the Hopfield net must have access to a library or set of basic patterns. Here, these are p1 through p5. A weight's genesis begins with the Hopfield net choosing a pair of coordinates within the bounds of a basic pattern matrix. It then visits the corresponding nodes in each pattern. At each stop, it augments an ongoing sum with the product of the values of the nodes (see Figure 4). When the net has visited each pattern, it sets the value of a weight object to this sum. Given a pair of nodes at (i,j) and (a,b), it sets the value of the weight object at (i*10+j,a*10+b) in the weight array.
That is how the array of weights is constructed, but how does it function in the larger Hopfield algorithm? How does it function in pattern reconstruction?
With a weight array and a distorted or noisy pattern in hand, the Hopfield net can sometimes output the original pattern. There are no guarantees, but the net is right a surprising percent of the time. It can do this either synchronously or asynchronously.
To do this asynchronously, the net traverses the distorted pattern, and at each node, N, it asks whether N's value should be set to -1 or +1.
To ascertain this, the net traverses the row in the weight array that contains all of the weights between N and the other nodes. Keep in mind that nodes may or may not be self-weighted.
At each stop in this second traversal, it forms the product of (1) the weight between N and the other node and (2) the value of the other node. As you'd expect, the net keeps an ongoing tally of these products.
Now, the net is in a position to decide. In the present implementation at least, if this sum is less than zero, the net sets the node to -1, and if it is greater than or equal to zero, the net sets the node to +1.
Figure 5. Reconstruction: No node left behind
For every node, N, in pattern P. SUM = 0 For every node, A, in P: W = weight between N and A V = value of A SUM += W * V if SUM < 0: set N's value to -1 else: set N's value to +1 |
The default update is asynchronous, because the net sets a node's value just after it decides what this value should be. It would be synchronous if the net set the values of the nodes after it had made all of its decisions. In this case, it would store its decisions, and then update the array of nodes after its last decision. In net.py (see Resources), reconstruction is, by default, asynchronous, but note the option for synchronous reconstruction.
When you experiment with net.py, and when reconstruction is successful, the Hopfield net's behavior startles. One such behavior is that even when the weight array is severely degraded, it can reconstruct patterns. My simple implementation's Degrade Weights traverses the weight array and randomly sets weights to zero. Displaying the weight array provides one view of the extent of the damage. Correct reconstruction illustrates here that a Hopfield net can be far more fault-tolerant than the brain. How does it work? The mathematical story is not short. Here instead is a plot summary.
The algorithmic details of the Hopfield net suggest why it can sometimes undo noise. As is often true in algorithm analysis, the devil is in the mathematical details. In the present case, these are difficult to picture or imagine. There are, fortunately, some closely related phenomena whose analysis sheds light on the workings of the Hopfield net.
When a marble drops into a simply curved bowl, it rolls to the lowest point. The bowl's curvature is like a rule that inputs the marble's entry point and returns the lowest point, the bottom of the bowl. A more complicated curvature would be like a function that inputs an entry point and returns one of several local lowest points.
Energy is an essential part of these simple phenomena. In both the simple and complicated cases, the entering marble has a certain degree of energy. Over time, this energy decreases. Eventually, it reaches a stable state in which it can't diminish any further. In the complicated case, there may be a lower energy level, but the marble can't reach it.
Analogously, a pattern, distorted or not, can be thought of as having a certain degree of energy. So, patterns p1 through p5 have energy levels.
Pattern | Energy | Rank |
p1 | -9600.0 | 1 |
p2 | -7104.0 | 3 |
p3 | -6848.0 | 4 |
p4 | -9472.0 | 2 |
p5 | -6408.0 | 5 |
The computation of a pattern's energy level is not complicated. For every possible pair of nodes, the Hopfield net calculates the product of their values and the weight between them. The pattern's energy level is the result of dividing the sum of these products by negative two. Net.py displays the energy level of any given pattern or node array. As you reconstruct a pattern, I expect and hope that you will see the pattern's energy level decrease.
In reconstruction, the net's decision to flip a node is based on the sum of the products of the values of the other nodes and the weights between them. When the sum is less than zero, the node is set to -1, else it is set to +1. When the product of value and weight is positive, it contributes to pushing the sum above zero. But this pushes the net in the direction of deciding to set the node's value to +1. When this product is negative, the sum is pushed to or below zero. So, the net is pushed in the direction of setting the node's value to -1. The variation in weights introduces variation into the degree to and direction in which the net is pushed in making its decision. A pattern can be so distorted that the net isn't pushed in the direction of correct decisions. Positively put, the net can be pushed to a correct decision an impressively large percent of the time.
If you reconstruct any of the five patterns, you will see that each reconstructs to itself. This is as it should be, since each pattern already occupies a local lowest energy point. There is no reconstructive process that can lower the pattern's energy level. If you succeed in reconstructing a distorted pattern, the Hopfield has reduced the pattern's energy level to the level of one of the patterns. When it fails, it has reduced the distorted pattern's energy level to a spurious local low point. In either case, the energy level cannot be reduced any further. In has, in other words, reached a stable state. Describing the Hopfield net in terms of energy has an interesting and important consequence. On this basis, it can be mathematically established that repeated application of the reconstruction algorithm will eventually result in a stable pattern. (See Resources for details.)
You should be aware of the limitations on Hopfield nets. An obvious already-cited limitation is that its patterns must be encodable in an array of either -1s and +1s, or 0s and +1s. You already know that a Hopfield can stabilize at a spurious local low. This more significant limitation is that the probability that the net will stabilize at some spurious local low increases when the number of patterns exceeds approximately 14 percent of the number of nodes in the node array. In other words, for each additional base pattern, there must be about 7 more nodes. Despite such limitations, the pattern reconstruction discussed here may serve as an illustrative guide to resolving your particular computing problem. You now have the precis of the Hopfield algorithm initially promised. Should it address your requirements, you now know the superstructure of forging your own implementations. This includes the algorithms for computing a weight array, the way to reconstruct distorted patterns, and the algorithm for computing a pattern's energy level.
- Download the Net.py source code from SourceForge.
- Andrew implemented the Hopfield net in Python using the Tkinter and Numeric libraries. Python lets you focus on algorithms by removing the need to focus on arcane and Byzantine syntax.
- Andrew Blais and David Mertz wrote an introductory article, An introduction to neural networks (developerWorks, July 2001), that focuses on the perceptron. David W. Clark's paper of the same name includes a table comparing neural networks with digital computers. See also the Wikipedia Neural network article.
- Consult Attrasoft on stamp classification and stamp recognition. For fingerprint recognition, see the design study by Nick D'Arcy. For the use of artificial intelligence in business (PDF document), see the study by Tomas Nordlander.
- Neural Networks Documentation from Wolfram Research gives background on Hopfield and Hopfield nets; clicking through to Section 9.2 at the end of the document (and then clicking the Next button) leads you to an interesting discussion of using Hopfield for character recognition. Or just see the Hopfield Net entry in An Idiot's Guide to Neural Networks.
- The biologically inspired concept that underlies the Hopfield net was explored in 1949 by Donald Hebb. See also The Hebb Legacy (Canadian Journal of Experimental Psychology). The study of Donald Hebb's The Organization of Behavior(Wiley, 1949) repays the required heavy lifting.
- You can read JJ Hopfield's 1982 paper, Neural networks and physical systems with emergent collective computational abilities (Proceedings of the National Academy of Sciences of the United States of America, 1982) online in PDF format. You can also see what interests Professor Hopfield today.
- Begin your excursion into the mathematics of the Hopfield net with these resources:
- Kevin Gurney's excellent An Introduction to Neural Networks (Routledge, 1997) focuses on building the intuitions that underlay the mathematics. See Chapter 7 for a discussion of the proof that iterated application of the reconstruction algorithm results in a stable pattern. Also see Neural Nets for an early online version.
- Neural Computing: An Introduction (Hilger, 1990), by R. Beal and T. Jackson, is also very good at developing your sense of what is going on in neural nets. In particular, see Chapter 6.
No comments:
Post a Comment