#006 PyTorch Solving the famous XOR problem using Linear classifiers with PyTorch

The Neural network architecture to solve the XOR problem will be as shown below. In the above figure, we can see that above the linear separable line the red triangle is overlapping with the pink dot and linear separability of data points is not possible using the XOR logic. So this is where multiple neurons also termed as Multi-Layer Perceptron are used with a hidden layer to induce some bias while weight updation and yield linear separability of data points using the XOR logic. So now let us understand how to solve the XOR problem with neural networks.

Similarly, if we were to use the decision boundary line for the NAND operator here, it will also classify 2 out of 3 points correctly. If we were to use the decision boundary line for the OR operator here, it will only classify 2 out of 3 points correctly. As you can see, the XOR operator outputs 1 only when the two elements are exclusive, i.e., have different values, and outputs 0 when the two elements are not exclusive, i.e., have the same values. Let’s represent the points on a graph where the red ones signify Class 0 and the green points are for Class 1. For better understanding, let us assume that all our input values are in the 2-dimensional domain.

An artificial neural network is made of layers, and a layer is made of many perceptrons (aka neurons). Perceptron is the basic computational unit of the neural network, xor neural network which multiplies input with weight, adds bias, and passes the result from the activation function to deliver the output to the next layer. Neural networks have revolutionized artificial intelligence and machine learning. These powerful algorithms can solve complex problems by mimicking the human brain’s ability to learn and make decisions. However, certain problems pose a challenge to neural networks, and one such problem is the XOR problem.

Like the biological kind, an artificial neural network has inputs, a processing area that transmits information, and outputs. However, these are much simpler, in both design and in function, and nowhere near as powerful as the real kind. The XOR, or “exclusive or”, problem is a classic problem in ANN research. It is the problem of using a neural network to predict the outputs of XOR logic gates given two binary inputs. An XOR function should return a true value if the two inputs are not equal and a false value if they are equal. All possible inputs and predicted outputs are shown in figure 1.

Representation Space Evolution

  • We will choose one extra hidden layer apart from the input and output layers.
  • Before we dive deeper into the XOR problem, let’s briefly understand how neural networks work.
  • A multi-layer neural network which is also known as a feedforward neural network or multi-layer perceptron is able to solve the XOR problem.
  • Each connection (similar to a synapse) between artificial neurons can transmit a signal from one to the other.
  • For example, in the case of a simple classifier, an output of say -2.5 or 8 doesn’t make much sense with regards to classification.

Backpropagation is a way to update the weights and biases of a model starting from the output layer all the way to the beginning. The main principle behind it is that each parameter changes in proportion to how much it affects the network’s output. A weight that has barely any effect on the output of the model will show a very small change, while one that has a large negative impact will change drastically to improve the model’s prediction power.

XOR – Introduction to Neural Networks, Part 1

  • The input for the logic gate consists of two values (T, F).
  • This facilitates the task of understanding neural network training.
  • Let’s see what happens when we use such learning algorithms.
  • The XOR function, as an example, returns true if one of the two binary inputs is real and the opposite is false, and fake if each inputs were either true or fake.

A single-layer perceptron can only learn linearly separable patterns, whereas a straight line or hyperplane can separate the data points. However, they requires a non-linear decision boundary to classify the inputs accurately. This means that a single-layer perceptron fails to solve the XOR problem, emphasizing the need for more complex neural networks.

Weights and Biases

The designing process will remain the same with one change. We will choose one extra hidden layer apart from the input and output layers. We will place the hidden layer in between these two layers. For that, we also need to define the activation and loss function for them and update the parameters using the gradient descent optimization algorithm. The XOR problem with neural networks can be solved by using Multi-Layer Perceptrons or a neural network architecture with an input layer, hidden layer, and output layer. So during the forward propagation through the neural networks, the weights get updated to the corresponding layers and the XOR logic gets executed.

XOR’s Drawbacks for Single-Layer Perceptrons:

Then “1” means “this weight is going to multiply the first input” and “2” means “this weight is going to multiply the second input”. In the case of the OR operator, we have four possible inputs for the two elements of our input vector \(x\), namely \(x_1\) and \(x_2\) respectively. Let’s see how the OR operation outputs the result in the table below.

To be able to use it as a PyTorch model, we will pass torch. Then, we will define the init() function by passing the parameter self. Now we can call the function create_dataset() and plot our data. In the 1950s and the 1960s, linear classification was widely used and in fact, showed significantly good results on simple image classification problems such as the Perceptron.

The perceptron basically works as a threshold function – non-negative outputs are put into one class while negative ones are put into the other class. A L-Layers XOR Neural Network using only Python and Numpy that learns to predict the XOR logic gates. Complete introduction to deep learning with various architechtures. Code samples for building architechtures is included using keras. This repo also includes implementation of Logical functions AND, OR, XOR.

The Significance of the XOR Problem in Neural Networks

Sounds like we are making real improvements here, but a linear function of a linear function makes the whole thing still linear. Another way of think about it is to imagine the network trying to separate the points. The points labeled with 1 must remain together in one side of line. The other ones (labelled with 0) stay on the other side of the line. This example may actually look too simple to us all because we already know how to tackle it, but in reality it stunned very good mathematitians and AI theorists some time ago.

Now, remember, because we are using PyTorch we need to convert our data to tensors. We will create a variable X and apply the function torch.hstack() to stack horizontally x1_torch and x2_torch tensors. By combining the two decision boundaries of OR operator and NAND operator respectively using an AND operation, we can obtain the exact XOR classifier. Notice the left-hand side image which is based on the Cartesian coordinates. There is no intuitive way to distinguish or separate the green and the blue points from each other such that they can be classified into respective classes.

Yorum bırakın

E-posta adresiniz yayınlanmayacak. Gerekli alanlar * ile işaretlenmişlerdir