Artificial neural network

From Wikipedia, the free encyclopedia
Jump to: navigation, search

A neural network (also called an ANN or an artificial neural network) is a sort of computer software inspired by the human brain[1]. Neural networks are studied in a field of machine learning named deep learning. The brain solves problems with large clusters of biological neurons connected by axons. Similarly, a neural network is made up of interconnected cells that work together to produce a result. This is one of the possible ways artificial intelligence may work. Most neural networks can still operate if one or more of the processing cells fail.

Neural networks learn to do things based on examples. Traditional software can only do what it is specifically written to do. Neural networks are so complex that they often need millions or billions of examples to do well.

Overview[change | change source]

There are two ways to think of a neural network. First is like a human brain. Second is like a mathematical equation.

A network starts with an input, somewhat like a sensory organ. Information then flows through layers of neurons, where each neuron is connected to many other neurons. If a particular neurons receives enough stimuli, then it sends a message to any other neuron is it connected to through its axon. Similarly, an artificial neural network has an input layer of data, one or more hidden layers of classifiers, and an output layer. Each node in each hidden layer is connected to a node in the next layer. When a node receives information, it sends along some amount of it to the nodes it is connected to. The amount is determined by a mathematical function called an activation function, such as sigmoid or tanh.

Thinking of a neural network like a mathematical equation, a neural network is simply a list of mathematical operations to be applied to an input. The input and output of each operation is a tensor (or more specifically a vector or matrix). Each pair of layers is connected by a list of weights. Each layer has several tensors stored in it. An individual tensor in a layer is called a node. Each node is connected to some or all of the nodes in the next layer by a weight. Each node also has a list of values called biases. The value of each layer is then the out of the activation function of the values of the current layer (called X) multiplied by the weights.

A cost function is defined for the network. The loss function tries to estimate how well the neural network is doing at its assigned task. Finally, an optimization technique is applied to minimize the output of the cost function by changing the weights and biases of the network. This process is called training. Training is done one small step at a time. After thousands of steps, the network is typically able to do its assigned task pretty well.

Learning methods[change | change source]

There are three ways a neural network can learn: supervised learning, unsupervised learning and reinforcement learning. These methods all work by either minimizing or maximizing a cost function, but each one is better at certain tasks.

Recently, a research team from the University of Hertfordshire, UK used reinforcement learning to make an iCub humanoid robot learn to say simple words by babbling.[2]

References[change | change source]

  1. McCulloch, Warren; Walter Pitts (1943). "A Logical Calculus of Ideas Immanent in Nervous Activity". Bulletin of Mathematical Biophysics 5 (4): 115–133. doi:10.1007/BF02478259.
  2. http://www.newscientist.com/article/dn21933-baby-robot-learns-first-words-from-human-teacher.html