Wonder how the character recognition application can recognize the image and converts it to characters or text may be in the edit? This first section will discuss the basics of his theory.
Imagine you're reading a long article in a magazine. Then, you are interested to send the contents of the article to one of his colleagues by e-mail. In order for the article is more flexible, you think to type it so that it can be sent in text form. You start your typing and it spent 3 hours until finally the article is finished
Technology OCR (Optical Character Recognition) will resolve the issue. To article with the scan, will produce an image (which must have text in them can still edit) for the text in the image can be converted into text that can be edited, it is necessary to an OCR software, which will process the image and produce output text form.
Talking about OCR software, actually is taken from technology Artificial Neural Network (ANN) which is one part of the field of artificial intelligence. Implementation of the ANN is often encountered in a variety of intelligent computing systems such as to make predictions (value stocks, weather, disease diagnosis) classification (the imposition of fingerprints, faces, or certain forms of) filtering (removing noise to the signal telephone) etc. What is interesting from an application ANN is its ability to learn so that more data is given, then the application becomes more intelligent (more accurate in doing an introduction)
in this article, we will learn about the basic principles of ANN to perform character recognition as in the case above.
ANN Basic Concepts
ANN workings based on the workings of the human brain in processing information. In the human brain there are 100 billion neurons or nerve cells connected to each other. In each neuron, there are sections called dendrites, soma (or cell body) and axon.
Dendrites are shaped like tree branches, useful for receiving input signals at the soma accommodated. If the input signal has enough, then the cell is activated to carry the signal output via the axon. Axon is a duty to spread signal to the neuron's dendrites through the narrow gap neighbor called synapses
ANN is a mathematical model of neurons that possess the human brain. The components consist of:
• Input: a single attribute or a variable that should be in numerical order code be computer processed. The input data can be a number, qualitative, true / false, etc. Because the neural computing can only process the numbers, it is a problem involving qualitative attributes or pictures, must be processed in order for the numerical data that can be read by a computer. In the case of processing an image, this input bits of the image to be processed.
• Output: a solution to the problem, recommendations, category, or estimates. In the case of OCR applications, the desired output is the letter corresponding to the input image, so if there are 26 types of desired outputs (AZ) then the amount of output can be as many as 26.
• Weight: is the relative strength of an input in determining the value of output. ANN system is called "learning" when the weight is adjusted such that the final results of the ANN system approach (or equal to) the desired target.
• Summing Junction: find the number of weights of all input elements for each processing element. Multiplying the input value (X) with weight (W) and summing them to generate number of weighting Y. The formula to get the output on the Summing Junction is:
Where n is the number of inputs (i), w is the weight for each input and b represents the bias. The result is normalized with an activation function. Activation functions: a function that determines whether a neuron produce output or not. Usefulness of the activation function in is:
• To normalized data (example: 1 = true and 0 = false)
• Detects when a threshold is reached (example: Y <0.5 then set Y = 0 and if Y> = 0.5 then set Y = 1).
How to Make ANN Can Learn?
One of the most important concepts of ANN is the learning process. The main purpose of the learning process is to make arrangements on the weights on ANN, thus obtained final weights appropriate to their training data patterns. The learning process usually involves three activities:
• Calculate output
• Comparing the output with the desired target
• Adjust the weight and repeat the process
the learning process begins with a set weight, can use a specific rule or given at random (as will be adjusted) differences between actual output (Y) and the desired output (Q) is called delta. The goal is to minimize the delta (if possible, to zero) Delta can be as small as possible by changing the weights. A key part of the learning process is to change the weight in the right direction, making the delta gets smaller. When a certain weight of the delta becomes very small (or zero) it is expected that for similar inputs, will give results close to the true target.
For example, suppose the input binary 0101 0111 is trained to produce output 1 (the target). After going through the weight adjustment process, could finally get a combination of weights such that the output is 0.97 (or in other words delta of 0:03) with such weight, if any input is approaching 0101 0111, 0001 eg 0011, then the expected output is still close to 1. This principle is a principle in the character recognition applications.
For character recognition application, we will use the method BackPropagation. Although not the best, but this method powerful enough to be applied to character recognition applications.
Back Propagation method to use, required ANN architecture called multilayer networks. What is meant by multilayer network is a network consisting of input layer, hidden layer, and output layer. The difference with single-layer network is the existence of hidden layer between input layer and output layer.
In the above example there are 3 pieces inputs, 5 hidden units in the layer and 2 fruit output in the output layer. For each input that is connected to a unit in the hidden layer, has the weight of each (which will be adjusted to certain rules) If the weights connecting the input to the hidden layer units we call V, then we can define Vi, j as the weight that connects input to i with the unit to the hidden layer. So for the first output of hidden layer is formulated with:
Where xi is the i input and output on z_inj is hidden layer to the j unit that has passed through an activation function. As mentioned earlier, the activation function is a function that is used to perform data normalization. The sum total of xi with vij can achieve - ∞ to + ∞. For that, the value must be normalized. There are several activation functions that can be used, but in this application we will use, the sigmoid activation function defined by:
By inserting the parameters in z_inj as activation function obtained:
For the last output in the output layer, can be formulated by:
Where is the weight wjk linking units in hidden layer with output, and Yk is the final output has been normalized by the activation function.
Note that the data is input layer, will calculate (In total and is calculated with an activation function) and then brought forward to the hidden layer. From the hidden layer will be processed in the same way and brought forward to the output layer as the final output. This step is called the forward pass ... (to be continued)
Imagine you're reading a long article in a magazine. Then, you are interested to send the contents of the article to one of his colleagues by e-mail. In order for the article is more flexible, you think to type it so that it can be sent in text form. You start your typing and it spent 3 hours until finally the article is finished
Technology OCR (Optical Character Recognition) will resolve the issue. To article with the scan, will produce an image (which must have text in them can still edit) for the text in the image can be converted into text that can be edited, it is necessary to an OCR software, which will process the image and produce output text form.
Talking about OCR software, actually is taken from technology Artificial Neural Network (ANN) which is one part of the field of artificial intelligence. Implementation of the ANN is often encountered in a variety of intelligent computing systems such as to make predictions (value stocks, weather, disease diagnosis) classification (the imposition of fingerprints, faces, or certain forms of) filtering (removing noise to the signal telephone) etc. What is interesting from an application ANN is its ability to learn so that more data is given, then the application becomes more intelligent (more accurate in doing an introduction)
in this article, we will learn about the basic principles of ANN to perform character recognition as in the case above.
ANN Basic Concepts
ANN workings based on the workings of the human brain in processing information. In the human brain there are 100 billion neurons or nerve cells connected to each other. In each neuron, there are sections called dendrites, soma (or cell body) and axon.
Dendrites are shaped like tree branches, useful for receiving input signals at the soma accommodated. If the input signal has enough, then the cell is activated to carry the signal output via the axon. Axon is a duty to spread signal to the neuron's dendrites through the narrow gap neighbor called synapses
ANN is a mathematical model of neurons that possess the human brain. The components consist of:
• Input: a single attribute or a variable that should be in numerical order code be computer processed. The input data can be a number, qualitative, true / false, etc. Because the neural computing can only process the numbers, it is a problem involving qualitative attributes or pictures, must be processed in order for the numerical data that can be read by a computer. In the case of processing an image, this input bits of the image to be processed.
• Output: a solution to the problem, recommendations, category, or estimates. In the case of OCR applications, the desired output is the letter corresponding to the input image, so if there are 26 types of desired outputs (AZ) then the amount of output can be as many as 26.
• Weight: is the relative strength of an input in determining the value of output. ANN system is called "learning" when the weight is adjusted such that the final results of the ANN system approach (or equal to) the desired target.
• Summing Junction: find the number of weights of all input elements for each processing element. Multiplying the input value (X) with weight (W) and summing them to generate number of weighting Y. The formula to get the output on the Summing Junction is:
Where n is the number of inputs (i), w is the weight for each input and b represents the bias. The result is normalized with an activation function. Activation functions: a function that determines whether a neuron produce output or not. Usefulness of the activation function in is:
• To normalized data (example: 1 = true and 0 = false)
• Detects when a threshold is reached (example: Y <0.5 then set Y = 0 and if Y> = 0.5 then set Y = 1).
How to Make ANN Can Learn?
One of the most important concepts of ANN is the learning process. The main purpose of the learning process is to make arrangements on the weights on ANN, thus obtained final weights appropriate to their training data patterns. The learning process usually involves three activities:
• Calculate output
• Comparing the output with the desired target
• Adjust the weight and repeat the process
the learning process begins with a set weight, can use a specific rule or given at random (as will be adjusted) differences between actual output (Y) and the desired output (Q) is called delta. The goal is to minimize the delta (if possible, to zero) Delta can be as small as possible by changing the weights. A key part of the learning process is to change the weight in the right direction, making the delta gets smaller. When a certain weight of the delta becomes very small (or zero) it is expected that for similar inputs, will give results close to the true target.
For example, suppose the input binary 0101 0111 is trained to produce output 1 (the target). After going through the weight adjustment process, could finally get a combination of weights such that the output is 0.97 (or in other words delta of 0:03) with such weight, if any input is approaching 0101 0111, 0001 eg 0011, then the expected output is still close to 1. This principle is a principle in the character recognition applications.
For character recognition application, we will use the method BackPropagation. Although not the best, but this method powerful enough to be applied to character recognition applications.
Back Propagation method to use, required ANN architecture called multilayer networks. What is meant by multilayer network is a network consisting of input layer, hidden layer, and output layer. The difference with single-layer network is the existence of hidden layer between input layer and output layer.
In the above example there are 3 pieces inputs, 5 hidden units in the layer and 2 fruit output in the output layer. For each input that is connected to a unit in the hidden layer, has the weight of each (which will be adjusted to certain rules) If the weights connecting the input to the hidden layer units we call V, then we can define Vi, j as the weight that connects input to i with the unit to the hidden layer. So for the first output of hidden layer is formulated with:
Where xi is the i input and output on z_inj is hidden layer to the j unit that has passed through an activation function. As mentioned earlier, the activation function is a function that is used to perform data normalization. The sum total of xi with vij can achieve - ∞ to + ∞. For that, the value must be normalized. There are several activation functions that can be used, but in this application we will use, the sigmoid activation function defined by:
By inserting the parameters in z_inj as activation function obtained:
For the last output in the output layer, can be formulated by:
Where is the weight wjk linking units in hidden layer with output, and Yk is the final output has been normalized by the activation function.
Note that the data is input layer, will calculate (In total and is calculated with an activation function) and then brought forward to the hidden layer. From the hidden layer will be processed in the same way and brought forward to the output layer as the final output. This step is called the forward pass ... (to be continued)
Comments
Post a Comment