# AI — Machine Learning — Learn It Visually

I created this tutorial as an ** entry-level **piece on

**Artificial Intelligence**.

Any new subject must be presented in language matching the learner’s level of skill at that time. So don’t expect crazy math formulas just yet.

In particular we’ll take a look at **Machine Learning** aka **Deep Learning**.

The ** depth** of a Neural Network is determined by

**.**

*number of input layers***Machine Learning **algorithms ** weigh the likelihood** of a particular data set against a specific pattern.

# Thinking In Ranges

Neurons in your brain are definitely not digital, but they resemble binary logic as either ** on** or

**state. But in software, we use a**

*off*

*range of values**instead.*

The result of a calculation cycle in an AI operation is a precision *estimate in the range between **0.0–1.0**. *Ultimately — an *output* value is produced based on how well input data matches a specific ** pattern** with

**1.0**being

**100%**match (You rarely reach that but

**is good.)**

*0.95–0.97*This ** pattern** is usually trained before meaningful results can be produced. More on this a bit later in this tutorial. But first, here’s ML at its basic.

It all begins with neural networks — ** a software imitation** of the physical structure of the neurons in a brain.

# Simple Neural Network Structure

In this minimalist example *1*** input layer** consisting of

**is shown.**

*3 input nodes**A multiple**set of inputs*** per layer** is usually provided. Each input is gathered from some type of a source. Like an

**from an image**

*array of pixels**used for*

*face recognition**, for example*/ or any other data. It depends on the purpose of what you’re trying to accomplish with your

**AI**algorithm.

Both ** input** and

**values are floating pt. numbers between**

*output***0.0**and

**1.0**.

Logistically, during network operation the data is fed from left to right. However… ** Back-propagation** is sometimes used to optimize the Neural Network. That’s when we travel the network in reverse. But for now we don’t need to concern ourselves with that.

# Sum

The sum of several input nodes is just what it sounds like. It is the total sum of the weights from every node from the previous input layer. After calculating the sum it is then passed into the **Activation Function** for processing.

# Activation Function

The ** Activation Function** converts sum of

**values into an**

*input***value.**

*output*But how exactly does it work?

We need to take a look at another aspect of Machine Learning.

Remember those math equations from high-school? ** Parabolas** —

*anyone?*An ** Activation Function** is literally just a

**. So for those with a math background this might be a bit easier to grasp. If not — read on forward to the visual diagrams and the rest of this tutorial so it starts to sink in!**

*math equation*Reason **we can’t use simple linear equations** is due to their limitations.

They are not sufficient enough for creation of *useful* neural networks.

Neural Networks are designed around more complex equations. For example the **Sigmoid** (also known as **Logistic**) function is quite common. (*We’ll take a look at a few of different ones in the section below.*)

They all take on the form of** f(x) = …** and then crunch the **x** value in a way unique to that function. Why this matters and why we have different AF functions will become more apparent a bit later.

What happens once we got our result?

**AF** passes calculated value onto the next node and essentially — as a partial input into one of the activation functions in a ** node in the next input set**.

You can think of it as taking a set of multiple inputs. And passing the calculated value onto the next node. It’s the value gateway between input sets.

## Different Types Of Activation Functions

Just like there are different types of math equations…there are different types of activation functions.

Exactly how they crunch numbers to arrive at the final output value is tightly related to ** training** an existing network first. So we can’t go that deep into the subject just yet, because overall, the system is not based on something as simple as calculating and returning a numeric result.

But what we can do — to deepen our understanding, thus far — is take a look at the ** visual representation** of each mathematical equation behind different activation functions!

This is a ** visual tutorial**. And to give you a basic idea of what you’ll be grappling with here is a table of the classic set of math equations many classic Activation Functions can be based on.

The most basic **AF** is represented by **f(x)=x** or the **Identity Function**.

There are several others. But they are a bit more complex.

Essentially these functions are used to determine the resulting node value.

## How Exactly Does An Activation Function Determine Its Value?

Well, that’s what an AF is. It takes an input in the form of a number and produces a return value between 0.0–1.0 (*sometimes range is +/- infinity*). The actual formulas are described above. You can re-write these equations as functions in ** Python**,

**or any other programming language.**

*JavaScript*If you are into math and have a lot of time on your hands you will love writing out these functions in code*!* But often you don’t have to. And that’s because already existing ** A.I. libraries** take care of that for you. This way you can focus on building your Neural Network and training it for a specific purpose.

# Each Node Carries A Calculated Weight

So these Activation Functions produce a value.

The most important thing to notice at this time — each point is a ** weight**.

This weight measures the likelihood a certain pattern was matched.

But ** multiple layers of input sets **are possible, as shown in next example.

Each single node communicates with every single node in the next input layer making up this cross-connected communication highway.

The number of items in each layer is arbitrary. It doesn’t have to be the same number as shown in the diagram above. Depending on which problem you’re trying to solve.

It will take some intuition and creativity to determine the number of input nodes you want to use in each layer. But even solving the same problem can be accomplished by different neural network structures.

Due to non-linear nature of calculations this process is ambiguous.

# Hidden Layers

We’ve just discussed how a Neural Network can have ** multiple input layers**. They can be thought of as vertical rows of nodes.

All of the *inner layers* between first input row and output node are often referred to as **hidden layers**.** **That makes sense because this is where most of the gritty AI processing work is done. Basically it’s the AI mystery box.

# Different Types of Neural Network Patterns

At times **ML** may seem a lot like crafting a network pattern to *match* patterns.

Neural networks come in different shapes and forms.

Different types of neural network structures are more apt at solving particular types of problems associated with their structure.

# OK — But How Do We Write The Code?

That was a lot of theory.

But how do we actually implement it in code?

You can use a library like **Tensorflow.js** to get started.

But that won’t do any good because there is still so much to cover.

# OK — But How Does It Produce Meaningful Results?

We’ve discussed the ** structure** of a neural network up to this point.

We talked about ** activation functions**,

**and**

*data inputs***.**

*hidden layers*We also talked about ** weights** passed to and fro the simulated connections.

In order for a non-linear Machine Learning algorithm to produce any sensible outcome it first ** needs to be trained** on a set of pre-existing data.

You always start with *choosing*** data** to train your AI algorithm.

That depends on what ** problem** you’re trying to solve.

If you want to** recognize numbers in an image** you start with images of digits.

## Recognizing Numbers From A Screenshot

The classic AI example is to teach a neural network to recognize numbers between **0–9**. In the same way as you can train a machine algorithm to recognize **A** — **Z** letters or even parts of a human face — an **eye** or a **mouth** on a photograph also represents a particular type of shape or pattern that is common to all humans but might appear slightly different.

Remember all we are dealing here is ** patterns**.

When the algorithm recognizes a pattern it is never a 100% match. But the closer we can get to 1.0 (100%) the more likely the shape we’re looking for represents what it was trained to recognize.

If we used a standard font, we wouldn’t even have to do any AI work. We could simply scan each digit for exact pixel pattern. But the key point of AI is to ** recognize a pattern in obscurity**.

First, we need to have some type of a medium which will be used as a piece of training data. Each digit can be represented by an image:

You can easily recognize each digit by sight. But an AI algorithm needs to be trained to recognize similar patterns because while they are similar they are still not 100% identical.

In order to achieve this we can break down the primary pattern into smaller blocks and implement something referred to as ** feature extraction**.

# Feature Extraction

To identify a digit the algorithm implements a ** feature extraction** system which breaks down common patterns into counterparts relevant for constructing the complete digit / symbol / letter / etc.

The essence of a pattern remains the same. For example **0** is mostly a circle — you can break it down into smaller patterns with an arch on each of the sides:

If we can only ** train our algorithm to recognize these 4 unique patterns** and check for their presence within localized area of an image we can calculate the

**with which it can be said that it might be a**

*amount of certainty***.**

*zero*It’s the same for other digits. Digit **1** for example is a *single vertical bar. **Or perhaps with a smaller line at a slight angle at the top.*

Number **2** is ** half a circle** on top, a

**and a**

*diagonal line***.**

*horizontal line*Number **3** can be broken into two ** semi-arch** patterns.

Number **4** can be thought of as 3 lines: ** vertical**,

**and**

*horizontal***.**

*diagonal*…and so on.

What if it’s a ** hand-written** digit? It still has the same properties of that digit: the same

**, the same**

*edges***.**

*loops*What if the digit appears on a ** speed limit sign** out on the street from an indirect angle on a photograph? Much like our own vision AI should be able to accommodate for some type of error term.

Try out this **AI JavaScript demo** that allows you to draw something on the screen and have the pre-trained algorithm tell you what you just drew.

The algorithm will try to give you the best match even if what you draw isn’t really a number. Still you can see artificial intellect at work trying to provide the closest approximation it can muster.

# What Does The Trained Set Look Like?

Here is a snippet of the training data from the algorithm. It’s just a list of weights stored in a very long array (** thousands of values**):

// The neural network's weights (unit-unit weights, and unit biases) // training was done in Matlab with the MNIST dataset.

// this data is for a 784-200-10 unit, with logistic non-linearity

// in the hidden and softmax in the output layer. The input is a

// [-1;1] gray level image, background == 1, 28x28 pixels linearized

// in column order (i.e. column1(:); column2(:); ...) i-th output

// being the maximum means the network thinks the input encodes

// (i-1) the weights below showed a 1.92% error rate on the test

// data set (9808/10000 digits recognized correctly).letw12 = [[-0.00718674, 0.00941102, -0.0310175, -0.00121102, -0.00978546, -4.65943e-05, 0.0150367, 0.0101846, 0.0482145, 0.00291535, -0.00172736, 0.0234746, 0.0416268, 0.0315077, -0.00252011, 0.0163985, 0.00853601, 0.00836308, 0.00692898, 0.0215552, 0.0540464, 0.0393167, 0.0668207, 0.0232665, 0.031598, 0.0143047, 0.0156885, -0.0269579, -0.00777022, 0.0397823, -0.00825727, 0.0212889, -0.00755215, 0.0353843, 0.0297246, .../* ... Thousands weights more follow ... */

The complete source code wouldn’t fit into this article. But the sets are usually pretty long even for what seems to be trivial tests.

# Painting Image Input Into Neural Net

This bit of code was taken from ** recognize()** function written in JavaScript.

It was taken from the demo at **http:// myselph.de**

You can check out the ** entire source code** here.

**// for visualization/debugging: paint the input to the neural net.** if (document.getElementById('preprocessing').checked == true)

{

ctx.clearRect(0, 0, canvas.width, canvas.height);

ctx.drawImage(copyCtx.canvas, 0, 0);

for (var y = 0; y < 28; y++) {

for (var x = 0; x < 28; x++) {

var block = ctx.getImageData(x * 10, y * 10, 10, 10);

var newVal = 255 * (0.5 - nnInput[x*28+y]/2);

for (var i = 0; i < 4 * 10 * 10; i+=4) {

block.data[i] = newVal;

block.data[i+1] = newVal;

block.data[i+2] = newVal;

block.data[i+3] = 255;

}

ctx.putImageData(block, x * 10, y * 10);

}

}

}

This partial piece of code “pastes” the image input (a free hand drawing) that was previously divided into 10 x 10 blocks storing average grayscale values for that area of the image.

It will then check it against the trained set and after crunching the sums / and average comparisons against it will return the likelihood of the result in terms of how closely your HTML canvas drawing matches a particular digit.

# Final Words

Artificial Intelligence is a vast subject. There are different types of machine learning patterns and tutorials coming out each day. This tutorial should serve only as an introduction for someone who is just starting out!

# Follow Me On Twitter For Free Book Giveaways

**Follow** me on **@ js_tut** where I post freemium tutorials JavaScript, online CSS tools and host free books giveaways!

The **Tidal Wave** account is the one that gives away my books for free.