<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="//www.moretticb.com/feed.xml" rel="self" type="application/atom+xml" /><link href="//www.moretticb.com/" rel="alternate" type="text/html" /><updated>2026-05-18T16:01:12+00:00</updated><id>//www.moretticb.com/feed.xml</id><title type="html">Caio Benatti Moretti</title><subtitle>Bits, bytes and stuff.</subtitle><entry><title type="html">From Perceptron to LSTM - An introduction to Artificial Neural Networks and applications</title><link href="//www.moretticb.com/blog/from-perceptron-to-lstm/" rel="alternate" type="text/html" title="From Perceptron to LSTM - An introduction to Artificial Neural Networks and applications" /><published>2019-11-26T03:59:55+00:00</published><updated>2019-11-26T03:59:55+00:00</updated><id>//www.moretticb.com/blog/from-perceptron-to-lstm</id><content type="html" xml:base="//www.moretticb.com/blog/from-perceptron-to-lstm/"><![CDATA[<p>After quite a while without a post here, I decided to come back. It has been hard to find some time for my personal projects, but my current situation led me to make this post. I’m currently a visiting PhD student at MIT and I was asked to talk about neural networks with the members of <a href="http://the77lab.mit.edu" target="_blank">The 77 Lab</a>, so I came up with this (long) material.</p>

<p>As an introduction to artificial neural networks and applications, this overview traces the whole path from one of the building blocks of artificial neural networks and then goes towards LSTMs. Check the video below.</p>

<iframe allowfullscreen="allowFullScreen" width="560" height="315" src="//www.youtube.com/embed/yKGm4yLuTkU" frameborder="0"> </iframe>

<p>It is interesting to see the discussions taking place along this talk with researchers from different areas/perspectives on this topic. I’m from Computer Science field, while the members of the lab are engineers. It was fun.</p>

<p>By the way, if you have any questions or comments, feel free to share in the comments section. If you find this video useful, please give a thumbs up, share with more people and subscribe to the channel, so I can continue posting more material here :)</p>]]></content><author><name>Caio Benatti Moretti</name><email>caiodba@gmail.com</email></author><category term="blog" /><category term="machine-learning" /><summary type="html"><![CDATA[This overview traces the whole path from one of the building blocks of artificial neural networks and then goes higher level towards LSTMs and some application examples.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="//www.moretticb.com/%7B%22feature%22=%3E%22nntalkCapa.png%22%7D" /><media:content medium="image" url="//www.moretticb.com/%7B%22feature%22=%3E%22nntalkCapa.png%22%7D" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">MLP Topology Workbench - A playground for Multi-Layer Perceptrons</title><link href="//www.moretticb.com/blog/mlp-topology-workbench-a-playground-for-multilayer-perceptrons/" rel="alternate" type="text/html" title="MLP Topology Workbench - A playground for Multi-Layer Perceptrons" /><published>2017-05-23T03:59:55+00:00</published><updated>2017-05-23T03:59:55+00:00</updated><id>//www.moretticb.com/blog/mlp-topology-workbench-a-playground-for-multilayer-perceptrons</id><content type="html" xml:base="//www.moretticb.com/blog/mlp-topology-workbench-a-playground-for-multilayer-perceptrons/"><![CDATA[<p>When it comes to neural networks, or more specifically multi-layer perceptrons, one can find in literature some contents which may be either complicated, demanding a bit of mathematical maturity, or too simplified, lacking details which would come in handy for implementation, usage, or even understanding of the mechanism as a whole.</p>

<p>For study purposes, I developed this tool as an additional material to be used along with contents of the literature, enabling to recreate a particular state of the network in order to inspect its behavior as a function, as well as its convergence during training, and (almost) everything else that can be studied. See in the video below an overview of this project:</p>

<iframe width="560" height="315" src="//www.youtube.com/embed/PaF6JXLaHFE" frameborder="0"> </iframe>

<p>I took the idea of a playground from <a href="http://playground.tensorflow.org" target="_blank">Tensorflow’s playground</a>, since you are free to play with multilayer perceptrons from this tool. In fact, I encourage you to make your own! It is quite a good way to explore your expertise from different perspectives. Have fun!</p>

<p>This project was developed in JavaScript, HTML and CSS and encompasses the implementation of the classic backpropagation training algorithm. It is entirely available at <a href="http://www.github.com/moretticb/MTW" target="_blank">GitHub</a>. In case you feel like adding new features and adjusting it to your needs, <a href="http://www.github.com/moretticb/MTW" target="_blank">fork me on GitHub</a> :).</p>

<p>It is important to point out that this post is a full guide of the tool. I would recommend to use it for <strong>query purposes</strong>, because a one-time reading may be a bit boring and worthless.</p>

<p>Open the <a href="http://www.moretticb.com/MTW" target="_blank">tool</a> in a new window or use the embedded version below for testing everything described in this guide. Below you can find a full guide of this tool divided into the following sections:</p>

<ul id="markdown-toc">
  <li><a href="#topology-tab" id="markdown-toc-topology-tab">Topology tab</a></li>
  <li><a href="#command-tab" id="markdown-toc-command-tab">Command tab</a></li>
  <li><a href="#train-tab" id="markdown-toc-train-tab">Train tab</a></li>
  <li><a href="#visualize-tab" id="markdown-toc-visualize-tab">Visualize tab</a></li>
  <li><a href="#sharing" id="markdown-toc-sharing">Sharing</a></li>
  <li><a href="#closing-remarks" id="markdown-toc-closing-remarks">Closing remarks</a></li>
</ul>

<iframe width="600" height="560" src="http://www.moretticb.com/MTW/Tool/embed.html" style="max-width: 600px; width: 100%; height: 568px;" frameborder="0"></iframe>

<h2 id="topology-tab">Topology tab</h2>
<p>Everything regarding structural configuration of a model is done in this tab (you can also use the shortcut <code class="language-plaintext highlighter-rouge">Shift+1</code>). Changes are performed through the interface at the bottom area, and inspected at the top area.</p>

<p>The number of inputs, hidden nodes and outputs are adjustable through <img src="/images/MTWUpArrow.png" style="width: 35px; height: 35px;" /> and <img src="/images/MTWDownArrow.png" style="width: 35px; height: 35px;" /> buttons, or just change the current number and input a new one from keyboard. Layers of neurons can be removed by <img src="/images/MTWTrash.png" style="width: 35px; height: 35px;" /> button, or added by <img src="/images/MTWAdd.png" style="width: 35px; height: 35px;" /> button.</p>

<p>The activation function used in each layer is indicated by \( g(u) \) or \( u \) (click to toggle between them), being respectively a sigmoid or a linear function:</p>

<ul>
  <li>\( g(u) = \dfrac{1}{1+e^{-u}} \)</li>
  <li>\( u = u \)</li>
</ul>

<p>The configurations can be swapped between layers using the <img src="/images/MTWSwap.png" style="width: 35px; height: 35px;" /> button. In case of swapping with the input layer, only the number of nodes does swap and everything else remains in the neural layer.</p>

<p>The Verbose area will be covered in the next section, since both topology and command tabs share the same interface.</p>

<h2 id="command-tab">Command tab</h2>

<p>This tool was initially developed to generate commands to run the <a href="/blog/multilayer-perceptron-implementation-in-c/" target="_blank">C implementation of multilayer perceptrons</a>. So this tab (<code class="language-plaintext highlighter-rouge">Shift+2</code>) encompasses the commands to perform training and operation of models configured in <a href="#topology-tab">topology tab</a> - even though the same task can be done in this tab without visual inspection.</p>

<p>Right below the generated commands, there is a text area which is related to the <a href="#load-and-save-states">state</a> of the current model and will be covered in <a href="#train-tab">next section</a>.</p>

<p>In verbose area, there are three options to be set regarding whether to verbose when running the generated command for training:</p>

<figure>
	<a href="/images/MTWVerboseArea.png"><img src="/images/MTWVerboseArea.png" alt="image" /></a>
	<figcaption>Verbose area.</figcaption>
</figure>

<ul>
  <li><strong>Total epochs</strong> the number of epochs after training</li>
  <li><strong>mean square error</strong> at the end of each epoch (useful for plotting)</li>
  <li><strong>synaptic weights</strong> the adjusted weights after training</li>
</ul>

<h2 id="train-tab">Train tab</h2>
<p>Given a model previously configured at <a href="#topology-tab">topology tab</a>, its weights - initially random - can be changed in this tab (<code class="language-plaintext highlighter-rouge">Shift+3</code>) either via training, or manually.</p>

<figure>
	<a href="/images/MTWTrainingArea.png"><img src="/images/MTWTrainingArea.png" alt="image" /></a>
	<figcaption>Interactive structure from training tab.</figcaption>
</figure>

<h3 id="interactive-structure">Interactive structure</h3>
<p>The structure displayed in this tab is interactive, so one can inspect the entire model, as well as perform changes manually.</p>

<p>As indicated in the image above, nodes - or neurons - are clickable, displaying (on click) its weights, which make up the connection to the previous layer. When the model is idle (i.e., when training iterations are not being performed), weights can be manually changed by modifying the value in the due text field and then pressing Return.</p>

<p>At any moment, regardless of the idleness, the network can be fed. Once all text fields at the left of the structure (see image above) are filled - tip: paste the input entirely in the first text field in comma-separated format, and everything is distributed to the other fields - with real-valued input, the structure is fed and the output appears at the right side.</p>

<h3 id="dataset">Dataset</h3>

<p>Before training, it is necessary to provide a dataset for the network to learn. Click <img src="/images/MTWDataset.png" style="width: 35px; height: 35px;" /> button to toggle to a text area to provide data - click <img src="/images/MTWStructure.png" style="width: 35px; height: 35px;" /> button to toggle back. Instances must be in <a href="https://pt.wikipedia.org/wiki/Comma-separated_values" target="_blank">CSV format</a> or a <a href="#data-functions">data function</a> can be called:</p>

<figure>
	<a href="/images/MTWCSVData.png"><img src="/images/MTWCSVData.png" alt="image" /></a>
	<figcaption>The means to input data.</figcaption>
</figure>

<p>Instances must be compatible to the model IO, i.e., the number of values per instance (input and expected output) equals the total of inputs and outputs of the model. Once a new dataset is inserted, hit <strong>update parameters</strong> button.</p>

<h4 id="data-functions">Data functions</h4>
<p>For didactic purposes, there are some functions to generate data points which can be used in place of explicit instances, such as <code class="language-plaintext highlighter-rouge">ring()</code>, <code class="language-plaintext highlighter-rouge">spiral()</code> and <code class="language-plaintext highlighter-rouge">sin()</code>, detailed below:</p>

<p><code class="language-plaintext highlighter-rouge">ring()</code> function generates data points for <a href="https://en.wikipedia.org/wiki/Binary_classification" target="_blank">binary classification</a> in \( \mathbb{R}^3 \). Spatially, each ring of points from one class is surrounded by a bigger one from another class. Its usage is defined below:</p>

<div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nx">ring</span><span class="p">();</span>
<span class="c1">//or</span>
<span class="nx">ring</span><span class="p">(</span><span class="nx">points</span><span class="p">,</span> <span class="nx">pairs</span><span class="p">,</span> <span class="nx">noise</span><span class="p">,</span> <span class="nx">flipLabels</span><span class="p">);</span>
</code></pre></div></div>

<p>where:</p>
<ul>
  <li><code class="language-plaintext highlighter-rouge">points:integer</code> number of points to be generated (both classes)</li>
  <li><code class="language-plaintext highlighter-rouge">pairs:integer</code> number of pairs of rings (one for each class)</li>
  <li><code class="language-plaintext highlighter-rouge">noise:decimal</code> the amount of noise to be added</li>
  <li><code class="language-plaintext highlighter-rouge">flipLabels:boolean</code> whether to invert labels of data points</li>
</ul>

<p><code class="language-plaintext highlighter-rouge">spiral()</code> function generates data points for <a href="https://en.wikipedia.org/wiki/Binary_classification" target="_blank">binary classification</a> in \( \mathbb{R}^3 \). Spatially, points of both classes are arranged in a spiral pattern. Its usage is defined below:</p>

<div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nx">spiral</span><span class="p">();</span>
<span class="c1">//or</span>
<span class="nx">spiral</span><span class="p">(</span><span class="nx">points</span><span class="p">,</span> <span class="nx">twirl</span><span class="p">,</span> <span class="nx">noise</span><span class="p">,</span> <span class="nx">flipLabels</span><span class="p">);</span>
</code></pre></div></div>

<p>where:</p>
<ul>
  <li><code class="language-plaintext highlighter-rouge">points:integer</code> number of points to be generated (both classes)</li>
  <li><code class="language-plaintext highlighter-rouge">twirl:decimal</code> the intensity of the twirl effect</li>
  <li><code class="language-plaintext highlighter-rouge">noise:decimal</code> the amount of noise to be added</li>
  <li><code class="language-plaintext highlighter-rouge">flipLabels:boolean</code> whether to invert labels of data points</li>
</ul>

<p><code class="language-plaintext highlighter-rouge">sin()</code> function generates data points for <a href="https://en.wikipedia.org/wiki/Regression_analysis" target="_blank">regression</a> in \( \mathbb{R}^2 \). Spatially, points describe the behavior of the <a href="https://en.wikipedia.org/wiki/Sine" target="_blank">sine</a> function. Its usage is defined below:</p>

<div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nx">sin</span><span class="p">();</span>
<span class="c1">//or</span>
<span class="nx">sin</span><span class="p">(</span><span class="nx">points</span><span class="p">,</span> <span class="nx">periods</span><span class="p">,</span> <span class="nx">noise</span><span class="p">);</span>
</code></pre></div></div>

<p>where:</p>
<ul>
  <li><code class="language-plaintext highlighter-rouge">points:integer</code> number of points to be generated (both classes)</li>
  <li><code class="language-plaintext highlighter-rouge">periods:integer</code> how many times the period is repeated</li>
  <li><code class="language-plaintext highlighter-rouge">noise:decimal</code> the amount of noise to be added</li>
</ul>

<h3 id="training-interface">Training interface</h3>
<p>At the bottom area, the adjustable parameters of the classic backpropagation algorithm are learning rate ( \( \eta \) ) and precision ( \( \epsilon \) ) as stopping criterion. Once parameters are changed, hit <strong>update parameters</strong> button.</p>

<p>To start training, hit <strong>iterate</strong> button. everything in the structure is updated at each iteration, and the behavior of the mean square error (MSE) along epochs can be inspected by the vertical gauge at the right. As MSE approaches \( \epsilon \), if the convergence slows down and the difference is too small to inspect visually, click the gauge area to zoom in.</p>

<p>Training iterations can be interrupted any time by marking <strong>interrupt</strong> option, going back to idle state. Changes of weights can be inspected between epochs, once <strong>interrupt</strong> option is marked and <strong>iterate</strong> is hit.</p>

<p>It is also possible to inspect changes of weights inside one epoch, as instances are presented to the algorithm, by hitting <strong>next instance<sub>c/t</sub></strong> button, where <strong>c</strong> indicates the next <strong>i-th instance</strong> to be presented <strong>out of t</strong> instances. After iterating over all instances, an epoch is counted; if <strong>iterate</strong> button is hit when \( 1 &lt; c &lt; t \), the remaining instances are presented, and an epoch is counted.</p>

<p>In case of restarting training, hit <strong>random</strong> button, so small random values in \( [0,1] \) are attributed to all weights of the network.</p>

<h3 id="load-and-save-states">Load and save states</h3>
<p>Once a network is trained and ready to use, hit <strong>save</strong> button. It takes to the <a href="#command-tab">command tab</a> and the text area below the generated commands displays the weights that make up the entire model, so it can be used in another piece of software - the <a href="/blog/color-sensor-prototype-using-neural-networks/" target="_blank">color sensor project</a> is a good example that benefits from these weights. The order of weights in relation to the nodes are defined as follows:</p>

<figure>
	<a href="/images/mlpWeightOrder.png"><img src="/images/mlpWeightOrder.png" alt="image" /></a>
	<figcaption>Order of weights as they appear in the topology.</figcaption>
</figure>

<p>The other way around is also possible. A list of weights obtained elsewhere - a <a href="https://www.mathworks.com/products/neural-network.html" target="_blank">MATLAB toolbox</a>, <a href="http://playground.tensorflow.org" target="_blank">Tensorflow playground</a>, or even my <a href="/blog/multilayer-perceptron-implementation-in-c/" target="_blank">implementation in C language</a> - can be applied to the network structure by pasting it in the same text area from command tab and hitting <strong>load</strong> button at train tab. Make sure to supply the exact number of weights, in the order illustrated above, necessary to make up a model with the configured structure.</p>

<p>The <strong>feed forward</strong> button is used when the output or activations of hidden neurons are not refreshed.</p>

<p>Depending of the topological configuration, the <img src="/images/MTWNeuron.png" style="width: 35px; height: 35px;" /> button becomes available. More details of this feature are covered in next section.</p>

<h2 id="visualize-tab">Visualize tab</h2>
<p>When the topology defines a function in \( \mathbb{R}^2 \) or \( \mathbb{R}^3 \), this tab (<code class="language-plaintext highlighter-rouge">Shift+4</code>) is enabled and ready to use. It shares the same <a href="#training-interface">training interface</a> as in training tab - so the result of iterations can be observed spatially - along with visualization parameters to customize what is shown in the top area.</p>

<p>The size of points (from <a href="#dataset">dataset</a>) to be plotted is defined by <strong>PT SIZE</strong> and <strong>ASPECT</strong> defines the <a href="https://en.wikipedia.org/wiki/Aspect_ratio_(image)" target="_blank">aspect ratio</a> of the plotting area. <strong>show points</strong> button flags whether to show the points along with the output of the model. Once a parameter is changed, <strong>plot</strong> button will be marked, meaning that there are changes to be done in the visualization. Hit <strong>plot</strong> button to refresh the plotting area with the new parameters.</p>

<p>The default node to be visualized is the model output (i.e., the first node of the output layer). To select another node for visualization, hit <img src="/images/MTWNeuron.png" style="width: 35px; height: 35px;" /> button (also available in training tab, or <code class="language-plaintext highlighter-rouge">Shift+V</code>, regardless of the selected tab). It will take to the <a href="#training-tab">training tab</a>, prompting to select another node to visualize its output in space - pass the mouse cursor over a node to see a preview of the visualization, or click to be taken to the visualize tab.</p>

<p>A quick note: the image of <img src="/images/MTWNeuron.png" style="width: 35px; height: 35px;" /> button is licensed under creative commons license. So let me give the due <a href="https://thenounproject.com/term/neuron/79860/" target="_blank">credits</a> to <a href="https://thenounproject.com/wattenberger/" target="_blank">Amelia Wattenberger</a>.</p>

<p>It is important to point out that whenever a weight is changed at training tab (via training or manually), the visualization, when available, is refreshed. That said, this tool is useful to study the role of each weight in the activation of nodes - performing a manual approximation of the <a href="#data-functions">sin function</a> is a good exercise - once it is possible to visualize and change the network at training tab.</p>

<h2 id="sharing">Sharing</h2>
<p>It is possible to embed this tool in an HTML page with a trained network, so the same model can be tested and used by others. Click at <strong><code class="language-plaintext highlighter-rouge">&lt;/&gt;</code></strong> button and copy the generated HTML code. Feel free to resize the frame according to your layout, but keep in mind that <strong>minimum and maximum widths are 455 and 700 pixels</strong>.</p>

<p>Everything shareable is stored within the <a href="https://en.wikipedia.org/wiki/Fragment_identifier" target="_blank">URL hash</a>. Only the dataset is not shareable - I chose to prioritize the weights in the URL, because large datasets might get truncated.</p>

<p>There is also the possibility of not to embed the weights. To do so, uncheck the <strong>synaptic weights</strong> option under <strong>verbose</strong> area in <a href="#topology-tab">topology tab</a>, as shown below:</p>

<h3 id="url-parameters">URL parameters</h3>
<p>There is also the possibility to share the URL (since network weights are carried in hash) - just ignore the HTML code around the URL. The hash contains comma-separated parameters in the following syntax:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>#I, N1, ..., Nm [, s | l, ..., s | l] [, eE] [, rR] [, wLIST]
</code></pre></div></div>

<p>where:</p>

<ul>
  <li><strong>UPPERCASE</strong> are numbers and <strong>lowercase</strong> are static characters</li>
  <li><code class="language-plaintext highlighter-rouge">I</code> is the number of inputs</li>
  <li><code class="language-plaintext highlighter-rouge">N1, ..., Nm</code> are the numbers of nodes (from layer <code class="language-plaintext highlighter-rouge">1</code> to layer <code class="language-plaintext highlighter-rouge">m</code> (output))</li>
  <li><code class="language-plaintext highlighter-rouge">[, s | l, ..., s | l]</code> are the activation functions to use per neural layer: <code class="language-plaintext highlighter-rouge">s</code> for sigmoid and <code class="language-plaintext highlighter-rouge">l</code> for linear (Optional. Default is <code class="language-plaintext highlighter-rouge">s</code>).</li>
  <li><code class="language-plaintext highlighter-rouge">[, eE]</code> is the precision \( \epsilon \). (Optional. Default is <code class="language-plaintext highlighter-rouge">E</code> \( = 0.01 \) ).</li>
  <li><code class="language-plaintext highlighter-rouge">[, rR]</code> is the learning rate \( \eta \). (Optional. Default is <code class="language-plaintext highlighter-rouge">R</code> \( = 0.1 \) ).</li>
  <li><code class="language-plaintext highlighter-rouge">[, wLIST]</code> is the list of weights following the order described <a href="#load-and-save-states">here</a>. (Optional. Default is a random list).
    <ul>
      <li><code class="language-plaintext highlighter-rouge">LIST</code> separates weights with separator <code class="language-plaintext highlighter-rouge">|</code></li>
    </ul>
  </li>
</ul>

<h2 id="closing-remarks">Closing remarks</h2>
<p>The purpose of this tool is to study how this architecture of neural networks and the classic backpropagation algorithm work, as well as to share a trained model. It has a lot of limitations and therefore plenty of room for improvements. Again, <a href="http://www.github.com/moretticb/MTW" target="_blank">fork me on GitHub</a> if you are keen on neural networks and this tool.</p>

<p>This is <strong>not</strong> a data science tool - Cross-validation and preprocessing tools were intentionally not implemented, because they do not concern the main purpose. For real-life applications, <a href="https://www.tensorflow.org/" target="_blank">Tensorflow library</a>, <a href="http://scikit-learn.org/" target="_blank">Python + Scikit-learn</a> and <a href="https://www.r-project.org/" target="_blank">R</a> are the most suitable tools.</p>]]></content><author><name>Caio Benatti Moretti</name><email>caiodba@gmail.com</email></author><category term="blog" /><category term="machine-learning" /><summary type="html"><![CDATA[For study purposes, this tool provides an intuitive view over MLPs, from its structural form, as a function, to the backpropagation training algorithm with full control over iterations, for inspection purposes. It also enables to share a trained network in your project page.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="//www.moretticb.com/%7B%22feature%22=%3E%22mtwCapa.png%22%7D" /><media:content medium="image" url="//www.moretticb.com/%7B%22feature%22=%3E%22mtwCapa.png%22%7D" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">A game environment for the integration of complementary game-based projects</title><link href="//www.moretticb.com/blog/a-game-environment-for-complementary-project-applications/" rel="alternate" type="text/html" title="A game environment for the integration of complementary game-based projects" /><published>2017-01-01T03:59:55+00:00</published><updated>2017-01-01T03:59:55+00:00</updated><id>//www.moretticb.com/blog/a-game-environment-for-complementary-project-applications</id><content type="html" xml:base="//www.moretticb.com/blog/a-game-environment-for-complementary-project-applications/"><![CDATA[<p>With the purpose of simplifying the integration of projects applied to games, I developed a simplified and manipulable 2D game engine, enabling to focus essentially on project goals and avoiding to deal with interfacing matters, which can eventually become a complicated task. At least at initial stages of a project development, this approach is useful to recreate games with the same underlying mechanics, until the stage (if necessary) where integration with the original game environment takes place.</p>

<p>Everything was developed in Java and the projects were created using <a href="http://www.eclipse.org/" target="_blank">Eclipse</a>. As <a href="http://www.github.com/moretticb/AStar" target="_blank">A*</a> algorithm can be used not only with this <a href="http://www.github.com/moretticb/GameEnv" target="_blank">Game environment</a>, I separated both projects in GitHub. If you are using Eclipse as well, don’t forget to <a href="http://help.eclipse.org/luna/index.jsp?topic=%2Forg.eclipse.jdt.doc.user%2Freference%2Fref-properties-build-path.htm" target="_blank">add</a> A* project to the Java Build Path.</p>

<p>I would say the ability to manipulate the environment is the main feature which justifies - aside from the fun - the development from scratch of this game engine. From appropriate adaptations, there are plenty of areas to develop and apply complementary projects to gaming scenarios, such as artificial intelligence (applicable to several parts of a game, not only giving intelligent behavior to characters), network communication (multiplaying), external user interfaces - from adaptations (e.g., robotic devices for motor rehabilitation) to some new concept of user interfaces - and so on, considering also the combination of each of these and other subjects.</p>

<p>Another benefit from this approach, as mentioned earlier, is the avoidance to write low-level softwares for virtual interfacing purposes. Such an effort can get very complicated and sometimes infeasible for simpler projects. SethBling did an outstanding work in <a href="https://www.youtube.com/watch?v=qv6UVOQ0F44" target="_blank">MarI/O</a> applying evolutionary mechanisms to obtain a topology of an artificial neural network that controls Mario properly, being able to cross an entire level. However, it was necessary to deal with an interface in order to exchange IO with an emulator. As far as feasibility is concerned, getting familiar with a particular programming language, as well as studying the achitecture of an emulator are not trivial tasks to be performed and sometimes become more difficult than the project itself. On the other hand it could be a nice excuse to learn more, if you have time :)</p>

<p>It is important to point out that this approach (using this game environment) is useful for recreating 2D games only. In case of a 3D game, <a href="http://www.unity3d.com/" target="_blank">Unity 3D</a> does the job very well - it also works with 2D, but I particularly prefer running away from the way they do.</p>

<p>This article encompasses some implementation details. The full UML class diagram can be seen <a href="/images/GameEnvClassDiagram.png" target="_blank">here</a>, but there are some information necessary to assimilate first in order to easily understand the diagram and the code itself. Such information are elucidated as an intuitive notion in the following sections.</p>

<ul id="markdown-toc">
  <li><a href="#gui" id="markdown-toc-gui">GUI</a></li>
  <li><a href="#level" id="markdown-toc-level">Level</a></li>
  <li><a href="#animation" id="markdown-toc-animation">Animation</a></li>
  <li><a href="#characters" id="markdown-toc-characters">Characters</a></li>
  <li><a href="#game-loop" id="markdown-toc-game-loop">Game loop</a></li>
</ul>

<h2 id="gui">GUI</h2>

<p>The GUI was developed using Java’s <a href="https://pt.wikipedia.org/wiki/Swing_(Java)" target="_blank">Swing</a>. The figure below gives a good understanding of the structure of graphic components.</p>

<figure>
        <a href="/images/GUIDiagram.png"><img src="/images/GUIDiagram.png" alt="image" /></a>
        <figcaption>Composition of GUI</figcaption>
</figure>

<p>It actually has only two GUI components (a <em>JFrame</em> and a <em>JPanel</em>). Everything else is about drawing graphic primitives, images and calculating positions on canvas. Swing toolkit also provides a good support to listeners, which were also used.</p>

<p>Most of the graphical activity occurs in the double-buffered JPanel, which receives a tiled image and sprite frames of characters. Each iteration of the <a href="#game-loop">game loop</a> calulates the position of each animated elements and updates the buffer to be displayed.</p>

<p>The mouse listener is used to click on a point of the map either to go to, or to perform an action in a character (in case the clicked place contains a character). The key listener is used to walk on the map (arrow keys) and to choose actions to perform: follow (F), stun (T), slow (S) and kill (K) - run the code and try it yourself. More about <a href="#actions">actions</a> is covered ahead.</p>

<h2 id="level">Level</h2>

<p>Given a <a href="https://en.wikipedia.org/wiki/Tile-based_video_game" target="_blank">tileset</a> image and a CSV file with comma-separated tileset coordinates, an image of the map can be built. As illustrates the figure below, each coordinate consists of two values and a special character (between coordinate values) to indicate whether the player can walk on that tile (floor) or not (wall).</p>

<figure>
        <a href="/images/LevelDiagram.png"><img src="/images/LevelDiagram.png" alt="image" /></a>
        <figcaption>Graphical and structural level features</figcaption>
</figure>

<h3 id="tile-map">Tile map</h3>

<p>From the CSV file where a level is designed, <em>TileMap</em> class builds the tilemap image. For each coordinate, the due referenced area of the tileset is copied and drawn on the tilemap image. The arrangement of tiles is obviously the same as the organization of coordinates in the level file.</p>

<h3 id="pathfinder-algorithm">Pathfinder algorithm</h3>

<p>The <em>TileMap</em> class also performs a conversion from the level file structure to a 0-1 matrix. Such matrix composes the input (along with origin and destination locations) of the <a href="https://www.youtube.com/watch?v=KNXfSOx4eEE" target="_blank">A* pathfinder algorithm</a> used to obtain shortest paths set as routes of characters. Each location of this matrix and the tilemap is called a cell.</p>

<h2 id="animation">Animation</h2>

<p>Every element shown in <em>MapArea</em> that smoothly moves or changes its visual aspect undergoes an animation effect. Motion transition between current and next positions in the map, as well as changes in states of sprite sequences are covered below.</p>

<p>From game mechanics perspective, a walking character, for instance, consists of indicating current and next positions until the path to walk is completely traversed. Positioning characters, graphically, is about a rough change from one cell to another in the map. That is when animation takes place to make these rough transitions more smooth, at a frame update rate (FPS).</p>

<h3 id="spriter">Spriter</h3>

<p>Still regarding walking character example, it is necessary to show creatures’ movements coupled with motion along the map. These movements are represented in frame-by-frame animations in a <em>spritesheet</em>, or simply sprite, where each frame is an image of a character in a particular pose, composing the whole movement animation, which I am going to call a <em>State</em>.</p>

<figure>
        <a href="/images/SpriterDiagram.gif"><img src="/images/SpriterDiagram.gif" alt="image" /></a>
        <figcaption>Frame-by-frame animations from a spritesheet</figcaption>
</figure>

<p>As can be seen above - Let me give the due credits to those who did a good job gathering GBC Pokemon tiles of the <a href="https://www.spriters-resource.com/game_boy_gbc/pokemonredblue/sheet/8728/" target="_blank">character sprites</a> and the <a href="https://www.spriters-resource.com/game_boy_gbc/pokemonredblue/sheet/63033/" target="_blank">map tileset</a> - Ash walking down, Gary walking down, Ash turning around and Ash walking right are examples of states in a <em>spritesheet</em>, which are respectively indicated by A, B, C and D. Note the arrangement of frames in the sheet, as well as the possibilities to combine frames and come up with animations (the states).</p>

<h3 id="animator">Animator</h3>

<p>Transitioning from current to next position, <em>Animator</em> class produces a value within \( [0,1] \) in order to weight the length of a line segment \( l \) that connects origin to destination points in canvas. It basically interpolates values over time (frames, in this case) so values ranging in \( l \) creates motion.</p>

<p>Motion animation, in this project, is used only for walking characters; with the purpose of keeping a constant motion between the ending of one animation (when a cell is reached) and the beginning of the next one (when walking towards another cell) only a linear behavior was considered for interpolating values:</p>

<figure>
        <a href="/images/mapEasingLinear.gif"><img src="/images/mapEasingLinear.gif" alt="image" /></a>
        <figcaption>Linear interpolation of values for motion animation</figcaption>
</figure>

<p>As shown above, given total duration of the animation (i.e., number of total frames \( f_{total} \)), linear interpolation is about having \( f_{total} \) equal segments (\( \dfrac{1}{f_{total}} \)) and as frames run over time, these segments increase uniformly being multiplied by current frame \( f_{current} \) (elapsed time).</p>

<p>I took this oportunity also to implement - because it’s fun - some simplified easing in interpolation behavior, giving a velocity boost either at the beginning or at the end:</p>

<figure>
        <a href="/images/mapEasingQuad.gif"><img src="/images/mapEasingQuad.gif" alt="image" /></a>
        <figcaption>Quadratic interpolation of values for motion animation</figcaption>
</figure>

<p><em>Easing in</em> is about giving a boost at the end of interpolation (obviously because it begins with ease), whose behavior is given by linear behavior raised to some power to define the \( order \) of the polynomial and hence the boost intensity. As an example, above we consider \( order=2 \), so we have a quadratic function, or simply a quadratic ease in. <em>Easing out</em> plays the opposite role, whose behavior is the inverse function of the <em>easing in</em> polynomial.</p>

<p>For more about easing, check out <a href="http://robertpenner.com/easing/" target="_blank">Robert Penner’s website</a>. There - different from here, where we deal with the highest-degree term of the polynomial only - can be found a deeper background on easing functions, as well as manipulation of polynomial coefficients and combination of easing in and out.</p>

<h2 id="characters">Characters</h2>

<p>Every character (player and non-player) we see on the map is essentialy a <em>GameChar</em>. Looking at the <a href="/images/GameEnvClassDiagram.png">class diagram</a> we can see that no specific behavior (functionality) exists in their actual implementations <em>GamePlayer</em> (human players) and <em>GameComputer</em> (non-human players, a.k.a. NPCs). <em>GameChar</em> assumes an abstract role here, even though technically it is not. I left it like that on purpose for further implementations, because such an arrangement of entities would easily allow to separate different ways to perform the same behavior - polymorphism rocks - and then <em>GameChar</em> would truly become an abstract class.</p>

<h3 id="actions">Actions</h3>

<p>Interaction between game characters can be seen as actions being executed to each other. Every action involves two people: the one who executes and the undergone one, people I am respectively calling actor and actee - of course <em>actee</em> is a made-up word, it was inspired in a <a href="https://youtu.be/HS93nHdNzds?t=51" target="_blank">funny line of Chandler’s</a> in a Friends episode where he said “<em>… the messers become the messees!</em>” - who are instances of <em>GameChar</em>.</p>

<p>Having an actor and an actee, an action behavior can be expressed through <em>GameChar</em> (or even through <em>GamePlayer</em> or <em>GameComputer</em>) methods. I could drain health points (HP) from a character, for example, by simply implementing a total of HP in characters; then public methods could remove some HP from actee and increment it in actor’s, and so on. Currently, as stated in <a href="#gui">GUI section</a>, the implemented actions are to follow, to stun, to slow and to kill.</p>

<p>Given two characters on a map and <a href="https://en.wikipedia.org/wiki/Melee" target="_blank"><em>melee</em></a> actions, actor follows actee until they are close enough, and then the action itself is executed. Closeness here is defined by euclidean and manhattan distances (\( d_{euclidean}(\cdot , \cdot ) \) and \( d_{manhattan}(\cdot , \cdot ) \)) - see their usage at <a href="#game-loop">Game loop</a> section.</p>

<p>Another interesting detail is that I just mentioned the actor follows the actee before executing an action, and then the action is executed; I also mentioned the existence of a Follow action. If you check the code, you will notice there is nothing to do at <em>FollowAction</em> class, because following is, say, a primitive function in the game, so when actor reaches actee, if action is an instance of FollowAction, actor must not stop following actee - I just translated code to text, check it yourself :).</p>

<p><em>ActionChar</em> also has a duration (again, in frames), so temporary effects can also exist. When this duration finishes, a method to revert the effect to initial conditions is triggered.</p>

<h2 id="game-loop">Game loop</h2>

<p>If you read until here - I tried to be brief, sorry! - I believe understanding the game loop will not be a problem. Below we have some definitions and the game loop algorithm.</p>

<ul>
  <li>\( players \) is a collection of <em>GameChar</em> instances (<em>GamePlayer</em> or <em>GameComputer</em>)</li>
  <li>\( p_{target} \) is a <em>GameChar</em> \( p \) has to follow</li>
  <li>\( p_{waypoint} \) a destination cell (the location on the map) to go to</li>
  <li>\( p_{position} \) is the current location of \( p \)</li>
  <li>\( p_{anim} \) is the <em>Animator</em> instance for animating motion of \( p \)</li>
  <li>\( p_{path} \) is a collection of cells, which creates a path from \( p_{position} \) to \( p_{waypoint} \)</li>
  <li>\( p_{busy} \) indicates whether \( p \) is busy traversing a path</li>
  <li>\( p_{spriter} \) is the <em>Spriter</em> instance for frame-by-frame animations of \( p \)</li>
  <li>\( p_{direction} \) is the current direction \( p \) is walking towards</li>
</ul>

<figure>
        <a href="/images/GameLoopAlgorithm.png"><img src="/images/GameLoopAlgorithm.png" alt="image" /></a>
        <figcaption>Game loop algorithm</figcaption>
</figure>

<p>It is a simplified version of the logic running in a separated Thread to update the screen, state of characters and everything else in the game. Click <a href="/images/GameLoopAlgorithm.png" target="_blank">here</a> to open the original image (if everything is too short to read).</p>]]></content><author><name>Caio Benatti Moretti</name><email>caiodba@gmail.com</email></author><category term="blog" /><category term="gaming" /><summary type="html"><![CDATA[With the purpose of simplifying the integration of projects applied to games, I developed a simplified and manipulable 2D game engine, enabling to focus essentially on project goals and avoiding to deal with interfacing matters, which can eventually become a complicated task.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="//www.moretticb.com/%7B%22feature%22=%3E%22GameEnvCapa.png%22%7D" /><media:content medium="image" url="//www.moretticb.com/%7B%22feature%22=%3E%22GameEnvCapa.png%22%7D" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Color sensor prototype using artificial neural networks</title><link href="//www.moretticb.com/blog/color-sensor-prototype-using-neural-networks/" rel="alternate" type="text/html" title="Color sensor prototype using artificial neural networks" /><published>2016-08-11T01:16:45+00:00</published><updated>2016-08-11T01:16:45+00:00</updated><id>//www.moretticb.com/blog/color-sensor-prototype-using-neural-networks</id><content type="html" xml:base="//www.moretticb.com/blog/color-sensor-prototype-using-neural-networks/"><![CDATA[<p>Colors play a very important role in human daily activities, being tied to several connotations, such as its functions in traffic lights, raiment of a particular profession, entity representation through geometric shapes, human-computer interaction concepts, and others. Such examples denote an universal language that does not sit well with idiomatic languages. Color blindness, however, affects (or even ceases) this sort of asset in communication, which can be <a href="http://www.ncbi.nlm.nih.gov/pubmedhealth/PMHT0024265/" target="_blank">explained</a> as abnormal photopigments, located in cone-shaped cells within the retina, called <em>cone cells</em>.</p>

<ul id="markdown-toc">
  <li><a href="#project-overview" id="markdown-toc-project-overview">Project overview</a></li>
  <li><a href="#implementation-details" id="markdown-toc-implementation-details">Implementation details</a></li>
  <li><a href="#tests" id="markdown-toc-tests">Tests</a></li>
</ul>

<p>This article is divided in the sections listed above. If you wanna jump to the technical details, check <a href="#implementation-details">implementation details</a> or download the code at the <a href="http://www.github.com/moretticb/ColorSensor" target="_blank">GitHub</a> repository. Before continuing your reading, check the project video to see what this project is really about :)</p>

<iframe width="560" height="315" src="//www.youtube.com/embed/pc4IlCQ8TZM" frameborder="0"> </iframe>

<h2 id="project-overview">Project overview</h2>

<p>Similar to color vision of the human eye, as well as based in light, the RGB model comprises more than 16 million colors, which are arranged in a 3d space, where integer values of components R (Red), G (Green) and B (Blue), ranging from 0 to 255, constitute coordinates of this space. From this model, color detection and recognition were performed with light-related electronic components and machine learning mechanisms; it is essentially the combination of an RGB LED and a CdS Cell (light sensor, or LDR), both isolated from ambient light. Such components, respectively, emit and sense the intensity of each light (red, green and blue) which was reflected from an object of a particular color.</p>

<p>Color recognition can be performed with machine learning algorithms, such as <a href="/blog/multilayer-perceptron-implementation-in-c/">Multi-Layer Perceptron</a> (MLP) - an architecture of Artificial Neural Networks (ANN). It allows classification and recognition of spatially separable patterns - very useful in this case.</p>

<p>It is important to consider situations when the human eye unsuccessfully attempts to recognize colors due to poor ambient conditions, or even difficulties in distinguishing particular shades of colors. This is comparable to outliers or distorted patterns, which may affect precision in the recognizing task; regarding feature (RGB) space, a misclassified color should be interpreted as a coordinate located at spatial regions associated to another color (sometimes due to poor generalization, overfitting, and many other possibilites).</p>

<p>An MLP, during training, should perform mapping of regions in the RGB color space illustrated below. Each region isolated by hyperplanes represent a color, so every new color pattern located in a particular region is classified as its respective color.</p>

<figure class="half">
	<a href="/images/colorSensorRgbCube1.png"><img src="/images/colorSensorRgbCube1.png" alt="image" /></a>
	<a href="/images/colorSensorRgbCube2.png"><img src="/images/colorSensorRgbCube2.png" alt="image" /></a>
	<figcaption>RGB color space.</figcaption>
</figure>

<h3 id="multi-layer-perceptron">Multi-Layer Perceptron</h3>

<p>Multi-Layer Perceptron is a feedforward architecture of ANNs, having an input (non-neural) layer, hidden layers and an output layer. This network is trained by backpropagation algorithm, performing supervised learning (learning by examples).</p>

<p>Topology configuration (i.e., size of each layer) is defined according to a specific problem to be worked on. For this color sensor, as indicated in the <a href="/MTW" target="_blank">MLP Topology Workbench</a> below, the neural network receives 3 inputs (RGB values), having one hidden layer with 6 neurons and an output layer with 10 neurons - just recalling: the output layer must have the same number of classes (colors, in this case), for a binarized output. Hidden-layer sizes are empirically obtained, establishing ranges of values for each topological parameter, so that approaches for <a href="https://en.wikipedia.org/wiki/Hyperparameter_optimization">Hyperparameter optimization</a> may find potentially good results (this can be a bit difficult and slow sometimes). The network below is already trained and interactive, so it is possible to input values at training tab and check the outputs.</p>

<iframe width="600" height="560" src="http://www.moretticb.com/MTW/Tool/embed.html#3,6,10,s,s,e1e-7,r0.1,w2.75309|-11.47226|-3.31174|16.48123|19.50701|20.83178|7.11333|-6.42349|1.90722|6.49539|-27.71213|26.22820|-0.20637|-5.72456|-22.27807|30.06561|6.13926|-10.81428|28.51313|-9.78495|6.46702|0.05500|3.73036|4.14509|2.47902|0.01300|-3.58242|-16.36439|14.13336|-5.08929|1.63749|5.89483|1.41576|-3.31553|14.81429|-20.90657|-1.56866|1.91766|4.91018|4.03942|-10.84847|-5.64168|-4.13243|10.71144|3.75994|19.50770|17.72872|-3.21024|-2.47699|8.98845|5.19683|2.63604|17.35721|2.00543|11.71339|-5.45325|-6.94032|10.75201|0.66661|-7.26608|-3.58712|-9.92182|-12.68206|-15.45614|-13.74093|0.50826|15.17941|-11.14318|-19.08512|1.25124|22.00649|-4.22733|-0.44452|3.58902|0.64966|13.67560|-13.02688|-11.22907|-15.30070|-1.71819|6.73797|-28.17680|-2.50547|5.19797|7.00798|-2.86927|3.65035|18.02920|4.09836|10.48119|-2.56631|9.92777|2.34494|4.52433|" style="max-width: 600px; width: 100%; height: 568px;" frameborder="0"></iframe>

<h3 id="color-recognition">Color recognition</h3>

<p>With the purpose of obtaining generalization with an MLP for a good recognition of RGB patterns, a training set (examples of colors with the desired output) must be presented to the network for the training step (<a href="https://github.com/moretticb/ML-Implementations/tree/master/MLP" target="_blank">download</a> the implemented architecture to perform training). The training set used in this project is available at the project’s <a href="http://www.github.com/moretticb/ColorSensor" target="_blank">GitHub</a> repository.</p>

<p>Generalization will happen within the domain that the training set comprises, so it is worth to pay attention to minimum and maximum values of each component of the space! Do not feed the network with patterns outside this domain, otherwise it would output an unexpected behavior.</p>

<p>The dataset (all examples) contains 75 instances of color patterns ranging from 0 to 1 (logistic activation function was used). Initially ranging from 0 to 255, these instances were rescaled by simply dividing each value by 255, such that \( 0 \leq x_1, x_2, x_3 \leq 1 \). It is important to point out that only one neuron at the output layer must output 1, whereas the remaining ones must output zero. Of course this is not possible using a sigmoid activation function; that’s when post-processing takes place:</p>

\[y_i^{post} = \begin{cases} 1 &amp; \text{, if }y_i=\max(y)\\ 0 &amp; \text{, otherwise} \end{cases}\]

<p>where \( y_i \) is the output of the \( i^{th} \) neuron and \( \max(y) \) is the is the greatest output value. In practical terms, the neuron with the greatest output gives 1 as output and the remaining ones give 0. Simple as that.</p>

<p>A trained MLP should create regions in the color space, separating color patterns, as shown below the <a href="https://p3d.in/e/7DJDC+spin+load" target="_blank">visualization</a> of the instances of the dataset. Gray color instances are also included for another version of the trained network - but pretend gray instances are not there.</p>

<iframe src="https://p3d.in/e/7DJDC+spin+load" width="100%" height="480" frameborder="0" seamless="" allowfullscreen="" webkitallowfullscreen=""></iframe>

<h2 id="implementation-details">Implementation details</h2>

<p>This section is divided in three parts: <a href="#electronic-circuit">electronic circuit</a>, <a href="#color-theory">color theory</a> and <a href="#programming">programming</a>. Click the links if you want to jump to a specific part.</p>

<h3 id="electronic-circuit">Electronic circuit</h3>

<p>Arising from objects, all the detection procedure happens in the electronic circuit, encompassing computational activity running in an Atmega328, which is hooked up in an Arduino Uno. Check the scheme below to see the wiring.</p>

<figure class="half">
	<a href="/images/colorSensorSchemeAnode.png"><img src="/images/colorSensorSchemeAnode.png" alt="image" /></a>
	<a href="/images/colorSensorSchemeCathode.png"><img src="/images/colorSensorSchemeCathode.png" alt="image" /></a>
	<figcaption>Electronic circuit schemes using respectively common anode and cathode RGB LED.</figcaption>
</figure>

<p>The code follows the scheme that uses a common anode RGB LED, so check whether your RGB LED is also a common anode, otherwise just invert the logic in the code.</p>

<p>Another important detail is that I am using only one resistor with the RGB LED. Since one color at a time will be lit, I put the resistor in the common anode, with an average resistance of the resistors that <strong>should have been</strong> with the cathodes - it is lazy, I know and I am sorry! When I went to buy project parts, they didn’t have everything I needed - but it is however very important to use the correct resistors with the cathodes in order to have fidelity in the collected RGB values in relation to RGB values in the computer. The way I did is not that bad, since the patterns are not distorted; they are just not the same colors we see in a computer screen.</p>

<p>It can be observed from the scheme an adjacency between the RGB LED and the CdS Cell. That is because they must be isolated from ambient light (an oldie black film tube is the perfect piece), so calibration (explained in <a href="#programming">Programming</a>) and recognition can be performed.</p>

<h3 id="color-theory">Color theory</h3>

<p>Color perception performed by the electronic circuit is based in color theory concepts. Since there are no lenses (yet) involved, only objects with opaque (and matte) material should be considered, avoiding to deal with specular reflection of the LED. Diffuse reflection on the other hand is the key to perform color detection with lights. From an incident light, it is reflected in irregular surfaces, not creating that glowish effect that ruins the function of the CdS Cell.</p>

<p>Back to actual color theory, when light (of a certain color) reaches an object, it is reflected according to properties of that object’s color. For example, a red light reaching a yellow object will be reflected according to how much red exists in the composition of that yellow color - remember, we are talking about lights! - so it is expected to have a lot of red light being reflected, what makes sense when we think of the RGB composition of yellow (essentially red and green). However, when a blue light reaches the yellow object, no strong reflection is expected due to low presence of blue in the color composition.</p>

<figure>
	<a href="/images/colorSensorDetection.png"><img src="/images/colorSensorDetection.png" alt="image" /></a>
	<figcaption>Acquisition of RGB values for detection and calibration.</figcaption>
</figure>

<p>Considering an additive color system, in which white and black are respectively presence and absence of every colors (more details <a href="http://motion.kodak.com/KodakGCG/uploadedfiles/motion/US_plugins_acrobat_en_motion_education_colorTheory.pdf" target="_blank">here</a>), there can be measured (with the CdS Cell) maximum and minimum reflections of each light from the RGB LED which will reach colored objects. That said, it is possible to perform the calibration in electronic components involved in the circuit. This is another key to get fidelity in detection, as well as to ensure a stable detection of patterns (avoiding outliers) - here is a golden tip: after calibrating, try (hard!) not to move or touch neither the electronic components (specially when they are placed in a breadboard), nor the piece you are using (you must use) to isolate components from ambient light.</p>

<h3 id="programming">Programming</h3>

<p>For calibration and recognition, the color sensor executes three iterations, once a colored object is exposed to the RGB LED and the CdS Cell. In the first iteration, red light hits the object and the program waits CdS cell to stabilize its sensing; the analog input is then read and the reflection of the red light is stored. The program iterates twice more for green and blue colors. The figure shown in <a href="#color-theory">Color theory</a> gives a good visual explanation of this iterative process.</p>

<p>Concerning calibration, the iterative process mentioned above is performed twice: once for black color and once for white color. As explained in <a href="#color-theory">Color theory</a>, this is for the detection of maximum and minimum - initially from <em>near zero</em> to <em>near 1024</em>, according to the reading resolution - reflections of red, green and blue lights, obtaining a true range to properly rescale to intervals \( [0,255] \) (for informative purpose) and \( [0,1] \) (the actual input to feed the neural network).</p>

<p>The waiting time to establish reading of the light sensor can vary according to each electronic component, so it is good to give a good delay to ensure a steady sensing. In my case, I gave a 500-millisecond delay, but it is worth to initially use a bigger value and then decreasing it until the verge of a non steady behavior.</p>

<p>In detection, the collected RGB values - ranging from 0 to 1 - feed an MLP, performing the actual color recognition. For the MLP running in Arduino, I am using <a href="/Neurona" target="_blank">Neurona</a> - a library I wrote to easily use ANNs in arduino. Check also <a href="/blog/neurona-neural-networks-for-arduino/">this post</a> for more details; for training MLPs, I am using <a href="/blog/multilayer-perceptron-implementation-in-c/">an implementation</a> in C language (The <a href="/MTW">MLP Topology Workbench</a> also does the job), giving the adjusted weights to use with Neurona. Still regarding training, it was used \( \alpha=0.8 \), \( \eta=0.1 \) and \( \epsilon=10^{-7} \).</p>

<p>Given topologic configuration and the dataset, five trainings were performed (with cross-validation), each one with a random initial state of the synaptic weights:</p>

<table class="table">
  <thead>
    <tr>
      <th style="text-align: center">Training #</th>
      <th style="text-align: center">Epochs</th>
      <th style="text-align: center">Accuracy</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td style="text-align: center">1</td>
      <td style="text-align: center">1565</td>
      <td style="text-align: center">67%</td>
    </tr>
    <tr>
      <td style="text-align: center"><strong>2</strong></td>
      <td style="text-align: center"><strong>2353</strong></td>
      <td style="text-align: center"><strong>95%</strong></td>
    </tr>
    <tr>
      <td style="text-align: center">3</td>
      <td style="text-align: center">3315</td>
      <td style="text-align: center">80%</td>
    </tr>
    <tr>
      <td style="text-align: center">4</td>
      <td style="text-align: center">4239</td>
      <td style="text-align: center">93%</td>
    </tr>
    <tr>
      <td style="text-align: center">5</td>
      <td style="text-align: center">680</td>
      <td style="text-align: center">40%</td>
    </tr>
  </tbody>
</table>

<p>From a balanced dataset, accuracy was considered to compare training results. Training #2 presented best generalization, being the one to have its weights embedded along with Neurona and the Arduino program.</p>

<h2 id="tests">Tests</h2>

<p>Bringing a more practical perspective, some colors were extracted from the dataset to perform some recognition tests:</p>

<figure>
	<a href="/images/colorSensorTest.png"><img src="/images/colorSensorTest.png" alt="image" /></a>
	<figcaption>Printed samples for testing the color sensor.</figcaption>
</figure>

<p>Numbers outside the figure are used for identification and numbers inside the figure indicate misclassifications, referencing which colors were classified instead. These colors were printed in sulfite paper with an inkjet printer - check the tiny paper squares in the video in the beginning of this post - so the <em>objects</em> consist of opaque material, proper for color detection.</p>]]></content><author><name>Caio Benatti Moretti</name><email>caiodba@gmail.com</email></author><category term="blog" /><category term="machine-learning" /><category term="projects" /><summary type="html"><![CDATA[This project encompasses the development of an initial architecture of a color sensor for colorblind using artificial neural networks. A Multi-Layer Perceptron topology, properly trained with backpropagation algorithm performs mapping in the RGB color space and the recognition of 10 different colors.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="//www.moretticb.com/%7B%22feature%22=%3E%22colorSensorCapa.png%22%7D" /><media:content medium="image" url="//www.moretticb.com/%7B%22feature%22=%3E%22colorSensorCapa.png%22%7D" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Neurona - Artificial Neural Networks for Arduino</title><link href="//www.moretticb.com/blog/neurona-neural-networks-for-arduino/" rel="alternate" type="text/html" title="Neurona - Artificial Neural Networks for Arduino" /><published>2016-08-01T04:42:05+00:00</published><updated>2016-08-01T04:42:05+00:00</updated><id>//www.moretticb.com/blog/neurona-neural-networks-for-arduino</id><content type="html" xml:base="//www.moretticb.com/blog/neurona-neural-networks-for-arduino/"><![CDATA[<p>Neurona is an Arduino library which allows boards to feed Artificial Neural Network (ANN) structures in order to perform tasks such as pattern recognition (classification), non-linear regression, function approximation and time-series prediction from the implemented architectures:</p>

<ul>
  <li><a href="/blog/multilayer-perceptron-implementation-in-c/">Multi-Layer Perceptron (MLP)</a></li>
  <li><del>Learning Vector Quantization (LVQ-1)</del> (to be implemented)</li>
</ul>

<p>Since only the operation mode of these architectures are deployable in microcontrollers, you should also check the respective full implementations (training and operation modes) of the involved architectures; the links in the list above bring more details and the codes available to download. From these programs, it is possible to train topologies of an archictecture, so its output (an array of adjusted weights) constitutes the trained network to be embedded along with an Arduino program.</p>

<h2 id="download-and-installation">Download and installation</h2>

<p>If you want to install Neurona, you can either search for Neurona in Arduino’s Library Manager or download by clicking the button below:</p>

<div><a target="_blank" href="http://www.github.com/MorettiCB/Neurona" class="btn">Download Neurona</a></div>

<p>Check also <a href="/Neurona" target="_blank">Neurona documentation</a> for code reference. <a href="http://www.arduino.cc/en/guide/libraries" target="_blank">Click here</a> if you need aditional help in the installation step.</p>

<h2 id="projects">Projects</h2>

<p>An usage example of this library can be seen in the <strong><a href="/blog/color-sensor-prototype-using-neural-networks/"><u>Color sensor prototype</u></a></strong> project; Neurona performs classification of color patterns, given RGB values as input, retrieving the name of the recognized color.</p>]]></content><author><name>Caio Benatti Moretti</name><email>caiodba@gmail.com</email></author><category term="blog" /><category term="machine-learning" /><category term="projects" /><summary type="html"><![CDATA[Neurona is an Arduino library which allows boards to feed Artificial Neural Network (ANN) structures in order to perform tasks such as pattern recognition (classification), function approximation and time-series predictions.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="//www.moretticb.com/%7B%22feature%22=%3E%22neuronaCapa.png%22%7D" /><media:content medium="image" url="//www.moretticb.com/%7B%22feature%22=%3E%22neuronaCapa.png%22%7D" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Multi-Layer Perceptron - an implementation in C language</title><link href="//www.moretticb.com/blog/multilayer-perceptron-implementation-in-c/" rel="alternate" type="text/html" title="Multi-Layer Perceptron - an implementation in C language" /><published>2016-07-28T00:59:55+00:00</published><updated>2016-07-28T00:59:55+00:00</updated><id>//www.moretticb.com/blog/multilayer-perceptron-implementation-in-c</id><content type="html" xml:base="//www.moretticb.com/blog/multilayer-perceptron-implementation-in-c/"><![CDATA[<p>Artificial Neural Networks (ANNs) and the working principle of its architectures are not subjects commonly discussed (except if you are into machine learning fields) between programmers when it comes to appliable contexts, or at least not thoroughly exploited, for instance, through examples from practical perspectives.</p>

<p>Divided in three sections (<a href="#implementation-details">implementation details</a>, <a href="#usage">usage</a> and <a href="#improvements">improvements</a>), this article has the purpose of sharing an implementation of the backpropagation algorithm of the Multi-Layer Perceptron (MLP) architecture in C language as a complement to the therory available in the literature. Such implementation is available at <a href="http://www.github.com/moretticb/ML-Implementations" target="_blank">GitHub</a>.</p>

<p>Before approaching details of the implemented ANN architecture, it is important to point out that basic knowledge regarding MLP and backpropagation algorithm is needed. If you are new to ANNs and MLP, I would recommend you to check <a href="https://www.amazon.com/Neural-Networks-Applications-Programming-Computation/dp/0201513765/" target="_blank">Freeman</a> and <a href="https://www.amazon.com/Neural-Networks-Learning-Machines-3rd/dp/0131471392" target="_blank">Haykin</a> references.</p>

<h2 id="implementation-details">Implementation details</h2>

<p>The implementation was based in <a href="http://laips.sel.eesc.usp.br/livrorna/" target="_blank">this</a> book (which is also a good reference, but only available in portuguese), coded in ANSI-C and should be compiled by GCC.</p>

<p>Among several variations of the backpropagation algorithm, this implementation encompasses the generalized delta-rule with the momentum term in the adjustment of weights. Both training and operation modes are implemented in the same file (check <a href="#usage">Usage</a> section to see how to trigger each mode). Therefore, this algorithm has the following adjustable parameters:</p>

<ul>
  <li>\( \eta \) - Learning rate</li>
  <li>\( \epsilon \) - Precision (stopping criterion)</li>
  <li>\( \alpha \) - Momentum rate</li>
</ul>

<p>The program verboses two types of outputs (when flagged): adjusted weights and the mean square errors (MSEs) of each training epoch. The adjusted weights should be used as input of the operation mode in order to constitute the trained MLP. The weights are ordered according to the appearance of neurons in the topology (i.e., from the first neuron of the first hidden layer to the last neuron of the output layer), as indicated below:</p>

<figure>
	<a href="/images/mlpWeightOrder.png"><img src="/images/mlpWeightOrder.png" alt="image" /></a>
	<figcaption>Order of the synaptic weights verbosed as output.</figcaption>
</figure>

<p>The mean square error at each epoch should be used in visualization, in order to inspect the behavior of the gradient-descent along the iterations from backpropagation algorithm, as can illustrates the plot below:</p>

<figure>
	<a href="/images/mlpMsePlot.png"><img src="/images/mlpMsePlot.png" alt="image" /></a>
	<figcaption>Examples of MSE behavior along iterations.</figcaption>
</figure>

<h2 id="usage">Usage</h2>

<p>As mentioned earlier, the program has two modes: training and operation. Here is how to trigger the training mode:</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">cat </span>inputFile | ./mlp <span class="nt">-i</span> INPUTS <span class="nt">-o</span> OUTPUTS <span class="nt">-l</span> LAYERS n1 n2 n3 <span class="o">[</span>-[e | E | W]]
</code></pre></div></div>

<p>where:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">inputFile</code> is the dataset to be used in training</li>
  <li><code class="language-plaintext highlighter-rouge">mlp</code> is the compiled program</li>
  <li><code class="language-plaintext highlighter-rouge">-i</code> indicates the number of <code class="language-plaintext highlighter-rouge">INPUTS</code></li>
  <li><code class="language-plaintext highlighter-rouge">-o</code> indicates the number of <code class="language-plaintext highlighter-rouge">OUTPUTS</code></li>
  <li><code class="language-plaintext highlighter-rouge">-l</code> indicates the number of <code class="language-plaintext highlighter-rouge">LAYERS</code> and its sizes</li>
  <li><code class="language-plaintext highlighter-rouge">-[e | E | W]</code>
    <ul>
      <li><code class="language-plaintext highlighter-rouge">e</code> verboses the number of epochs</li>
      <li><code class="language-plaintext highlighter-rouge">E</code> verboses the MSE at each epoch</li>
      <li><code class="language-plaintext highlighter-rouge">W</code> suppresses verbose of adjusted weights</li>
    </ul>
  </li>
</ul>

<p>Here is an example of a dataset to use as input in training mode, holding 6 instances; each instance consists of 2 inputs and 3 desired outputs:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>6
2.0000 1.0000 1 0 0
7.0050 0.7500 1 0 0
2.0001 0.3240 0 1 0
0.0040 0.2380 0 1 0
7.0050 0.7500 0 0 1
2.0003 2.0001 0 0 1
</code></pre></div></div>

<p>It is important to point out in this example that numbers without a floating point represent desired outputs for classification. However, it is also possible to use floating points in desired outputs with the purpose of performing regressions or time-series predictions.</p>

<p>Having the output of the training mode (the adjusted weights), here is how to trigger the operation mode:</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>./mlp <span class="nt">-i</span> INPUTS <span class="nt">-o</span> OUTPUTS <span class="nt">-l</span> LAYERS n1 n2 nLAYERS <span class="nt">-w</span>
</code></pre></div></div>

<p>where:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">-w</code> indicates the insertion of the adjusted weights</li>
</ul>

<p>Runtime verbose may guide the user for inserting weights and network inputs in order to obtain the due output.</p>

<p>You can also use the <a href="/MTW" target="_blank">MLP Topology Workbench</a> either to generate commands to run the compiled program in training and/or operation modes (command tab), or as an alternative tool for training and/or operation modes:</p>

<iframe width="600" height="560" src="http://www.moretticb.com/MTW/Tool/embed.html" style="max-width: 600px; width: 100%; height: 568px;" frameborder="0"></iframe>

<h2 id="improvements">Improvements</h2>

<p>This implementation was focused only in algorithmic fidelity (didactic purposes), and therefore there are several improvements (from which I enumerated a few) to be done, towards an optimized and easy-to-use tool:</p>

<ul>
  <li>Better memory management: occurrences of memory dynamically allocated could be reduced (or even replaced with static allocation approaches). No allocated memory was freed in this program (sorry, I had a tight deadline to finish this implementation)
    <ul>
      <li>Operation mode (the forward step) was optimized with statically allocated memory in <a href="/blog/neurona-neural-networks-for-arduino/">Neurona</a> – a project involving MLP for AVR microcontrollers</li>
    </ul>
  </li>
  <li>Parameterize (at program execution) algorithm parameters such as \( \eta \), \( \epsilon \), \( \alpha \) and and the activation function to be used.</li>
  <li>Implement stopping criterion by epochs (also parameterizable).</li>
  <li>Append a new parameter to set a random seed, so outputs/outcomes can become reproductable.</li>
  <li>Input buffer reading improvements, enabling batch-like feed for several instance in operation mode.</li>
</ul>

<p><a href="http://www.github.com/moretticb/ML-Implementations" target="_blank">Fork me on GitHub</a> if you’re keen on MLPs and ANNs and liked this project :)</p>]]></content><author><name>Caio Benatti Moretti</name><email>caiodba@gmail.com</email></author><category term="blog" /><category term="machine-learning" /><summary type="html"><![CDATA[Divided in three sections (implementation details, usage and improvements), this article has the purpose of sharing an implementation of the backpropagation algorithm of a Multi-Layer Perceptron Artificial Neural Network as a complement to the theory available in the literature.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="//www.moretticb.com/%7B%22feature%22=%3E%22mlpCapa.png%22%7D" /><media:content medium="image" url="//www.moretticb.com/%7B%22feature%22=%3E%22mlpCapa.png%22%7D" xmlns:media="http://search.yahoo.com/mrss/" /></entry></feed>