This demo shows the HTM temporal memory algorithm operating on a simple 1D domain containing only four possible input states. The input states are for letters: A, C, G, T. Three sensor patches move over the domain. Each sensor patch consists of five sensors that directly encode the input state in one of four active neurons (bits) associated with each sensor. The last three neurons associated with each patch is an encoding of the next movement of the patch in one of three bits: left, stay, right. These inputs are then incorporated into three temporal memory modules with eight neurons per column.
This demo is a variation on the GATACA example above. We take the same 1D domain with four possible input features at each location. The top line is the input domain. The three lines below show the target pattern and current input to each of the sensor patches. For each sensor patch, there are 20 pre-synaptic neurons (not shown), and of these, 5 will be active during each cycle (corresponding to the currently active detector in each sensor). The graphs shown below are visualizations of dendrites for three post-synaptic neurons. Each dendrite has 20 synapses corresponding to each of the detectors on each sensor. Each of the synapses has an associated weight and position on the dendrite. The synaptic weights are indicated by the vertical bars along the dendrite. Activated synapses will generate a localized effect on the dendrite inversely proportional to the distance away from the synapse location (indicated by the Gaussian bump centered on each active synapse). The learning rule follows that described in this paper by Toviah Moldwin, et.al. The dendrite activation is depicted by the thicker plot line, and then integrated into the bar on the far right. The white horizontal line on this bar is the post-synaptic neuron firing threshold. These plots will take on different colors depending on the current state. Green for successful detection of the target pattern (true-positive), cyan for failure to detect the target pattern (false-negative), red for detection of the pattern when not present (false-positive), and gray for successful non-detection of the pattern (true-negative).
Prototype visualization of a simple agent in a simple environment. The agent possesses two cameras for visual input. The RGB channels are then digitized very coarsely and projected onto a pair of virtual retinas. This is of course not how the actual processing occurs in the retina. That is currently on the TODO list. The purpose of this visualization was to prototype a potential interactive application that would be able to show the initial stages of encoding and processing stereo vision.
Another candidate for a simple embodied agent: a spherical rat in a maze. This demo only got as far as implementing basic collision physics before getting bogged down in non-AI details. Going foward, I will probably utilize an existing physics engine and focus on how the agent generates movement and receives sensor feedback from its environment.
This demo is mostly pretty flashing lights that demonstrates one potential way to visualize the inner workings of a single cortical column. There was a half-hearted attempt to implement a temporal memory algorithm, and you can kind of see it working in the shifting of the neurons from red (active-bursting) to blue (predicted) and green (active-predicted). However, the proximal inputs at the lowest level are essentially random, so no meaningful learning is taking place.
Atoms in the dictionary are initialized by sub-sampling from a set of random images in the training set. Thereafter these atoms are used as an overcomplete basis set to encode portions of subsequent images. The encoding selects the best atom by direct projection (dot product of image and basis atom) to obtain a correlation coefficient. The product of this coefficient and the basis atom is subtracted from the image leaving a residual. This residual is then subjected to the same procedure to select the next atom that best captures the image features that were not present in the first atom. This continues until the atom limit is reached or the magnitude of the residual falls below a minimum threshold. The reconstructed image is then displayed along with the residual.
NOTE: This demo is not currently learning or adapting the atoms after the initial sampling stage. This simple choice for the basis set yields some fairly impressive results which can best be appreciated by comparing them to the reconstructions that results if you enable the "random atoms" checkbox in the menu.
Prototype of a visualization of a low-level visual encoding strategy. At the top of the window a sequence of MNIST digits are displayed with an overlay of a stencil displaying the proximal receptive fields for a set of cortical columns. The size of the stencil can be controled through colRadius.
The primary visualization is a 3D depiction of the encoding of features associated with each column's receptive fields. Each cortical column is composed of 19 mini-columns rendered as three concentric rings (1+6+12). The intensity of each mini-column corresponds to the strength with which the input field matches one of the nineteen Gabor Filters (show in the lower left corner).**
** If the Gabor field is unchecked, then a simpler set of 6 filters is used: centerOn, centerOff, xSobel, ySobel, xScharr, and yScharr.
A more detailed stereo vision prototype. This one is designed to understand more about how to create a sensor-motor feedback loop for aligning two independent retinal sensor patches on the same location in the input field.
In this demo, two retinal patches are overlaid on an input field consisting of a sequence of colored MNIST digits. Each patch consists of multiple retinal sensors covering a fixed spatial extent. The individual circles rendered on the sequence display the receptive field of each retinal sensor.
The main portion of the display window shows the cortical regions associated with the two retinal patches. Each region consists of a cortical column for each retinal sensor. Within each column are multiple minicolumns. Each minicolumn is proximally connected to the column's sensor via a log-Gabor convolutional filter. This filter is currently standing in place for what will eventually become an adaptive (Hebbian) filter. The minicolumn with the greatest filter response (dot product of receptive field with log-Gabor filter) fires first and then decays over time. While it is fading, the next best filter match has the opportunity to fire. This process continues until either no more filters exceed the activation threshold, or any of the previously activated filters have completed their refractory period and are ready to fire again.
NOTE: Some of the controls are currently inactive as the backend functionality has not yet been completed.
Inspired by a question asked in the HTM Forum, this example seeks to answer the question: "Can HTM learn Conway's Game of Life?"
This is a work in progress. Check back soon for future updates.