Input

How Computers "See"

To a human, an image is a meaningful picture. To a computer, it's just a giant grid of numbers (pixels). 0 = Black, 1 = White.
Continue

What is a CNN?

Convolutional Neural Network. Standard AI requires flat inputs (Vectors). This destroys the image's shape. A CNN preserves the 2D structure. It scans the grid to find *spatial features* (lines, shapes) before compacting them.
Continue

1. The Feature Extractor (Convolution)

The CNN slides a small 'Filter' (like a flashlight) over the image. It looks for specific patterns: edges, curves, or corners. When it finds a match, it creates a high number in the 'Feature Map'.
Continue

2. Removing the Darkness (ReLU)

The filter might return negative numbers (opposites). We don't care about those. We apply an Activation Function called ReLU. It sets all negative values to ZERO, keeping only the 'positive' matches.
Continue

3. Summarizing (Pooling)

The feature map is still too big. We need to shrink it. 'Max Pooling' takes the biggest number from each small area. It summarizes only the most important features (the loudest matches).
Continue

4. Converting to Vector (Flatten)

Finally! We need to turn this 2D map into a 1D list of numbers (a Vector) so our classifier can read it. We essentially 'unroll' the grid row by row. Now the image is just a list of features: [Edge, Curve, Corner...].
Continue

5. The Prediction

The vector goes into a standard Neural Network (Fully Connected Layers). It compares the features against known patterns. "Vertical Line + Horizontal Base = Letter L!"
Continue

Interactive Pipeline

You are now the AI Engineer. 1. Change the Input Image (L, T, Box). 2. Change the Filter. Watch how the Vector changes, and how the simplified AI predicts the shape!
Continue

The "Vector" Moment

This is the bridge between Computer Vision and standard Classification.

flatten.py
# Input: A 2D Feature Map (e.g., 2x2)
feature_map = [
    [10, 0],
    [5,  2]
]

# Flattening: Unrolling into a 1D Vector
vector = feature_map.flatten()

print(vector)
# Output: [10, 0, 5, 2]

# Now we feed this to a Dense Layer!
output = dense_layer(vector)
Flattening Operation
AlgoAnimator: Interactive Data Structures