Teaching Robots to "See" and Understand Images
Just like how your phone camera takes pictures, robots can use cameras to "see" their environment. Digital images are made up of tiny squares called pixels, and each pixel has a color value that the robot can read and analyze.
Every pixel in a color image has three values: Red, Green, and Blue (RGB). Each value ranges from 0 to 255. For example, pure red would be (255, 0, 0), pure white would be (255, 255, 255), and black would be (0, 0, 0).
Images use a coordinate system where (0,0) is typically at the top-left corner. The x-coordinate increases as you move right, and the y-coordinate increases as you move down. This is like a grid system for locating specific pixels.
// Simple image processing with a camera
class SimpleVision {
private:
int imageWidth = 320;
int imageHeight = 240;
public:
// Get the brightness of a pixel (0-255)
int getPixelBrightness(int red, int green, int blue) {
// Simple average of RGB values
return (red + green + blue) / 3;
}
// Check if a pixel is mostly red
bool isRedPixel(int red, int green, int blue) {
return (red > 150 && red > green && red > blue);
}
// Find the center point of the image
void getImageCenter(int ¢erX, int ¢erY) {
centerX = imageWidth / 2;
centerY = imageHeight / 2;
}
// Simple brightness adjustment
int adjustBrightness(int pixelValue, int adjustment) {
int newValue = pixelValue + adjustment;
if (newValue > 255) newValue = 255;
if (newValue < 0) newValue = 0;
return newValue;
}
}; Raw camera images can be noisy or have poor lighting. We can process these images to make them easier for the robot to understand. This includes adjusting brightness, increasing contrast, and filtering out unwanted details.
Brightness makes the whole image lighter or darker by adding or subtracting the same amount from all pixel values. Contrast makes the difference between light and dark areas more pronounced, helping important features stand out.
Sometimes it's easier to work with black and white images. Converting to grayscale simplifies processing and can make it easier to detect edges and shapes. We do this by averaging the red, green, and blue values of each pixel.
// Simple image enhancement for robot vision
class ImageProcessor {
private:
int clampValue(int value) {
if (value > 255) return 255;
if (value < 0) return 0;
return value;
}
public:
// Convert RGB to grayscale
int convertToGray(int red, int green, int blue) {
return (red + green + blue) / 3;
}
// Adjust image brightness
int adjustBrightness(int pixelValue, int brightnessDelta) {
return clampValue(pixelValue + brightnessDelta);
}
// Increase contrast
int adjustContrast(int pixelValue, float contrastFactor) {
// Contrast around middle gray (128)
int adjusted = 128 + (pixelValue - 128) * contrastFactor;
return clampValue(adjusted);
}
// Simple noise reduction (average with neighbors)
int reduceNoise(int currentPixel, int neighbor1, int neighbor2) {
return (currentPixel + neighbor1 + neighbor2) / 3;
}
// Detect if there's a strong edge (big difference in brightness)
bool detectEdge(int pixel1, int pixel2, int threshold) {
int difference = abs(pixel1 - pixel2);
return difference > threshold;
}
}; One of the most useful computer vision skills is helping robots identify specific colors and shapes. This allows them to find objects, follow colored lines, or avoid obstacles of certain colors.
Instead of looking for exact colors, we usually look for ranges. For example, "red" might include any pixel where red is greater than 150, and green and blue are both less than 100. This accounts for lighting variations and camera differences.
We can detect simple shapes by looking for patterns in the edges and corners. A square has four corners and four equal sides, while a circle has no corners and a consistent curved edge. These patterns can be detected using basic counting and measurement techniques.
// Simple color and object detection
class ObjectDetector {
private:
struct ColorRange {
int minRed, maxRed;
int minGreen, maxGreen;
int minBlue, maxBlue;
};
ColorRange redRange = {150, 255, 0, 100, 0, 100};
ColorRange blueRange = {0, 100, 0, 100, 150, 255};
ColorRange greenRange = {0, 100, 150, 255, 0, 100};
public:
// Check if a pixel matches a specific color range
bool isColorInRange(int red, int green, int blue, ColorRange range) {
return (red >= range.minRed && red <= range.maxRed &&
green >= range.minGreen && green <= range.maxGreen &&
blue >= range.minBlue && blue <= range.maxBlue);
}
// Find the center of a colored object
bool findColorCenter(int imageWidth, int imageHeight,
int ¢erX, int ¢erY, ColorRange targetColor) {
int totalX = 0, totalY = 0, pixelCount = 0;
// This would scan through the actual image pixels
// For demo purposes, we'll show the logic
for (int y = 0; y < imageHeight; y++) {
for (int x = 0; x < imageWidth; x++) {
// Get pixel color at (x,y) - this would come from camera
// int red = getPixelRed(x, y);
// int green = getPixelGreen(x, y);
// int blue = getPixelBlue(x, y);
// if (isColorInRange(red, green, blue, targetColor)) {
// totalX += x;
// totalY += y;
// pixelCount++;
// }
}
}
if (pixelCount > 50) { // Need at least 50 pixels for valid object
centerX = totalX / pixelCount;
centerY = totalY / pixelCount;
return true;
}
return false;
}
// Simple object tracking
float calculateDistance(int x1, int y1, int x2, int y2) {
int deltaX = x2 - x1;
int deltaY = y2 - y1;
return sqrt(deltaX * deltaX + deltaY * deltaY);
}
}; Build a basic computer vision system that can detect and track colored objects using your miniAuto robot's camera.
Topic: Real-World Computer Vision Applications
Research and write a one-page report on how computer vision is used in one of these areas: