Through the eyes of a machine - Young World Club
100

Through the eyes of a machine

  • POSTED ON: 22 Feb, 2024
  • TOTAL VIEWS: 338 Views
  • POSTED BY: Archana Subramanian | Text: Veena Prasad
  • ARTICLE POINTS: 100 Points

Can you identify these pictures?

Of course you can. How would you describe them? Pictures of humans, perhaps? And you might notice that some are smiling, and some are not. You might mentally classify them as cheerful people and grumpy people. You may instead, focus on those with glasses and those without. Or you may notice that some people are wearing jackets, and some are not.
There are so many ways of classifying a few pictures!

Let’s perform a thought experiment now. Let’s assume you are not of the human species. Maybe you’re an alien, maybe you’re a machine – it doesn’t matter as long as you’re not human – and you have no idea what a human is, what a smile is, what glasses and jackets are, what “classifying” means.

But I have to teach you all that. Where do I start? I start with what is known as “training data”. My training data for smile recognition would be a set of pictures (the more the better), and a label that explains what it is. Each picture goes into the machine, along with the label. At the end of the training, I would give this machine a completely new image, and see if it recognises a smile or not.

What about other expressions? And glasses and jackets? It’s a LOT more work, and a LOT more training data!

Specifically, I would need hundreds of thousands of images showing every possible human expression, all of them labelled correctly. I would need a variety of clothing examples, again labelled correctly. And then glasses. And every other element that I might expect to find in the environment that I am training the machine on. This is machine learning.

What is machine intelligence?

If your machine can go beyond its training data, understand images that are not part of the original data set, and, in some way, extrapolate the learning, it can be called intelligent.

This extrapolation is built using statistics-based algorithms that deploy a variety of techniques to fine-tune a model for different objectives, such as reading expressions, classifying images based on clothing, understanding indoor and outdoor pictures, X-rays, satellite images and anything that humans can make sense of.

In a similar way, models are trained to understand written texts and their styles — formal writing, casual writing, funny writing, the complete works of Shakespeare, academic text books, medical diagnoses, and anything that humans can read.

When we put all this learning together, we can come up with an AI app that can create a completely new image based on instructions such as “generate an image of a child smiling in a playground”. Or anything really, limited only by the human imagination and, of course, the training dataset.

Now, let’s pretend you are a machine that has just learnt to recognise smiling faces. Can you click on all of them?