#mlni01 – IMA Documentation

Google translation with computer vision

Background

When you are in a foreign country, the most difficult problem must be the language, especially for daily dialogue. Therefore we need translation tools to help us communicate with others.

The traditional way of translating is to type the content of what you want to translate into the translation software. On pressing the button, you will get the expected answer. But this traditional approach takes time and delays because you need to spend several minutes typing things so it is not suitable for everyday communication. But if you could translate the words you saw, what you said, or even what you want to say in your mind immediately and directly. It would be a lot more convenient.

And these are all implemented step by step through artificial intelligence technology. One of its progressed functions is to use a camera to capture text, and the software can recognize the text in the image. The user can select the part on the screen and translate it in one click. The other is instant translation, which only needs to point the camera at the text to translate it on the original dynamic image.

Progression

So far, since the application of computer vision has enhanced the ability of computers to understand and recognize images, Google’s photo translation function has been further improved. It is no longer limited to text translation, but extends to the content of the image itself. The content of the picture can be automatically described or summarized in the target language, including the wearing and expression of the character.

Comprehension

How to achieve this effect? By watching the video about computer vision, my comprehension of the process is as follows. The program analyzes the image layer by layer, and each layer has a different focus. The difference in image brightness can be used to derive the outline of the object. Finally, the results of the analysis of the layers are combined to give the most probable results. Of course, the basis of all analysis must be a huge database. Without a large set of data, there is no basis and support for analysis.

Significance

According to the related materials, the goal of computer vision is to enhance the computer’s ability of recognizing and understanding images until it is infinitely close to the human visual system. The case of google translation shows that the goal is practical and the improvement and application of computer vision will considerably make multi-language communication more convenient.

Link of my presentation:

Tag: #mlni01

Assignment 01 Crystal