[Special article] What is machine learning?

The word "AI" has already come to be recognized as a general technology rather than a word in the science fiction world.
The term AI = Artifical intelligence was first used at the Dartmouth Conference held in 1956.
In recent years, with the development of AI technology, there are increasing opportunities to see words such as "machine learning," "neural network," and "deep learning" as related keywords.
In this article, we will introduce how these words are related and what technologies and products are related to them.

 

 

What is machine learning?

AI is a term that broadly refers to artificial intelligence.Technologies such as machine learning and deep learning can be said to be categories included in AI.
AI refers to attempts and technologies to complete human-like intelligence on a computer.
Machine learning is a mechanism that enables you to perform specific tasks on a computer by accumulating and training data.
Deep learning is a type of machine learning, in which a computer autonomously makes predictions and classifications without human intervention if there is sufficient data.The feature is that the computer automatically learns and improves the accuracy without human instruction.

AI has a long history, and in 1943 Warren Sturgis and Walter Pitts proposed neural network theory (formal neurons).
This neural network is a mathematical model that imitates the characteristics of neural circuits in the brain.
In 1958, Frank Rosenblatt introduced the Perceptron, which models visual and brain functions and performs pattern recognition, and research into learnable networks will proceed.
In 1986, David Rumelhart et al. Developed backpropagation (backpropagation), an algorithm used to train neural networks in machine learning.
About 30 years later, in 2006, Geoffrey Hinton and others proposed layered pre-learning, and in 2010, the field of deep learning was formed.
And now, AI, machine learning, and deep learning technologies are being used in various fields.

 

Types of machine learning

As mentioned above, machine learning is used to refer to technology that allows a computer to learn a lot of data and automatically build algorithms and models that perform some tasks such as classification and prediction.
And while machine learning is one of the categories contained in AI, it also occupies a central position in the concept of AI.

The word machine learning itself refers to multiple elements at once, and can be broadly divided into the following three learning methods.

1. Supervised learning

You will be given a question (input data) in advance and data (correct answer label) that will be the answer prepared by humans.Supervised learning is a learning method that aims to bring the predicted values ​​for this dataset closer to the correct label.Analyze the characteristics of the data to determine the probability that the data is correct.

In addition, the problems handled by supervised learning can be classified into "classification" and "regression".

■ Example of classification
The purpose is to predict the label to which the data belongs (eg YES / NO, Group A / Group B).
(1) Email spam filter
(2) Gesture recognition
(3) Document recognition

■ Example of regression
The purpose is to predict numbers.Guess the data relationships and predict the correct answer.
(1) Stock price forecast
(2) Price forecast
(3) Temperature forecast

2. Unsupervised learning

This is a method of learning without giving the correct answer to the training data.Find trends and regularities from multiple data and classify them according to common features.
It is a learning method that is good at finding correlations from huge amounts of data.

In addition, there are methods such as "clustering" and "dimension reduction" for unsupervised learning.

■ Clustering
It is a method to find a collection of similar features from multiple data.
Instead of judging from the correct label, find a group that has something in common from the characteristics of the data.

■ Dimensionality reduction
It is a method to reduce the dimension of data without losing as much information as possible.
The data handled by machine learning is often high-dimensional data.
Therefore, it is difficult to visualize, it is difficult to understand the data structure intuitively, and the calculation cost is high.
However, in reality, variables are correlated with each other, and there are few variables that are important in analysis, so it is possible to reduce variables without losing much information.

3. Reinforcement learning

It is a learning method in which the computer itself performs trial and error in order to maximize the reward (score) set as the purpose in a certain option.
The computer Go program "AlphaGo" is said to have learned the winning pattern through reinforcement learning.

 

What you need for machine learning

Since the hurdles for beginners to create AI are quite high, it may be realistic to develop using APIs and frameworks.
Alternatively, you can use a free tool. The "Neural Network Console" provided by Sony allows you to perform deep learning without any programming knowledge. The Windows app version is provided free of charge, and the cloud version is charged.As an introduction to deep learning, there is also a way to get an image with these tools.

On the other hand, when advancing AI development by machine learning in earnest, a learning model will be developed using a programming language.Recently, development using Python or SQL is common.

1. Python

It is a particularly popular programming language in recent years.
For Python,[Feature Article] Programming language Python Why is it so popular? --Tools to accelerate Python programmingSee also
As a programming language for machine learning, it is relatively easy to learn, it is reliable that it is used in familiar services such as YouTube and Instagram, and it is highly efficient because there are many libraries and frameworks. Widely supported.

2. SQL

A language for working with databases.You will be able to efficiently manipulate the large amount of information stored in the database.
In order to prepare AI learning data, a process called preprocessing is required to modify the data you want to use for learning and process it to make it easier to learn.This pre-processing is a very important task that is said to account for 7-8% of AI development, so knowledge of SQL is useful.
In addition, knowledge of databases is also required to manipulate learning data.

In addition to programming-related skills, you need a machine to prepare and process the data that will be the basis for learning.In addition, it may be necessary to install CUDA Toolkit, which is a typical usage environment for deep learning, so if you are uncertain about installing a computer, we willTEGSYSWe hope that you will see the model PCs and case studies of.

Machine learning flow

Let's see what steps machine learning takes.In actual work, there may be differences in details depending on the data and methods to be handled, but the major flow is as follows.

1. Data preparation

The first step is to acquire the data that will be the basis of learning.
The amount of data required depends on the learning method and learning content, but as of 31, the Ministry of Internal Affairs and Communications said, "Status of data usage for AIIn the scope of the material, nearly 10,000% of the cases are 2 or less.However, there are many cases where more data is used.In order to process so-called big data, an environment that can process a huge amount of data at high speed is required.However, recently, a method that can learn even a small amount of data by using transfer learning etc. is attracting attention.

* Transfer learning: A technology that applies knowledge in one area to learning in another area.

2. Method selection and learning model creation

Consider what method to use for machine learning.
It is necessary to decide the learning method and algorithm according to the scale and quality of the data, the urgency of the task, the purpose of use, the calculation environment, and so on.

3. Pretreatment

From the collected data, select the data required for learning.
In general, it is often difficult to use the collected data as it is, and there may be missing data or incorrect data, so it is necessary to prepare it so that it can be used in machine learning. This is a process that requires skills related to SQL and databases.

4. Parameter tuning and training

It is a process of changing the weighting of the network by utilizing the machine learning algorithm.
In the Deep Neural Network of deep learning, some of them are set automatically in the training process, but there are some parameters (= hyperparameters) that humans should set in advance when learning.For example, the number of layers and units of a neural network are the parts that humans need to set.
It is necessary to repeat model performance evaluation, training, and tuning until the hyperparameter values ​​are optimal.Nowadays, there are also auto-optimization tools for hyperparameters, which can be relatively easy to adjust.

In deep learning, it is common to use a tool called a library.
It is a tool created to reuse a specific process, and if you use the library, you can easily perform a specific process without programming all the processes yourself.Many of them are distributed as open source, and if the conditions are met, you can use the modified source code for your product.

■ Typical library
(1) Caffe / Caffe2
(2) Keras
(3) Chainer
(4) TensorFlow
(5) Torch / Pytorch

5. Model performance evaluation

Since the evaluation index differs between the classification model and the regression model, there is more than one evaluation index for the model.
The important point is which of the various evaluation indicators is used to evaluate the model.

■ Example of evaluation index of classification model
The confusion matrix is ​​often created in training and performance evaluation of classification models.
For models that predict the classification of each sample (such as group A or group B), this is a way to compare the predictions with the observations.
The confusion matrix evaluates performance by clearly showing how accurate the results of the classification model are.

 

-True Positive (TP): A state in which a sample that actually belongs to group A can be correctly predicted as group A.
-True Negative (TN): A state in which a sample that is not actually in Group A can be correctly predicted to be in Group A.
-False Negative (FN): Actually, the sample of group A is mistakenly predicted not to be group A.
-False Positive (FP): A state in which a sample that is not actually in Group A is mistakenly predicted as Group A.

(1) Correct answer rate
Accurately predictable percentages for all samples

(2) Compliance rate
Percentage of what was predicted to be correct and actually correct

(3) Recall rate
Percentage of actual correct sample that was predicted to be correct

(4) F value
Evaluation index that summarizes the precision rate and recall rate

■ Example of evaluation index of regression model
(1) Mean squared error (MSE): The average value of the sum of squares of errors, and the smaller the MSE, the better the model performance.

(2) Root mean square error (RMSE): The square root of MSE.The smaller the value, the better the model performance


(3) Coefficient of determination (R2): The closer the value of R2 is to 1, the better the model performance.

Repeat tuning> training> evaluation to improve the accuracy of the model.
If the accuracy does not improve even after repeating this, it is necessary to return to the selection of the method and consider it.

Machine learning aims to complete AI through these steps.
The above is just an example, and in reality, different processes and more detailed processes are required, but please consider it as a rough image.

Products related to machine learning

As we have introduced so far, the introduction of machine learning has become relatively easy and the improvements in related technologies have made it possible to use the results of machine learning in various situations.
In response to this trend, PCs, hardware, and software for performing machine learning in a more comfortable environment are required and provided.
In addition, the number of products that utilize the results of machine learning is increasing, and it is contributing to the further activation of the research and development scene, such as improving functions from conventional products and implementing functions that were difficult to realize in the past. There is no mistake.

The following is an example of products related to machine learning among the products we have a track record of handling.
If you are new to machine learning or are interested in the specific use of machine learning, please take a look.

If you have a product you are looking for, please feel free to contact us regarding the handling of products not listed here.

Animal Behavior Analysis Machine Case No.PC-7984

This is an example of a workstation configuration for the software "DeepLab Cut" that analyzes animal behavior by deep learning.

3090GPU machine with RTX 4 × 4 (for 100V power supply environment) Case No.PC-8351

This is an example of a machine configuration for deep learning using RTX3090 x 4 sheets.

Coral Dev Board

It is a single board computer equipped with Google Edge TPU for edge devices such as IoT.
It supports TensorFlow Lite, and system-on-module allows you to implement on-device machine learning inference applications.

Orange Pi AI Stick

Small board computer Orange Pi series AI stick.
It is possible to execute inference processing by connecting a USB stick type module to the machine.

Portable FPGA Accelerator FLIK (FPGA Client Innovation Kit)

A compact portable FPGA accelerator device equipped with Intel's FPGA "Intel Arria 10 GX FPGA".
Accelerate important workflows such as data analysis and deep learning in AI (artificial intelligence) and compute-intensive applications.

Duckietown

An open source AI car that allows you to learn robotics.
In the small pseudo-city "Duckietown", you can enjoy robotics and AI learning by driving an automatic vehicle (Duckiebot).

Donkey car

It is an automatic driving platform kit for radio-controlled cars.
An open source self-driving platform for small cars such as radio-controlled cars, a solution for learning AI and self-driving cars.

ZED 2 Stereo Camera

It is a 3D camera that can shoot high-resolution video.
It is a new stereo camera that reproduces human vision with a neural network for stereo matching and further improves the perception of XNUMXD and depth.

Mu Jo Co

A physics engine developed at the University of Washington.
Used for model-based computational implementations such as parallel sampling in machine learning applications.
It is also suitable for articulated dynamics simulation and is also used for control simulation in the field of robotics.

Skeleton Tracking SDK License

Software designed to provide deep learning-based 2D / 3D skeleton tracking capabilities to applications for embedded hardware.It can be used for behavior discrimination.

Glasses-type eye tracking device Pupil Invisible

A glasses-type eye-tracking device made by Pupil Labs that develops Pupil Core.
It uses a line-of-sight estimation pipeline that uses deep learning, so there is no need to calibrate, and you can acquire and record line-of-sight and viewing landscape data at the same time by simply attaching the device and pressing the record button on the app.

BOOM Library DEBIRD

An audio repair tool for automatically detecting and removing birdsong.
The neural network can automatically detect, remove, and extract the sound of birds that entered during recording.
It can also be used for research purposes such as learning bird voice and bird song.

Summary

Utilization of big data is advocated in ideal societies such as SDGs and Society 5.0.As a result, it is expected that people's lives will be enriched in a wide range of fields and situations, and that evolution toward a society without people left behind will accelerate.
Compared to the past, the research environment for machine learning is richer and is used in a wide range of research fields.In that sense, AI and machine learning technologies are indispensable for the realization of a sustainable society.We will also send and update information on related technologies and related products, so we hope that you will find it useful for your research and development.