Afureru, data analysis and AI learning materials for higher education "Deep learning starting with robots TensorFlo ...
If you write the contents roughly
You can work on image identification and applied control in deep learning easily and inexpensively.
Afrel is an AI robotics teaching material for universities, vocational schools, and technical colleges, "Deep learning starting with robots T ... → Continue reading
"The power to continue learning for all"
"EdTechZine" is an online media of educational ICT (EdTech) for all people who want to learn and want to teach.
We hope that our readers will be able to help themselves to become involved in society and lead a fulfilling life.
Wikipedia related words
If there is no explanation, there is no corresponding item on Wikipedia.
Deep learning(British: Deep learning) OrDeep learning(Shinsogakushu) is a method of learning by associating the concept of each granularity from the whole picture to the details of an object as a hierarchical structure.[Annotation 1]..The most popular method for deep learning is (in a narrow sense, 4 layers or more).[Annotation 2]) Multi-layer artificialneural network(Deep neural network,British: deep neural networksBy DNN)Machine learningIs a technique..For multi-layer neural networksJeffrey HintonResearch team2006Invented inAuto encoderBecame a direct origin.
As an elemental technologyBack propagationAlthough it was developed in the 20th century, it was not possible to learn enough about deep neural networks with four or more layers due to technical problems such as local optimum solutions and gradient disappearance, and the performance was not good.But in the 4st century, stackedAuto encoderStarting withHintonThrough research on learning of multi-layer neural networks, improvement of computer capacity required for learning, and distribution of learning data due to the development of the Internet, it has become possible to fully learn.as a result,voice-image-Natural languageDemonstrate high performance that overwhelms other methods for various problems targeting, Popular in the 2010s..Further in academia抽象 化Deep learning based on mathematical concepts has been studied [Annotation 3].
Deep learning refers to a method of learning by associating the concept of each granularity from the whole picture to the detail of an object as a hierarchical structure, regardless of the concrete mathematical concept used for learning.[Annotation 1].21st centuryEnterAuto encoderStarting withJeffrey HintonMulti-layered byneural networkThe method using a multi-layer neural network was first established by the study of learning by means of learning, the improvement of the computer capacity required for learning, and the distribution of learning data due to the development of the Internet.as a result,voice-image-Natural languageDemonstrate high performance that overwhelms other methods for various problems targeting, Popular in the 2010s..As a result (more than 4 layers in a narrow sense[Annotation 4]Multi-layered artificial neural network (deep neural network,)British: deep neural networksBy DNN)Machine learningMethodHas become widely known, but deep learning can be configured outside of neural networks, and we are currently exploring mathematical concepts of deep learning that are more abstract than neural networks...In the field of business, the application of multi-layer neural networks is popular, and it is often interpreted as "deep learning = neural network", but in academia it is explained as an abstract concept including methods other than neural networks.[Annotation 3].
It becomes a component of the neural networkperceptronWas invented in 1957, but due to the significant lack of computer performance and the simple perceptron consisting of two layers.Exclusive ORThe research did not continue significantly due to the drawbacks such as not being able to recognize..Since the 1980s, it has enabled the learning of a three-layered multi-layer perceptron that can handle the problem of exclusive OR.Back propagationWas developed, but it cannot recognize complicated mechanisms such as inefficient mechanisms and past forms of verbs (in the first place, all arbitrary functions can be approximated by a 3-layer neural network, andCerebral neocortexIt was unclear why there are three or more layers), and it subsided in the latter half of the 3s..
Published in 1979 by Kunihiko Fukushima (NHK Science & Technical Research Laboratories, then Department of Biological Engineering, Faculty of Basic Engineering, Osaka University) as a pioneering research aimed at multi-layer neural networks such as deep learning.NeocognitronCan be mentioned..Neocognitron has a self-organizing function, and it acquires pattern recognition ability (formation of concept) by learning by itself.As an application example, Fukushima et al. Demonstrated that handwritten character recognition ability (concept of each character) can be acquired by self-learning from a handwritten character database (big data).However, at that time, it was misunderstood as "one of the handwritten character recognition methods", and the recognition of its importance did not spread to the world...At that time, in verifying the neocognitronDigital computerWas too poorsoftwareIt is impossible to verify withCircuit elementThe neocognitron was implemented and verified by connecting the two.For learning methodsBackpropagation methodUnless you are using add-if silent insteadConvolutional Neural Network (CNN)It was the same as, and it was very foresighted considering the times.
1998LeNet-5 (the number at the end indicates that there are 5 layers), which is the direct origin of the convolutional neural network (CNN), was proposed.In the paper, the method of illustrating the layered structure of a neural network with a plate-shaped figure was used for the first time..
multilayerneural networkRealization (2006-2012)
Since early deep learning was greatly contributed by Geoffrey Hinton, the process of theoretical proof by neural network is described.
Single layerperceptronThe limit of not being able to solve the "linear inseparable problem" isMultilayer perceptronMachine learningBack propagationIt was solved to some extent by being realized by.However, the learning of multi-layer neural networks with an increased number of layers was not sufficiently trained due to technical problems such as local optimum solutions and gradient disappearance, and the performance was not good. Research was also on the verge of retreat.In addition, as a problem before the deficiency of these theories, it is evolvingMachine learningTo doComputerThe computational performance of the data was significantly insufficient, and it was difficult to obtain a large amount of data, which was a major obstacle to research.But,インターネットHas become widespread and computer performance has improved.2006Is a representative researcher of neural networksJeffrey HintonThe research teamLimited Boltzmann machinebyAuto encoderSucceeded in deepening (self-encoder)[Annotation 5]And it came to attract attention again.The method invented at this time was called a stacked autoencoder.At this time, the term deep network, which means a deeper network structure than the conventional multi-layer neural network, was established from the published paper.Originally, the deep network developed by Geoffrey Hinton et al. Had a simple structure with serial layers, but the current algorithm is complicated with multiple branches and loops.GraphHas a structure.Therefore, we have made it possible to easily realize a complicated graph structure by integrating the basic technologies.LibraryIs also open to the public. In 2012, led by Geoffrey Hinton at ILSVRC, which competes for object recognition.University of TorontoIt shocked machine learning researchers that AlexNet made a dramatic 26% error rate over the traditional method (17% error rate).Since then, ILSVRC has been dominated by teams using deep learning every year, and the error rate has improved to about 10% as of 2014..
Rapid progress in computer hardware performance, facilitation of data collection due to the spread of the Internet,CPUExcellent for parallel processing of simpler operationsGPUThe usefulness of deep learning in image processing has been recognized worldwide at competitions based on the reduction of prices and the expansion of their computational resources.2012It is said that research became active rapidly from around that time, and the third artificial intelligence boom came...Various after thisAppsToArtificial intelligenceHas been incorporated so that the best answer can be returned to the user.
Age of complexity and mathematical abstraction of learning models (2012-present)
Deep learning is used in various fields centered on object recognition.In addition, many IT companies such as Google are focusing on research and development.R & D competition between nations is a technology that greatly influences the economic growth of nations.Economic warIs causing.
GoogleAndroid 4.3Improved accuracy by 25 to 50 percent by leveraging deep learning technology for speech recognition.. 2012,Stanford UniversityIs a joint research withIs 1,000serverIn 16,000 days using 3 coresCatReacts to the image ofneural networkWas announced that it was built and became a hot topic..In this study, 200 million images of 1,000 dots square are analyzed.However, it has been pointed out that it is far from the human brain... By a team called Google LeNetUniversity of TorontoIn collaboration with, we have developed a system called "Image to Text" that can automatically generate a description of an image.this is,Computer visionAnd natural language processing to recognize the image uploaded by the user and display the description.Is. In March 2015, Schroff et al. Judged 3 million images of 800 million people with 2% accuracy (99.6 layers)... April 2016,AlphaGoThe system called is Chinese-FrenchEuropean Go ChampionIsFan HuiIt was announced in October 2015 that he had won all five races.DeepMind, which was acquired by Google in 10, was mainly involved in the development.Go isチ ェ スBecause the board is wider than that, the number of steps that can be hit is incomparable, and it overturned the prediction that it will take another 10 years to hit even with a human professional, and with an expert machine specializing in Go Attention was focused on the fact that it uses a system that can be used for general purposes... From 2016 to 2017, they are all world-class Go players. South Korea OfLee SeとChugoku OfKe KiyoshiIn the 2016th match against Lee Sedol in 5, he won 4 wins and 1 loss, and in the 2017rd match against Ke Jie in 3, he won 3 consecutive victories..
FacebookRecognizes the image uploaded by the user by deep learning and improves the accuracy of determining what is in the image...In addition, the Artificial Intelligence Research Lab was launched in 2013.As a result, the deep learning development environment was launched on January 2015, 1.Open SourcePublished in.It's 23.5 times faster than traditional code in a GPU environment., Is expected to promote research and development of deep learning.
It is also used as an obstacle sensor for self-driving cars.
While there are many advantagesethicsProblemscrimeIs also occurring.For example, in ChinaSky netDeep learning is rapidly becoming widespread for the purpose of strengthening the oversight of the authorities over the people, as represented byIs said to account for three-quarters of the world's deep learning servers..According to the U.S. government, China has surpassed the U.S. in the number of papers on deep learning since 2013...Along with Hinton and others, he is called the "father of deep learning."Yoshua BengioWarned that China is using artificial intelligence to monitor its citizens and strengthen its dictatorship.. Also,Deep fakeA fake image generation technology that is indistinguishable from the real thing has appeared, and it is specificCelebrityRemarks that are different from the facts using the face and voice ofPornographyContains (called deepfake porn)MovieSince the large number of products have been distributed, it is important.Libel,Moral rightsBecause there is a possibility of infringementPolicemenIs the creator or publisherRaidIs moving to..Furthermore, since attacks that disturb various unmanned control systems using fake images and sounds are assumed, countermeasures are being taken from the perspective of preventing damage..
Network models are still being actively researched, and new ones are proposed every year.
Convolutional neural network
Convolutional neural network (Convolutional Neural Networks: CNN) is a type of feedforward neural network that is not fully connected.In particular, a two-dimensional convolutional neural network is a neural network similar to the connection of neurons in the human visual cortex, and it is expected that learning similar to human cognition will be performed.Since the coupling is sparse, learning is faster than a fully coupled neural network.
Announced by Kunihiko Fukushima in 1979NeocognitronDeveloped from, in 1988 Homma Toshiteru et al. Became aware of phonemes, Used by Yann LeCun et al. In 1989 to recognize text images, Following LeNet-1998 announced by LeCun et al. In 5, AlexNet, which won the object category recognition at ILSVRC in 2012, is also a deep convolutional neural network...It has been deep since the time of Neocognitron, but in recent years it is sometimes called a deep convolutional neural network because it is deep in the head to emphasize that it is deep.Applications for natural language processing have also begun.
First 3 layersAuto encoderWhen learning is completed, the next layer (4th layer) is learned as an autoencoder.Repeat this as many times as necessary, and finally learn all layers.Also called pre-learning.Similar technologies include Deep Belief Network and Deep Boltzmann Machine.
Instead of learning the transformation that turns input data into outputResidual errorTo learn.Than a normal multi-layer neural networkSlopeIt is hard to disappear and can be made into many layers.Some have been learned experimentally up to 1000 layers.The disadvantage is that the number of input dimensions and the number of output dimensions cannot be changed.
Adversarial Generation Network
A network model in which two networks learn for conflicting purposes. The Discriminator acts as a loss function.In the square error minimization and the like, it is assumed that there is only one peak, but since the discriminator is a neural network, it is possible to approximate the probability distribution having multiple peaks and handle a more general probability distribution.
Unlike conventional neural networks, it is an image recognition model consisting only of pure multi-layer perceptrons that should not be used for deep learning...The image is divided into a large number of patches, and the accuracy is greatly improved by preparing a layer in which parameters are shared and a layer in which conversion is performed between patches for each patch.The disadvantage is that you can only input images of a fixed size.
Using statistical fluctuationsHopfield networkKind of.
Limited Boltzmann machine
A Boltzmann machine that has no connection between the same layers.
Regression neural network
Regression neural network (Recurrent Neural Network: RNN) is directedCycleA neural network with.It changes depending on the input before thatStatusHold (automaton).This is effective when the output changes depending on the order of input data such as moving image, audio, and language.Further, in a feedforward neural network, the number of peaks that can be approximated depends on the number of elements in the intermediate layer, whereas in a recurrent neural network, it is possible to approximate a function having infinite periodicity.
Research began in the 1980s and was published in 1982.Hopfield networkIs an early study.After that, Elman network and Jordan network were announced, and in 1997, S. Hochreiter and J. Schmidhuber et al.LSTMnetwork(Long / short term memory, Long short-term memory) was announced.
Vanishing gradient problem
The stochastic gradient descent calculates the gradient from the error and corrects the weight of the middle layer,Sigmoid functionAs you can see immediately, there is a region where the gradient is close to 0.If you happen to go to that area, the gradient will be close to 0 and the weights will be almost uncorrected.In a multi-layer NN, if there is a layer whose gradient is close to 0 even in one place, the gradients of all the layers below it are also close to 0, so that learning becomes more difficult as the number of layers increases stochastically.For more informationBack propagation,Activation functionSee also.
A phenomenon in which the discrimination rate is low in the test data while achieving a high discrimination rate in the training data.OverfittingSee also.
Trap to the locally optimal solution
Learning converges to a locally suitable solution rather than a global optimal solution, and cannot escape.
テ ク ニ ッ ク
It is widely used for purposes other than deep learning, but if you can make some assumptions (modeling) in advance about what kind of test data the input data will come from, such as an image, for example, you can rotate or stretch the image to enter the input data. Increasing the number has also been practiced for a long time.
In neural networks for a long timesigmoidThe function was often used, but in recent years, another function has been used due to the vanishing gradient problem.For more informationActivation functionSee.
Since the output is not standardized to 0.0-1.0, the vanishing gradient problem is less likely to occur, and since it is simpler than the sigmoid function, the amount of calculation is small and learning proceeds quickly..
A function that outputs the maximum value of multiple dimensions.As long as any one of the input values has a large valueVanishing gradient problemIs extremely low. Same calculation as CNN pooling.It is said to have high performance, but due to its nature, the dimension is reduced.Feature selectionIt can be said that it also serves as.
Drop outIs a technique that randomly ignores some percentage of arbitrary neurons (dimensions).The solution can be solved by reducing the dimensions without increasing the input data.SignificantYou can improve your sex.The learning results obtained by dropping out are used at the same time during the test, and the results are used on average.this isRandom forestThis is because even a classifier with a low detection rate can be increased by parallelizing.
Lasso returnAlso called.Dot product of dictionary matrix and coefficient matrix (Linear combinationWhen approximating the input data (column vector) with), the coefficient matrix isSparse matrix(A matrix with only a few non-zero elements). L1RegularizationThat.
When performing batch learning, a batch regularization layer is provided and whitening (input data is regularized to 0 on average and 1 on variance).In the past, it was thought that learning would proceed efficiently by suppressing the internal covariance shift, but now it is thought that it is not just due to the internal covariance shift..
Mini batch method
- Cafe - Python, C++
- - Moon
- --Python.Functional language.Specializing in parallelization, GPU code is automatically generated.
- Hard --Python. A wrapper for TensorFlow. It can also work with Theano.
- deepy --python
- cuDNN - NVIDIAProvided byCUDAPrimitive library for base (GPU based) DNN.
- Deep learning 4j - Java,LadderIs used.
- EBlearn --A library for CNN written in C ++.
- cuda-convnet --CNN for C ++ / CUDA implementation.The basic function is the same as EBlearn.
- Chainer --Python
- TensorFlow --Python, C ++
- Microsoft Cognitive Toolkit --Python, C ++, C #.Formerly called CNTK.
- DyNet --Python, C ++
- ^ a b The definition of deep learning in the fourth paragraph (pp.4-1) of the introduction to the textbook "Deep Learning" by Ian Goodfellow, Yoshua Bengio, and Aaron Courville, who are world-renowned as deep learning masters, No mention is made of neural networks, "The hierarchy of concepts allows computers to learn complex concepts by constructing complex ones from simple ones. How these concepts reciprocate each other. When you draw a graph that shows how it is built, the graph is deep and has many layers, which is why this approach is called AI deep learning. ”Is defined by the hierarchical structure of the concept.
- ^ Simple with 2 layersperceptron.. Hierarchical neural network with 3 layers.Hierarchical neural networks that are deeper than these are called deep (hierarchical) neural networks.
- ^ a b AcademiaArtificial intelligenceIt is recognized that any implementation method may be used as long as is useful.Therefore, academia is in the computerA human OfbrainWe are not just aiming to reproduce.Also,neural network TheA human OfCranial nerve OfnetworkThe research was only started with the idea of the structure, and after that, except for some research cases.A human OfbrainTheoretical expansion continues in various ways regardless of.
- ^ Simple with 2 layersperceptron.. Hierarchical neural network with 3 layers.Hierarchical neural networks that are deeper than these are called deep (hierarchical) neural networks.
- ^ A technique called a stacked autoencoder
- ^ a b Ian Goodfellow and Yoshua Bengio and Aaron Courville. “Deep Learning”(English). Massachusetts Institute of Technology Publishing. 2021/2/4Browse.
- ^ a b c d Hideki Asoh et al., Supervised by: The Japanese Society for Artificial Intelligence "Overview of Deep Learning Methods" "Deep Learning", Modern Science Co., Ltd., 2015, p. Xiv.ISBN 9784764904873.
- ^ a b Deep Learning The Japanese Society for Artificial Intelligence Overview of Deep Learning Methods xiii
- ^ a b Takayuki Okatani Deep Learning (Machine Learning Professional Series), April 2015, 4, Preface,ISBN-978 4061529021
- ^ "Challenge of Mathematics Approaching the Principles of Deep Learning" Masatoshi Imaizumi Iwanami Shoten 2021/04/16 Trial reading https://www.iwanami.co.jp/moreinfo/tachiyomi/0297030.pdf https://www.iwanami.co.jp/book/b570597.htmlApproximate Performance and Complexity Analysis for Generalization Error in Deep Learning 2019/11/22 IBIS Planning Session Masatoshi Imaizumi The University of Tokyo (The Institute of Statistical Mathematics / RIKEN / JST PRESTO)https://ibisml.org/ibis2019/files/2019/11/slide_imaizumi.pdf https://sites.google.com/view/mimaizumi/home_JP
- ^ a b c Ian Goodfellow and Yoshua Bengio and Aaron Courville. “Deep Learning”. An MIT Press book. P. 14. 2021/2/3Browse.
- ^ Masakazu Kobayashi 2013, p. 92.
- ^ "How can deep learning be used for business?”. WirelessWire News (October 2015, 5). 2015/5/21Browse.
- ^ Masakazu Kobayashi 2013, p. 94.
- ^ "Neocognitron". 2015/6/30Browse.
- ^ "Model of neural circuit of pattern recognition mechanism that is not affected by misalignment --- Neocognitron ---”. IEICE Transactions A (October 1979, 10). 2017/8/16Browse.
- ^ "Ask the father of deep learning "CNN" for images, "Neocognitron is still evolving"”(May 2015, 5). 2015/9/3Browse.
- ^ "[CEDEC 2015] Image recognition has already surpassed humans.Deep learning regenerates Japan”. 4gamer (October 2015, 8). 2015/9/1Browse.
- ^ Masakazu Kobayashi 2015, p. 107.
- ^ "MNIST Demos on Yann LeCun's website". yann.lecun.com. 2021/3/31Browse.
- ^ Naoki Asakawa (October 2014, 10). “[Artificial intelligence that challenges the brain 1] Amazing deep learning, the prototype was developed by the Japanese”. Nikkei xTECH (Cross Tech). 2019/12/20Browse.
- ^ "[2th] Let's experience the hottest deep learning now (page XNUMX)”. Enterprise (January 2015, 1). 2015/5/30Browse.
- ^ "Google Deep Learning learns and plays retro games yourself”. Ascii × Digital (March 2015, 3). 2015/5/21Browse.
- ^ a b Masakazu Kobayashi 2015, p. 29.
- ^ Masakazu Kobayashi 2015, p. 28.
- ^ "A technology that "automatically explains" photos that Google is developing”. Wired (November 2014, 11). 2015/5/18Browse.
- ^ "A new GPU market called deep learning”. PC Watch (March 2014, 4). 2015/5/21Browse.
- ^ ""Images to Text" that automatically generates a description when you upload an image”. GIGAZINE (May 2014, 12). 2015/5/21Browse.
- ^ "A technology that "automatically explains" photos that Google is developing”. WIRED (June 2014, 11). 2015/5/30Browse.
- ^ "Artificial intelligence can recognize the world better: Google's computer vision”. WIRED (June 2014, 9). 2015/5/30Browse.
- ^ CEDEC 2015 Image recognition has already surpassed humans.Deep learning regenerates JapanFrom the presentation slide of Yutaka Matsuo, Associate Professor, The University of Tokyo
- ^ ITTOUSAI (January 2016, 1). “Google's Go AI "AlphaGo" defeats a professional Go player, the first feat in history.Machine learning self-play and improve”. Engadget. As of January 2016, 1original[Broken link]More archives.2016/3/2Browse.
- ^ CADE METZ (January 2016, 1). “Google's super-intelligence, which solved the "mystery of Go," hastened the evolution of artificial intelligence by 10 years.”. WIRED. 2016/3/2Browse.
- ^ "<Go: Human vs. Artificial Intelligence> Lee Sedol" I definitely wanted to win, but today was harder than when I lost three games in a row. ". JoongAng Daily News(July 2016, 3) 2018/2/7Browse.
- ^ "AlphaGo wins the final round for 3 consecutive wins". Japanese Ki-in (September 2017, 5). 2018/2/7Browse.
- ^ "Facebook launches artificial intelligence research lab”. ITMedia News (April 2013, 12). 2015/5/22Browse.
- ^ "Facebook releases module for deep learning development environment "Torch" as open source”. ITMedia News (April 2015, 1). 2015/5/22Browse.
- ^ "Facebook makes deep learning technology open source”. ZDNet Japan (January 2015, 1). 2015/5/22Browse.
- ^ Toshiaki Nakazawa,A new paradigm for machine translation: the principles of neural machine translation "Information Management" Vol. 2017, No. 60, 5 p.299-306, two:10.1241 / johokanri.60.299
- ^ "Automotive Engineering Exposition 2015-ZMP's RoboCar MiniVan with "Deep Learning"”. Mynavi News (October 2015, 5). 2015/5/26Browse.
- ^ “Citizen monitoring with face recognition, a new AI tool in China”. Wall Street Journal(July 2017, 6) 2018/2/7Browse.
- ^ “Angle: China's face recognition technology is booming investment, boosting demand for surveillance”. Reuters(July 2017, 11) 2018/2/7Browse.
- ^ "China's "Super AI surveillance society"---in Xinjiang Uygur Autonomous Region, monitor even the "body"! ”. Shueisha(July 2018, 2) 2018/2/7Browse.
- ^ “Testing face recognition system operation in Xinjiang, China. Warns authorities if more than 300m away from designated area”. Engadget(November 2018, 1). originalArchived as of October 2020, 3. 2018/2/7Browse.
- ^ "No one can stop the move of China becoming an "AI superpower.". WIRED(July 2017, 8) 2018/2/7Browse.
- ^ ""Father of Deep Learning" Calls for AI Use in China". Sankei Biz(July 2019, 4) 2019/4/5Browse.
- ^ “Deep Learning'Godfather' Bengio Worries About China's Use of AI”. Bloomberg(July 2019, 2) 2019/4/5Browse.
- ^ "Fake pornography using AI "Recovery of damage is hopelessly difficult" What are the problems with "deepfake" technology: Chunichi Sports / Tokyo Chunichi Sports”(Japanese). Chunichi Sports/Tokyo Chunichi Sports. 2021/3/31Browse.
- ^ "Is AI fooled? What are the weaknesses of AI? ｜ Security communication”(Japanese). Security communication. 2021/4/1Browse.
- ^ Homma, Toshiteru; Les Atlas; Robert Marks II (1988). “An Artificial Neural Network for Spatio-Temporal Bipolar Patters: Application to Phoneme Classification”. Advances in Neural Information Processing Systems 1: 31–40 .
- ^ Yann Le Cun (June 1989). Generalization and Network Design Strategies.
- ^ Y. LeCun; B. Boser; JS Denker; D. Henderson; RE Howard; W. Hubbard; LD Jackel (1989). “Backpropagation applied to handwritten zip code recognition”. Neural Computation 1 (4): 541-551.
- ^ Alex Krizhevsky; Ilya Sutskever; Geoffrey E. Hinton (2012). “ImageNet Classification with Deep Convolutional Neural Networks”. Advances in Neural Information Processing Systems 25: 1097-1105 .
- ^ a b Vaswani, Ashish; Shazeer, Noam; Parmar, Niki; Uszkoreit, Jakob; Jones, Llion; Gomez, Aidan N .; Kaiser, Lukasz; Polosukhin, Illia (2017-12-05). “Attention Is All You Need”. arXiv: 1706.03762 [cs] .
- ^ Tolstikhin, Ilya; Houlsby, Neil; Kolesnikov, Alexander; Beyer, Lucas; Zhai, Xiaohua; Unterthiner, Thomas; Yung, Jessica; Steiner, Andreas et al. (2021-06-11). “MLP-Mixer: An all-MLP Architecture for Vision”. arXiv: 2105.01601 [cs] .
- ^ Takayuki Okatani Deep Learning p11
- ^ [1806.02375] About batch regularization
- ^ Understanding Batch Normalization · Issue # 942 · arXivTimes / arXivTimes · GitHub
- ^ Paper Introduction Understanding Batch Normalization --Notes on Whale Shark
- "Impact of AI Is Artificial Intelligence the Enemy of Mankind?" (1st printing) Kodansha <Kodansha's New Book>, March 2015, 3.ISBN 978-4-06-288307-8.
- "From the cloud to AI, the next main battlefield of Apple, Google, and Facebook" (1st printing) Asahi Shimbun Publishing <Asahi Shinsho>, July 2013, 7.ISBN 978-4-02-273515-7.
- Yutaka Matsuo "Is Artificial Intelligence Beyond Humans or Beyond Deep Learning" (1st print) KADOKAWA <Kadokawa EPUB Selection>, March 2015, 3.ISBN 978 – 4040800202.
- Sho Sonoda: "Integral Expression Theory of Deep Neural Networks", Waseda University Doctoral Dissertation (2017).
- Sho Sonoda: "Neural Net Integral Expression Theory"