Compared to other fields, machine learning/artificial intelligence now seems to have a super-funny development with higher frequency. Let you say "wow" or even "what time is alive!"Two-minute paperThe creator always said)

Disclaimer: I did not use any strict definitions of "exciting" or "breakthrough"; this is a random list.. I may use less strict terminology to make this article more accessible

Amazingly accurate estimates

Wall posture estimation

Website / Video by MIT Researcher, 2018 Year

We can accurately estimate how the person on the other side of the wall is standing/sitting/walking, just because of the disturbance of the Wifi signal caused by humans.

Measuring the physical properties of materials from video

MIT Researcher's Article/Video, 2015 Year

ResearchersIn 2014 yearfirstproveThey can reproduce human speech based on vibration from the video of the potato chip bag (without audio). There is no machine learning in this part. In 2015, they used machine learning to prove that you can estimate the stiffness, elasticity, weight per unit area, etc. of the material from the video only (in some cases, vibrations caused by ordinary air circulation alone are sufficient).

Estimate the keys of the smartphone next to the keyboard

Paper, 2015 Year

Estimate the keys of the smartphone next to the keyboard

Researchers say that by recording audio recorded on a single off-the-shelf smartphone next to the keyboard, one can estimate the individual keystrokesAccuracy rate is 94%. Unlike previous methods that used supervised deep learning and placed many microphones around the keyboard, this article actually uses relatively simple machine learning techniques (K-means clustering) andUnsupervisedLearn.

Generating model

Realistic face generation

Written by NVIDIA researcherspaper / Video, 2018 years

The process of machine generating faces

Researchers will have a new architecture with a lot ofGPUTogether, create an extremely realistic artificial face that is interpolated between other faces, or an application of a face and another face "style." This work was built in the past on generating a confrontation network (GAN) On top of the work. GAN was invented in 2014, and research has exploded since then. The most basic concept of GAN is two duel neural networks (for example, the second neural network that classifies images as "real" or "false", and the second neural network that attempts to "cheat"" the first neural network to generate images. The second neural network erroneously classified the fake image as real...so the second neural network is the first "adversary".

In general, there is a lot of learning about confrontational machines.Wonderful researchIt has been around for more than ten years. There are many horrific effects on cybersecurity, etc. but I am digressing.

Drawing teaching machine

2017 Year Google Brain blog post

The machine learns to draw
Interpolation between two drawings

I am familiar withDavid Ha of Google BrainUsing a generative recurrent neural network (RNN) Make vector graphics based drawing (I think this is Adobe Illustrator, except for automation).

Teach great dance moves to the machine

University of California, Berkeley Researcher网站 / Video, 2018 years

Think of "automatically adjust the dance." By using pose estimation and generating confrontation training, the researchers can make a fake video that lets any real person ("target" person) dance with great dance skills. The required input is only:

  • a short film with a great dance skill
  • A few minutes of the target person dancing (usually very bad because most people like to dance)

I also saw Jensen Huang, CEO of NVIDIA, who showed a video of dancing like Michael Jackson (made with this technique). I am very happy to participate in the GPU Technology Conference, haha.

Reinforcement learning

World Model-Artificial Intelligence learns in its own dreams

Website provided by Google Brain, 2018

Never stop machine learning

Humans do not actually understand or think about all the details of the world in which we live.Our actions are based on the abstraction of the world in our heads.For example, if I ride a bicycle, I don't think of bicycle gears/nuts/bolts; I just have a rough idea of ​​where the wheels, seats, and handlebars are and how to interact with them.Why not use similar methods for AI?

This "world model" approach (created again by David Ha et al.) allows "agents" (for example, controlling the car's AI in a racing game) to create a generation model around its world/environment, which is a simplification / Abstract real environment. Therefore, you can think of the world model as a dream in the field of artificial intelligence. Then AI can train in this “dream” through reinforcement learning to get better performance. Therefore, this approach actually combines generative ML with reinforcement learning. By doing so, researchers can achieve state-of-the-art performance in certain video game tasks.

[Update 2019/2/15] Based on the above "World Model" method, Google just releasedPlaNet: Enhanced Learning Deep Planning NetworkIts data efficiency has increased by 5000% over the previous method.

AlphaStar-StarCraft II machine defeats top professional players

By DeepMind (Google)Published blog post.E-sports-ish video, 2019 years

Our distanceBetween Lee Sedol and DeepMind's AlphaGoOfHistoric Go competitionHas gone a long way,This gameShocked the world, just 3 years ago 2016 years ago (lookNetFlix documentaryLet some people cry). Then, although no training data is used for human competition, 2017 AlphaZero is better than AlphaGo on GoGo (and better than any other algorithm in chess, Japanese chess, etc.). But 2019's AlphaStar MoreAmazing.

I have been a fan of StarCraft since 1998, and I can understand "...the need to balance short-term and long-term goals and adapt to unexpected situations...this is a huge challenge." This is really a difficult and complex game that requires It's good to understand and play on multiple levels. Since 2009, research on the StarCraft algorithm has been ongoing.

AlphaStar basically uses a combination of supervised learning (from human matching) and reinforcement learning (against itself) to achieve its results.

Human training robot

Teaching tasks to the machine through a single human body demonstration

article / VideoBy NVIDIA researchers, 2018 years

I can think of 3 as a typical way to teach robots to do something, but it takes a lot of time/labor:

  • Manually programming the robot's joint rotation, etc. for each case
  • Let the robot try the task multiple times (enhanced learning)
  • Show the robot to the robot multiple times

Often, one of the main criticisms of deep learning is that the cost of generating millions of examples (data) that make a computer work well is very high. However, more and more methods do not rely on such expensive data.

Researchers have found a way for robotic arms to successfully perform tasks (such as "pick up blocks and stack them so that they are in order: red, blue, orange") based onsingleVideo, even if the video was taken from a different angle, it only needsOneHuman body demonstration (a physical real hand moving block). The algorithm actually generates a human-readable description of the tasks it is scheduled to perform, which is very useful for troubleshooting. The algorithm relies on object detection with pose estimation, synthetic training data generation and simulation to reality delivery.

Unsupervised machine translation

Blog post by Facebook AI Research, 2018 Year

In general, you need a large translation file training data set (such as a professional translation of the UN program) to perform machine translation well (ieSupervisionLearn). Of course, many topics and language pairs do not have high quality, rich training data. In this article, the researchers showed that they can be usedUnsupervisedLearning (ie not using translation data, using only unrelated text corpora in each language), can achieve the most advancedSupervisionThe quality of the translation of the learning method. Wow.

The basic idea is that in any language, certain words/concepts tend to appear nearby (such as "furry" and "cat"). They describe this as "words embedded in different languages ​​share a similar neighborhood structure." I mean, I understand, but using this method is still exciting, they can achieve such high translation quality without Train the translation data set.

This article is from Noteworthy,Original address(Requires Internet Science)