2014年的一晚,Ian Goodfellow和一個(gè)剛剛畢業(yè)的博士生一起喝酒慶祝。在蒙特利爾一個(gè)酒吧,一些朋友希望他能幫忙看看手頭上一個(gè)棘手的項(xiàng)目:計(jì)算機(jī)如何自己生成圖片。
研究人員已經(jīng)使用了神經(jīng)網(wǎng)絡(luò)(模擬人腦的神經(jīng)元網(wǎng)絡(luò)的一種算法),作為生成模型來(lái)創(chuàng)造合理的新數(shù)據(jù)。但結(jié)果往往不盡人意。計(jì)算機(jī)生成的人臉圖像通常不是模糊不清,就是缺耳少鼻。
Ian Goodfellow朋友們提出的方案是對(duì)那些組成圖片的元素進(jìn)行復(fù)雜的統(tǒng)計(jì)分析以幫助機(jī)器自己生成圖片。這需要進(jìn)行大量的數(shù)據(jù)運(yùn)算,Ian Goodfellow告訴他們這根本行不通。
邊喝啤酒邊思考問(wèn)題時(shí),他突然有了一個(gè)想法。如果讓兩個(gè)神經(jīng)網(wǎng)絡(luò)相互對(duì)抗會(huì)出現(xiàn)什么結(jié)果呢?他的朋友對(duì)此持懷疑態(tài)度。
當(dāng)他回到家,女朋友已經(jīng)熟睡,他決定馬上實(shí)驗(yàn)自己的想法。那天他一直寫代碼寫到凌晨,然后進(jìn)行測(cè)試。第一次運(yùn)行就成功了!
那天晚上他提出的方法現(xiàn)在叫做GAN,即生成對(duì)抗網(wǎng)絡(luò)(generative adversarial network)。
通過(guò)使用兩個(gè)神經(jīng)網(wǎng)絡(luò)的相互對(duì)抗,Ian Goodfellow創(chuàng)造了一個(gè)強(qiáng)大的AI工具——生成對(duì)抗網(wǎng)絡(luò)GAN(generative adversarial network)。現(xiàn)在,該方法已經(jīng)在機(jī)器學(xué)習(xí)領(lǐng)域產(chǎn)生了巨大的影響,也讓他的創(chuàng)造者Goodfellow成為了人工智能界的重要人物。
GAN的誕生故事早已為技術(shù)圈所熟知,但是,產(chǎn)生這樣奇妙對(duì)抗想法的似乎不止Ian Goodfellow一人。
比如另一位機(jī)器學(xué)習(xí)領(lǐng)袖Jurgen Schmidhuber就聲稱早些時(shí)候已經(jīng)做過(guò)類似的工作。
NIPS 2016上有的相關(guān)爭(zhēng)論:
https://media.nips.cc/nipsbooks/nipspapers/paper_files/nips27/reviews/1384.html
今天,一篇2010年的博文亦在reddit上引發(fā)熱議。這是一篇非常簡(jiǎn)短的文章,但是很精確的提出了GAN的基本想法,其中附帶的一張圖片更是直接表示出了GAN的部署方式。
https://web.archive.org/web/20120312111546/http://yehar.com:80 /blog /?p = 167
這篇帖子引發(fā)了大量討論,不少人覺(jué)得遺憾,稱,如果小哥能更重視一下自己的這個(gè)想法,“他可能才會(huì)成為那個(gè)改變世界的人。”
當(dāng)然,也有人表示,有這樣的想法很重要,但真的付諸實(shí)踐才行,并且,2010年的硬件條件或許也還無(wú)法支撐讓GAN大火的一些應(yīng)用。甚至拿出來(lái)哥倫布發(fā)現(xiàn)新大陸的例子表示,“哥倫布可能是第一個(gè)發(fā)現(xiàn)者,但一定有很多人早就預(yù)言過(guò)'也許在大西洋有一些島嶼'?”
事實(shí)上,這篇博客的作者Olli Niemitalo的心態(tài)其實(shí)比吃瓜群眾要好很多,Olli是位來(lái)自芬蘭的電器工程師,在2017年的一篇帖子了,他敘述了自己在剛剛發(fā)現(xiàn)GAN的心路歷程:“2017年5月,我在YouTube看到了Ian Goodfellow的相關(guān)教程,made my day! 我之前寫下的只是一個(gè)基本的想法,并且已經(jīng)做了很多工作來(lái)使它取得良好的效果。這個(gè)演講回答了我曾經(jīng)遇到過(guò)的問(wèn)題以及更多問(wèn)題。”
從這篇博客作者的個(gè)人主頁(yè)可以看出,Olli本身也是位思維活躍并且樂(lè)于提出新想法的“寶藏男孩”,從2007年開始,他在博客中記下了從“能唱歌的自行車剎車“到”永不遲到的手表“等超多自己的想法,當(dāng)然其中也包括了這個(gè)“GAN”的雛形。
正如Goodfellow所說(shuō),“如果你有一個(gè)覺(jué)得可行的想法,也具有領(lǐng)域知識(shí)能夠認(rèn)識(shí)到它切實(shí)有效,那么你的想法才會(huì)真的價(jià)值。我提出GAN只花了大約1個(gè)小時(shí),寫論文花了2個(gè)星期。這絕對(duì)是一個(gè)“99%靈感,1%汗水”的故事,但是在那之前我花了4年時(shí)間在相關(guān)主題上攻讀博士學(xué)位。”
最后,歡迎看看這個(gè)比Goodfellow早三年提出的GAN的簡(jiǎn)短想法。
Amethod for training artificial neural networksto generate missing data within a variable context. As the idea is hard to put in a single sentence, I will use an example:
An image may have missing pixels (let's say, under a smudge). How can one restore the missing pixels, knowing only the surrounding pixels? One approach would be a "generator" neural network that, given the surrounding pixels as input, generates the missing pixels.
But how to train such a network? One can't expect the network to exactly produce the missing pixels. Imagine, for example, that the missing data is a patch of grass. One could teach the network with a bunch of images of lawns, with portions removed. The teacher knows the data that is missing, and could score the network according to the root mean square difference (RMSD) between the generated patch of grass and the original data. The problem is that if the generator encounters an image that is not part of the training set, it would be impossible for the neural network to put all the leaves, especially in the middle of the patch, in exactly the right places. The lowest RMSD error would probably be achieved by the network filling the middle area of the patch with a solid color that is the average of the color of pixels in typical images of grass. If the network tried to generate grass that looks convincing to a human and as such fulfills its purpose, there would be an unfortunate penalty by the RMSD metric.
My idea is this (see figure below): Train simultaneously with the generator a classifier network that is given, in random or alternating sequence, generated and original data. The classifier then has to guess, in the context of the surrounding image context, whether the input is original (1) or generated (0). The generator network is simultaneously trying to get a high score (1) from the classifier. The outcome, hopefully, is that both networks start out really simple, and progress towards generating and recognizing more and more advanced features, approaching and possibly defeating human's ability to discern between the generated data and the original. If multiple training samples are considered for each score, then RMSD is the correct error metric to use, as this will encourage the classifier network to output probabilities.
如果你對(duì)GAN的誕生故事感興趣,也可以看大數(shù)據(jù)文摘的相關(guān)報(bào)道:
GAN之父Ian Goodfellow :那個(gè)賦予機(jī)器想象力的人類
-
GaN
+關(guān)注
關(guān)注
19文章
1966瀏覽量
74244 -
機(jī)器學(xué)習(xí)
+關(guān)注
關(guān)注
66文章
8441瀏覽量
133093
原文標(biāo)題:“我比Goodfellow提前三年想到了GAN”
文章出處:【微信號(hào):BigDataDigest,微信公眾號(hào):大數(shù)據(jù)文摘】歡迎添加關(guān)注!文章轉(zhuǎn)載請(qǐng)注明出處。
發(fā)布評(píng)論請(qǐng)先 登錄
相關(guān)推薦
評(píng)論