前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >AI Weekly | October 16, 2021

AI Weekly | October 16, 2021

作者头像
用户9732312
发布2022-05-13 21:46:36
8160
发布2022-05-13 21:46:36
举报
文章被收录于专栏:ADAS性能优化ADAS性能优化

This week, Microsoft and Nvidia announced that theytrained what they claim is one of the largest and most capable AI languagemodels to date: Megatron-Turing Natural Language Generation (MT-NLP). MT-NLPcontains 530 billion parameters — the parts of the model learned fromhistorical data — and achieves leading accuracy in a broad set of tasks,including reading comprehension and natural language inferences.

But building it didn’t come cheap.Training took place across 560 Nvidia DGX A100 servers, each containing 8Nvidia A100 80GB GPUs. Experts peg the cost in the millions of dollars.

Like other large AI systems, MT-NLPraises questions about the accessibility of cutting-edge research approaches inmachine learning. AI training costs dropped 100-foldbetween 2017 and 2019, but the totals still exceed the compute budgets of moststartups, governments, nonprofits, and colleges. The inequity favorscorporations and world superpowers with extraordinary access to resources atthe expense of smaller players, cementing incumbent advantages.

For example, in early October,researchers at Alibaba detailed M6-10T, a language model containing 10 trillionparameters (roughly 57 times the size of OpenAI’s GPT-3) trained across512 Nvidia V100 GPUs for 10 days. The cheapest V100 plan available throughGoogle Cloud Platform costs 2.28 per hour, which would equate to over 300,000(

Google subsidiary DeepMind isestimated to have spent $35 million training a systemto learn the Chinese board game Go. And when the company’s researchers designeda model to play StarCraft II, theypurposefully didn’t try multiple ways of architecting a key component becausethe training cost would have been too high. Similarly, OpenAI didn’t fix amistake when it implemented GPT-3 because the cost of training made retrainingthe model infeasible.

Paths forward

It’s important to keep in mind thattraining costs can be inflated by factors other than an algorithm’s technicalaspects. As Yoav Shoham, Stanford University professor emeritus and cofounder ofAI startup AI21 Labs, recently told Synced,personal and organizational considerations often contribute to a model’s finalprice tag.

“[A] researcher might be impatient towait three weeks to do a thorough analysis and their organization may not beable or wish to pay for it,” he said. “So for the same task, one could spend100,000 or 1 million.”

Still, the increasing cost oftraining — and storing — algorithms like Huawei’s PanGu-Alpha,Naver’s HyperCLOVA, and theBeijing Academy of Artificial Intelligence’s Wu Dao 2.0 isgiving rise to a cottage industry of startups aiming to “optimize” modelswithout degrading accuracy. This week, former Intel exec Naveen Rao launched anew company, MosaicML, to offer tools, services, and training methods thatimprove AI system accuracy while lowering costs and saving time. MosaicML —which has raised $37 million in venture capital — competes with CodeplaySoftware, OctoML, Neural Magic, Deci, CoCoPie, and NeuReality in a marketthat’s expected to grow exponentially in the coming years.

In a sliver of good news, the cost ofbasic machine learning operations has been falling over the past few years. A2020 OpenAI survey found thatsince 2012, the amount of compute needed to train a model to the sameperformance on classifying images in a popular benchmark — ImageNet — has beendecreasing by a factor of two every 16 months.

Approaches like network pruning priorto training could lead to further gains. Research has shown that parameterspruned after training, a process that decreases the model size, could have beenpruned before training without any effect on the network’s ability to learn. Calledthe “lottery ticket hypothesis,” the idea is that the initial values parametersin a model receive are crucial for determining whether they’re important.Parameters kept after pruning receive “lucky” initial values; the network cantrain successfully with only those parameters present.

Network pruning is far from a solvedscience, however. New ways of pruning that work before or in early trainingwill have to be developed, as most current methods apply only retroactively.And when parameters are pruned, the resulting structures aren’t always a fitfor the training hardware (e.g., GPUs), meaning that pruning 90% of parameterswon’t necessarily reduce the cost of training a model by 90%.

Whetherthrough pruning, novel AI accelerator hardware, or techniques likemeta-learning and neural architecture search, the need for alternatives tounattainably large models is quickly becoming clear. A University ofMassachusetts Amherst study showed thatusing 2019-era approaches, training an image recognition model with a 5% errorrate would cost $100 billion and produce as much carbon emissions as New YorkCity does in a month. As Spectrum’s editorial team wrote in a recentpiece: “we must either adapt how we do deep learning or face a future of muchslower progress.

Microsoft and Nvidia team up to train one of the world's largest language models

Microsoft and Nvidia claim to have trained one of the world’s largest natural language models, containing 530 billion parameters.

AI technology could reshape the U.S. government, but should it?

Federal spending on AI in the U.S. rose by 50% between 2018 and 2020, making it the fastest rate of growth for any emerging technology.

A MESSAGE FROM SAMSUNG

The data economy: How AI helps us understand and utilize our data

Whether pulled from the cloud, your phone, TV, or an IoT device, the vast range of connected streams provide data on just about everything that goes on in our daily lives. But what do we do with it? HARMAN’s Chairman Young Sohn sits down with international journalist Ali Aslan to discuss the symbiotic relationship of AI & data and the ethical use of both - -- including bias, privacy, and security.

AI lab DeepMind becomes profitable and bolsters relationship with Google

While this could be great news for DeepMind, which has always hemorrhaged money, the AI lab’s financial reports are also notably vague.

Facebook quietly acquires syntheticdata startup AI.Reverie

Facebook has quietly acquire AI.Reverie, astartup that developed a platform and tools for synthetic data generation.

DeepMind is developing one algorithm to rule them all

Deep learning powers some of the most iconic AI apps, but deep learning models need retraining to be applied in new domains.

Google delivers collection of smart device 'essentials' for the enterprise

Google has announced Intelligent Product Essentials, an array of components for managing and leveraging data from smart devices.

Facebook introduces dataset and benchmarks to make AI more 'egocentric'

Facebook’s latest long-term research project, Ego4D, focuses on developing AI with an ‘egocentric,’ first-person perspective.

Americans Need a Bill of Rights for an AI-Powered World

The White House Office of Science and Technology Policy is developing principles to guard against powerful technologies—with input from the public. (via WIRED)

China has won AI battle with U.S., Pentagon's ex-software chief says

China has won the artificial intelligence battle with the United States and is heading towards global dominance because of its technological advances, the Pentagon’s former software chief told the Financial Times. (via Reuters)

These neural networks know what they’re doing

MIT researchers have demonstrated that a special class of deep learning neural networks is able to learn the true cause-and-effect structure of a navigation task during training. (via MIT News)

Duke Professor Wins $1 Million Artificial Intelligence Prize, A ‘New Nobel’

Cynthia Rudin becomes second recipient of AAAI Squirrel AI Award for pioneering socially responsible AI (via Duke Pratt School of Engineering)

本文参与?腾讯云自媒体分享计划,分享自微信公众号。
原始发表:2021-10-16,如有侵权请联系?cloudcommunity@tencent.com 删除

本文分享自 Android性能优化 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与?腾讯云自媒体分享计划? ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
相关产品与服务
对象存储
对象存储(Cloud Object Storage,COS)是由腾讯云推出的无目录层次结构、无数据格式限制,可容纳海量数据且支持 HTTP/HTTPS 协议访问的分布式存储服务。腾讯云 COS 的存储桶空间无容量上限,无需分区管理,适用于 CDN 数据分发、数据万象处理或大数据计算与分析的数据湖等多种场景。
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档
http://www.vxiaotou.com