DeepMind½ñÈÕ·¢²¼ÁËHaikuºÍRLaxÁ½¸ö¿â£¬¶¼ÊÇ»ùÓÚJAX¡£
JAXÓɹȸèÌá³ö£¬ÊÇTensorFlowµÄ¼ò»¯¿â¡£½áºÏÁËÕë¶ÔÏßÐÔ´úÊýµÄ±àÒëÆ÷XLA£¬ºÍ×Ô¶¯Çø·Ö±¾µØ Python ºÍ Numpy ´úÂëµÄ¿âAutograd£¬ÔÚ¸ßÐÔÄܵĻúÆ÷ѧϰÑо¿ÖÐʹÓá£
¶ø´Ë´Î·¢²¼µÄÁ½¸ö¿â£¬·Ö±ðÕë¶ÔÉñ¾ÍøÂçºÍÇ¿»¯Ñ§Ï°£¬´ó·ù¼ò»¯ÁËJAXµÄʹÓá£
HaikuÊÇ»ùÓÚJAXµÄÉñ¾ÍøÂç¿â£¬ÔÊÐíÓû§Ê¹ÓÃÊìϤµÄÃæÏò¶ÔÏó³ÌÐòÉè¼ÆÄ£ÐÍ£¬¿ÉÍêÈ«·ÃÎÊ JAX µÄ´¿º¯Êý±ä»»¡£
RLaxÊÇJAX¶¥²ãµÄ¿â£¬ËüÌṩÁËÓÃÓÚʵÏÖÔöǿѧϰ´úÀíµÄÓÐÓù¹¼þ¡£
ÓÐÒâ˼µÄÊÇ£¬RedditÍøÓѾªÆæµÄ·¢ÏÖHaikuÕâ¸ö¿âµÄÃû×Ö£¬¾¹È»²»ÒÔ¡°ax¡±½áβ¡£
µ±È»£¬Ò²ÓÐÍøÓѶÔÕâÁ½¸ö¿â±íʾÁ˿϶¨£º
ºÁÎÞÒÉÎÊ£¬¶ÔJAXÆðµ½ÁËÍƶ¯×÷Óá£
ÄÇô£¬ÎÒÃǾÍÀ´¿´ÏÂHaikuºÍRLexµÄ®ɽÕæÃæÄ¿°É¡£
HaikuÊÇJAXµÄÉñ¾ÍøÂç¿â£¬ËüÔÊÐíÓû§Ê¹ÓÃÊìϤµÄÃæÏò¶ÔÏó±à³ÌÄ£ÐÍ£¬Í¬Ê±ÔÊÐíÍêÈ«·ÃÎÊJAXµÄ´¿º¯Êýת»»¡£
ËüÌṩÁËÁ½¸öºËÐŤ¾ß£ºÄ£¿é³éÏóhk.Module£¬ºÍÒ»¸ö¼òµ¥µÄº¯Êýת»»hk.transform¡£
hk.ModuleÊÇPython¶ÔÏ󣬰üº¬¶ÔÆä×ÔÉí²ÎÊý¡¢ÆäËûÄ£¿éºÍ¶ÔÓû§ÊäÈëÓ¦Óú¯Êý·½·¨µÄÒýÓá£
hk.transformÔÊÐíÍêÈ«·ÃÎÊJAXµÄ´¿º¯Êýת»»¡£
Æäʵ£¬ÔÚJAXÖÐÓÐÐí¶àÉñ¾ÍøÂç¿â£¬ÄÇôHaikuÓÐʲôÌرðÖ®´¦ÄØ£¿ÓÐ5µã¡£
1¡¢HaikuÒѾÓÉDeepMindµÄÑо¿ÈËÔ±½øÐÐÁË´ó¹æÄ£²âÊÔ
DeepMindÏà¶ÔÈÝÒ×µØÔÚHaikuºÍJAXÖи´ÖÆÁËÐí¶àʵÑé¡£ÆäÖаüÀ¨Í¼ÏñºÍÓïÑÔ´¦ÀíµÄ´ó¹æÄ£½á¹û¡¢Éú³ÉÄ£ÐͺÍÇ¿»¯Ñ§Ï°¡£
2¡¢HaikuÊÇÒ»¸ö¿â£¬¶ø²»ÊÇÒ»¸ö¿ò¼Ü
ËüµÄÉè¼ÆÊÇΪÁ˼ò»¯Ò»Ð©¾ßÌåµÄÊÂÇ飬°üÀ¨¹ÜÀíÄ£ÐͲÎÊýºÍÆäËûÄ£ÐÍ״̬¡£¿ÉÒÔÓëÆäËû¿âÒ»Æð±àд£¬²¢ÓëJAXµÄÆäËû²¿·ÖÒ»Æð¹¤×÷¡£
3¡¢Haiku²¢²»ÊÇÁíÆð¯Ôî
Ëü½¨Á¢ÔÚSonnetµÄ±à³ÌÄ£ÐͺÍAPIÖ®ÉÏ£¬SonnetÊÇDeepMind¼¸ºõÆÕ±é²ÉÓõÄÉñ¾ÍøÂç¿â¡£Ëü±£ÁôÁËSonnetÓÃÓÚ״̬¹ÜÀíµÄ»ùÓÚÄ£¿éµÄ±à³ÌÄ£ÐÍ£¬Í¬Ê±±£ÁôÁ˶ÔJAXº¯Êýת»»µÄ·ÃÎÊ¡£
4¡¢¹ý¶Éµ½HaikuÊDZȽÏÈÝÒ×µÄ
ͨ¹ý¾«ÐĵÄÉè¼Æ£¬´ÓTensorFlowºÍSonnet£¬¹ý¶Éµ½JAXºÍHaikuÊDZȽÏÈÝÒ׵ġ£³ýÁËеĺ¯Êý(Èçhk.transform)£¬HaikuµÄÄ¿µÄÊÇSonnet 2µÄAPI¡£
5¡¢Haiku¼ò»¯ÁËJAX
ËüÌṩÁËÒ»¸ö´¦ÀíËæ»úÊýµÄ¼òµ¥Ä£ÐÍ¡£ÔÚת»»ºóµÄº¯ÊýÖУ¬hk.next_rng_key()·µ»ØÒ»¸öΨһµÄrng¼ü¡£
ÄÇô£¬¸ÃÈçºÎ°²×°HaikuÄØ£¿
HaikuÊÇÓô¿Python±àдµÄ£¬µ«ÊÇͨ¹ýJAXÒÀÀµÓÚc++´úÂë¡£
Ê×ÏÈ£¬°´ÕÕÏ·½Á´½ÓÖеÄ˵Ã÷£¬°²×°´øÓÐÏà¹Ø¼ÓËÙÆ÷Ö§³ÖµÄJAX¡£
https://github.com/google/jax#installation
È»ºó£¬Ö»ÐèÒªÒ»¾ä¼òµ¥µÄpipÃüÁî¾Í¿ÉÒÔÍê³É°²×°¡£
$?pip?install?git+https://github.com/deepmind/haiku?
½ÓÏÂÀ´£¬ÊÇÒ»¸öÉñ¾ÍøÂçºÍËðʧº¯ÊýµÄÀý×Ó¡£
import?haiku?as?hk??import?jax.numpy?as?jnp??def?softmax_cross_entropy(logits,?labels):????one_hot?=?hk.one_hot(labels,?logits.shape[-1])????return?-jnp.sum(jax.nn.log_softmax(logits)?*?one_hot,?axis=-1)??def?loss_fn(images,?labels):????model?=?hk.Sequential([????????hk.Linear(1000),????????jax.nn.relu,????????hk.Linear(100),????????jax.nn.relu,????????hk.Linear(10),????])????logits?=?model(images)????return?jnp.mean(softmax_cross_entropy(logits,?labels))??loss_obj?=?hk.transform(loss_fn)?
RLaxÊÇJAX¶¥²ãµÄ¿â£¬ËüÌṩÁËÓÃÓÚʵÏÖÔöǿѧϰ´úÀíµÄÓÐÓù¹¼þ¡£
ËüËùÌṩµÄ²Ù×÷ºÍº¯Êý²»ÊÇÍêÕûµÄËã·¨£¬¶øÊÇÇ¿»¯Ñ§Ï°Ìض¨Êýѧ²Ù×÷µÄʵÏÖ¡£
RLaxµÄ°²×°Ò²·Ç³£¼òµ¥£¬Ò»¸öpipÃüÁî¾Í¿ÉÒԸ㶨¡£
pip?install?git+git://github.com/deepmind/rlax.git?
ʹÓÃJAXµÄjax.jitº¯Êý£¬ËùÓеÄRLax´úÂë¿ÉÒÔ²»Í¬µÄÓ²¼þÉϱàÒë¡£
RLaxÐèҪעÒâµÄÊÇËüµÄÃüÃû¹æÔò¡£
Ðí¶àº¯ÊýÔÚÁ¬ÐøµÄʱ¼ä²½³¤Öп¼ÂDzßÂÔ¡¢²Ù×÷¡¢½±ÀøºÍÖµ£¬ÒÔ±ã¼ÆËãËüÃǵÄÊä³ö¡£ÔÚÕâÖÖÇé¿öÏ£¬ºó׺_tºÍtm1ͨ³£ÊÇΪÁË˵Ã÷ÿ¸öÊäÈëÊÇÔÚÄĸö²½ÖèÉÏÉú³ÉµÄ£¬ÀýÈ磺
q_tm1£º×ª»»µÄԴ״̬ÖеIJÙ×÷Öµ¡£
a_tm1£ºÔÚԴ״̬ÏÂÑ¡ÔñµÄ²Ù×÷¡£
r_t£ºÔÚÄ¿±ê״̬ÏÂÊÕ¼¯µÄ½á¹û½±Àø¡£
q_t£ºÄ¿±ê״̬ϵIJÙ×÷Öµ¡£
HaikuºÍRLax¶¼ÒÑÔÚGitHubÉÏ¿ªÔ´£¬ÓÐÐËȤµÄ¶ÁÕß¿É´Ó¡°´«ËÍÃÅ¡±µÄÁ´½Ó·ÃÎÊ¡£
´«ËÍÃÅ
Haiku£º
https://github.com/deepmind/haiku
RLax£º
https://github.com/deepmind/rlax
±¾ÎľAIÐÂýÌåÁ¿×Ó루¹«ÖÚºÅID:QbitAI£©ÊÚȨתÔØ£¬×ªÔØÇëÁªÏµ³ö´¦¡£
ÁìȡרÊô 10ÔªÎÞÃż÷ȯ
˽Ïí×îР¼¼Êõ¸É»õ