开发网站步骤,企业门户网站模板,山东东营市天气预报,商业模式顶层设计案例Transformer 4.43.40 版本是自然语言处理领域的一个重要工具包#xff0c;为开发者提供了丰富的预训练模型资源#xff0c;可以用于各种文本处理任务。在这个版本中#xff0c;Transformer 支持了众多模型#xff0c;每个模型都具有不同的优势和适用领域。下面是一个 Trans… Transformer 4.43.40 版本是自然语言处理领域的一个重要工具包为开发者提供了丰富的预训练模型资源可以用于各种文本处理任务。在这个版本中Transformer 支持了众多模型每个模型都具有不同的优势和适用领域。下面是一个 Transformer 4.43.40 版本所支持的所有模型的目录让您能够更好地了解这一工具包的功能和用途。 ALBERT来自Google研究和芝加哥丰田技术研究所与论文ALBERT轻量级BERT用于自监督学习语言表示一起发布作者为Zhenzhong LanMingda ChenSebastian GoodmanKevin GimpelPiyush SharmaRadu Soricut。AltCLIP来自BAAI与Chen、Zhongzhi和Liu、Guang和Zhang、Bo-Wen和Ye、Fulong和Yang、Qinghong和Wu、Ledell的论文AltCLIP修改CLIP中的语言编码器以扩展语言能力一起发布。Audio Spectrogram Transformer来自MIT与论文ASTAudio Spectrogram Transformer by Yuan GongYu-An ChungJames Glass一起发布。BART来自Facebook与论文BARTDenoising Sequence-to-Sequence Pre-training for Natural Language GenerationTranslationand Comprehension by Mike LewisYinhan LiuNaman GoyalMarjan GhazvininejadAbdelrahman MohamedOmer LevyVes Stoyanov和Luke Zettlemoyer一起发布。BARThez来自École polytechnique与论文BARTheza Skilled Pretrained French Sequence-to-Sequence Model by Moussa Kamal EddineAntoine J.-P. TixierMichalis Vazirgiannis一起发布。BARTpho来自VinAI Research与论文BARTphoPre-trained Sequence-to-Sequence Models for Vietnamese by Nguyen Luong TranDuong Minh Le和Dat Quoc Nguyen一起发布。BEiT来自Microsoft与论文BEiTBERT Pre-Training of Image Transformers by Hangbo BaoLi DongFuru Wei一起发布。BERT来自Google与论文BERTPre-training of Deep Bidirectional Transformers for Language Understanding by Jacob DevlinMing-Wei ChangKenton Lee和Kristina Toutanova一起发布。BERT For Sequence Generation来自Google与论文Leveraging Pre-trained Checkpoints for Sequence Generation Tasks by Sascha RotheShashi NarayanAliaksei Severyn一起发布。BERTweet来自VinAI Research与论文BERTweetA pre-trained language model for English Tweets by Dat Quoc NguyenThanh Vu和Anh Tuan Nguyen一起发布。BigBird-Pegasus来自Google Research与论文Big BirdTransformers for Longer Sequences by Manzil ZaheerGuru GuruganeshAvinava DubeyJoshua AinslieChris AlbertiSantiago OntanonPhilip PhamAnirudh RavulaQifan WangLi YangAmr Ahmed一起发布。BigBird-RoBERTa来自Google Research与论文Big BirdTransformers for Longer Sequences by Manzil ZaheerGuru GuruganeshAvinava DubeyJoshua AinslieChris AlbertiSantiago OntanonPhilip PhamAnirudh RavulaQifan WangLi YangAmr Ahmed一起发布。BioGpt来自Microsoft Research AI4Science与论文BioGPTgenerative pre-trained transformer for biomedical text generation and mining by Renqian LuoLiai SunYingce XiaTao QinSheng ZhangHoifung Poon和Tie-Yan Liu一起发布。BiT来自Google AI与论文Big TransferBiTGeneral Visual Representation Learning by Alexander KolesnikovLucas BeyerXiaohua ZhaiJoan PuigcerverJessica YungSylvain GellyNeil Houlsby一起发布。Blenderbot来自Facebook与论文Recipes for building an open-domain chatbot by Stephen RollerEmily DinanNaman GoyalDa JuMary WilliamsonYinhan LiuJing XuMyle OttKurt ShusterEric M. SmithY-Lan BoureauJason Weston一起发布。BlenderbotSmall来自Facebook与论文Recipes for building an open-domain chatbot by Stephen RollerEmily DinanNaman GoyalDa JuMary WilliamsonYinhan LiuJing XuMyle OttKurt ShusterEric M. SmithY-Lan BoureauJason Weston一起发布。BLIP来自Salesforce与论文BLIPBootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation by Junnan LiDongxu LiCaiming XiongSteven Hoi一起发布。BLOOM来自BigScience workshop由BigScience Workshop发布。BORT来自Alexa与论文Optimal Subarchitecture Extraction For BERT by Adrian de Wynter and Daniel J. Perry一起发布。ByT5来自Google Research与论文ByT5Towards a token-free future with pre-trained byte-to-byte models by Linting XueAditya BaruaNoah ConstantRami Al-RfouSharan NarangMihir KaleAdam RobertsColin Raffel一起发布。CamemBERT来自Inria/Facebook/Sorbonne与论文CamemBERTa Tasty French Language Model by Louis MartinBenjamin MullerPedro Javier Ortiz Suárez*Yoann DupontLaurent RomaryÉric Villemonte de la ClergerieDjamé Seddah和Benoît Sagot一起发布。CANINE来自Google Research与论文CANINEPre-training an Efficient Tokenization-Free Encoder for Language Representation by Jonathan H. ClarkDan GarretteIulia TurcJohn Wieting一起发布。Chinese-CLIP来自OFA-Sys与论文Chinese CLIPContrastive Vision-Language Pretraining in Chinese by An YangJunshu PanJunyang LinRui MenYichang ZhangJingren Zhou和Chang Zhou一起发布。CLIP来自OpenAI与论文Learning Transferable Visual Models From Natural Language Supervision by Alec RadfordJong Wook KimChris HallacyAditya RameshGabriel GohSandhini AgarwalGirish SastryAmanda AskellPamela MishkinJack ClarkGretchen KruegerIlya Sutskever一起发布。CLIPSeg来自Göttingen大学与论文Image Segmentation Using Text and Image Prompts by Timo Lüddecke and Alexander Ecker一起发布。CodeGen来自Salesforce与论文A Conversational Paradigm for Program Synthesis by Erik NijkampBo PangHiroaki HayashiLifu TuHuan WangYingbo ZhouSilvio SavareseCaiming Xiong一起发布。Conditional DETR来自Microsoft Research Asia与论文Conditional DETR for Fast Training Convergence by Depu MengXiaokang ChenZejia FanGang ZengHouqiang LiYuhui YuanLei SunJingdong Wang一起发布。ConvBERT来自YituTech与论文ConvBERTImproving BERT with Span-based Dynamic Convolution by Zihang JiangWeihao YuDaquan ZhouYunpeng ChenJiashi FengShuicheng Yan一起发布。ConvNeXT来自Facebook AI与论文A ConvNet for the 2020s by Zhuang LiuHanzi MaoChao-Yuan WuChristoph FeichtenhoferTrevor DarrellSaining Xie一起发布。ConvNeXTV2来自Facebook AI与论文ConvNeXt V2Co-designing and Scaling ConvNets with Masked Autoencoders by Sanghyun WooShoubhik DebnathRonghang HuXinlei ChenZhuang LiuIn So KweonSaining Xie一起发布。CPM来自清华大学与论文CPMA Large-scale Generative Chinese Pre-trained Language Model by Zhengyan Zhang, Xu Han, Hao Zhou, Pei Ke, Yuxian Gu, Deming Ye, Yujia Qin, Yusheng Su, Haozhe Ji, Jian Guan, Fanchao Qi, Xiaozhi Wang, Yanan Zheng, Guoyang Zeng, Huanqi Cao, Shengqi Chen, Daixuan Li, Zhenbo Sun, Zhiyuan Liu, Minlie Huang, Wentao Han, Jie Tang, Juanzi Li, Xiaoyan Zhu, Maosong Sun一起发布。CTRL来自Salesforce与论文CTRLA Conditional Transformer Language Model for Controllable Generation by Nitish Shirish Keskar, Bryan McCann, Lav R. Varshney, Caiming Xiong and Richard Socher一起发布。CvT来自Microsoft与论文CvTIntroducing Convolutions to Vision Transformers by Haiping Wu, Bin Xiao, Noel Codella, Mengchen Liu, Xiyang Dai, Lu Yuan, Lei Zhang一起发布。Data2Vec来自Facebook与论文Data2VecA General Framework for Self-supervised Learning in Speech, Vision and Language by Alexei Baevski, Wei-Ning Hsu, Qiantong Xu, Arun Babu, Jiatao Gu, Michael Auli一起发布。DeBERTa来自Microsoft与论文DeBERTaDecoding-enhanced BERT with Disentangled Attention by Pengcheng He, Xiaodong Liu, Jianfeng Gao, Weizhu Chen一起发布。DeBERTa-v2来自Microsoft与论文DeBERTaDecoding-enhanced BERT with Disentangled Attention by Pengcheng HeXiaodong LiuJianfeng GaoWeizhu Chen一起发布。Decision Transformer来自Berkeley/Facebook/Google与论文Decision TransformerReinforcement Learning via Sequence Modeling by Lili ChenKevin LuAravind RajeswaranKimin LeeAditya GroverMichael LaskinPieter AbbeelAravind SrinivasIgor Mordatch一起发布。Deformable DETR来自SenseTime Research与论文Deformable DETRDeformable Transformers for End-to-End Object Detection by Xizhou ZhuWeijie SuLewei LuBin LiXiaogang WangJifeng Dai一起发布。DeiT来自Facebook与论文Training data-efficient image transformers distillation through attention by Hugo TouvronMatthieu CordMatthijs DouzeFrancisco MassaAlexandre SablayrollesHervé Jégou一起发布。DETR来自Facebook与论文End-to-End Object Detection with Transformers by Nicolas CarionFrancisco MassaGabriel SynnaeveNicolas UsunierAlexander KirillovSergey Zagoruyko一起发布。DialoGPT来自Microsoft Research与论文DialoGPTLarge-Scale Generative Pre-training for Conversational Response Generation by Yizhe ZhangSiqi SunMichel GalleyYen-Chun ChenChris BrockettXiang GaoJianfeng GaoJingjing LiuBill Dolan一起发布。DiNAT来自SHI Labs与论文Dilated Neighborhood Attention Transformer by Ali Hassani and Humphrey Shi一起发布。DistilBERT来自HuggingFace与论文DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter by Victor Sanh, Lysandre Debut and Thomas Wolf 一起发布。该方法也被应用于将GPT2压缩为DistilGPT2、RoBERTa压缩为DistilRoBERTa、Multilingual BERT压缩为DistilmBERT以及德语版本的DistilBERT。DiT来自Microsoft Research与论文DiT: Self-supervised Pre-training for Document Image Transformer by Junlong Li, Yiheng Xu, Tengchao Lv, Lei Cui, Cha Zhang, Furu Wei一起发布。Donut来自NAVER与论文OCR-free Document Understanding Transformer by Geewook Kim, Teakgyu Hong, Moonbin Yim, Jeongyeon Nam, Jinyoung Park, Jinyeong Yim, Wonseok Hwang, Sangdoo Yun, Dongyoon Han, Seunghyun Park一起发布。DPR来自Facebook与论文Dense Passage Retrieval for Open-Domain Question Answering by Vladimir KarpukhinBarlas OğuzSewon MinPatrick LewisLedell WuSergey EdunovDanqi Chen和Wen-tau Yih一起发布。DPT来自Intel Labs与论文Vision Transformers for Dense Prediction by René RanftlAlexey BochkovskiyVladlen Koltun一起发布。ELECTRA来自Google Research/Stanford University与论文ELECTRAPre-training text encoders as discriminators rather than generators by Kevin ClarkMinh-Thang LuongQuoc V. LeChristopher D. Manning一起发布。EncoderDecoder来自Google Research与论文Leveraging Pre-trained Checkpoints for Sequence Generation Tasks by Sascha RotheShashi NarayanAliaksei Severyn一起发布。ERNIE来自Baidu与论文ERNIEEnhanced Representation through Knowledge Integration by Yu SunShuohuan WangYukun LiShikun FengXuyi ChenHan ZhangXin TianDanxiang ZhuHao TianHua Wu一起发布。ESM来自Meta AI是transformer蛋白质语言模型。ESM-1b与论文Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences by Alexander RivesJoshua MeierTom SercuSiddharth GoyalZeming LinJason LiuDemi GuoMyle OttC. Lawrence ZitnickJerry Ma和Rob Fergus一起发布。ESM-1v与论文Language models enable zero-shot prediction of the effects of mutations on protein function by Joshua MeierRoshan RaoRobert VerkuilJason LiuTom Sercu和Alexander Rives一起发布。ESM-2和ESMFold与论文Language models of protein sequences at the scale of evolution enable accurate structure prediction by Zeming LinHalil AkinRoshan RaoBrian HieZhongkai ZhuWenting LuAllan dos Santos CostaMaryam Fazel-ZarandiTom SercuSal Candido和Alexander Rives一起发布。FLAN-T5来自Google AI在google-research/t5x库中发布。作者包括Hyung Won ChungLe HouShayne LongpreBarret ZophYi TayWilliam FedusEric LiXuezhi WangMostafa DehghaniSiddhartha BrahmaAlbert WebsonShixiang Shane GuZhuyun DaiMirac SuzgunXinyun ChenAakanksha ChowdherySharan NarangGaurav MishraAdams YuVincent ZhaoYanping HuangAndrew DaiHongkun YuSlav PetrovEd H. ChiJeff DeanJacob DevlinAdam RobertsDenny ZhouQuoc V. Le和Jason Wei。FlauBERT来自CNRS与论文FlauBERTUnsupervised Language Model Pre-training for French by Hang LeLoïc VialJibril FrejVincent SegonneMaximin CoavouxBenjamin LecouteuxAlexandre AllauzenBenoît CrabbéLaurent Besacier和Didier Schwab一起发布。FLAVA来自Facebook AI与论文FLAVAA Foundational Language And Vision Alignment Model by Amanpreet SinghRonghang HuVedanuj GoswamiGuillaume CouaironWojciech GalubaMarcus Rohrbach和Douwe Kiela一起发布。FNet来自Google Research与论文FNetMixing Tokens with Fourier Transforms by James Lee-Thorp, Joshua Ainslie, Ilya Eckstein, Santiago Ontanon一起发布。Funnel Transformer来自CMU/Google Brain与论文Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing by Zihang DaiGuokun LaiYiming YangQuoc V. Le一起发布。GIT来自Microsoft Research与论文GIT: A Generative Image-to-text Transformer for Vision and Language by Jianfeng WangZhengyuan YangXiaowei HuLinjie LiKevin LinZhe GanZicheng LiuCe LiuLijuan Wang一起发布。GLPN来自KAIST与论文Global-Local Path Networks for Monocular Depth Estimation with Vertical CutDepth by Doyeon KimWoonghyun GaPyungwhan AhnDonggyu JooSehwan ChunJunmo Kim一起发布。GPT来自OpenAI与论文Improving Language Understanding by Generative Pre-Training by Alec RadfordKarthik NarasimhanTim Salimans和Ilya Sutskever一起发布。GPT Neo来自EleutherAI在存储库EleutherAI/gpt-neo中发布作者为Sid BlackStella BidermanLeo GaoPhil Wang和Connor Leahy。GPT NeoX来自EleutherAI与论文GPT-NeoX-20B: An Open-Source Autoregressive Language Model by Sid BlackStella BidermanEric HallahanQuentin AnthonyLeo GaoLaurence GoldingHorace HeConnor LeahyKyle McDonellJason PhangMichael PielerUSVSN Sai PrashanthShivanshu PurohitLaria ReynoldsJonathan TowBen Wang和Samuel Weinbach一起发布GPT NeoX Japanese来自ABEJA由Shinya Otani、Takayoshi Makabe、Anuj Arora和Kyo Hattori发布。GPT-2来自OpenAI与论文Language Models are Unsupervised Multitask Learners by Alec Radford、Jeffrey Wu、Rewon Child、David Luan、Dario Amodei和Ilya Sutskever一起发布。GPT-J来自EleutherAI在存储库kingoflolz/mesh-transformer-jax中发布作者为Ben Wang和Aran Komatsuzaki。GPT-Sw3来自AI-Sweden与论文Lessons Learned from GPT-SW3: Building the First Large-Scale Generative Language Model for Swedish by Ariel Ekgren、Amaru Cuba Gyllensten、Evangelia Gogoulou、Alice Heiman、Severine Verlinden、Joey Öhman、Fredrik Carlsson、Magnus Sahlgren一起发布。GroupViT来自UCSD、NVIDIA与论文GroupViT: Semantic Segmentation Emerges from Text Supervision by Jiarui Xu、Shalini De Mello、Sifei Liu、Wonmin Byeon、Thomas Breuel、Jan Kautz、Xiaolong Wang一起发布。Hubert来自Facebook与论文HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units by Wei-Ning Hsu、Benjamin Bolte、Yao-Hung Hubert Tsai、Kushal Lakhotia、Ruslan Salakhutdinov、Abdelrahman Mohamed一起发布。I-BERT来自伯克利与论文I-BERT: Integer-only BERT Quantization by Sehoon Kim、Amir Gholami、Zhewei Yao、Michael W. Mahoney、Kurt Keutzer一起发布。ImageGPT来自OpenAI与论文Generative Pretraining from Pixels by Mark Chen、Alec Radford、Rewon Child、Jeffrey Wu、Heewoo Jun、David Luan、Ilya Sutskever一起发布。Jukebox来自OpenAI与论文Jukebox: A Generative Model for Music by Prafulla Dhariwal、Heewoo Jun、Christine Payne、Jong Wook Kim、Alec Radford、Ilya Sutskever一起发布。LayoutLM来自Microsoft Research Asia与论文LayoutLM文档图像理解的文本和布局预训练作者为Yiheng XuMinghao LiLei CuiShaohan HuangFuru Wei和Ming Zhou。LayoutLMv2来自Microsoft Research Asia与论文LayoutLMv2用于视觉丰富文档理解的多模式预训练作者为Yang XuYiheng XuTengchao LvLei CuiFuru WeiGuoxin WangYijuan LuDinei FlorencioCha ZhangWanxiang CheMin Zhang和Lidong Zhou。LayoutLMv3来自Microsoft Research Asia与论文LayoutLMv3使用统一文本和图像蒙版的文档AI预训练作者为Yupan HuangTengchao LvLei CuiYutong Lu和Furu Wei。LayoutXLM来自Microsoft Research Asia与论文LayoutXLM用于多语言视觉丰富文档理解的多模式预训练作者为Yiheng XuTengchao LvLei CuiGuoxin WangYijuan LuDinei FlorencioCha Zhang和Furu Wei。LED来自AllenAI与论文Longformer长文档transformer作者为Iz BeltagyMatthew E. Peters和Arman Cohan。LeViT来自Meta AI与论文LeViTConvNet中的视觉变压器以实现更快的推理作者为Ben GrahamAlaaeldin El-NoubyHugo TouvronPierre StockArmand JoulinHervé Jégou和Matthijs Douze。LiLT来自华南理工大学与论文LiLT用于结构化文档理解的简单而有效的语言无关布局transformer作者为Jiapeng WangLianwen Jin和Kai Ding。Longformer来自AllenAI与论文Longformer长文档transformer作者为Iz BeltagyMatthew E. Peters和Arman Cohan。LongT5来自Google AI与论文LongT5用于长序列的高效文本到文本transformer作者为Mandy Guo、Joshua Ainslie、David Uthus、Santiago Ontanon、Jianmo Ni、Yun-Hsuan Sung和Yinfei Yang。LUKE来自Studio Ousia与论文LUKE具有实体感知自我注意力的深度上下文化实体表示作者为Ikuya Yamada、Akari Asai、Hiroyuki Shindo、Hideaki Takeda和Yuji Matsumoto。LXMERT来自UNC Chapel Hill与论文LXMERT从变压器中学习跨模态编码器表示以用于开放域问答作者为Hao Tan和Mohit Bansal。M-CTC-T来自Facebook与论文Pseudo-Labeling For Massively Multilingual Speech Recognition by Loren Lugosch、Tatiana Likhomanenko、Gabriel Synnaeve和Ronan Collobert。M2M100来自Facebook与论文Beyond English-Centric Multilingual Machine Translation by Angela Fan、Shruti Bhosale、Holger Schwenk、Zhiyi Ma、Ahmed El-Kishky、Siddharth Goyal、Mandeep Baines、Onur Celebi、Guillaume Wenzek、Vishrav Chaudhary、Naman Goyal、Tom Birch、Vitaliy Liptchinsky、Sergey Edunov、Edouard Grave、Michael Auli和Armand Joulin。MarianMT使用OPUS数据训练的机器翻译模型由Jörg Tiedemann开发。Marian框架正在由Microsoft Translator团队开发。MarkupLM来自Microsoft Research Asia与论文MarkupLM面向视觉丰富文档理解的文本和标记语言预训练作者为Junlong Li、Yiheng Xu、Lei Cui和Furu Wei。Mask2Former来自FAIR和UIUC与论文Masked-attention Mask Transformer for Universal Image Segmentation by Bowen Cheng、Ishan Misra、Alexander G. Schwing、Alexander Kirillov和Rohit Girdhar。MaskFormer来自Meta和UIUC与论文Per-Pixel Classification is Not All You Need for Semantic Segmentation by Bowen Cheng、Alexander G. Schwing和Alexander Kirillov。mBART来自Facebook与论文Multilingual Denoising Pre-training for Neural Machine Translation by Yinhan Liu、Jiatao Gu、Naman Goyal、Xian Li、Sergey Edunov、Marjan Ghazvininejad、Mike Lewis、Luke Zettlemoyer。mBART-50来自Facebook与论文Multilingual Translation with Extensible Multilingual Pretraining and Finetuning by Yuqing Tang、Chau Tran、Xian Li、Peng-Jen Chen、Naman Goyal、Vishrav Chaudhary、Jiatao Gu和Angela Fan。Megatron-BERT来自NVIDIA与论文Megatron-LM使用模型并行训练多十亿参数语言模型作者为Mohammad Shoeybi、Mostofa Patwary、Raul Puri、Patrick LeGresley、Jared Casper和Bryan Catanzaro。Megatron-GPT2来自NVIDIA与论文Megatron-LM使用模型并行性训练多十亿参数语言模型作者为Mohammad ShoeybiMostofa PatwaryRaul PuriPatrick LeGresleyJared Casper和Bryan Catanzaro。mLUKE来自Studio Ousia与论文mLUKE多语言预训练语言模型中实体表示的力量作者为Ryokan RiIkuya Yamada和Yoshimasa Tsuruoka。MobileBERT来自CMU / Google Brain与论文MobileBERT面向资源有限设备的紧凑型任务不可知BERT作者为Zhiqing SunHongkun YuXiaodan SongRenjie LiuYiming Yang和Denny Zhou。MobileNetV1来自Google Inc.与论文MobileNets用于移动视觉应用的高效卷积神经网络作者为Andrew G. HowardMenglong ZhuBo ChenDmitry KalenichenkoWeijun WangTobias WeyandMarco Andreetto和Hartwig Adam。MobileNetV2来自Google Inc.与论文MobileNetV2倒置残差和线性瓶颈作者为Mark SandlerAndrew HowardMenglong ZhuAndrey Zhmoginov和Liang-Chieh Chen。MobileViT来自Apple与论文MobileViT轻量级、通用且移动友好的视觉变换器作者为Sachin Mehta和Mohammad Rastegari。MPNet来自Microsoft Research与论文MPNet用于语言理解的掩码和排列预训练作者为Kaitao SongXu TanTao QinJianfeng Lu和Tie-Yan Liu。MT5来自Google AI与论文mT5大规模多语言预训练文本到文本变换器作者为Linting XueNoah ConstantAdam RobertsMihir KaleRami Al-RfouAditya SiddhantAditya Barua和Colin Raffel。MVP来自RUC AI Box与论文MVP用于自然语言生成的多任务监督预训练作者为Tianyi TangJunyi LiWayne Xin Zhao和Ji-Rong Wen。NAT来自SHI Labs与论文Neighborhood Attention Transformer作者为Ali Hassani、Steven Walton、Jiachen Li、Shen Li和Humphrey Shi。Nezha来自华为诺亚方舟实验室与论文NEZHA用于中文语言理解的神经上下文表示作者为Junqiu WeiXiaozhe RenXiaoguang LiWenyong HuangYi LiaoYasheng WangJiashu LinXin JiangXiao Chen和Qun Liu。NLLB来自Meta与论文No Language Left BehindNLLB团队扩展以人为中心的机器翻译。Nyströmformer来自威斯康星大学 - 麦迪逊分校与论文Nyströmformer一种基于Nyström的算法用于近似自我注意力作者为Yunyang XiongZhanpeng ZengRudrasis ChakrabortyMingxing TanGlenn FungYin Li和Vikas Singh。OPT来自Meta AI与论文OPT开放预训练变换器语言模型作者为Susan ZhangStephen RollerNaman GoyalMikel ArtetxeMoya ChenShuohui Chen等。OWL-ViT来自Google AI与论文Simple Open-Vocabulary Object Detection with Vision TransformersMatthias Minderer、Alexey Gritsenko、Austin Stone、Maxim Neumann、Dirk Weissenborn、Alexey Dosovitskiy、Aravindh Mahendran、Anurag Arnab、Mostafa Dehghani、Zhuoran Shen、Xiao Wang、Xiaohua Zhai、Thomas Kipf和Neil Houlsby。Pegasus来自Google与论文PEGASUS使用提取的间隙句子进行抽象汇总的预训练作者为Jingqing Zhang、Yao Zhao、Mohammad Saleh和Peter J. Liu。PEGASUS-X来自Google与论文Investigating Efficiently Extending Transformers for Long Input SummarizationJason Phang、Yao Zhao和Peter J. Liu。Perceiver IO来自Deepmind与论文Perceiver IO用于结构化输入和输出的通用体系结构作者为Andrew Jaegle、Sebastian Borgeaud、Jean-Baptiste Alayrac、Carl Doersch、Catalin Ionescu、David Ding、Skanda Koppula、Daniel Zoran、Andrew Brock、Evan Shelhamer、Olivier Hénaff、Matthew M. Botvinick、Andrew Zisserman、Oriol Vinyals和João Carreira。PhoBERT来自VinAI Research与论文PhoBERT越南语预训练语言模型作者为Dat Quoc Nguyen和Anh Tuan Nguyen。PLBart来自UCLA NLP与论文Unified Pre-training for Program Understanding and GenerationWasi Uddin AhmadSaikat ChakrabortyBaishakhi Ray和Kai-Wei Chang。PoolFormer来自Sea AI Labs与论文MetaFormer is Actually What You Need for Vision作者为YuWeihao和LuoMi和ZhouPan和SiChenyang和ZhouYichen和WangXinchao和FengJiashi和YanShuicheng。ProphetNet来自微软研究与论文ProphetNet用于序列到序列预训练的预测未来N-gram作者为Yu YanWeizhen QiYeyun GongDayiheng LiuNan DuanJiusheng ChenRuofei Zhang和Ming Zhou。QDQBert来自NVIDIA与论文用于深度学习推断的整数量化原理和实证评估作者为Hao WuPatrick JuddXiaojie ZhangMikhail Isaev和Paulius Micikevicius。RAG来自Facebook与论文检索增强生成用于知识密集型NLP任务作者为Patrick LewisEthan PerezAleksandara PiktusFabio PetroniVladimir KarpukhinNaman GoyalHeinrich KüttlerMike LewisWen-tau YihTim RocktäschelSebastian Riedel和Douwe Kiela。REALM来自Google Research与论文REALM检索增强语言模型预训练作者为Kelvin GuuKenton LeeZora TungPanupong Pasupat和Ming-Wei Chang。Reformer来自Google Research与论文Reformer高效Transformer作者为Nikita Kitaev、Łukasz Kaiser、Anselm Levskaya。RegNet来自META Platforms与论文设计网络设计空间作者为Ilija Radosavovic、Raj Prateek Kosaraju、Ross Girshick、Kaiming He、Piotr Dollár。RemBERT来自Google Research与论文在预训练语言模型中重新思考嵌入耦合作者为Hyung Won Chung、Thibault Févry、Henry Tsai、M. Johnson、Sebastian Ruder。ResNet来自微软研究与论文Deep Residual Learning for Image Recognition作者为Kaiming He、Xiangyu Zhang、Shaoqing Ren、Jian Sun。RoBERTa来自Facebook与论文RoBERTa一种鲁棒优化的BERT预训练方法一起发布作者为Yinhan Liu、Myle Ott、Naman Goyal、Jingfei Du、Mandar Joshi、Danqi Chen、Omer Levy、Mike Lewis、Luke Zettlemoyer、Veselin Stoyanov。RoBERTa-PreLayerNorm来自Facebook与论文fairseq一种快速、可扩展的序列建模工具包作者为Myle Ott、Sergey Edunov、Alexei Baevski、Angela Fan、Sam Gross、Nathan Ng、David Grangier和Michael Auli。RoCBert来自WeChatAI与论文RoCBert具有多模态对比预训练的鲁棒中文Bert作者为HuiSu、WeiweiShi、XiaoyuShen、XiaoZhou、TuoJi、JiaruiFang和JieZhou。RoFormer来自ZhuiyiTechnology与论文RoFormer具有旋转位置嵌入的增强型Transformer作者为Jianlin Su、Yu Lu、Shengfeng Pan、Bo Wen和Yunfeng Liu。SegFormer来自NVIDIA与论文SegFormer用于语义分割的简单高效Transformer设计作者为Enze Xie、Wenhai Wang、Zhiding Yu、Anima Anandkumar、Jose M. Alvarez和Ping Luo。SEW来自ASAPP与论文Performance-Efficiency Trade-offs in Unsupervised Pre-training for Speech Recognition作者为Felix Wu、Kwangyoun Kim、Jing Pan、Kyu Han、Kilian Q. Weinberger和Yoav Artzi。SEW-D来自ASAPP与论文Performance-Efficiency Trade-offs in Unsupervised Pre-training for Speech Recognition作者为Felix Wu、Kwangyoun Kim、Jing Pan、Kyu Han、Kilian Q. Weinberger和Yoav Artzi。SpeechToTextTransformer来自Facebook与论文fairseq S2T使用fairseq进行快速语音转文本建模作者为Changhan Wang、Yun Tang、Xutai Ma、Anne Wu、Dmytro Okhonko和Juan Pino。SpeechToTextTransformer2来自Facebook与论文用于大规模自监督学习的语音翻译作者为Changhan Wang、Anne Wu、Juan Pino、Alexei Baevski、Michael Auli和Alexis Conneau。Splinter来自特拉维夫大学与论文Few-Shot Question Answering by Pretraining Span Selection作者为Ori Ram、Yuval Kirstain、Jonathan Berant、Amir Globerson和Omer Levy。SqueezeBERT来自伯克利与论文SqueezeBERT计算机视觉如何教NLP高效神经网络作者为Forrest N. Iandola、Albert E. Shaw、Ravi Krishna和Kurt W. Keutzer。Swin Transformer来自微软与论文Swin Transformer使用移动窗口的分层视觉Transformer作者为Ze Liu、Yutong Lin、Yue Cao、Han Hu、Yixuan Wei、Zheng Zhang、Stephen Lin和Baining Guo。Swin Transformer V2来自微软与论文Swin Transformer V2通过Ze Liu、Han Hu、Yutong Lin、Zhuliang Yao、Zhenda Xie、Yixuan Wei、Jia Ning、Yue Cao、Zheng Zhang、Li Dong、Furu Wei和Baining Guo扩展容量和分辨率。Swin2SR来自Würzburg大学与论文Swin2SR用于压缩图像超分辨率和恢复的SwinV2 Transformer作者为Marcos V. Conde、Ui-Jin Choi、Maxime Burchi和Radu Timofte。SwitchTransformers来自Google与论文Switch Transformers使用简单高效的稀疏性扩展到万亿参数模型的规模作者为William Fedus、Barret Zoph和Noam Shazeer。T5来自Google AI与论文探索统一的文本到文本Transformer的迁移学习极限作者为Colin Raffel、Noam Shazeer、Adam Roberts、Katherine Lee、Sharan Narang、Michael Matena、Yanqi Zhou、Wei Li和Peter J. Liu。T5v1.1来自Google AI发布在google-research/text-to-text-transfer-transformer存储库中作者为Colin Raffel、Noam Shazeer、Adam Roberts、Katherine Lee、Sharan Narang、Michael Matena、Yanqi Zhou、Wei Li和Peter J. Liu。Table Transformer来自Microsoft Research与论文PubTables-1M从非结构化文档中提取全面表格作者为Brandon Smock、Rohith Pesala和Robin Abraham。TAPAS来自Google AI与论文TAPAS通过预训练进行弱监督表格解析作者为Jonathan Herzig、Paweł Krzysztof Nowak、Thomas Müller、Francesco Piccinno和Julian Martin Eisenschlos。TAPEX来自Microsoft Research与论文TAPEX通过学习神经SQL执行器进行表格预训练作者为Qian LiuBei ChenJiaqi GuoMorteza ZiyadiZeqi LinWeizhu Chen和Jian-Guang Lou。Time Series Transformer来自HuggingFace。TimeSformer来自Facebook与论文时空注意力是否足以实现视频理解作者为Gedas Bertasius、Heng Wang和Lorenzo Torresani。Trajectory Transformer来自加州大学伯克利分校与论文离线强化学习作为一个大序列建模问题作者为Michael Janner、Qiyang Li和Sergey LevineTransformer-XL来自Google / CMU与论文Transformer-XL超出固定长度上下文的注意力语言模型作者为Zihang DaiZhilin YangYiming YangJaime CarbonellQuoc V. LeRuslan Salakhutdinov。TrOCR来自Microsoft与论文TrOCR基于Transformer的光学字符识别预训练模型作者为Minghao LiTengchao LvLei CuiYijuan LuDinei FlorencioCha ZhangZhoujun Li和Furu Wei。UL2来自Google Research与论文统一语言学习范例作者为Yi TayMostafa DehghaniVinh Q. TranXavier GarciaDara BahriTal SchusterHuaixiu Steven ZhengNeil HoulsbyDonald MetzlerUniSpeech来自Microsoft Research与论文UniSpeech使用有标记和无标记数据的统一语音表示学习作者为Chengyi WangYu WuQian YaoKenichi KumataniShujie LiuFuru WeiMichael Zeng和Xuedong Huang。UniSpeechSat来自Microsoft Research与论文UNISPEECH-SAT具有说话人感知的预训练的通用语音表示学习作者为Sanyuan Chen、Yu Wu、Chengyi Wang、Zhengyang Chen、Zhuo Chen、Shujie Liu、Jian Wu、Yao Qian、Furu Wei、Jinyu Li和Xiangzhan Yu。UPerNet来自北京大学与论文Unified Perceptual Parsing for Scene Understanding作者为Tete Xiao、Yingcheng Liu、Bolei Zhou、Yuning Jiang和Jian Sun。VAN来自清华大学和南开大学与论文Visual Attention Network作者为Meng-Hao Guo、Cheng-Ze Lu、Zheng-Ning Liu、Ming-Ming Cheng和Shi-Min Hu。VideoMAE来自南京大学多媒体计算组与论文VideoMAEMasked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training作者为Zhan Tong、Yibing Song、Jue Wang和Limin Wang。ViLT来自NAVER AI Lab/Kakao Enterprise/Kakao Brain与论文ViLTVision-and-Language Transformer Without Convolution or Region Supervision作者为Wonjae Kim、Bokyung Son和Ildoo Kim。Vision TransformerViT来自Google AI与论文An Image is Worth 16x16 WordsTransformers for Image Recognition at Scale作者为Alexey Dosovitskiy、Lucas Beyer、Alexander Kolesnikov、Dirk Weissenborn、Xiaohua Zhai、Thomas Unterthiner、Mostafa Dehghani、Matthias Minderer、Georg Heigold、Sylvain Gelly、Jakob Uszkoreit和Neil Houlsby。VisualBERT来自UCLA NLP与论文VisualBERTA Simple and Performant Baseline for Vision and Language作者为Liunian Harold Li、Mark Yatskar、Da Yin、Cho-Jui Hsieh和Kai-Wei Chang。ViT Hybrid来自Google AI与论文An Image is Worth 16x16 WordsTransformers for Image Recognition at Scale作者为Alexey Dosovitskiy、Lucas Beyer、Alexander Kolesnikov、Dirk Weissenborn、Xiaohua Zhai、Thomas Unterthiner、Mostafa Dehghani、Matthias Minderer、Georg Heigold、Sylvain Gelly、Jakob Uszkoreit和Neil Houlsby。ViTMAE来自Meta AI与论文Masked Autoencoders Are Scalable Vision Learners作者为Kaiming He、Xinlei Chen、Saining Xie、Yanghao Li、Piotr Dollár和Ross Girshick。ViTMSN来自Meta AI与论文Masked Siamese Networks for Label-Efficient Learning作者为Mahmoud Assran、Mathilde Caron、Ishan Misra、Piotr Bojanowski、Florian Bordes、Pascal Vincent、Armand Joulin、Michael Rabbat和Nicolas Ballas。Wav2Vec2来自Facebook AI与论文wav2vec 2.0A Framework for Self-Supervised Learning of Speech Representations作者为Alexei Baevski、Henry Zhou、Abdelrahman Mohamed和Michael Auli。Wav2Vec2-Conformer来自Facebook AI与论文FAIRSEQ S2TFast Speech-to-Text Modeling with FAIRSEQ作者为Changhan Wang、Yun Tang、Xutai Ma、Anne Wu、Sravya Popuri、Dmytro Okhonko和Juan Pino。Wav2Vec2Phoneme来自Facebook AI与论文Simple and Effective Zero-shot Cross-lingual Phoneme Recognition作者为Qiantong Xu、Alexei Baevski和Michael Auli。WavLM来自Microsoft Research与论文WavLMLarge-Scale Self-Supervised Pre-Training for Full Stack Speech Processing作者为Sanyuan Chen、Chengyi Wang、Zhengyang Chen、Yu Wu、Shujie Liu、Zhuo Chen、Jinyu Li、Naoyuki Kanda、Takuya Yoshioka、Xiong Xiao、Jian Wu、Long Zhou、Shuo Ren、Yanmin Qian、Yao Qian、Jian Wu、Michael Zeng和Furu Wei。Whisper来自OpenAI与论文Robust Speech Recognition via Large-Scale Weak Supervision作者为Alec Radford、Jong Wook Kim、Tao Xu、Greg Brockman和Christine McLeavey。X-CLIP来自Microsoft Research与论文Expanding Language-Image Pretrained Models for General Video Recognition作者为Bolin Ni、Houwen Peng、Minghao Chen、Songyang Zhang、Gaofeng Meng、Jianlong Fu、Shiming Xiang和Haibin Ling。XGLM来自Facebook AI与论文Few-shot Learning with Multilingual Language Models作者为Xi Victoria Lin、Todor Mihaylov、Mikel Artetxe、Tianlu Wang、、Shuohui Chen、、Daniel Simig、、Myle Ott、、Naman Goyal、、Shruti Bhosale、、Jingfei Du、、Ramakanth Pasunuru、、Sam Shleifer、、Punit Singh Koura、、Vishrav Chaudhary、、Brian O’Horo、、Jeff Wang、、Luke Zettlemoyer、、Zornitsa Kozareva、、Mona Diab和Veselin Stoyanov。XLM来自Facebook与论文Cross-lingual Language Model Pretraining作者为Guillaume Lample和Alexis Conneau。XLM-ProphetNet来自Microsoft Research与论文ProphetNetPredicting Future N-gram for Sequence-to-Sequence Pre-training作者为Yu Yan、、Weizhen Qi、、Yeyun Gong、、Dayiheng Liu、、Nan Duan、、Jiusheng Chen、、Ruofei Zhang和Ming Zhou。XLM-RoBERTa来自Facebook AI与论文Unsupervised Cross-lingual Representation Learning at Scale作者为Alexis Conneau、、Kartikay Khandelwal、、Naman Goyal、、Vishrav Chaudhary、、Guillaume Wenzek、、Francisco Guzmán、、Edouard Grave、、Myle Ott、、Luke Zettlemoyer和Veselin Stoyanov。XLM-RoBERTa-XL来自Facebook AI与论文Larger-Scale Transformers for Multilingual Masked Language Modeling作者为Naman Goyal、、Jingfei Du、、Myle Ott、、Giri Anantharaman和Alexis Conneau。XLNet来自Google/CMU的论文《XLNet: Generalized Autoregressive Pretraining for Language Understanding》由Zhilin Yang、Zihang Dai、Yiming Yang、Jaime Carbonell、Ruslan Salakhutdinov和Quoc V.Le撰写已发布。XLS-R来自Facebook AI的论文《XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale》由Arun Babu、Changhan Wang、Andros Tjandra、Kushal Lakhotia、Qiantong Xu、Naman Goyal、Kritika Singh、Patrick von Platen、Yatharth Saraf、Juan Pino、Alexei Baevski、Alexis Conneau和Michael Auli撰写已发布。XLSR-Wav2Vec2来自Facebook AI的论文《Unsupervised Cross-Lingual Representation Learning For Speech Recognition》由Alexis Conneau、Alexei Baevski、Ronan Collobert、Abdelrahman Mohamed和Michael Auli撰写已发布。YOLOS来自华中科技大学的论文《You Only Look at One Sequence: Rethinking Transformer in Vision through Object Detection》由Yuxin Fang、Bencheng Liao、Xinggang Wang、Jiemin Fang、Jiyang Qi、Rui Wu、Jianwei Niu和Wenyu Liu撰写已发布。YOSO来自威斯康星大学 - 麦迪逊分校的论文《You Only Sample (Almost) Once: Linear Cost Self-Attention Via Bernoulli Sampling》由Zhanpeng Zeng、Yunyang Xiong、Sathya N. Ravi、Shailesh Acharya、Glenn Fung和Vikas Singh撰写已发布。 Transformer 4.43.40 版本支持的模型目录涵盖了自然语言处理领域的众多任务和应用从文本分类到机器翻译从命名实体识别到情感分析以及许多其他领域。这些模型的广泛支持使得开发者可以根据不同的需求选择最合适的模型从而加速自然语言处理项目的开发和部署。无论您是在研究、开发还是部署自然语言处理应用Transformer 4.43.40 版本都提供了强大的工具和资源助您取得更好的成果。不断更新的 Transformer 工具包将继续为自然语言处理领域的创新和进步做出贡献为解决各种复杂的文本处理问题提供支持。