Language models are few-shot learners T Brown, B Mann, N Ryder, M Subbiah, JD Kaplan, P Dhariwal, ... Advances in neural information processing systems 33, 1877-1901, 2020 | 57566 | 2020 |
Learning Transferable Visual Models From Natural Language Supervision A Radford, JW Kim, C Hallacy, A Ramesh, G Goh, S Agarwal, G Sastry, ... https://cdn.openai.com/papers …, 2021 | 45582 | 2021 |
Language Models are Unsupervised Multitask Learners A Radford, J Wu, R Child, D Luan, D Amodei, I Sutskever Technical report, OpenAi, 2019 | 33399* | 2019 |
Proximal policy optimization algorithms J Schulman, F Wolski, P Dhariwal, A Radford, O Klimov arXiv preprint arXiv:1707.06347, 2017 | 31382 | 2017 |
Unsupervised representation learning with deep convolutional generative adversarial networks A Radford, L Metz, S Chintala arXiv preprint arXiv:1511.06434, 2015 | 21174 | 2015 |
Gpt-4 technical report J Achiam, S Adler, S Agarwal, L Ahmad, I Akkaya, FL Aleman, D Almeida, ... arXiv preprint arXiv:2303.08774, 2023 | 19304 | 2023 |
Improving language understanding by generative pre-training A Radford, K Narasimhan, T Salimans, I Sutskever | 17167 | 2018 |
Improved techniques for training gans T Salimans, I Goodfellow, W Zaremba, V Cheung, A Radford, X Chen Advances in neural information processing systems 29, 2016 | 12823 | 2016 |
Zero-shot text-to-image generation A Ramesh, M Pavlov, G Goh, S Gray, C Voss, A Radford, M Chen, ... International conference on machine learning, 8821-8831, 2021 | 7376 | 2021 |
Robust speech recognition via large-scale weak supervision A Radford, JW Kim, T Xu, G Brockman, C McLeavey, I Sutskever International conference on machine learning, 28492-28518, 2023 | 6847 | 2023 |
Evaluating large language models trained on code M Chen, J Tworek, H Jun, Q Yuan, HPDO Pinto, J Kaplan, H Edwards, ... arXiv preprint arXiv:2107.03374, 2021 | 6643 | 2021 |
Scaling laws for neural language models J Kaplan, S McCandlish, T Henighan, TB Brown, B Chess, R Child, ... arXiv preprint arXiv:2001.08361, 2020 | 4994 | 2020 |
Learning to summarize with human feedback N Stiennon, L Ouyang, J Wu, D Ziegler, R Lowe, C Voss, A Radford, ... Advances in neural information processing systems 33, 3008-3021, 2020 | 2895 | 2020 |
Generating long sequences with sparse transformers R Child, S Gray, A Radford, I Sutskever arXiv preprint arXiv:1904.10509, 2019 | 2627 | 2019 |
Gpt-4o system card A Hurst, A Lerer, AP Goucher, A Perelman, A Ramesh, A Clark, AJ Ostrow, ... arXiv preprint arXiv:2410.21276, 2024 | 2563 | 2024 |
Fine-tuning language models from human preferences DM Ziegler, N Stiennon, J Wu, TB Brown, A Radford, D Amodei, ... arXiv preprint arXiv:1909.08593, 2019 | 2482 | 2019 |
Generative pretraining from pixels M Chen, A Radford, R Child, J Wu, H Jun, D Luan, I Sutskever International conference on machine learning, 1691-1703, 2020 | 2172 | 2020 |
Advances in neural information processing systems T Brown, B Mann, N Ryder, M Subbiah, JD Kaplan, P Dhariwal, ... Language models are few-shot learners 33, 1877-901, 2020 | 1305 | 2020 |
Jukebox: A generative model for music P Dhariwal, H Jun, C Payne, JW Kim, A Radford, I Sutskever arXiv preprint arXiv:2005.00341, 2020 | 1136 | 2020 |
Openai baselines P Dhariwal, C Hesse, O Klimov, A Nichol, M Plappert, A Radford, ... | 1124 | 2017 |