Tang et al., 2023 - Google Patents
ML-Bench: Evaluating Large Language Models and Agents for Machine Learning Tasks on Repository-Level CodeTang et al., 2023
View PDF- Document ID
- 15649746882063966615
- Author
- Tang X
- Liu Y
- Cai Z
- Shao Y
- Lu J
- Zhang Y
- Deng Z
- Hu H
- An K
- Huang R
- Si S
- Chen S
- Zhao H
- Chen L
- Wang Y
- Liu T
- Jiang Z
- Chang B
- Fang Y
- Qin Y
- Zhou W
- Zhao Y
- Cohan A
- Gerstein M
- Publication year
- Publication venue
- arXiv preprint arXiv:2311.09835
External Links
Snippet
Despite Large Language Models (LLMs) like GPT-4 achieving impressive results in function- level code generation, they struggle with repository-scale code understanding (eg, coming up with the right arguments for calling routines), requiring a deeper comprehension of …
- 238000010801 machine learning 0 title description 2
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/30—Creation or generation of source code
- G06F8/34—Graphical or visual programming
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/30—Creation or generation of source code
- G06F8/36—Software reuse
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2705—Parsing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/44—Arrangements for executing specific programmes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformations of program code
- G06F8/41—Compilation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/70—Software maintenance or management
- G06F8/76—Adapting program code to run in a different environment; Porting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/30—Creation or generation of source code
- G06F8/38—Implementation of user interfaces
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/70—Software maintenance or management
- G06F8/71—Version control; Configuration management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/10—Requirements analysis; Specification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/36—Preventing errors by testing or debugging software
- G06F11/3668—Software testing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/60—Software deployment
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/20—Software design
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/02—Knowledge representation
- G06N5/022—Knowledge engineering, knowledge acquisition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06Q—DATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/10—Office automation, e.g. computer aided management of electronic mail or groupware; Time management, e.g. calendars, reminders, meetings or time accounting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F19/00—Digital computing or data processing equipment or methods, specially adapted for specific applications
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Du et al. | DependEval: Benchmarking LLMs for Repository Dependency Understanding | |
| US10353796B2 (en) | System and method for using development objectives to guide implementation of source code | |
| Arulmohan et al. | Extracting domain models from textual requirements in the era of large language models | |
| Iovino et al. | On the Impact Significance of Metamodel Evolution in MDE. | |
| Wilken | Angular in action | |
| Tang et al. | ML-Bench: Evaluating Large Language Models and Agents for Machine Learning Tasks on Repository-Level Code | |
| Oumoussa et al. | Evolution of microservices identification in monolith decomposition: A systematic review | |
| US20180060779A1 (en) | Method of generating business process model and computerized system associated therewith | |
| Ray | A Review on Vibe Coding: Fundamentals, State-of-the-art, Challenges and Future Directions | |
| Hemmat et al. | Research directions for using LLM in software requirement engineering: A systematic review | |
| Ramackers et al. | From prose to prototype: synthesising executable UML models from natural language | |
| Lutalo | Software Language Engineering-Text Processing Language Design, Implementation, Evaluation Methods | |
| Kim | Comparing proficiency of ChatGPT and bard in software development | |
| Capdepon et al. | Migration Process from Monolithic to Micro Frontend Architecture in Mobile Applications. | |
| Khan et al. | Developing Multi-Platform Apps with Visual Studio Code | |
| Sänger et al. | Large language models to the rescue: Reducing the complexity in scientific workflow development using ChatGPT | |
| CN116107524B (en) | Low-code application log processing method, medium, device and computing equipment | |
| Flores | A Two-Level Model-Driven Engineering Approach for Reengineering CI/CD Pipelines | |
| Pedemonte et al. | Towards automatic functional test execution | |
| Miao et al. | Paper2agent: Reimagining research papers as interactive and reliable ai agents | |
| Zhou | Fine-Tuning Large Language Models for Practical Software Engineering: Case Studies in Automated Patch Generation | |
| Ignaim | EvoSPL: An evolutionary approach for adopting software product lines in the automotive industry | |
| Aigner et al. | Kotlin in action | |
| Tang et al. | ML-Bench: Evaluating Large Language Models for Code Generation in Repository-Level Machine Learning Tasks | |
| EP4629056A1 (en) | Video analytics pipeline development system with assistive feedback and annotation |