JP2022544842A - 同期プロセッサのためのシャーディング - Google Patents
同期プロセッサのためのシャーディング Download PDFInfo
- Publication number
- JP2022544842A JP2022544842A JP2022511309A JP2022511309A JP2022544842A JP 2022544842 A JP2022544842 A JP 2022544842A JP 2022511309 A JP2022511309 A JP 2022511309A JP 2022511309 A JP2022511309 A JP 2022511309A JP 2022544842 A JP2022544842 A JP 2022544842A
- Authority
- JP
- Japan
- Prior art keywords
- tiles
- tile
- candidate
- layer
- different
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Program initiating; Program switching, e.g. by interrupt
- G06F9/4806—Task transfer initiation or dispatching
- G06F9/4843—Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
- G06F9/4881—Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored program computers
- G06F15/82—Architectures of general purpose stored program computers data or demand driven
- G06F15/825—Dataflow computers
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/16—Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformation of program code
- G06F8/41—Compilation
- G06F8/44—Encoding
- G06F8/445—Exploiting fine grain parallelism, i.e. parallelism at instruction level
- G06F8/4451—Avoiding pipeline stalls
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformation of program code
- G06F8/41—Compilation
- G06F8/45—Exploiting coarse grain parallelism in compilation, i.e. parallelism between groups of instructions
- G06F8/451—Code distribution
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
- G06F9/5038—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
- G06F9/5044—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering hardware capabilities
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
- G06F9/5066—Algorithms for mapping a plurality of inter-dependent sub-tasks onto a plurality of physical CPUs
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G06N3/063—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2209/00—Indexing scheme relating to G06F9/00
- G06F2209/48—Indexing scheme relating to G06F9/48
- G06F2209/483—Multiproc
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2209/00—Indexing scheme relating to G06F9/00
- G06F2209/48—Indexing scheme relating to G06F9/48
- G06F2209/485—Resource constraint
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2209/00—Indexing scheme relating to G06F9/00
- G06F2209/50—Indexing scheme relating to G06F9/50
- G06F2209/5017—Task decomposition
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2209/00—Indexing scheme relating to G06F9/00
- G06F2209/50—Indexing scheme relating to G06F9/50
- G06F2209/506—Constraint
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Software Systems (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Computer Hardware Design (AREA)
- Data Mining & Analysis (AREA)
- Mathematical Optimization (AREA)
- Computational Mathematics (AREA)
- Pure & Applied Mathematics (AREA)
- Mathematical Analysis (AREA)
- Computing Systems (AREA)
- Databases & Information Systems (AREA)
- Algebra (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Neurology (AREA)
- General Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- Devices For Executing Special Programs (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Multi Processors (AREA)
- Complex Calculations (AREA)
- Memory System Of A Hierarchy Structure (AREA)
Abstract
Description
実施形態1は、
複数の同期タイルを有するデバイスによって行われることになるそれぞれのマトリクス動作をそれぞれが表す複数のノードを含むデータフローグラフの表現を受信するステップと、
前記複数の同期タイルの各タイルに対する前記データフローグラフのそれぞれの部分の複数の候補割当てを生成するステップと、
前記デバイスの1つまたは複数のリソース制約に従って前記複数の候補割当ての各候補割当てを評価するステップと、
各候補割当ての評価に基づいて前記候補割当てのうちの1つを選択するステップと
を含む方法である。
2つの異なるパスのそれぞれについてそれぞれの実行傾斜をコンピュータ計算するステップと、
第1のパスが、第2のパスよりも小さい実行傾斜を有することを決定するステップと、
それに応答して、前記第1のパスがより緩やかな実行傾斜を有するように前記候補割当てを修正するステップと
を含む、実施形態9に記載の方法である。
103 第2の寸法
200 マトリクス
301、302、303 ストール
310、320、330、340 アイドルゾーン
360 バッファ
502 タイル
506 区分
600 タイル
602 ローカルメモリ、物理メモリ
604 コンピュータ計算アレイ
606 セル
610 汎用の制御可能なバスライン
610a、610b、610c、610d バスライン
620 コンピュータ計算アレイ部分和バスライン
620a、620b 部分和
621 制御要素
Claims (20)
複数の同期タイルを有するデバイスによって行われることになるそれぞれのマトリクス動作をそれぞれが表す複数のノードを含むデータフローグラフの表現を受信することと、
前記複数の同期タイルの各タイルに対する前記データフローグラフのそれぞれの部分の複数の候補割当てを生成することと、
前記デバイスの1つまたは複数のリソース制約に従って前記複数の候補割当ての各候補割当てを評価することと、
各候補割当ての評価に基づいて前記候補割当てのうちの1つを選択することと
を含む、
システム。
2つの異なるパスのそれぞれについてそれぞれの実行傾斜をコンピュータ計算することと、
第1のパスが、第2のパスよりも小さい実行傾斜を有することを決定することと、
それに応答して、前記第1のパスがより緩やかな実行傾斜を有するように前記候補割当てを修正することと
を含む、請求項9に記載のシステム。
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2024005431A JP2024040198A (ja) | 2019-08-22 | 2024-01-17 | 同期プロセッサのためのシャーディング |
Applications Claiming Priority (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US201962890471P | 2019-08-22 | 2019-08-22 | |
| US62/890,471 | 2019-08-22 | ||
| PCT/US2020/047206 WO2021035055A1 (en) | 2019-08-22 | 2020-08-20 | Sharding for synchronous processors |
Related Child Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| JP2024005431A Division JP2024040198A (ja) | 2019-08-22 | 2024-01-17 | 同期プロセッサのためのシャーディング |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| JP2022544842A true JP2022544842A (ja) | 2022-10-21 |
| JP7423757B2 JP7423757B2 (ja) | 2024-01-29 |
Family
ID=72474370
Family Applications (2)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| JP2022511309A Active JP7423757B2 (ja) | 2019-08-22 | 2020-08-20 | 同期プロセッサのためのシャーディング |
| JP2024005431A Pending JP2024040198A (ja) | 2019-08-22 | 2024-01-17 | 同期プロセッサのためのシャーディング |
Family Applications After (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| JP2024005431A Pending JP2024040198A (ja) | 2019-08-22 | 2024-01-17 | 同期プロセッサのためのシャーディング |
Country Status (7)
| Country | Link |
|---|---|
| US (2) | US12147793B2 (ja) |
| EP (1) | EP3987394A1 (ja) |
| JP (2) | JP7423757B2 (ja) |
| KR (2) | KR102819972B1 (ja) |
| CN (1) | CN114270307A (ja) |
| TW (1) | TWI776212B (ja) |
| WO (1) | WO2021035055A1 (ja) |
Families Citing this family (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US12332836B2 (en) * | 2022-07-13 | 2025-06-17 | SambaNova Systems, Inc. | Estimating a scaled cost of implementing an operation unit graph on a reconfigurable processor |
| CN115796041B (zh) * | 2022-12-05 | 2025-09-19 | 杭州海康威视数字技术股份有限公司 | 神经网络模型部署方法、系统、设备及存储介质 |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2009151645A (ja) * | 2007-12-21 | 2009-07-09 | Mitsubishi Electric Corp | 並列処理装置及びプログラム並列化装置 |
| JP2012248114A (ja) * | 2011-05-30 | 2012-12-13 | Canon Inc | 情報処理装置、情報処理装置の制御方法、およびプログラム |
| WO2018185765A1 (en) * | 2017-04-04 | 2018-10-11 | Hailo Technologies Ltd. | Neural network processor incorporating inter-device connectivity |
| WO2018193370A1 (en) * | 2017-04-17 | 2018-10-25 | Cerebras Systems Inc. | Task activating for accelerated deep learning |
Family Cites Families (16)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| AUPQ131399A0 (en) | 1999-06-30 | 1999-07-22 | Silverbrook Research Pty Ltd | A method and apparatus (NPAGE02) |
| JPH02190934A (ja) * | 1989-01-20 | 1990-07-26 | Hitachi Ltd | 並列計算機向けプログラム生成方法 |
| US5682107A (en) | 1994-04-01 | 1997-10-28 | Xilinx, Inc. | FPGA architecture with repeatable tiles including routing matrices and logic matrices |
| TWI353521B (en) | 2006-09-28 | 2011-12-01 | Sandisk Corp | Soft-input soft-output decoder for nonvolatile mem |
| US8862625B2 (en) | 2008-04-07 | 2014-10-14 | Teradata Us, Inc. | Accessing data in a column store database based on hardware compatible indexing and replicated reordered columns |
| KR101710910B1 (ko) * | 2010-09-27 | 2017-03-13 | 삼성전자 주식회사 | 프로세싱 유닛의 동적 자원 할당을 위한 방법 및 장치 |
| US9176794B2 (en) | 2010-12-13 | 2015-11-03 | Advanced Micro Devices, Inc. | Graphics compute process scheduling |
| US9336146B2 (en) | 2010-12-29 | 2016-05-10 | Empire Technology Development Llc | Accelerating cache state transfer on a directory-based multicore architecture |
| KR20120079498A (ko) * | 2011-01-05 | 2012-07-13 | 정동규 | 청소할 필요없는 키보드 |
| US9229983B2 (en) * | 2012-11-30 | 2016-01-05 | Amazon Technologies, Inc. | System-wide query optimization |
| US9507563B2 (en) | 2013-08-30 | 2016-11-29 | Cavium, Inc. | System and method to traverse a non-deterministic finite automata (NFA) graph generated for regular expression patterns with advanced features |
| CN105630441B (zh) | 2015-12-11 | 2018-12-25 | 中国航空工业集团公司西安航空计算技术研究所 | 一种基于统一染色技术的gpu系统 |
| US12118451B2 (en) | 2017-01-04 | 2024-10-15 | Stmicroelectronics S.R.L. | Deep convolutional network heterogeneous architecture |
| US10452452B2 (en) * | 2017-04-17 | 2019-10-22 | Wave Computing, Inc. | Reconfigurable processor fabric implementation using satisfiability analysis |
| KR102172866B1 (ko) * | 2017-04-17 | 2020-11-02 | 딥시그 인크. | 라디오 신호 프로세싱 데이터플로 연산들의 배치 및 스케줄링 |
| US10380063B2 (en) * | 2017-09-30 | 2019-08-13 | Intel Corporation | Processors, methods, and systems with a configurable spatial accelerator having a sequencer dataflow operator |
-
2020
- 2020-08-20 JP JP2022511309A patent/JP7423757B2/ja active Active
- 2020-08-20 KR KR1020227004916A patent/KR102819972B1/ko active Active
- 2020-08-20 EP EP20771947.7A patent/EP3987394A1/en active Pending
- 2020-08-20 WO PCT/US2020/047206 patent/WO2021035055A1/en not_active Ceased
- 2020-08-20 CN CN202080058480.0A patent/CN114270307A/zh active Pending
- 2020-08-20 KR KR1020257018524A patent/KR20250086806A/ko active Pending
- 2020-08-20 US US17/636,805 patent/US12147793B2/en active Active
- 2020-08-21 TW TW109128609A patent/TWI776212B/zh active
-
2024
- 2024-01-17 JP JP2024005431A patent/JP2024040198A/ja active Pending
- 2024-10-18 US US18/920,341 patent/US20250045032A1/en active Pending
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2009151645A (ja) * | 2007-12-21 | 2009-07-09 | Mitsubishi Electric Corp | 並列処理装置及びプログラム並列化装置 |
| JP2012248114A (ja) * | 2011-05-30 | 2012-12-13 | Canon Inc | 情報処理装置、情報処理装置の制御方法、およびプログラム |
| WO2018185765A1 (en) * | 2017-04-04 | 2018-10-11 | Hailo Technologies Ltd. | Neural network processor incorporating inter-device connectivity |
| WO2018193370A1 (en) * | 2017-04-17 | 2018-10-25 | Cerebras Systems Inc. | Task activating for accelerated deep learning |
Also Published As
| Publication number | Publication date |
|---|---|
| KR20250086806A (ko) | 2025-06-13 |
| US12147793B2 (en) | 2024-11-19 |
| EP3987394A1 (en) | 2022-04-27 |
| KR102819972B1 (ko) | 2025-06-13 |
| JP2024040198A (ja) | 2024-03-25 |
| JP7423757B2 (ja) | 2024-01-29 |
| US20250045032A1 (en) | 2025-02-06 |
| WO2021035055A1 (en) | 2021-02-25 |
| TW202111562A (zh) | 2021-03-16 |
| CN114270307A (zh) | 2022-04-01 |
| TWI776212B (zh) | 2022-09-01 |
| KR20220031717A (ko) | 2022-03-11 |
| US20220300450A1 (en) | 2022-09-22 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| JP7094262B2 (ja) | 計算グラフの修正 | |
| JP6898496B2 (ja) | 計算グラフの処理 | |
| Yang et al. | UMR: A multi-round algorithm for scheduling divisible workloads | |
| Ni et al. | Resource allocation strategy in fog computing based on priced timed petri nets | |
| KR102081952B1 (ko) | 계산 그래프들의 스트림-기반 가속기 프로세싱 | |
| US11687771B2 (en) | Platform for concurrent execution of GPU operations | |
| Lou et al. | Cost-effective scheduling for dependent tasks with tight deadline constraints in mobile edge computing | |
| US20250045032A1 (en) | Sharding for synchronous processors | |
| Malyshkin et al. | Optimization methods of parallel execution of numerical programs in the LuNA fragmented programming system | |
| JP7708921B2 (ja) | 同期プロセッサのためのコンパイル | |
| Beaumont et al. | Comparison of static and runtime resource allocation strategies for matrix multiplication | |
| WO2013058396A1 (ja) | タスク配置装置及びタスク配置方法 | |
| JP2023145676A (ja) | 伝搬レイテンシの短縮 | |
| Herrmann et al. | Memory-aware list scheduling for hybrid platforms | |
| Song et al. | A game theory based mapreduce scheduling algorithm | |
| HK40071879A (en) | Sharding for synchronous processors | |
| Zhu et al. | High-Throughput Scientific Workflow Scheduling under Deadline Constraint in Clouds. | |
| Pascual et al. | Locality-aware policies to improve job scheduling on 3D tori | |
| CN113095474B (en) | Resource usage prediction for deep learning model | |
| Mendonca | Multi-Purpose Efficient Resource Allocation for Parallel Systems | |
| Wu et al. | The performance impact of different master nodes on parallel loop self-scheduling schemes for rule-based expert systems | |
| Lampsas et al. | Scheduling Independent Tasks in Heterogeneous Environments under Communication Constraints |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| A521 | Request for written amendment filed |
Free format text: JAPANESE INTERMEDIATE CODE: A523 Effective date: 20220413 |
|
| A621 | Written request for application examination |
Free format text: JAPANESE INTERMEDIATE CODE: A621 Effective date: 20220413 |
|
| A977 | Report on retrieval |
Free format text: JAPANESE INTERMEDIATE CODE: A971007 Effective date: 20230322 |
|
| A131 | Notification of reasons for refusal |
Free format text: JAPANESE INTERMEDIATE CODE: A131 Effective date: 20230424 |
|
| A601 | Written request for extension of time |
Free format text: JAPANESE INTERMEDIATE CODE: A601 Effective date: 20230724 |
|
| A521 | Request for written amendment filed |
Free format text: JAPANESE INTERMEDIATE CODE: A523 Effective date: 20230919 |
|
| TRDD | Decision of grant or rejection written | ||
| A01 | Written decision to grant a patent or to grant a registration (utility model) |
Free format text: JAPANESE INTERMEDIATE CODE: A01 Effective date: 20231218 |
|
| A61 | First payment of annual fees (during grant procedure) |
Free format text: JAPANESE INTERMEDIATE CODE: A61 Effective date: 20240117 |
|
| R150 | Certificate of patent or registration of utility model |
Ref document number: 7423757 Country of ref document: JP Free format text: JAPANESE INTERMEDIATE CODE: R150 |