JP2008077151A

JP2008077151A - Shared memory device

Info

Publication number: JP2008077151A
Application number: JP2006252389A
Authority: JP
Inventors: Mutsuhiro Omori; 睦弘大森
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 2006-09-19
Filing date: 2006-09-19
Publication date: 2008-04-03
Anticipated expiration: 2026-09-19
Also published as: JP4811212B2

Abstract

<P>PROBLEM TO BE SOLVED: To provide a shared memory device for efficiently connecting and expanding a plurality of processors in a scalable way via memories and achieving a simple redundant configuration. <P>SOLUTION: The shared memory device includes a plurality of processors 12-0 to 12-16, a plurality of memory modules 14-0 to 14-63 accessible from the processors, a connection part 13 to which only a specified processor among the plurality of processors is connectable to a specified memory module. The plurality of processors are accessible to memory systems M0-M15 formed by one or more memory modules through the connection part. The memory system accessible from the different processors shares a part of the memory modules to be accessed from different processors, and has a redundant function allowing redundancy to the plurality of processors. The processors 12 to 16 are processors for redundancy. <P>COPYRIGHT: (C)2008,JPO&INPIT

Description

本発明は、プロセッサエレメント（Processing Element：ＰＥ）等の処理装置を含む複数のメモリシステムを混載し、各システムのメモリを共有する共有メモリ装置に関するものである。 The present invention relates to a shared memory device in which a plurality of memory systems including a processing device such as a processor element (PE) are mixedly mounted and the memory of each system is shared.

複数のメモリシステムを混載するシステムにおいて、並列処理を重視したアーキテクチャを採用すると、たとえば図１に示すような構成となる。
図１の構成においては、ＰＥ（プロセッサエレメント）１−１〜１−４とメモリ２−１〜２−４は並列処理を優先するため、１対１で接続される。
図１の構成において、ＰＥ１とメモリ２は並列処理を優先するため１対１で接続されるが、ＰＥ１は隣接しているＰＥのデータを参照するためには、上位装置を介したパスを使う必要がある。 If an architecture emphasizing parallel processing is employed in a system in which a plurality of memory systems are mixed, for example, a configuration as shown in FIG. 1 is obtained.
In the configuration of FIG. 1, PEs (processor elements) 1-1 to 1-4 and memories 2-1 to 2-4 are connected one-to-one because priority is given to parallel processing.
In the configuration of FIG. 1, PE1 and memory 2 are connected in a one-to-one relationship in order to prioritize parallel processing, but PE1 uses a path through a higher-level device to refer to data of adjacent PEs. There is a need.

そこで、ＰＥ回路１から直接、隣接メモリまでの接続を、一般的に、図２に示すように、クロスバー（Ｘｂａｒ）３で行う構成が採用される。
ＵＳＰ5,471,592 Therefore, a configuration in which connection from the PE circuit 1 directly to the adjacent memory is generally performed by a crossbar (Xbar) 3 as shown in FIG.
USP 5,471,592

前述した複数のＰＥをもつシステムにおいて、図３に示すように、メモリを介してデータの共有をおこない効率よくスケーラブルに接続拡張した場合にＰＥとメモリ間の接続がＰＥ数に対してリニアに増加しないという問題があった（要するに急激に増加する。）。 In a system with multiple PEs as described above, as shown in FIG. 3, when data is shared via memory and efficient and scalable connection expansion is performed, the connection between PEs and memory increases linearly with respect to the number of PEs. There was a problem of not (in short, it increases rapidly).

メモリの共有システムとしては、特許文献１（US5,471,592；Multi-Processor with crossbar link of processors and memories）以前は、ＳＩＭＤ,ＭＩＭＤのどちらかであったが、近年の複雑なアプリケーション実現のためにはその両方の機能を包含したメモリシステムが必要となる。そこで、その基本的方法が提案されている。
この場合、データを転送するのではなく、ＰＥとメモリの接続先を変更することで、効率的なマルチＰＥ処理を実現しており、次の３つの形態の接続を持つ。 As a memory sharing system, it was either SIMD or MIMD before Patent Document 1 (US 5,471,592; Multi-Processor with crossbar link of processors and memories). A memory system including both functions is required. Therefore, the basic method has been proposed.
In this case, efficient multi-PE processing is realized by changing the connection destination of the PE and the memory instead of transferring data, and has the following three types of connections.

メモリ全体をアクセスできるグローバル接続、
特定のＰＥに接続可能なローカル接続、
ＰＥの実行命令を転送する命令転送経路、
の３つである。 A global connection that can access the entire memory,
Local connection that can connect to a specific PE,
Instruction transfer path for transferring PE execution instructions,
It is three.

クロスバースイッチのそれぞれの縦方向(ひとつのメモリのつながる方向)(=メモリ毎と等価)にはプライオリティーをつけるための機構があり、ラウンドロビン（round-robin）方式で決定する。 Each crossbar switch has a mechanism for giving priority in the vertical direction (direction in which one memory is connected) (= equivalent to each memory), and is determined by a round-robin method.

しかしながら、この中では非常にたくさんのＰＥをクロスバー接続した場合の接続の巨大化に対してはまったく触れていないため、ＰＥ数を増加させた場合のクロスバー接続の急激な増大に対してはその対策方法はまったく考えられていない。 However, since there is no mention of enlarging the connection when a very large number of PEs are crossbar-connected, there is no sudden increase in crossbar connection when the number of PEs is increased. The countermeasure method is not considered at all.

また、図４(Ａ)，(Ｂ)に示すように、ＰＥを増加させる場合のデータ転送路の増大を抑えるために、データ転送路の階層化も提案されているが、その場合には階層構造を構成するために接続ポート６の設置など、本来のデータ転送にとっては不要となる機構が必要となり、無駄が多い。 As shown in FIGS. 4A and 4B, hierarchization of data transfer paths has also been proposed in order to suppress an increase in data transfer paths when PEs are increased. In order to configure the structure, a mechanism that is unnecessary for the original data transfer such as the installation of the connection port 6 is required, which is wasteful.

さらに、図５（クロスバーに接続されたアレー構造に対する故障回避）におけるように、クロスバー接続されたアレー構造に対して単純に冗長部分を追加してアレー要素のどれかひとつが故障していた場合に代替アレー要素を利用できるようにするためには、クロスバー接続がアレー要素の数分増加する。
アレー要素が少ない場合はなんとかこの方法でアレー冗長可能であっても、アレー要素が増加してきた場合には冗長化のためのクロスバー接続は急激に増大し、システム実装の足かせとなる。 Further, as in FIG. 5 (Failure Avoidance for Array Structure Connected to Crossbar), one of the array elements has failed by simply adding a redundant portion to the array structure connected to the crossbar. In order to be able to utilize alternative array elements in some cases, the crossbar connection is increased by the number of array elements.
Even if array redundancy is possible with this method if the number of array elements is small, the crossbar connection for redundancy increases rapidly if the number of array elements increases, which is a drag on system implementation.

本発明は、複数の処理装置をメモリを介して効率よくスケーラブルに接続拡張することが可能でしかも簡単な冗長構成を実現可能な共有メモリ装置を提供することにある。 An object of the present invention is to provide a shared memory device capable of efficiently and scalablely connecting and expanding a plurality of processing devices via a memory and realizing a simple redundant configuration.

本発明の第１の観点の共有メモリ装置は、複数の処理装置と、前記処理装置によりアクセス可能な複数のメモリモジュールと、前記複数の処理装置のうち、特定の処理装置のみが特定のメモリモジュールに接続可能な接続部と、を有し、前記複数の処理装置は、前記接続部を介して一または複数のメモリモジュールにより形成されるメモリシステムをアクセス可能で、異なる処理装置によりアクセス可能な前記メモリシステムは、異なる処理装置でアクセスされるメモリモジュールを一部共有し、前記複数の処理装置に対して冗長化可能な冗長機能を有する。 A shared memory device according to a first aspect of the present invention includes a plurality of processing devices, a plurality of memory modules accessible by the processing device, and a specific memory module only among the plurality of processing devices. The plurality of processing devices can access a memory system formed by one or a plurality of memory modules via the connection portion, and can be accessed by different processing devices. The memory system shares a part of memory modules accessed by different processing devices, and has a redundancy function that enables redundancy for the plurality of processing devices.

好適には、前記冗長機能はシフト冗長機能である。 Preferably, the redundancy function is a shift redundancy function.

好適には、前記共有メモリモジュールは、配置位置が近い処理装置同士がアクセス可能に、前記メモリシステムが形成されている。 Preferably, the memory system is formed such that the shared memory module can be accessed by processing devices located close to each other.

好適には、同じメモリモジュールに同時に複数の処理装置からアクセス要求があった場合には優先順位付け処理を実行し、その優先順位に従いアクセス制御を行う調停回路を有する。 Preferably, there is an arbitration circuit that executes prioritization when there are access requests from a plurality of processing devices simultaneously to the same memory module, and performs access control according to the priorities.

好適には、外部との通信が可能で、前記複数のメモリモジュールのアクセスを制御するコントローラを有し、前記コントローラは、前記接続部を介して全てのメモリモジュールにアクセス可能である。 Preferably, it has a controller that can communicate with the outside and controls access to the plurality of memory modules, and the controller can access all the memory modules via the connection unit.

本発明の第２の観点の共有メモリ装置は、複数の処理装置と、複数のメモリモジュールと、前記複数の処理装置のうち、特定の処理装置のみが特定のメモリモジュールに接続可能接続部と、外部との通信が可能で、前記複数のメモリモジュールのアクセスを制御するコントローラを有し、前記複数の処理装置は、前記接続部を介して一または複数のメモリモジュールにより形成されるメモリシステムをアクセス可能で、異なる処理装置によりアクセス可能な前記メモリシステムは、異なる処理装置でアクセスされるメモリモジュールを一部共有し、前記複数の処理装置に対して冗長化可能な冗長機能を含む複数の共有メモリ装置を有し、各共有メモリ装置のコントローラがバスにより接続されている。 A shared memory device according to a second aspect of the present invention includes a plurality of processing devices, a plurality of memory modules, and a connection unit in which only a specific processing device among the plurality of processing devices can be connected to a specific memory module, A controller capable of communicating with the outside and controlling access to the plurality of memory modules, wherein the plurality of processing devices access a memory system formed by one or a plurality of memory modules via the connection unit. The memory system that can be accessed by different processing devices partially shares a memory module that is accessed by different processing devices, and includes a plurality of shared memories including a redundant function that can be made redundant to the plurality of processing devices And a controller of each shared memory device is connected by a bus.

本発明によれば、たとえば複数の処理装置のいずれかに故障があった場合、この処理装置に対して冗長機能を用いて冗長化処理、たとえばシフト冗長により冗長構成がとられる。
そして、複数の処理装置は、メモリシステムのメモリモジュールに接続部を介してアクセスする。このとき、異なる処理装置によりアクセス可能なメモリシステムは、異なる処理装置でアクセスされるメモリモジュールを一部共有している。すなわち、部分共有している。 According to the present invention, for example, when any of a plurality of processing devices has a failure, the processing device is configured to be redundant by using a redundancy function, for example, shift redundancy.
The plurality of processing devices access the memory module of the memory system via the connection unit. At this time, the memory systems that can be accessed by different processing devices partially share memory modules that are accessed by different processing devices. That is, partial sharing.

本発明によれば、複数の処理装置をメモリを介して効率よくスケーラブルに接続拡張することができ、しかも簡単な冗長構成を実現可能である。 According to the present invention, it is possible to efficiently expand and connect a plurality of processing devices via a memory, and it is possible to realize a simple redundant configuration.

以下、本発明の実施形態を図面に関連付けて説明する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.

図６は、本発明の実施形態に係る共有メモリ装置のシステム構成図である。 FIG. 6 is a system configuration diagram of the shared memory device according to the embodiment of the present invention.

図６の共有メモリ装置１０は、ダイレクトメモリアクセスコントローラ（ＤＭＡコントローラ）１１と、複数（図５では１６）のＰＥコア１２−０〜１２−１６、接続部としての一部重なりマルチポートおよびシフト冗長回路(以下、オーバーラップマルチポートという)１３、複数（図５では６４個）のメモリモジュールとしてのメモリバンク(たとえばＳＲＡＭバンク)１４−０〜１４−６３、並びに調停回路１５を有する。
本実施形態の共有メモリ装置１０において、ＰＥコア１２−１６は冗長用ＰＥコアとして設けられており、いずれかのＰＥコアに故障がある場合に、後で説明するようなシフト冗長を行うことを可能に構成されている。 6 includes a direct memory access controller (DMA controller) 11, a plurality (16 in FIG. 5) of PE cores 12-0 to 12-16, partially overlapping multiports and shift redundancy as connection portions. A circuit (hereinafter referred to as an overlap multiport) 13, a memory bank (for example, SRAM bank) 14-0 to 14-63 as a plurality (64 in FIG. 5) of memory modules, and an arbitration circuit 15 are included.
In the shared memory device 10 of the present embodiment, the PE core 12-16 is provided as a redundant PE core, and when any PE core has a failure, shift redundancy as described later is performed. It is configured to be possible.

図６の共有メモリ装置１０において、メモリバンク１４−０〜１４−６３は隣接する８バンクにより形成される複数のメモリシステムＭ０〜Ｍ１５に区分けされている。
たとえば、メモリシステムＭ０は８個のメモリバンク１４−０〜１４−７により形成されている。
メモリシステムＭ０に隣接するメモリシステムＭ１は、メモリシステムＭ０の４個のメモリバンク１４−４〜１４−７を共有して８個のメモリバンク１４−４〜１４−１１により形成されている。
同様に、メモリシステムＭ１に隣接するメモリシステムＭ２は、メモリシステムＭ１の４個のメモリバンク１４−８〜１４−１１を共有して８個のメモリバンク１４−８〜１４−１５により形成されている。
以下、メモリシステムＭ３〜Ｍ１５は、隣接するメモリシステムの４つのメモリバンクを共有する形態で８つのメモリバンクにより形成されている。
ただし、メモリシステムＭ１５のみ４つのメモリバンクにより形成されている。 In the shared memory device 10 of FIG. 6, the memory banks 14-0 to 14-63 are divided into a plurality of memory systems M0 to M15 formed by adjacent eight banks.
For example, the memory system M0 is formed by eight memory banks 14-0 to 14-7.
The memory system M1 adjacent to the memory system M0 is formed by eight memory banks 14-4 to 14-11 sharing the four memory banks 14-4 to 14-7 of the memory system M0.
Similarly, the memory system M2 adjacent to the memory system M1 is formed by eight memory banks 14-8 to 14-15 sharing the four memory banks 14-8 to 14-11 of the memory system M1. Yes.
Hereinafter, the memory systems M3 to M15 are formed by eight memory banks in a form of sharing four memory banks of adjacent memory systems.
However, only the memory system M15 is formed by four memory banks.

図５の共有メモリ装置１０において、各ＰＥコア１２−０〜１２−１５(１６)は、たとえば８バンク(16kByte)ずつアクセス可能となっていて、ＰＥコア１２−０〜１２−１５(１６)のアクセス可能バンクは隣同士など複数のＰＥ間で(8kByte)重なり合っている。
完全クロスバー接続ではなく、一部の接続を行わない。重なったＳＲＡＭバンクへのアクセス競合は調停により回避する。
一つのＰＥコアが直接接続領域を超えてのＳＲＡＭバンクに同時アクセスしたい場合に効率が低下するが、そのようなケースがレアケースとなるように共有バング数を設定することができるため、ここでの転送効率低下は全体のシステム効率の低下にはあまり関与しないようにできる。 In the shared memory device 10 of FIG. 5, each PE core 12-0 to 12-15 (16) can be accessed, for example, by 8 banks (16 kBytes), and the PE cores 12-0 to 12-15 (16). These accessible banks overlap (8kByte) between multiple PEs, such as neighbors.
Not a full crossbar connection and some connections are not made. Access competition to the overlapping SRAM bank is avoided by arbitration.
The efficiency decreases when one PE core wants to access the SRAM bank beyond the direct connection area at the same time, but the number of shared bangs can be set so that such a case becomes a rare case. It is possible to prevent the decrease in transfer efficiency of the system from contributing to a decrease in the overall system efficiency.

このようなメモリバンクなどの部分共有を行う場合に、端ではないＰＥコアが故障した場合に、共有されたメモリバンクの状況が変化したのでは、実行プログラムの変更、データ受け渡しの変更などが発生し、効率が悪化するような冗長方式は許されない。そのためＰＥ間の関係がどのＰＥが故障してもメモリからみたＰＥコアの関係がかわらないようする。代表的方法としてＰＥコアに対してシフト冗長を行う。 When performing partial sharing of such a memory bank, etc., if the PE core that is not the end fails, the status of the shared memory bank has changed, causing changes in the execution program, changes in data delivery, etc. However, a redundancy scheme that degrades efficiency is not allowed. For this reason, the PE core relationship as viewed from the memory is not changed regardless of which PE fails. As a representative method, shift redundancy is performed on the PE core.

図７は、シフト冗長構成を採用する場合のシフトスイッチの挿入方法について説明するための図である。 FIG. 7 is a diagram for explaining a method of inserting a shift switch when a shift redundant configuration is employed.

図７において、ＰＥコア１２−０〜１２−Ｎ，１２−Ｒ（冗長用ＰＥコア）と一般論理回路２０との間にシフトスイッチ回路２１−０〜２１−Ｎを挿入して、ＰＥコアの冗長化を行う様子を示した。
シフトスイッチ回路２１−０〜２１−Ｎは、論理回路２０側かの信号を選択するマルチプレクサ（ｍｕｘ１）２１１と、ＰＥコア側の信号を選択するマルチプレクサ（ｍｕｘ２）２１２を有している。 In FIG. 7, shift switch circuits 21-0 to 21-N are inserted between the PE cores 12-0 to 12-N and 12-R (redundant PE cores) and the general logic circuit 20, and the PE cores The state of redundancy was shown.
The shift switch circuits 21-0 to 21-N include a multiplexer (mux1) 211 that selects a signal on the logic circuit 20 side and a multiplexer (mux2) 212 that selects a signal on the PE core side.

それぞれのＰＥコアが故障した場合に論理的に隣接する他のＰＥコアにその信号をまわすことで、機能を論理的に隣接したＰＥコアに受け渡し、さらに機能を受け渡されたＰＥコアは反対側の隣のＰＥコアに自分の機能を受け渡してゆき、冗長ＰＥコア１２−Ｒにたどり着くまで同様の受け渡しを行う。
たとえば、ＰＥコア１２−１が故障した場合には、ＰＥコア１２−１への入力信号はＰＥコア１２−２にも入力されていて、ＰＥコア１２−２では本来ＰＥコア１２−２に入力（接続）されていた入力信号ではなく、ＰＥコア１２−１への入力信号を用いて演算処理などを行う。
さらに、ＰＥコア１２−１から一般論理回路２０への出力信号は、ＰＥコア１２−２からの出力信号を伝達するようにマルチプレクサ２２２の選択信号を制御する。 When each PE core fails, the function is passed to the logically adjacent PE core by passing the signal to other logically adjacent PE cores. The function is handed over to the next PE core, and the same handing is performed until the redundant PE core 12-R is reached.
For example, when the PE core 12-1 fails, the input signal to the PE core 12-1 is also input to the PE core 12-2, and the PE core 12-2 originally inputs to the PE core 12-2. An arithmetic process or the like is performed using the input signal to the PE core 12-1 instead of the (connected) input signal.
Further, the output signal from the PE core 12-1 to the general logic circuit 20 controls the selection signal of the multiplexer 222 so as to transmit the output signal from the PE core 12-2.

不良ＰＥコアへの入力変化を停止することで、消費電力の削減を行う。パワーゲートなどで不良ＰＥコアの電源を遮断する場合は不要である。
スイッチ回路内のクランプはほぼ無視できる程度であることから、スイッチ回路内のゲート数を削減して、全体の規模削減と消費電力削減にあまりつながらない。 The power consumption is reduced by stopping the input change to the defective PE core. This is not necessary when the power of the defective PE core is shut off by a power gate or the like.
Since the clamp in the switch circuit is almost negligible, reducing the number of gates in the switch circuit does not lead to much reduction in overall scale and power consumption.

図８は、本実施形態に係る共有メモリ装置の信号経路の接続例を示す図である。
なお、図８においては、理解を容易にするために、各メモリシステムは、４つのメモリバンクにより構成している。 FIG. 8 is a diagram illustrating a connection example of signal paths of the shared memory device according to the present embodiment.
In FIG. 8, each memory system includes four memory banks for easy understanding.

メモリシステムＭ０はメモリバンク１４−０〜１４−３により形成され、メモリシステムＭ１はメモリバンク１４−２〜１４−５により形成され、メモリシステムＭ２はメモリバンク１４−４〜１４−７により形成され、メモリシステムＭ３はメモリバンク１４−６〜１４−９により形成されている。 The memory system M0 is formed by the memory banks 14-0 to 14-3, the memory system M1 is formed by the memory banks 14-2 to 14-5, and the memory system M2 is formed by the memory banks 14-4 to 14-7. The memory system M3 is formed by memory banks 14-6 to 14-9.

図８の共有メモリ装置１０Ａは、各ＰＥコア１２−０〜１２−３が４個のメモリバンクにアクセスする経路が存在する。
ただし、各ＰＥコア１２−０〜１２−３と通常のアクセス経路１３１との間にシフト冗長処理経路部１３２と、各ＰＥコア１２−０〜１２−３と調停回路１５との間にシフト冗長処理経路部１３３とを有する。
シフト冗長処理経路部１３２，１３３において、○の部分は配線同士のスイッチ機構である。 In the shared memory device 10A of FIG. 8, there is a path for each PE core 12-0 to 12-3 to access four memory banks.
However, shift redundancy processing path unit 132 is provided between each PE core 12-0 to 12-3 and normal access path 131, and shift redundancy is provided between each PE core 12-0 to 12-3 and arbitration circuit 15. And a processing path unit 133.
In the shift redundancy processing path sections 132 and 133, the circled portions are switch mechanisms between wirings.

ＰＥコア１２−０は、冗長経路１３２１を介して通常の経路１３１に接続され、メモリモジュール１４−０〜１４−３に対してアクセス可能である。
ＰＥコア１２−１は、冗長経路１３２１を介して通常の経路１３１に接続され、メモリモジュール１４−０〜１４−３に対してアクセス可能である。また、ＰＥコア１２−１は、冗長経路１３２２を通して通常の経路１３１に接続され、メモリモジュール１４−２〜１４−５にアクセス可能である。
ＰＥコア１２−２は、冗長経路１３２２を介して通常の経路１３１に接続され、メモリモジュール１４−２〜１４−５に対してアクセス可能である。また、ＰＥコア１２−２は、冗長経路１３２３を通して通常の経路１３１に接続され、メモリモジュール１４−４〜１４−７にアクセス可能である。
ＰＥコア１２−３は、冗長経路１３２３を介して通常の経路１３１に接続され、メモリモジュール１４−４〜１４−７に対してアクセス可能である。また、ＰＥコア１２−３は、冗長経路１３２４を通して通常の経路１３１に接続され、メモリモジュール１４−６〜１４−９にアクセス可能である。
ＰＥコア１２−４は、冗長経路１３２４を介して通常の経路１３１に接続され、メモリモジュール１４−６〜１４−９に対してアクセス可能である。 The PE core 12-0 is connected to the normal path 131 via the redundant path 1321, and can access the memory modules 14-0 to 14-3.
The PE core 12-1 is connected to the normal path 131 via the redundant path 1321, and can access the memory modules 14-0 to 14-3. The PE core 12-1 is connected to the normal path 131 through the redundant path 1322, and can access the memory modules 14-2 to 14-5.
The PE core 12-2 is connected to the normal path 131 via the redundant path 1322, and can access the memory modules 14-2 to 14-5. The PE core 12-2 is connected to the normal path 131 through the redundant path 1323 and can access the memory modules 14-4 to 14-7.
The PE core 12-3 is connected to the normal path 131 via the redundant path 1323 and can access the memory modules 14-4 to 14-7. The PE core 12-3 is connected to the normal path 131 through the redundant path 1324 and can access the memory modules 14-6 to 14-9.
The PE core 12-4 is connected to the normal path 131 via the redundant path 1324 and can access the memory modules 14-6 to 14-9.

また、ＰＥコア１２−０は冗長経路１３３１を通して調停回路１５に信号を送出可能である。調停回路１５は冗長経路１３３２を通して信号をＰＥコア１２−０に送出可能である。
ＰＥコア１２−１は冗長経路１３３１または１３３４を通して調停回路１５に信号を送出可能である。調停回路１５は冗長経路１３３２または１３３４を通して信号をＰＥコア１２−１に送出可能である。
ＰＥコア１２−２は冗長経路１３３３または１３３５を通して調停回路１５に信号を送出可能である。調停回路１５は冗長経路１３３４または１３３６を通して信号をＰＥコア１２−２に送出可能である。
ＰＥコア１２−３は冗長経路１３３５または１３３７を通して調停回路１５に信号を送出可能である。調停回路１５は冗長経路１３３６または１３３８を通して信号をＰＥコア１２−３に送出可能である。
ＰＥコア１２−４は冗長経路１３３７を通して調停回路１５に信号を送出可能である。調停回路１５は冗長経路１３３８を通して信号をＰＥコア１２−４に送出可能である。 The PE core 12-0 can send a signal to the arbitration circuit 15 through the redundant path 1331. The arbitration circuit 15 can send a signal to the PE core 12-0 through the redundant path 1332.
The PE core 12-1 can send a signal to the arbitration circuit 15 through the redundant path 1331 or 1334. The arbitration circuit 15 can send a signal to the PE core 12-1 through the redundant path 1332 or 1334.
The PE core 12-2 can send a signal to the arbitration circuit 15 through the redundant path 1333 or 1335. The arbitration circuit 15 can send a signal to the PE core 12-2 through the redundant path 1334 or 1336.
The PE core 12-3 can send a signal to the arbitration circuit 15 through the redundant path 1335 or 1337. The arbitration circuit 15 can send a signal to the PE core 12-3 through the redundant path 1336 or 1338.
The PE core 12-4 can send a signal to the arbitration circuit 15 through the redundant path 1337. The arbitration circuit 15 can send a signal to the PE core 12-4 through the redundant path 1338.

本実施形態において、ＰＥコアが最初に処理する外部からのデータ転送は、ＤＭＡコントローラ１１により実現される。
図６において、ＤＭＡ１１を用いたデータ転送方法を説明する。 In the present embodiment, data transfer from the outside that is first processed by the PE core is realized by the DMA controller 11.
A data transfer method using the DMA 11 will be described with reference to FIG.

外部からのデータを特定のメモリバンクに転送または特定のメモリバンクのデータを外部に出力する場合、ＤＭＡコントローラ１１にＰＥコア１２−０〜１２−３から転送要求が入ると、ＤＭＡコントローラ１１は指定されたアドレスへの転送要求を調停回路１５に伝達して、転送許可を待つ。
転送許可が調停回路１５からおりたならば、外部データバスと特定のメモリを接続し、目的のアドレスを順に出力しながら外部データバスに対する転送制御などを行い、外部データバスとメモリの間でデータ伝送を実行する。シフト冗長機構はメモリの部分共有接続と同様に配線同士のスイッチ機構により実現できる。 When transferring data from the outside to a specific memory bank or outputting data from a specific memory bank to the outside, the DMA controller 11 designates a transfer request from the PE cores 12-0 to 12-3 to the DMA controller 11. The transfer request to the designated address is transmitted to the arbitration circuit 15, and the transfer permission is awaited.
If the transfer permission is received from the arbitration circuit 15, the external data bus is connected to a specific memory, transfer control for the external data bus is performed while sequentially outputting the target address, and data is transferred between the external data bus and the memory. Perform transmission. The shift redundancy mechanism can be realized by a switch mechanism between wirings in the same manner as the partial shared connection of the memory.

次に、ＰＥ間でのデータ共有と転送の例を説明する。
図８において、ＰＥコア１２−０の入力データがメモリバンク１４−０に置かれ、ＰＥコア１２−０はメモリバンク１４−０の内容を読んで処理を行い、メモリバンク１４−２とメモリバンク１４−３に結果を出力する。
有効なデータをメモリバンク１４−２またはメモリバンク１４−３に出力すると、ＰＥコア１２−０はメモリバンク１４−２の特定アドレスＡ−１の有効確認ビットをオンとする。
ＰＥコア１２−１は自分の処理が完了した時点でＰＥコア１２−０がアドレスＡ−１をオンにしているかどうかを検査して、オンであればメモリバンク１４−２またはメモリバンク１４−３からのデータ読み出しと演算処理を開始する。
ＰＥコア１２−１は、メモリバンク１４−２とメモリバンク１４−３に置かれたデータを入力として処理してその出力をメモリバンク１４−４に行う。ＰＥコア１２−２は処理が完了するとＤＭＡコントローラ１１に対して外部へのデータ転送要求を行い、ＤＭＡコントローラ１１はメモリバンク１４−４の有効データを外部バスを経由して出力する。
各ＰＥコア１２−０〜１２−３と各メモリバンクのデータ転送は、各ＰＥコアが調停回路１５にデータ転送要求アドレスを伝達して調停回路１５が他のＰＥコア、ＤＭＡコントローラとの優先順位をround-robin方式で決定して、ＰＥコアに対して転送許可を発行する。 Next, an example of data sharing and transfer between PEs will be described.
In FIG. 8, the input data of the PE core 12-0 is placed in the memory bank 14-0, and the PE core 12-0 reads and processes the contents of the memory bank 14-0, and the memory bank 14-2 and the memory bank The result is output to 14-3.
When valid data is output to the memory bank 14-2 or the memory bank 14-3, the PE core 12-0 turns on the validity confirmation bit of the specific address A-1 of the memory bank 14-2.
When the PE core 12-1 completes its processing, it checks whether the PE core 12-0 has turned on the address A-1, and if so, the memory bank 14-2 or the memory bank 14-3. Starts reading data from and computing.
The PE core 12-1 processes the data placed in the memory bank 14-2 and the memory bank 14-3 as an input and outputs the processed data to the memory bank 14-4. When the processing is completed, the PE core 12-2 requests the DMA controller 11 to transfer data to the outside, and the DMA controller 11 outputs valid data in the memory bank 14-4 via the external bus.
In the data transfer between each PE core 12-0 to 12-3 and each memory bank, each PE core transmits a data transfer request address to the arbitration circuit 15, and the arbitration circuit 15 prioritizes other PE cores and DMA controllers. Is determined by the round-robin method, and a transfer permission is issued to the PE core.

図９は、データ転送機構の実装例を示す図であって、メモリの部分共有のためのマルチプレクサとシフト冗長のためのマルチプレクサをひとつのマルチプレクサＭＵＸに機能を合体させた実装の例を示した図である。 FIG. 9 is a diagram illustrating an implementation example of the data transfer mechanism, and is a diagram illustrating an implementation example in which a multiplexer for partial sharing of a memory and a multiplexer for shift redundancy are combined into one multiplexer MUX. It is.

メモリの部分共有と、シフト冗長をばらばらに実装するのではなく、両方を同時に実現する回路方式を選択することで、冗長のための回路増加を抑えることが可能となる。
点線で図示した配線がシフト冗長のために付加した配線を示している。ここでの例は実稼動するＰＥ数が４で、メモリバンクは全部で１０個、各ＰＥ間ので部分共有されたメモリバンク数は２バンクの場合を示している。すなわち、図８の構成に対応している。 It is possible to suppress an increase in the number of circuits for redundancy by selecting a circuit system that realizes both of the memory sharing and shift redundancy separately, but not both.
Wirings indicated by dotted lines indicate wirings added for shift redundancy. In this example, the number of PEs actually operating is 4, the number of memory banks is 10 in total, and the number of memory banks partially shared between the PEs is 2 banks. That is, it corresponds to the configuration of FIG.

ＰＥコア１２−０への入力はメモリバンク１４−０，１４−１，１４−２，１４−３のどれかひとつを選択可能なように４：１マルチプレクサＭＵＸ１でデータ入力を選択する。
ＰＥコア１２−０の出力はメモリバンク１４−０，１４−１，１４−２，１４−３のどれかひとつにデータ転送できるようにそれぞれのメモリバンクの入力のマルチプレクサＭＵＸ２のひとつの入力に接続されている。
ＰＥコア１２−１ではその入力はシフト動作のためにＰＥコア１４−０の機能を代替する場合に必要なＰＥコア１２−０へのメモリバンクからの入力としてメモリバンク１４−０，１４−１、通常の動作用としてメモリバンク１４−２，１４−３，１４−４，１４−５からの出力を選択的に入力するための６：１マルチプレクサＭＵＸ１により入力データの選択を行う。
ＰＥコア１２−１の出力は、ＰＥコア１２−０の機能を代替するためのシフト冗長のための出力先として、メモリモジュール１４−０，１４−１、通常の動作のための出力先としてメモリモジュール１４−２，１４−３，１４−４，１４−５の入力マルチプレクサＭＵＸ２に接続されている。
このような接続を行うことで、ＰＥコア１２−０はメモリモジュール１４−０，１４−１，１４−２，１４−３へのデータの入出力を行うことができ、ＰＥコア１２−１は通常はメモリモジュール１４−２，１４−３，１４−４，１４−５へのデータの入出力を行うことができる。
ＰＥコア１２−０が故障した場合には、ＰＥコア１２−１がその代替ＰＥとして機能するために、ＰＥコア１２−１はメモリモジュール１４−０，１４−１へのデータ入出力が行えるようになっている。
他のＰＥコア１２−２，１２−３，１２−Ｒに関しても同様の動作ができるように入力にマルチプレクサを接続して入力データの選択を行うことで、部分共有メモリとシフト冗長を同時に実現できるようになっている。 The data input to the PE core 12-0 is selected by the 4: 1 multiplexer MUX1 so that any one of the memory banks 14-0, 14-1, 14-2, and 14-3 can be selected.
The output of the PE core 12-0 is connected to one input of the multiplexer MUX2 at the input of each memory bank so that data can be transferred to any one of the memory banks 14-0, 14-1, 14-2, 14-3. Has been.
In the PE core 12-1, its input is a memory bank 14-0, 14-1 as an input from the memory bank to the PE core 12-0, which is necessary when the function of the PE core 14-0 is substituted for the shift operation. The input data is selected by the 6: 1 multiplexer MUX1 for selectively inputting the outputs from the memory banks 14-2, 14-3, 14-4, and 14-5 for normal operation.
The output of the PE core 12-1 is a memory module 14-0, 14-1 as an output destination for shift redundancy for substituting the function of the PE core 12-0, and a memory as an output destination for normal operation. The modules 14-2, 14-3, 14-4, and 14-5 are connected to the input multiplexer MUX2.
By making such a connection, the PE core 12-0 can input / output data to / from the memory modules 14-0, 14-1, 14-2, 14-3, and the PE core 12-1 Normally, data can be input / output to / from the memory modules 14-2, 14-3, 14-4, and 14-5.
When the PE core 12-0 fails, the PE core 12-1 functions as an alternative PE, so that the PE core 12-1 can input and output data to the memory modules 14-0 and 14-1. It has become.
For other PE cores 12-2, 12-3, and 12-R, a partial shared memory and shift redundancy can be realized at the same time by selecting input data by connecting a multiplexer to the input so that the same operation can be performed. It is like that.

図１０は、ＰＥ(n)とＰＥ(n+1)におけるＭＥＭ（メモリバンク）(2n)へのアクセス調停のフローチャートである。
以下に、図１０においてＰＥ(n)とＰＥ(n+1)におけるＭＥＭ(2n)へのアクセス調停処理方法を説明する。なお、ここではＰＥコアをＰＥとして記している。 FIG. 10 is a flowchart of access arbitration to MEM (memory bank) (2n) in PE (n) and PE (n + 1).
The access arbitration processing method for MEM (2n) in PE (n) and PE (n + 1) in FIG. 10 will be described below. Here, the PE core is described as PE.

チップのリセット直後スタートからはじまり、まずはＰＥ(n)のＭＥＭ(2n)に対するアクセス要求を確認する(ＳＴ１)。要求がない場合はＰＥ(n+1)のＭＥＭ(2n)へのアクセス要求を確認フェーズに移行する(2)。
ＰＥ(n)のＭＥＭ(2n)に対するアクセス要求があった場合には、ＰＥ(n)にＭＥＭ(2n)に対するアクセス許可を与え、ＰＥ(n+1)にはＭＥＭ(2n)に対するアクセス拒否を行う(ＳＴ２)。
一定の時間をカウントするタイマーに初期値を設定する(ＳＴ３)。タイマーはカウントダウンを開始する。再びＰＥ(n)のＭＥＭ(2n)に対するアクセス要求を確認し、要求がない場合には(2)に移行する。あいかわらずアクセス要求がある場合には、タイマーのカウント値を確認してタイムアウトしていない場合には再びＰＥ(n)のＭＥＭ(2n)に対するアクセス要求確認を繰り返す。タイムアウトしていた場合には(2)に移行する（ＳＴ４，ＳＴ５）。 Starting from the start immediately after resetting the chip, first, the access request to the MEM (2n) of the PE (n) is confirmed (ST1). If there is no request, the PE (n + 1) access request to the MEM (2n) is shifted to the confirmation phase (2).
When there is an access request for MEM (2n) from PE (n), PE (n) is granted access permission to MEM (2n), and PE (n + 1) is denied access to MEM (2n). Perform (ST2).
An initial value is set in a timer that counts a certain time (ST3). The timer starts counting down. The access request to the MEM (2n) of the PE (n) is confirmed again, and if there is no request, the process proceeds to (2). If there is an access request, the timer count value is confirmed, and if the timeout has not occurred, the access request confirmation for MEM (2n) of PE (n) is repeated again. If timed out, the process proceeds to (2) (ST4, ST5).

(2)においても同様な処理を行う。ＰＥ(n+1)のＭＥＭ(2n)に対するアクセス要求を確認する（ＳＴ６）。要求がない場合はＰＥ(n)のＭＥＭ(2n)へのアクセス要求を確認フェーズに移行する(スタート)。
ＰＥ(n+1)のＭＥＭ(2n)に対するアクセス要求があった場合には、ＰＥ(n+1)にＭＥＭ(2n)に対するアクセス許可を与え、ＰＥ(n)にはＭＥＭ(2n)に対するアクセス拒否を行う(ＳＴ７)。
一定の時間をカウントするタイマーに初期値を設定する(ＳＴ８)。タイマーはカウントダウンを開始する。再びＰＥ(n+1)のＭＥＭ(2n)に対するアクセス要求を確認し、要求がない場合には(スタート)に移行する。あいかわらずアクセス要求がある場合には、タイマーのカウント値を確認してタイムアウトしていない場合には再びＰＥ(n+1)のＭＥＭ(2n)に対するアクセス要求確認を繰り返す。タイムアウトしていた場合には(スタート)に移行する（ＳＴ９，ＳＴ１０）。 Similar processing is performed in (2). An access request for MEM (2n) of PE (n + 1) is confirmed (ST6). If there is no request, the PE (n) access request to the MEM (2n) is shifted to the confirmation phase (start).
When there is an access request for MEM (2n) from PE (n + 1), PE (n + 1) is granted access permission to MEM (2n), and PE (n) is accessed to MEM (2n). Rejection is performed (ST7).
An initial value is set in a timer that counts a certain time (ST8). The timer starts counting down. The access request to the MEM (2n) of the PE (n + 1) is confirmed again, and if there is no request, the process proceeds to (START). If there is an access request, the timer count value is confirmed, and if the timeout has not occurred, the access request confirmation for MEM (2n) of PE (n + 1) is repeated again. If timed out, the process proceeds to (START) (ST9, ST10).

図１１は、部分共有マルチポート機構PEの階層単位増設方法を説明するための図である。次に、ＤＭＡ転送のネックとなった場合の階層単位での増設方法を説明する。 FIG. 11 is a diagram for explaining a hierarchical unit expansion method of the partially shared multiport mechanism PE. Next, an extension method in units of layers in the case of a bottleneck in DMA transfer will be described.

ＰＥコア同士でのデータ転送に関しては、大量のデータ転送がぶつかることでの性能低下は大幅に減らすことが可能であるが、外部とメモリの間でのデータ転送はＰＥコアが複数の機能を同時に処理している場合には衝突する確率が増大する。
そのような場合には、図１１に示すように、ＰＥアレーを階層化することにより対処する。
図６の基本構成と同様に、ＰＥアレーを１６個とＤＭＡコントローラ１個をひとつの階層としてＡＸＩバス(Advanced eXtensible Interfaceバス）２０を経由して接続するメモリシステム１００を構成する。
このようなAXIの階層が少しでもはいらないようにすることは重要であり、本発明では、この階層を極力減らすことに寄与するものである。 Regarding the data transfer between PE cores, the performance degradation due to a large amount of data transfer can be greatly reduced, but the data transfer between the outside and the memory is performed simultaneously by the PE core. When processing, the probability of collision increases.
In such a case, as shown in FIG. 11, a PE array is hierarchized.
Similar to the basic configuration of FIG. 6, a memory system 100 is configured in which 16 PE arrays and one DMA controller are connected as one layer via an AXI bus (Advanced eXtensible Interface bus) 20.
It is important to prevent the AXI hierarchy from entering even a little, and the present invention contributes to reducing this hierarchy as much as possible.

以上説明したように、本実施形態によれば、複数のＰＥ（処理装置）１２−０〜１２−１５と、処理装置によりアクセス可能な複数のメモリモジュール１４−０〜１４−６３と、複数の処理装置のうち、特定の処理装置のみが特定のメモリモジュールに接続可能な接続部１３と、を有し、複数の処理装置は、接続部を介して一または複数のメモリモジュールにより形成されるメモリシステムＭ０〜Ｍ１５をアクセス可能で、異なる処理装置によりアクセス可能なメモリシステムは、異なる処理装置でアクセスされるメモリモジュールを一部共有し、さらに、同じメモリモジュールに同時に複数の処理装置からアクセス要求があった場合には優先順位付け処理を実行し、その優先順位に従いアクセス制御を行う調停回路１５を有しかつ、シフト冗長構成を有することから、次のような効果が実現可能となっている。 As described above, according to the present embodiment, a plurality of PEs (processing devices) 12-0 to 12-15, a plurality of memory modules 14-0 to 14-63 accessible by the processing devices, and a plurality of Among the processing devices, only a specific processing device has a connection unit 13 connectable to a specific memory module, and the plurality of processing devices are memories formed by one or a plurality of memory modules via the connection unit Memory systems that can access the systems M0 to M15 and can be accessed by different processing devices share part of the memory modules accessed by the different processing devices, and access requests from a plurality of processing devices to the same memory module at the same time. If there is, it has an arbitration circuit 15 that executes prioritization processing and performs access control according to the priority, and shifts Since it has a long structure, the following effects can be realized.

各ＰＥが利用する作業用メモリモジュール（Memory Module）を使ってそのままＰＥ間のデータ転送に利用することで、通信のためのメモリモジュールを削減可能である。
メモリへのアクセス（Access）方向を変更するのみであり、通信時間が限りなくゼロになる。
ＰＥ数が増加してもＰＥとメモリ間の接続資源の量はＰＥの数にリニアに増加するため、必要なだけのＰＥを簡単にスケーラブル(Scalable)に増設可能である。
全てのＰＥが全てのメモリモジュールに接続可能とすることは資源を使ったわりには効果は少ないが、本実施形態では、限定的なＰＥ間のアクセス調停になるため、同一メモリへのアクセス競合調停が簡素になる。
また、ＰＥ間のメモリ共有の関係を変化させることなく冗長構造が可能となり歩留まりのいちじるしい改善につながる。
複数のＰＥをスケーラブルに増加させながら冗長効果により製造歩留まりがいちじるしく向上する。
部分共有メモリ化処理並びに冗長処理を別々に行うよりもリソースをシェアできる部分があり同時に行うことで回路規模を減らすことができる。 By using a working memory module (Memory Module) used by each PE as it is for data transfer between PEs, it is possible to reduce memory modules for communication.
Only the access direction to the memory (Access) is changed, and the communication time becomes zero without limit.
Even if the number of PEs increases, the amount of connection resources between the PEs and the memory increases linearly with the number of PEs. Therefore, it is possible to easily add as many PEs as necessary in a scalable manner.
Making all PEs connectable to all memory modules is less effective for using resources, but in this embodiment, access arbitration between the same memories is limited because access arbitration between PEs is limited. Becomes simple.
In addition, a redundant structure is possible without changing the relationship of memory sharing between PEs, leading to a significant improvement in yield.
The production yield is remarkably improved by the redundancy effect while increasing the number of PEs in a scalable manner.
Rather than performing the partial shared memory processing and the redundant processing separately, there is a portion where resources can be shared, and the circuit scale can be reduced by simultaneously performing the processing.

マルチプロセッサの一般的なアーキテクチャを示す図である。1 is a diagram illustrating a general architecture of a multiprocessor. クロスバーを用いたアーキテクチャを示す図である。It is a figure which shows the architecture using a crossbar. ＰＥ増設の課題を説明するための図である。It is a figure for demonstrating the subject of PE expansion. ＰＥを増加させる場合のデータ転送路の増大を抑えるためにポートを用いた構成例を示す図である。It is a figure which shows the structural example which used the port in order to suppress the increase in the data transfer path in the case of increasing PE. クロスバーに接続されたアレー構造に対する故障回避の方法を示し図である。It is a figure which shows the method of the failure avoidance with respect to the array structure connected to the crossbar. 本発明の実施形態に係る共有メモリ装置のシステム構成図である。1 is a system configuration diagram of a shared memory device according to an embodiment of the present invention. シフト冗長構成を採用する場合のシフトスイッチの挿入方法について説明するための図である。It is a figure for demonstrating the insertion method of the shift switch in the case of employ | adopting a shift redundant structure. 本実施形態に係る共有メモリ装置の信号経路の接続例を示す図である。It is a figure which shows the example of a connection of the signal path | route of the shared memory device which concerns on this embodiment. データ転送機構の実装例を示す図であって、メモリの部分共有のためのマルチプレクサとシフト冗長のためのマルチプレクサをひとつのマルチプレクサＭＵＸに機能を合体させた実装の例を示した図である。It is a figure which shows the example of mounting of a data transfer mechanism, Comprising: It is the figure which showed the example of mounting which united the function for the multiplexer for partial sharing of a memory, and the multiplexer for shift redundancy in one multiplexer MUX. ＰＥ(n)とＰＥ(n+1)におけるＭＥＭ（メモリバンク）(2n)へのアクセス調停のフローチャートである。It is a flowchart of access arbitration to MEM (memory bank) (2n) in PE (n) and PE (n + 1). 部分共有マルチポート機構PEの階層単位増設方法を説明するための図である。It is a figure for demonstrating the hierarchical unit expansion method of the partial shared multiport mechanism PE.

Explanation of symbols

１０，１０Ａ，１０Ｂ・・・共有メモリ装置、１１・・・ＤＭＡコントローラ、１２−０〜１２−１５・・・ＰＥコア（処理装置）、１３・・・一部重なりマルチポートおよびシフト冗長回路、１４−０〜１４−６３・・・メモリバンク（メモリモジュール）、１５・・・調停回路、２０・・・ＡＸＩバス、Ｍ０〜Ｍ１５・・・・メモリシステム。 10, 10A, 10B ... shared memory device, 11 ... DMA controller, 12-0 to 12-15 ... PE core (processing device), 13 ... partially overlapping multi-port and shift redundant circuit, 14-0 to 14-63 ... memory bank (memory module), 15 ... arbitration circuit, 20 ... AXI bus, M0-M15, ... memory system.

Claims

A plurality of processing devices;
A plurality of memory modules accessible by the processing unit;
Among the plurality of processing devices, only a specific processing device has a connection part connectable to a specific memory module,
The plurality of processing devices can access a memory system formed by one or a plurality of memory modules via the connection unit,
The memory system accessible by different processing devices partially shares memory modules accessed by different processing devices,
A shared memory device having a redundancy function capable of redundancy with respect to the plurality of processing devices.

The shared memory device according to claim 1, wherein the redundancy function is a shift redundancy function.

The shared memory device according to claim 1, wherein the shared memory module is configured such that processing devices close to each other are accessible to each other.

The shared memory device according to claim 1, further comprising: an arbitration circuit that executes prioritization processing when a plurality of processing devices simultaneously request access to the same memory module, and performs access control according to the priorities.

A controller capable of communicating with the outside and controlling access to the plurality of memory modules;
The shared memory device according to claim 1, wherein the controller is capable of accessing all memory modules via the connection unit.

A plurality of processing devices;
Multiple memory modules;
Of the plurality of processing devices, only a specific processing device can be connected to a specific memory module, and
A controller capable of communicating with the outside and controlling access to the plurality of memory modules;
The plurality of processing devices can access a memory system formed by one or a plurality of memory modules via the connection unit,
The memory system accessible by different processing devices partially shares memory modules accessed by different processing devices,
A plurality of shared memory devices including a redundancy function capable of redundancy with respect to the plurality of processing devices;
A shared memory device in which controllers of each shared memory device are connected by a bus.

The shared memory device according to claim 6, wherein the redundancy function is a shift redundancy function.

The shared memory device according to claim 6, wherein the memory system is formed such that the shared memory modules can be accessed by processing devices located close to each other.

The shared memory device according to claim 6, further comprising: an arbitration circuit that executes prioritization processing when a plurality of processing devices simultaneously request access to the same memory module, and performs access control according to the priority.

The shared memory device according to claim 6, wherein the controller can access all the memory modules via the connection unit.