JP2000020486A

JP2000020486A - SIMD type computing unit

Info

Publication number: JP2000020486A
Application number: JP10182105A
Authority: JP
Inventors: Yukio Kadowaki; 幸男門脇; Shinichi Yamaura; 慎一山浦; Sugitaka Otegi; 杉高樗木; Kazuhiko Hara; 和彦原
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 1998-06-29
Filing date: 1998-06-29
Publication date: 2000-01-21

Abstract

(57)【要約】【課題】従来のＳＩＭＤ型演算器では、演算の結果桁
あふれが発生する場合や倍精度演算を行う場合は３サイ
クル以上の演算処理が必要であり、この処理回数の削減
を目指す。【解決手段】データの２つの入力手段においてそれぞ
れ複数（２ⁿ）個のデータ格納部を設け、それらデータ
格納部のうち半分のみ有効データ格納部とし、残りを無
効データ格納部とする。演算処理時には、有効データ格
納部に格納されるデータのみ用いる。それらデータの長
さをそれぞれ倍に拡張しおのおのの演算に繋げ、倍の長
さのデータ格納部に出力することで、上記課題の解決手
段とする。データ入力手段のうち１つがただひとつの有
効データ格納部を有する場合でも、上記解決手段は有効
である。演算処理時、データの符号拡張も考慮する。 (57) [Summary] [PROBLEMS] In a conventional SIMD type arithmetic unit, when an overflow occurs as a result of an arithmetic operation or when a double precision arithmetic operation is performed, arithmetic processing of three cycles or more is required, and the number of processing times can be reduced. It aims to. SOLUTION: A plurality of (2 ⁿ ) data storage units are provided in two data input units, and only half of the data storage units are used as valid data storage units, and the rest are used as invalid data storage units. At the time of arithmetic processing, only data stored in the valid data storage unit is used. The length of the data is doubled, connected to each operation, and output to a double-length data storage unit to solve the above problem. The above solution is effective even when one of the data input means has only one valid data storage. At the time of arithmetic processing, the sign extension of data is also considered.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、マイクロプロセッ
サにおけるＳＩＭＤ（ＳｉｎｇｌｅＩｎｓｔｒｕｃｔ
ｉｏｎＭｕｌｔｉｐｌｅＤａｔａ）型演算器に関す
る。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a SIMD (Single Instruction) for a microprocessor.
The present invention relates to an ion multiple data type arithmetic unit.

【０００２】[0002]

【従来の技術】マイクロプロセッサにおいて複数のデー
タの演算を並列処理する方式としてＳＩＭＤがある。一
般的にＳＩＭＤ方式とは、マイクロプロセッサ中の演算
器において、対象のデータは異なるが機能が同一である
複数の演算を、単一命令により並列処理するもので、命
令の供給装置や制御装置の共有化が図れ、また、実行処
理の時間短縮が図れるという利点を有する。2. Description of the Related Art There is a SIMD as a method for processing a plurality of data in a microprocessor in parallel. Generally, the SIMD method is a processing unit in a microprocessor in which a plurality of operations having different data but the same function are processed in parallel by a single instruction. There is an advantage that sharing can be achieved and the time of the execution process can be reduced.

【０００３】[0003]

【発明が解決しようとする課題】従来のＳＩＭＤ型演算
器では、入力手段の有するデータ格納部のデータ長の長
さと出力手段の有するデータ格納部のデータ長の長さが
同じである。このため、演算の結果桁あふれが発生する
場合や倍精度演算を行う場合は、出力下位側の結果を出
す演算と出力上位側の結果を出す演算とを少なくとも１
回ずつ行い、さらに得られた上位側および下位側の出力
結果を合成する処理が別途必要である。In the conventional SIMD type arithmetic unit, the length of the data length of the data storage section of the input means is the same as the length of the data length of the data storage section of the output means. For this reason, when overflow occurs as a result of an operation or when a double-precision operation is performed, at least one operation for generating a lower output result and an operation for generating a higher output result is performed.
It is necessary to separately perform processing for each time and combine the obtained upper and lower output results.

【０００４】例えば、図１（ａ）、（ｂ）、（ｃ）は、
６４ビットの長さの入力手段２が、８個の８ビットデー
タ、４個の１６ビットデータ、そして２個の３２ビット
データから構成される従来のＳＩＭＤ演算の様子をそれ
ぞれ示している。従来のＳＩＭＤ演算では入力データ格
納部４のデータ長が８ビットであれば同一の８ビット演
算を同時に８個実行し結果を出力手段６に出力する〔Ａ
ＤＤ命令説明図、図１（ａ）〕。この構成で倍精度演算
を行う場合には最初に出力の下位側８ビット演算を行
い、結果をメモリに記憶し、次に出力の上位側８ビット
演算を行い、先に記憶した下位側のデータをロードして
倍精度データを作らなければならない。For example, FIGS. 1 (a), 1 (b) and 1 (c)
The state of a conventional SIMD operation in which the input means 2 having a length of 64 bits is composed of eight 8-bit data, four 16-bit data, and two 32-bit data is shown. In the conventional SIMD operation, if the data length of the input data storage unit 4 is 8 bits, the same eight-bit operation is simultaneously executed eight times and the result is output to the output means 6 [A
Illustration of DD instruction, FIG. 1 (a)]. When performing double precision operation in this configuration, first, the lower 8-bit operation of the output is performed, the result is stored in the memory, then the upper 8-bit operation of the output is performed, and the lower-order data stored earlier is stored. Must be loaded to create double precision data.

【０００５】本発明は、従来より少ないサイクル数の処
理で倍精度演算が可能なＳＩＭＤ型演算器を提供するこ
とを目的としている。また、本発明は、例えば乗算にお
ける係数として１つの固定値を用いるような場合に、係
数データを１つだけ用意してＳＩＭＤ演算を実行〔図３
（ａ）参照〕し、かつ上記のような倍精度演算が可能な
ＳＩＭＤ型演算器〔図２（ｂ）〕や、同じ数値を複数並
べたＳＩＭＤ型入力データを用意することなしに係数デ
ータをレジスタ上に１つだけロードしＳＩＭＤ演算を実
行〔ブロードキャスト形式、図３（ｂ）参照〕し、かつ
上記のような倍精度演算が可能なＳＩＭＤ型演算器〔図
２（ｃ）〕を提供することを目的としている。An object of the present invention is to provide a SIMD type arithmetic unit capable of performing double-precision arithmetic with a smaller number of cycles than in the prior art. Further, according to the present invention, when one fixed value is used as a coefficient in multiplication, for example, only one coefficient data is prepared and the SIMD operation is executed [FIG.
(See FIG. 2 (a)), and the coefficient data can be converted without preparing a SIMD type arithmetic unit [FIG. 2 (b)] capable of performing the double precision operation as described above or SIMD type input data in which a plurality of the same numerical values are arranged. A SIMD type arithmetic unit [FIG. 2 (c)] that loads only one register and executes a SIMD operation [broadcast format, see FIG. 3 (b)] and capable of performing the double precision operation as described above. It is intended to be.

【０００６】[0006]

【課題を解決するための手段】本発明の第一の形態は、
２つの演算データ入力手段と１つの演算結果データ出力
手段を持つＳＩＭＤ型演算器であって、第１の入力手段
及び第２の入力手段はいずれも所定のデータ長の長さで
あり、かつ長さの等しいデータ格納部を２ⁿ個有し、出
力手段は上記入力手段と同一の長さであり、かつ長さの
等しいデータ格納部を２^n-1個有する。従来のＳＩＭＤ
型演算器であるとすると、第１の入力手段の各データ格
納部に格納されるデータ（Ａ）とこれに対応する第２の
入力手段の各データ格納部に格納されるデータ（Ｂ）と
を用いて所定の演算を行った結果得られたデータ（Ｃ）
を対応する出力手段のデータ格納部に格納するのである
が、本発明の上記ＳＩＭＤ型演算器では、第１の入力手
段及び第２の入力手段のそれぞれにおける各データ格納
部は２^n-1個の有効データ格納部と２^n-1個の無効データ
格納部とからなり、第１の入力手段の有効データと第２
の入力手段の対応する有効データとを用いて演算した結
果を出力手段の対応するデータ格納部に格納する。Means for Solving the Problems A first aspect of the present invention is as follows.
A SIMD type operation unit having two operation data input means and one operation result data output means, wherein both the first input means and the second input means have a predetermined data length, and has 2 ⁿ pieces are of equal data storage unit, the output unit are the same length and said input means, and to 2 ^n-1 pieces have a same data storage unit of length. Conventional SIMD
If it is a type arithmetic unit, data (A) stored in each data storage unit of the first input means and corresponding data (B) stored in each data storage unit of the second input means (C) obtained as a result of performing a predetermined operation using
Is stored in the data storage section of the corresponding output means. In the SIMD type arithmetic unit of the present invention, each of the first input means and the second input means has 2 ^{n -1} data storage sections. , And 2 ^{n -1} invalid data storage units, and the valid data of the first input means and the second
The result calculated using the corresponding valid data of the input means is stored in the corresponding data storage of the output means.

【０００７】上記演算器において、第１の入力手段及び
第２の入力手段のそれぞれにおけるデータ格納部に関し
て、２^n-1個の無効データ格納部と２^n-1個の有効データ
格納部とが交互に配列されてもよい。In the above computing unit, 2 ^{n -1} invalid data storage units and 2 ^{n -1} valid data storage units are provided for the data storage units in each of the first input means and the second input means. They may be arranged alternately.

【０００８】ｎの値は命令の指定により変化し得る。[0008] The value of n can change depending on the designation of an instruction.

【０００９】本発明の第二の形態は、２つの演算データ
入力手段と１つの演算結果データ出力手段を持つＳＩＭ
Ｄ型演算器であって、第１の入力手段は所定のデータ長
の長さであり、かつ長さの等しいデータ格納部を２ⁿ個
有し、第２の入力手段は少なくとも第１の入力手段のデ
ータ格納部のデータ長以上の長さであり、かつ第１の入
力手段のデータ格納部と長さが等しいデータ格納部を１
個有し、出力手段は第１の入力手段と同一の長さであ
り、かつ長さの等しいデータ格納部を２^n-1個有する。
従来のＳＩＭＤ型演算器であるとすると、第１の入力手
段の各データ格納部に格納されるデータ（Ａ）と第２の
入力手段の１個のデータ格納部に格納されるデータ
（Ｂ）とを用いて所定の演算を行った結果得られたデー
タ（Ｃ）を対応する出力手段のデータ格納部に格納する
のであるが、本発明の上記ＳＩＭＤ型演算器では、第１
の入力手段における各データ格納部は２^n-1個の有効デ
ータ格納部と２^n-1個の無効データ格納部とからなり、
第１の入力手段の各有効データと第２の入力手段のデー
タを用いて演算した結果を出力手段の対応するデータ格
納部に格納する。A second embodiment of the present invention is a SIM having two operation data input means and one operation result data output means.
A D-type arithmetic unit, wherein the first input means has 2 ⁿ data storage units having a predetermined data length and the same length, and the second input means has at least the first input means. One data storage unit having a length equal to or greater than the data length of the data storage unit of the means and having the same length as the data storage unit of the first input means.
And the output means has 2 ^{n -1} data storage units having the same length and the same length as the first input means.
If it is a conventional SIMD type arithmetic unit, data (A) stored in each data storage unit of the first input means and data (B) stored in one data storage unit of the second input means The data (C) obtained as a result of performing a predetermined operation by using the above is stored in the data storage unit of the corresponding output means.
Each of the data storage units in the input means comprises 2 ^n-1 valid data storage units and 2 ^n-1 invalid data storage units,
The result calculated using each valid data of the first input means and the data of the second input means is stored in a corresponding data storage section of the output means.

【００１０】上記演算器において、第１の入力手段にお
けるデータ格納部に関して、２^n-1個の無効データ格納
部と２^n-1個の有効データ格納部とが交互に配列されて
もよい。In the above-mentioned arithmetic unit, 2 ^{n -1} invalid data storage units and 2 ^{n -1} valid data storage units may be alternately arranged with respect to the data storage unit in the first input means.

【００１１】また、ｎの値は命令の指定により変化し得
る。Further, the value of n can be changed by the designation of an instruction.

【００１２】本発明の第一の形態の演算器、第二の形態
の演算器いずれにおいても、入力手段における有効デー
タ格納部に格納されたデータを出力手段のデータ格納部
のデータ長にまで拡張して所定の演算に繋げる際に、符
号拡張を行うか、行わないかを選択できる。In both the first and second modes of the present invention, the data stored in the valid data storage section of the input means is extended to the data length of the data storage section of the output means. Then, when connecting to a predetermined operation, it is possible to select whether to perform sign extension or not.

【００１３】[0013]

【発明の実施の形態】以下、添付図面を参照して本発明
の実施の形態を説明する。図２(ａ)は、第１の形態のＳ
ＩＭＤ型演算器（以下、演算器という。）１０を示し、
この演算器１０は、第１の入力レジスタ１２と、第２の
入力レジスタ１４と、出力レジスタ１６を有する。本実
施形態では、第１の入力レジスタ１２はデータ長が６４
ビットで、８ビットごとに分割して８個の入力データ格
納部１８で構成されている。これら８個の入力データ格
納部１８は、隣接する２個のデータ格納部を一組とし、
上位側は無効データ格納部（Ｄ．Ｃ．：Ｄｏｎ’ｔＣ
ａｒｅ）２０、下位側は有効データ格納部２２としてあ
る。第２の入力レジスタ１４は、第１の入力レジスタ１
２と同一の構成を有し、隣接する上位側の無効データ格
納部（Ｄ．Ｃ．：Ｄｏｎ’ｔＣａｒｅ）２４と下位側
の有効データ格納部２６を複数組備えている。Embodiments of the present invention will be described below with reference to the accompanying drawings. FIG. 2A shows a first embodiment of S
An IMD type operation unit (hereinafter, referred to as an operation unit) 10 is shown,
The arithmetic unit 10 has a first input register 12, a second input register 14, and an output register 16. In the present embodiment, the first input register 12 has a data length of 64.
The input data storage section 18 is divided into eight bits and divided into eight bits. These eight input data storage units 18 form a set of two adjacent data storage units,
The upper side is an invalid data storage unit (DC: Don't C).
are) 20 and the lower side is a valid data storage unit 22. The second input register 14 is the first input register 1
2, and includes a plurality of adjacent upper invalid data storage units (DC: Don't Care) 24 and lower valid data storage units 26.

【００１４】出力レジスタ１６は、入力レジスタ１２，
１４と同様に６４ビットのデータ長を有し、１６ビット
ごとに分割して４個の出力データ格納部（Ｃ０〜Ｃ３）
２８で構成されている。ここで、各データ格納部２８
は、入力レジスタ１２，１４の有効データ格納部２２，
２６とそれぞれ対応付けられており、例えば、演算デー
タＣ０を格納するデータ格納部（以下、データ格納部Ｃ
０という。）は、有効データ格納部Ａ０及びＢ０に対応
している。The output register 16 includes the input register 12,
14 has a data length of 64 bits, and is divided into 16 bits to output four output data storage units (C0 to C3).
28. Here, each data storage unit 28
Are valid data storage units 22 of the input registers 12 and 14,
26, for example, a data storage unit (hereinafter referred to as a data storage unit C
It is called 0. ) Correspond to the valid data storage units A0 and B0.

【００１５】以上の構成を有する演算器１０によれば、
第１の入力レジスタ１２の有効データ格納部２２にそれ
ぞれ格納されている演算データＡ０〜Ａ３と、これに対
応する第２の入力レジスタ１４の有効データ格納部２６
に格納されている演算データＢ０〜Ｂ３とを用いて所定
の演算が行われ、その演算結果が対応する出力データ格
納部Ｃ０〜Ｃ３に格納される。According to the arithmetic unit 10 having the above configuration,
The operation data A0 to A3 stored respectively in the valid data storage unit 22 of the first input register 12 and the corresponding valid data storage unit 26 of the second input register 14
A predetermined operation is performed using the operation data B0 to B3 stored in the storage device, and the operation result is stored in the corresponding output data storage units C0 to C3.

【００１６】このように構成された演算器１０によれ
ば、各出力データ格納部２８のデータ長が各入力データ
格納部のデータ長の２倍となっているので、従来では少
なくとも３つの演算（上位側の演算、下位側の演算及び
合成演算）を必要としていた倍精度演算が１つの演算処
理で行える。According to the arithmetic unit 10 configured as described above, the data length of each output data storage unit 28 is twice the data length of each input data storage unit. Double-precision calculations that required higher-order calculations, lower-order calculations, and synthesis calculations can be performed by one calculation process.

【００１７】第１の入力レジスタ１２における各有効デ
ータ格納部２２は、第２の入力レジスタ１４における有
効データ格納部２６のうちのそれぞれ異なる１つに対応
付けられていなければならない。この対応付けに基づい
て、演算処理を行うからである。けれども、その対応付
けが為されていれば、入力レジスタ１２，１４において
各有効データ格納部２２，２６と各無効データ格納部２
０，２４とはランダムに並べられても良い。Each valid data store 22 in the first input register 12 must be associated with a different one of the valid data stores 26 in the second input register 14. This is because arithmetic processing is performed based on this association. However, if the correspondence is established, each valid data storage unit 22, 26 and each invalid data storage unit 2
0 and 24 may be arranged at random.

【００１８】入力レジスタ１２，１４におけるデータ格
納部１８の個数（即ち、出力レジスタ１６におけるデー
タ格納部２８の個数の２倍）２ⁿは演算器１０に対する
命令の内容により変動し得る（図２（ａ）、（ｂ）、
（ｃ）参照）。ｎの値が変動しても、上記と同様に１つ
の演算処理で倍精度のＳＩＭＤ演算が行える。The number 2 ⁿ of the data storage units 18 in the input registers 12 and 14 (that is, twice the number of the data storage units 28 in the output register 16) 2 ⁿ can be varied depending on the contents of the instruction to the arithmetic unit 10 (FIG. a), (b),
(C)). Even if the value of n fluctuates, a double-precision SIMD operation can be performed by one operation in the same manner as described above.

【００１９】次に、第２の形態の演算器を図２（ｂ）に
示す。この演算器は、上述した第１の形態の演算器と同
様に、概略、第１の入力レジスタ３２、第２の入力レジ
スタ（または即値、等）３４、及び出力レジスタ３６で
構成されている。第１の入力レジスタ３２は、上記第１
の実施形態の第１の入力レジスタ１２と同一の構成を有
する。第２の入力レジスタ（または即値、等）３４は、
１２ビットのデータ長を有し、これは８ビットの有効デ
ータ格納部４２と４ビットの無効データ格納部４０で構
成されている。FIG. 2B shows an arithmetic unit according to the second embodiment. This arithmetic unit is generally composed of a first input register 32, a second input register (or immediate value, etc.) 34, and an output register 36, similarly to the arithmetic unit of the first embodiment described above. The first input register 32 stores the first
It has the same configuration as the first input register 12 of the embodiment. The second input register (or immediate value, etc.) 34
It has a data length of 12 bits, which is composed of an 8-bit valid data storage section 42 and a 4-bit invalid data storage section 40.

【００２０】出力レジスタ３６も、上記第１の実施形態
の出力レジスタ１６と同一の構成を有する。ここで、各
データ格納部４４は、第２の入力レジスタ（または即
値、等）３４のただ１つの有効データ格納部４２と及び
第１の入力レジスタ３２のそれぞれの有効データ格納部
３８と対応付けられており、例えば、データ格納部Ｃ０
は、有効データ格納部Ａ０及びＢ０に、Ｃ１はＡ１及び
Ｂ０に、対応している。The output register 36 has the same configuration as the output register 16 of the first embodiment. Here, each data storage unit 44 is associated with only one valid data storage unit 42 of the second input register (or immediate value, etc.) 34 and each valid data storage unit 38 of the first input register 32. For example, the data storage unit C0
Corresponds to the valid data storage units A0 and B0, and C1 corresponds to A1 and B0.

【００２１】この演算器によれば、第１の入力レジスタ
３２の有効データ格納部３８にそれぞれ格納されている
演算データＡ０〜Ａ３と、これに対応する第２の入力レ
ジスタ（または即値等）３４の有効データ格納部４２に
格納されている演算データＢ０とを用いて所定の演算が
行われ、その演算結果が対応する出力データ格納部Ｃ０
〜Ｃ３に格納される。According to this arithmetic unit, the operation data A0 to A3 stored in the effective data storage unit 38 of the first input register 32 and the corresponding second input register (or immediate value) 34 are stored. A predetermined operation is performed using the operation data B0 stored in the effective data storage unit 42 of the output data storage unit 42 corresponding to the output data storage unit C0.
To C3.

【００２２】このように構成された演算器によれば、上
記第１の形態で得られる効果だけではなく、演算データ
Ｂ０が４個並列に並ぶ第２の入力レジスタを用意する必
要が無いことや、第２の入力レジスタ（または即値、
等）３４のデータ長を短くできること等の固有の利点が
ある。According to the arithmetic unit configured as described above, not only the effect obtained in the first embodiment but also the necessity of preparing a second input register in which four pieces of operation data B0 are arranged in parallel is eliminated. , A second input register (or an immediate,
Etc.) There is an inherent advantage that the data length of 34 can be shortened.

【００２３】さらに、第３の形態の演算器を図２（ｃ）
に示す。この演算器も、上述した第１の形態の演算器と
同様に、概略、第１の入力レジスタ５２、第２の入力レ
ジスタ５４、及び出力レジスタ５６で構成されている。
第１の入力レジスタ５２は、１６ビットごとに分割して
４個の入力データ格納部で構成されている以外は、上記
第１の実施形態の第１の入力レジスタ１２と同様の構成
を有する。第２の入力レジスタ５４は、第１の入力レジ
スタ５２と同じ長さの６４ビットのデータ長を有し、１
６ビットの有効データ格納部６２と４８ビットの無効デ
ータ格納部６０で構成されている。FIG. 2C shows an arithmetic unit according to the third embodiment.
Shown in This computing unit is also generally composed of a first input register 52, a second input register 54, and an output register 56, similarly to the computing unit of the first embodiment described above.
The first input register 52 has the same configuration as that of the first input register 12 of the first embodiment, except that the first input register 52 is divided into 16 bits and configured by four input data storage units. The second input register 54 has a data length of 64 bits of the same length as that of the first input register 52, and
It comprises a 6-bit valid data storage unit 62 and a 48-bit invalid data storage unit 60.

【００２４】出力レジスタ５６も、３２ビットで分割し
て２個の出力データ格納部（Ｃ０、Ｃ１）６４で構成さ
れている以外は上記第１の実施形態の出力レジスタ１６
と同様の構成を有する。ここで、各データ格納部６４
は、第２の入力レジスタ５４のただ１つの有効データ格
納部６２と及び第１の入力レジスタ５２のそれぞれの有
効データ格納部５８と対応付けられており、例えば、デ
ータ格納部Ｃ０は、有効データ格納部Ａ０及びＢ０に、
Ｃ１はＡ１及びＢ０に、対応している。The output register 56 of the first embodiment is also the same as that of the first embodiment except that the output register 56 is divided into 32 bits and is constituted by two output data storage sections (C0, C1) 64.
Has the same configuration as Here, each data storage unit 64
Is associated with only one valid data storage unit 62 of the second input register 54 and each valid data storage unit 58 of the first input register 52. For example, the data storage unit C0 stores the valid data In the storage units A0 and B0,
C1 corresponds to A1 and B0.

【００２５】この演算器によれば、第１の入力レジスタ
５２の有効データ格納部５８にそれぞれ格納されている
演算データＡ０、Ａ１と、これに対応する第２の入力レ
ジスタ５４の有効データ格納部６２に格納されている演
算データＢ０とを用いて所定の演算が行われ、その演算
結果が対応する出力データ格納部Ｃ０、Ｃ１に格納され
る。According to this arithmetic unit, the operation data A0 and A1 stored in the effective data storage unit 58 of the first input register 52 and the corresponding effective data storage unit of the second input register 54 are stored. A predetermined operation is performed using the operation data B0 stored in 62 and the operation result is stored in the corresponding output data storage units C0 and C1.

【００２６】このように構成された演算器によれば、上
記第１の形態で得られる効果だけではなく、演算データ
Ｂ０が２個並列に並ぶ第２の入力レジスタを用意する必
要が無いこと等の固有の利点がある。According to the arithmetic unit configured as described above, not only the effect obtained in the first embodiment but also the necessity of preparing a second input register in which two operation data B0 are arranged in parallel is eliminated. There are inherent advantages of

【００２７】第２の形態、第３の形態では、第１の入力
レジスタ３２，５４において各有効データ格納部３８，
５８と各無効データ格納部４６，６６とはランダムに並
べられても良い。第２の入力レジスタ３４、５４におい
て有効データ格納部４２、６２はレジスタ上のどこに位
置しても良い。また、第３の形態において、第２の入力
レジスタ５４は第１のレジスタ５２よりもデータ長が長
くても良い。In the second and third embodiments, each valid data storage section 38,
58 and the invalid data storage units 46 and 66 may be arranged at random. In the second input registers 34 and 54, the valid data storage units 42 and 62 may be located anywhere on the registers. In the third embodiment, the data length of the second input register 54 may be longer than that of the first register 52.

【００２８】第２の形態、第３の形態において、第１の
入力レジスタ３２，５２におけるデータ格納部４８，６
８の個数（即ち、出力レジスタ３６，５６におけるデー
タ格納部４４，６４の個数の２倍）２ⁿは演算器に対す
る命令の内容により変動し得る（図２（ａ）、（ｂ）、
（ｃ）参照）。ｎの値が変動しても、上記と同様に一つ
の演算処理で倍精度のＳＩＭＤ演算が行える。In the second and third embodiments, the data storage units 48, 6 in the first input registers 32, 52
8 (that is, twice the number of the data storage units 44 and 64 in the output registers 36 and 56) 2 ⁿ can be varied depending on the contents of the instruction to the arithmetic unit (FIGS. 2A and 2B).
(C)). Even if the value of n fluctuates, a double-precision SIMD operation can be performed by one operation as in the above.

【００２９】なお、以上の説明では、第１の入力レジス
タ（及び、第１の形態、第３の形態ては、第２の入力レ
ジスタ）と出力レジスタとのデータ長は６４ビットとし
たが、データ長はこれに限るものでなく、例えば１２８
ビットであってもよい。In the above description, the data length between the first input register (and the second input register in the first and third embodiments) and the output register is 64 bits. The data length is not limited to this, for example, 128
It may be a bit.

【００３０】本発明の演算器においては、倍精度演算を
行う際に入力データの拡張を行う。その際、データを符
号付きデータ（２の補数）とみなすならば上位部分に当
該データの符号を拡張し、データを絶対値データとみな
すならば上位部分に”０”を代入するが、以上の選択は
演算命令の書き分けにより行う。具体的に、図４は８ビ
ットのデータの倍精度演算を行う際に１６ビットのデー
タとして符号拡張をする様子を示す。８ビットの入力デ
ータ２００を符号付きデータ（ＳＩＧＮ）とみなす場
合、符号ビットであるＸ７を上位に拡張して演算に繋げ
る。８ビットの入力データ２００を絶対値データ（ＵＮ
ＳＩＧＮ）とみなす場合、拡張ビットＳには”０”を代
入する。In the arithmetic unit according to the present invention, input data is extended when performing double precision arithmetic. At this time, if the data is regarded as signed data (two's complement), the sign of the data is extended to the upper part, and if the data is regarded as absolute value data, “0” is substituted into the upper part. The selection is made by separately writing the operation instruction. Specifically, FIG. 4 shows a state where sign extension is performed as 16-bit data when performing double-precision operation on 8-bit data. When the 8-bit input data 200 is regarded as signed data (SIGN), the sign bit X7 is extended to a higher order and connected to the operation. The 8-bit input data 200 is converted to absolute value data (UN
SIGN), “0” is assigned to the extension bit S.

【００３１】[0031]

【発明の効果】以上の説明から明らかなように、本発明
のＳＩＭＤ型演算器によれば、従来では少なくとも３サ
イクルはかかっていた倍精度演算がより少ないサイクル
数で実現できる。また、当該演算において、データの符
号付き、符号無し、の選択を考慮できる。As is apparent from the above description, according to the SIMD type arithmetic unit of the present invention, double-precision arithmetic operations which conventionally required at least three cycles can be realized with a smaller number of cycles. In addition, in the calculation, selection of signed or unsigned data can be considered.

【００３２】上記倍精度演算は、２つの入力手段のうち
一方が１つだけの有効データ格納部を有する場合におい
ても実現できる。また、その場合における当該演算にお
いても、データの符号付き、なし、の選択を考慮でき
る。The double precision operation can be realized even when one of the two input means has only one valid data storage. In addition, in the calculation in that case, selection of signed or unsigned data can be considered.

[Brief description of the drawings]

【図１】従来のＳＩＭＤ演算器（単精度）の構成図。FIG. 1 is a configuration diagram of a conventional SIMD arithmetic unit (single precision).

【図２】本発明の倍精度ＳＩＭＤ演算器構成図。FIG. 2 is a configuration diagram of a double precision SIMD arithmetic unit according to the present invention.

【図３】単精度の従来のブロードキャストの説明図FIG. 3 is an explanatory diagram of a single precision conventional broadcast.

【図４】ＳＩＧＮ拡張の説明図FIG. 4 is an explanatory diagram of a SIGN extension.

[Explanation of symbols]

２・・・入力手段、４、１８、４８、６８・・・入力デ
ータ格納部、６・・・出力手段、１０・・・第１の形態
のＳＩＭＤ型演算器、１２、３２、５２・・・第１の入
力レジスタ、１４、５４・・・第２の入力レジスタ、３
４・・・第２の入力レジスタ（または即値、等）、１
６、３６、５６・・・出力レジスタ、２０、４６、６６
・・・第１のレジスタの無効データ格納部、２２、３
８、５８・・・第１のレジスタの有効データ格納部、２
４、４０、６０・・・第２のレジスタの無効データ格納
部、２６、４２、６２・・・有効データ格納部、２８、
４４、６４・・・出力データ格納部、Ａ０、Ａ１、Ａ
２、Ａ３、Ｂ０、Ｂ１、Ｂ２、Ｂ３、Ｃ０、Ｃ１、Ｃ
２、Ｃ３・・・演算データ、２００・・・８ビットの入
力演算データ2 ... input means, 4, 18, 48, 68 ... input data storage unit, 6 ... output means, 10 ... SIMD type arithmetic unit of the first embodiment, 12, 32, 52 ... First input register, 14, 54... Second input register, 3
4... Second input register (or immediate value, etc.), 1
6, 36, 56 ... output register, 20, 46, 66
... Invalid data storage of first register, 22, 3
8, 58 ... valid data storage section of first register, 2
4, 40, 60: invalid data storage of the second register, 26, 42, 62: valid data storage, 28,
44, 64... Output data storage unit, A0, A1, A
2, A3, B0, B1, B2, B3, C0, C1, C
2, C3 ... operation data, 200 ... 8-bit input operation data

───────────────────────────────────────────────────── フロントページの続き (72)発明者樗木杉高東京都大田区中馬込１丁目３番６号株式会社リコー内 (72)発明者原和彦東京都大田区中馬込１丁目３番６号株式会社リコー内 ──────────────────────────────────────────────────続き Continuing from the front page (72) Inventor: Sugitaka Shioki 1-3-6 Nakamagome, Ota-ku, Tokyo Inside Ricoh Co., Ltd. (72) Kazuhiko Hara 1-3-6 Nakamagome, Ota-ku, Tokyo No. Inside Ricoh Company

Claims

[Claims]

An SIMD (Single I / O) having two operation data input means and one operation result data output means.
a first input means and a second input means each having a predetermined data length and having 2 ⁿ data storage units having the same length; The means has 2 ^{n -1} data storage units having the same length and the same length as the input means, and data (A) stored in each data storage unit of the first input means and The data (C) obtained by performing a predetermined operation using the data (B) stored in each data storage unit of the second input means corresponding to the second input means is stored in the data storage unit of the corresponding output means. In the arithmetic unit, each data storage unit in each of the first input means and the second input means comprises 2 ^n-1 valid data storage units and 2 ^n-1 invalid data storage units. Valid data of the first input means and second input Stored in the corresponding data storage portion of the output means a result obtained by calculation using the stage of the corresponding valid data.

2. The data storage section of each of the first input means and the second input means, wherein 2 ^n-1 invalid data storage sections and 2 ^n-1 valid data storage sections are alternately arranged. The arithmetic unit according to claim 1, wherein:

3. A first input means and second input means and output means having a 2 ^n-1 pieces of data storage unit having 2 ⁿ pieces of data storage unit, the value of n is the specification of the instruction An arithmetic unit according to any one of the preceding claims, which can vary.

4. An SIMD type arithmetic unit having two operation data input means and one operation result data output means, wherein the first input means has a predetermined data length and is equal in length. It has 2 ⁿ data storage units, and the second input means is at least as long as the data length of the data storage section of the first input means, and has a length equal to that of the data storage section of the first input means. The output means has the same length as the first input means and has 2 ^{n -1} data storage parts having the same length; Data (C) obtained by performing a predetermined operation using data (A) stored in each data storage unit and data (B) stored in one data storage unit of the second input means. ) In the data storage section of the corresponding output means, the first input means Definitive respective data storage unit is made up of a 2 ^n-1 pieces of valid data storage unit and the 2 ^n-1 pieces of invalid data storage unit, and the data of each valid data and the second input means of the first input means The result of the calculation is stored in the corresponding data storage of the output means.

5. The data storage section of the first input means, wherein 2 ^n-1 invalid data storage sections and 2 ^n-1 valid data storage sections are alternately arranged. Arithmetic unit.

6. The output means having a first input means and the 2 ^n-1 pieces of data storage unit with the 2 ⁿ data storage unit, the value of n may vary with the specified instruction, claims An arithmetic unit according to claim 4 or claim 5.

7. The arithmetic unit according to claim 1, wherein the data stored in the valid data storage unit of the input unit is extended to a data length of the data storage unit of the output unit and a predetermined length is set. When linking to an operation, it is possible to select whether to perform sign extension or not.