JP2005259120A

JP2005259120A - Signal processing device

Info

Publication number: JP2005259120A
Application number: JP2005025899A
Authority: JP
Inventors: Shuji Miyasaka; 修二宮阪
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 2004-02-12
Filing date: 2005-02-02
Publication date: 2005-09-22

Abstract

【課題】キャッシュミスを少なくし、データ伝送時間及びデータの転送量を削減することが可能な信号処理装置を提供する。
【解決手段】
信号処理装置１は、主メモリ１０の第１メモリ領域１１及び第２メモリ領域１２の第１テーブル１３及び第２テーブル１４を用いてデコード処理を行う。信号処理装置１は、第１テーブル１３及び第２テーブル１４のデータを一時格納するためのキャッシュメモリ３４と、キャッシュメモリ３４を介して第１メモリ領域１１及び第２メモリ領域１２の少なくとも１つにアクセスすることによってデータを読み出すプロセッサ２０と、キャッシュメモリ３４に一時格納する空き領域が不足した場合に、キャッシュメモリ３４に空き領域を確保する制御部５１とを備え、制御部５１は、第１メモリ領域１１のデータが第２メモリ領域１２のデータよりも優先してキャッシュメモリ３４に保持されるように、空き領域を確保する。
【選択図】図１PROBLEM TO BE SOLVED: To provide a signal processing device capable of reducing a cache miss and reducing a data transmission time and a data transfer amount.
[Solution]
The signal processing device 1 performs a decoding process using the first table 13 and the second table 14 of the first memory area 11 and the second memory area 12 of the main memory 10. The signal processing apparatus 1 includes a cache memory 34 for temporarily storing data of the first table 13 and the second table 14, and at least one of the first memory area 11 and the second memory area 12 via the cache memory 34. A processor 20 that reads data by accessing and a control unit 51 that secures a free area in the cache memory 34 when a free area temporarily stored in the cache memory 34 is insufficient, the control unit 51 includes a first memory An empty area is secured so that the data in the area 11 is held in the cache memory 34 in preference to the data in the second memory area 12.
[Selection] Figure 1

Description

本発明は、第１及び第２メモリ領域のデータを用いて信号処理を行う信号処理装置に関し、特にキャッシュメモリを介して第１及び第２メモリ領域にアクセスし、第１及び第２メモリ領域のデータを読み出すことにより信号処理を行う信号処理装置に関する。 The present invention relates to a signal processing apparatus that performs signal processing using data in first and second memory areas, and in particular, accesses the first and second memory areas via a cache memory, and stores the first and second memory areas. The present invention relates to a signal processing device that performs signal processing by reading data.

例えば、音楽データについて例えばＭＰＥＧ−ＡＡＣ規格のハフマンコードブック（例えば、８）を用いてハフマン符号にエンコードし、再生の際にエンコードされた符号をデコードするような信号処理が行われる。 For example, signal processing is performed such that music data is encoded into a Huffman code using, for example, a Huffman codebook (for example, 8) of the MPEG-AAC standard, and the encoded code is decoded during reproduction.

図２２は、ＭＰＥＧ−ＡＡＣ規格のハフマンコードブック８を示している。
このコードブックを用いてどのようにハフマンデコードが行われるか以下説明する。 FIG. 22 shows the Huffman codebook 8 of the MPEG-AAC standard.
How Huffman decoding is performed using this code book will be described below.

例えば、ハフマン符号化された入力のビット列が、「０１１１０１１００００１１１１１１０１１０１０００１１１１ｘｘｘｘ」と続いているとする。 For example, it is assumed that the Huffman-encoded input bit string continues as “01110110000111111011010001111xxxx”.

その場合、最初の５ビットが、当該テーブルの最初のエントリ「０１１１０」に合致するので、デコード結果は「０」となり、当該符号長は５ビットであることがわかる。さらに、続くビットが、「１１００００１１１１１１０１１０１０００１１１１ｘｘｘｘ」となっているので、最初の６ビットが、当該テーブルの４番目のエントリ「１１００００」に合致するので、デコード結果は「３」となり、当該符号長は６ビットであることがわかる。さらに、続くビットが、「１１１１１１０１１０１０００１１１１ｘｘｘｘ」となっているので、最初の９ビットが、当該テーブルの最後から３番目のエントリ「１１１１１１０１１」に合致するので、デコード結果は「６１」となり、当該符号長は９ビットであることがわかる。さらに、続くビットが、「０１０００１１１１ｘｘｘｘ」となっているので、最初の４ビットが、当該テーブルの１１番目のエントリ「０１００」に合致するので、デコード結果は「１０」となり、当該符号長は４ビットであることがわかる。さらに、続くビットが、「０１１１１ｘｘｘｘ」となっているので、最初の５ビットが、当該テーブルの１７番目のエントリ「０１１１１」に合致するので、デコード結果は「１６」となり、当該符号長は５ビットであることがわかる。 In this case, since the first 5 bits match the first entry “01110” of the table, the decoding result is “0”, which indicates that the code length is 5 bits. Furthermore, since the subsequent bits are “110000111111011010001111xxxx”, the first 6 bits match the fourth entry “110000” in the table, so the decoding result is “3”, and the code length is 6 bits. It can be seen that it is. Furthermore, since the following bits are “111111011010001111xxxx”, the first 9 bits match the third entry “111111011” from the end of the table, so the decoding result is “61” and the code length is It can be seen that it is 9 bits. Furthermore, since the subsequent bits are “010001111xxxx”, the first 4 bits match the 11th entry “0100” of the table, so the decoding result is “10” and the code length is 4 bits. It can be seen that it is. Furthermore, since the subsequent bits are “01111xxxx”, the first 5 bits match the 17th entry “01111” of the table, so the decoding result is “16” and the code length is 5 bits. It can be seen that it is.

このように、ハフマンコードブック８の各行を総当たり方式でサーチすることによりハフマンデコードが行われる。 In this way, Huffman decoding is performed by searching each line of the Huffman codebook 8 in a brute force manner.

しかしながら、このデコード方法は、一致するハフマン符号が見つかるまでハフマンコードブック８を何度もサーチする必要があるため、非常に多くの処理時間を必要とするという問題を有している。 However, this decoding method has a problem that it requires a very long processing time because it is necessary to search the Huffman codebook 8 many times until a matching Huffman code is found.

このため、本願に係る発明者らは、ハフマンデコード処理を高速に行うための装置を発明した（例えば、特許文献１参照）。 For this reason, the inventors of the present application have invented an apparatus for performing Huffman decoding at high speed (see, for example, Patent Document 1).

この発明を適用した従来の技術を、ＭＰＥＧ−ＡＡＣ規格のハフマンコードブック８を用いてエンコードされた符号をデコードする過程を例にとり、以下説明する。 A conventional technique to which the present invention is applied will be described below by taking as an example a process of decoding a code encoded using the Huffman codebook 8 of the MPEG-AAC standard.

図２３は、ハフマンデコード処理を高速に行うための２つのテーブルを示す図である。特に図２３（ａ）は第１テーブル５０００を、図２３（ｂ）は第２テーブル６０００を、それぞれ示している。 FIG. 23 is a diagram showing two tables for performing the Huffman decoding process at high speed. In particular, FIG. 23A shows the first table 5000, and FIG. 23B shows the second table 6000.

これらの第１及び第２テーブル５０００，６０００は、ＳＲＡＭ等のメモリの所定のアドレスに記憶され、デコーダがハフマンコードのビット列に基づいて第１及び第２テーブル５０００，６０００を引くことによりデコード処理が行われる。 These first and second tables 5000 and 6000 are stored at predetermined addresses of a memory such as SRAM, and the decoder subtracts the first and second tables 5000 and 6000 based on the bit string of the Huffman code, thereby performing the decoding process. Done.

この２つの第１及び第２テーブル５０００，６０００を用いる場合、入力のビット列は、以下のようにデコードされる。 When these two first and second tables 5000 and 6000 are used, the input bit string is decoded as follows.

例えば、入力のビット列が、先の例と同様「０１１１０１１００００１１１１１１０１１０１０００１１１１ｘｘｘｘ」と続いている場合、デコーダは、まず最初の６ビットを取り出し、当該６ビットをアドレスとして、第１テーブル５０００を引く。この場合、「０１１１０１」をアドレスとして第１テーブル５０００を引くので、デコーダは、図２３（ａ）中に示される１回目のアクセスポイントα１をアクセスし、符号長が５ビットで、デコード結果が「０」であることがわかる。そして、デコーダは、符号長が５ビットであったので、上記ビット列を５ビット進めると、ビット列は「１１００００１１１１１１０１１０１０００１１１１ｘｘｘｘ」となる。 For example, when the input bit string continues as “01110110000111111011010001111xxxx” as in the previous example, the decoder first extracts the first 6 bits, and draws the first table 5000 using the 6 bits as an address. In this case, since the first table 5000 is drawn with “011101” as the address, the decoder accesses the first access point α1 shown in FIG. 23A, the code length is 5 bits, and the decoding result is “ It turns out that it is "0". Since the code length of the decoder is 5 bits, when the bit string is advanced by 5 bits, the bit string becomes “110000111111011010001111xxxx”.

次に、デコーダは、先ほどと同様に当該ビット列の最初の６ビットを取り出し、取り出した６ビットをアドレスとして、第１テーブル５０００を引く。この場合、「１１００００」をアドレスとして、第１テーブル５０００を引くので、デコーダは、図２３（ａ）中に示される２回目のアクセスポイントα２をアクセスし、符号長が６ビットで、デコード結果が「３」であることがわかる。そして、符号長が６ビットであったので、上記ビット列を６ビット進めると、ビット列は「１１１１１１０１１０１０００１１１１ｘｘｘｘ」となる。 Next, the decoder extracts the first 6 bits of the bit string in the same manner as described above, and draws the first table 5000 using the extracted 6 bits as an address. In this case, since the first table 5000 is drawn with “110000” as the address, the decoder accesses the second access point α2 shown in FIG. 23A, the code length is 6 bits, and the decoding result is It turns out that it is "3". Since the code length is 6 bits, when the bit string is advanced by 6 bits, the bit string becomes “111111011010001111xxxx”.

さらに、デコーダは、先ほどと同様に当該ビット列の最初の６ビットを取り出し、取り出した６ビットをアドレスとして、第１テーブル５０００を引く。この場合、「１１１１１１」をアドレスとして、第１テーブル５０００を引くので、図２３（ａ）中に示される３回目のアクセスポイントα３をアクセスし、未完了記号「１５」、次のアドレスポインタ「１１０００１０」、次に読み込むビット数４を得るので、図２３（ｂ）に示される第２テーブル６０００のアドレス「１１０００１０」にポインタを進め、上記ビット列を６ビット進めた後にさらに４ビットを取得し、当該４ビットをアドレス「１１０００１０」からのインデックスとし第２テーブル６０００を引く。この場合、当該４ビットの値が「０１１０」であるので、図２３（ｂ）中に示される４回目のアクセスポイントα４をアクセスすることになるので、デコーダは、符号長が３ビットで、デコード結果が「６１」であることがわかる。そして、上記符号長が３ビットであったので上記ビット列を３ビット進めると、ビット列は「０１０００１１１１ｘｘｘｘ」となる。 Further, the decoder extracts the first 6 bits of the bit string in the same manner as described above, and draws the first table 5000 using the extracted 6 bits as an address. In this case, since the first table 5000 is drawn with “111111” as the address, the third access point α3 shown in FIG. 23A is accessed, the incomplete symbol “15”, and the next address pointer “1100010”. In order to obtain the number of bits 4 to be read next, the pointer is advanced to the address “1100010” of the second table 6000 shown in FIG. 23B, and after the bit string is advanced by 6 bits, further 4 bits are obtained, The second table 6000 is subtracted using 4 bits as an index from the address “1100010”. In this case, since the 4-bit value is “0110”, the fourth access point α4 shown in FIG. 23B is accessed, so the decoder has a code length of 3 bits and decodes It can be seen that the result is “61”. Since the code length is 3 bits, when the bit string is advanced by 3 bits, the bit string becomes “010001111xxxx”.

次に、デコーダは、先ほどと同様に当該ビット列の最初の６ビットを取り出し、当該６ビットをアドレスとして、第１テーブル５０００を引く。この場合、「０１０００１」をアドレスとして第１テーブル５０００を引くので、デコーダは、図２３（ａ）中に示される５回目のアクセスポイントα５をアクセスし、符号長が４ビットで、デコード結果で「１０」であることがわかる。上記符号長が４ビットであったので上記ビット列を４ビット進めると、ビット列は「０１１１１ｘｘｘｘ」となる。 Next, the decoder takes out the first 6 bits of the bit string in the same manner as before, and draws the first table 5000 using the 6 bits as an address. In this case, since the first table 5000 is drawn with “010001” as the address, the decoder accesses the fifth access point α5 shown in FIG. 23A, the code length is 4 bits, and the decoding result is “ 10 ". Since the code length is 4 bits, if the bit string is advanced by 4 bits, the bit string becomes “01111xxxx”.

次に、デコーダは、先ほどと同様に当該ビット列の最初の６ビットを取り出し、当該６ビットをアドレスとして、第１テーブル５０００を引く。この場合、「０１１１１ｘ」をアドレスとして、第１テーブル５０００を引くので、図２３（ａ）中に示す、６回目のアクセスポイントα６をアクセスし、符号長が５ビットで、デコード結果が「１６」であることがわかる。 Next, the decoder takes out the first 6 bits of the bit string in the same manner as before, and draws the first table 5000 using the 6 bits as an address. In this case, since the first table 5000 is drawn using “01111x” as the address, the sixth access point α6 shown in FIG. 23A is accessed, the code length is 5 bits, and the decoding result is “16”. It can be seen that it is.

上記のような処理過程をみると明らかなように、符号長の短い符号（発生頻度の高い符号）に対しては、１回のテーブルアクセスでハフマンデコードが完了し、符号長の長い符号（発生頻度の低い符号）に対してのみ、２回のテーブルアクセスでハフマンデコードが完了するので、平均として高速な信号処理が実現できる。 As can be seen from the above process, Huffman decoding is completed with one table access for a code with a short code length (a code with a high frequency of occurrence). Since Huffman decoding is completed with only two table accesses only for a low frequency code), high-speed signal processing can be realized on average.

上記が、筆者らが発明した従来の技術である。
ところで、近年、携帯電話機などのようなコンピュータ装置においては、電話をしたりするだけでなく、オーディオの再生だけでなく、ビデオを再生したり、メールを送受信したりできるように、多機能化が図られている。 The above is the conventional technique invented by the authors.
By the way, in recent years, in a computer device such as a mobile phone, not only a phone call but also an audio playback, a video playback, and a mail transmission / reception can be performed. It is illustrated.

このような複数の機能を実現するために、各機能毎にその機能専用の回路を設けると、回路規模が膨大となる。このため、必要に応じてテーブル等のデータを外部メモリ等に格納し、信号処理に必要なときに外部メモリのデータをキャッシュメモリに格納することで、レスポンスを落とさずに、各機能を実現することが考えられている。 In order to realize such a plurality of functions, if a circuit dedicated to each function is provided for each function, the circuit scale becomes enormous. For this reason, each function is realized without dropping the response by storing data such as a table in an external memory as necessary, and storing data in the external memory in a cache memory when necessary for signal processing. It is considered.

このような、キャッシュメモリを搭載した演算装置（信号処理装置）で、上記のようなハフマンデコード処理を行う場合がある。この場合の従来の技術について述べるが、まず、キャッシュメモリの振る舞いについて、簡単に説明する。 Such an Huffman decoding process may be performed by an arithmetic unit (signal processing unit) equipped with a cache memory. The conventional technology in this case will be described. First, the behavior of the cache memory will be briefly described.

図２４は、主メモリ１０００と、プロセッサ２０００と、キャッシュメモリ装置３０００との関係を示した図である。 FIG. 24 is a diagram showing the relationship among the main memory 1000, the processor 2000, and the cache memory device 3000.

キャッシュメモリ装置３０００は、図２４に示されるようにデータを格納するためのキャッシュメモリ３１００と、リプレース制御するための制御部３２００とを備える。 As shown in FIG. 24, the cache memory device 3000 includes a cache memory 3100 for storing data and a control unit 3200 for replacement control.

キャッシュメモリ３１００は、ラインと呼ばれる単位でデータが管理される。通常、ラインのサイズは、３２バイトとか１２８バイトといった値であり、キャッシュメモリのサイズは、１６ｋバイトとか３２ｋバイトといった値であるので、キャッシュメモリ３１００は、数百から数千のラインで区切られているが、ここでは、説明を簡明にするために、４つのラインで区切られているものとして説明する。 The cache memory 3100 manages data in units called lines. Normally, the line size is a value such as 32 bytes or 128 bytes, and the cache memory size is a value such as 16 kbytes or 32 kbytes. Therefore, the cache memory 3100 is divided by several hundred to several thousand lines. However, here, in order to simplify the description, the description will be made assuming that the lines are divided by four lines.

主メモリ１０００の記憶領域も、当該ラインサイズ毎に区切られる。そして、主メモリ１０００の何番目のラインがキャッシュメモリ３１００のデータエリア３４００のどこに格納されているかということは、キャッシュメモリ３１００内のタグ情報のエリア（タグエリア）３３００で管理される。 The storage area of the main memory 1000 is also divided for each line size. The number line of the main memory 1000 and where it is stored in the data area 3400 of the cache memory 3100 is managed in the tag information area (tag area) 3300 in the cache memory 3100.

この図２４では、主メモリ１０００のＬｉｎｅ１，Ｌｉｎｅ３，Ｌｉｎｅ６，Ｌｉｎｅ８のデータがキャッシュメモリ３１００のデータエリア３４００に格納され、当該ラインを識別するための情報がタグエリア３３００に格納されている状態が示されている。 FIG. 24 shows a state in which data of Line 1, Line 3, Line 6, Line 8 of the main memory 1000 is stored in the data area 3400 of the cache memory 3100 and information for identifying the line is stored in the tag area 3300. Has been.

従って、プロセッサ２０００は、主メモリ１０００の当該ライン（Ｌｉｎｅ１，Ｌｉｎｅ３，Ｌｉｎｅ６，Ｌｉｎｅ８）のデータをアクセスする場合は、キャッシュメモリ３１００をアクセスすることで行える。 Therefore, the processor 2000 can access the cache memory 3100 when accessing data on the line (Line1, Line3, Line6, Line8) of the main memory 1000.

しかしながら、例えば主メモリ１０００のＬｉｎｅ９をアクセスする場合は、当該領域が今、キャッシュメモリ３１００に格納されておらず、かつ、既にキャッシュメモリ３１００内に空き領域がないので、キャッシュメモリ３１００に今格納されているいずれかのラインのデータを一旦、主メモリ１０００に書き戻すことによって、キャッシュメモリ３１００に空き領域を確保し、当該空いた領域にＬｉｎｅ９のデータを格納することになる。 However, for example, when accessing the Line 9 of the main memory 1000, the area is not currently stored in the cache memory 3100, and there is no empty area in the cache memory 3100, so the area is now stored in the cache memory 3100. By temporarily writing back the data of one of the lines to the main memory 1000, a free area is secured in the cache memory 3100, and the data of Line 9 is stored in the free area.

このようなキャッシュメモリ３１００を搭載した信号処理装置で、先に説明したハフマンデコード処理を行う場合、例えば、ＭＰＥＧオーディオのＡＡＣ方式の場合、大きなメモリ領域を必要とするので、主メモリ１０００にハフマンデコードに必要なすべてのテーブルデータを格納しておいて、逐次、キャッシュメモリ３１００にデータを移しながらハフマンデコード処理を行うことになる。 When the above-described Huffman decoding process is performed by a signal processing apparatus equipped with such a cache memory 3100, for example, in the case of the MPEG audio AAC method, a large memory area is required. All the necessary table data is stored, and the Huffman decoding process is performed while sequentially transferring the data to the cache memory 3100.

なお、ここで、アクセスとは、アドレスを指定してデータを読み出したり、書き込んだりすることを意味する。
特開２０００−２８６７１７号公報 Here, “access” means reading or writing data by designating an address.
JP 2000-286717 A

しかしながら、キャッシュメモリを用いた信号処理装置で上記に示したハフマンデコード処理を行うと、以下に示すような問題が生じる。 However, when the Huffman decoding process described above is performed by a signal processing apparatus using a cache memory, the following problems occur.

図２５は、キャッシュメモリ３１００を用いてハフマンデコードを行う場合の従来の信号処理装置の構成を示すブロック図である。なお、従来の信号処理装置と、本願発明に係る信号処理装置との相違をより明確にするために、ここでは第１及び第２テーブル５０００，６０００に代えて、後述する第１及び第２テーブル１３，１４を用いた場合について説明する。 FIG. 25 is a block diagram showing a configuration of a conventional signal processing apparatus when Huffman decoding is performed using the cache memory 3100. In order to clarify the difference between the conventional signal processing apparatus and the signal processing apparatus according to the present invention, here, instead of the first and second tables 5000 and 6000, first and second tables described later are used. The case where 13 and 14 are used will be described.

信号処理装置は、図２５に示されるように、主メモリ１０００、アクセス手段としてのプロセッサ２０００及びキャッシュメモリ３１００とを備え、主メモリ１０００の第１メモリ領域１００１に第１テーブル１３が格納され、主メモリ１０００の第２メモリ領域１００２に第２テーブル１４が格納されることにより構成される。 As shown in FIG. 25, the signal processing apparatus includes a main memory 1000, a processor 2000 as an access unit, and a cache memory 3100. The first table 13 is stored in the first memory area 1001 of the main memory 1000, and This is configured by storing the second table 14 in the second memory area 1002 of the memory 1000.

第１テーブル１３は、８つのライン（Ｌｉｎｅ０〜Ｌｉｎｅ７）から構成されており、第２テーブル１４は７つのライン（Ｌｉｎｅ８〜Ｌｉｎｅ１４）から構成されている。第１テーブル１３及び第２テーブル１４のそれぞれの具体的内容を、図２，図３にそれぞれ示す。第１テーブル１３及び第２テーブル１４は、従来の技術で先に示した図２３（ａ），図２３（ｂ）と同様のデータを格納したテーブルであるが、キャッシュメモリの構成に併せてラインサイズ毎に区切られている点が異なる。 The first table 13 includes eight lines (Line 0 to Line 7), and the second table 14 includes seven lines (Line 8 to Line 14). Specific contents of the first table 13 and the second table 14 are shown in FIGS. 2 and 3, respectively. The first table 13 and the second table 14 are tables in which data similar to those shown in FIGS. 23A and 23B described above in the prior art are stored. The difference is that each size is separated.

さて、今、入力のビット列が、先の例と同様に「０１１１０１１００００１１１１１１０１１０１０００１１１１ｘｘｘｘ」と続いている場合、プロセッサ２０００は、まず、当該ビット列の最初の６ビットをアドレスとして、第１テーブル１３を引く。この場合、「０１１１０１」をアドレスとして、第１テーブル１３を引くが、当該アドレスが図２に示されるようにＬｉｎｅ３にあたるので、当該Ｌｉｎｅ３のデータが主メモリ１０００からキャッシュメモリ３１００に移された後（図２６参照）、プロセッサ２０００は、当該データをアクセスし、符号長が５ビットで、デコード結果が「０」であることがわかる。上記符号長が５ビットであったので、プロセッサ２０００は、上記ビット列を５ビット進める。そうすると、次のビット列は、「１１００００１１１１１１０１１０１０００１１１１ｘｘｘｘ」となる。 Now, when the input bit string continues as “01110110000111111011010001111xxxx” as in the previous example, the processor 2000 first draws the first table 13 using the first 6 bits of the bit string as an address. In this case, the first table 13 is drawn with “011101” as the address, but since the address corresponds to Line 3 as shown in FIG. 2, the data of Line 3 is transferred from the main memory 1000 to the cache memory 3100 ( The processor 2000 accesses the data, and it can be seen that the code length is 5 bits and the decoding result is “0”. Since the code length is 5 bits, the processor 2000 advances the bit string by 5 bits. Then, the next bit string becomes “110000111111011010001111xxxx”.

次に、プロセッサ２０００は、先ほどと同様に、当該ビット列の最初の６ビットをアドレスとして、第１テーブル１３を引く。この場合、「１１００００」をアドレスとして、第１テーブル１３を引くが、当該アドレスは、図２に示されるようにＬｉｎｅ６にあたるので、当該Ｌｉｎｅ６のデータが主メモリ１０００からキャッシュメモリ３１００に移された後（図２７参照）、プロセッサ２０００は、当該データをアクセスし、符号長が６ビットで、デコード結果が「３」であることがわかる。上記符号長が６ビットであったので上記ビット列を６ビット進めると、ビット列は「１１１１１１０１１０１０００１１１１ｘｘｘｘ」となる。 Next, the processor 2000 draws the first table 13 using the first 6 bits of the bit string as an address, as before. In this case, the first table 13 is drawn with “110000” as the address. Since the address corresponds to Line 6 as shown in FIG. 2, the data of Line 6 is transferred from the main memory 1000 to the cache memory 3100. (See FIG. 27.) The processor 2000 accesses the data, and it can be seen that the code length is 6 bits and the decoding result is “3”. Since the code length is 6 bits, when the bit string is advanced by 6 bits, the bit string becomes “111111011010001111xxxx”.

さらに、先ほどと同様に、プロセッサ２０００は、当該ビット列の最初の６ビットをアドレスとして、第１テーブル１３を引く。この場合、「１１１１１１」をアドレスとして、第１テーブル１３を引くが、当該アドレスは、図２に示されるようにＬｉｎｅ７にあたるので、プロセッサ２０００は、当該Ｌｉｎｅ７のデータが主メモリ１０００からキャッシュメモリ３１００に移された後（図２８参照）、当該データにアクセスし、未完了記号「１５」、次のアドレスポインタ「１１０００１０」、次に読み込むビット数「４」を得る。 Further, as before, the processor 2000 draws the first table 13 using the first 6 bits of the bit string as an address. In this case, the first table 13 is drawn with “111111” as the address. Since the address corresponds to Line 7 as shown in FIG. 2, the processor 2000 causes the data of the Line 7 to be transferred from the main memory 1000 to the cache memory 3100. After the transfer (see FIG. 28), the data is accessed, and an incomplete symbol “15”, the next address pointer “1100010”, and the number of bits to be read “4” are obtained.

そこで、プロセッサ２０００は、第２テーブル１４のアドレス「１１０００１０」にポインタを進め、上記ビット列を６ビット進めた後に、続く４ビットの値をアドレス「１１０００１０」からのインデックスとし第２テーブル１４を引く。この場合、当該４ビットの値は、「０１１０」であるので、アドレス「１１０００１０」に「０１１０」を足して、次にアクセスするアドレスは「１１０１０００」となる。つまり、プロセッサ２０００は、次のアドレスポインタ「１１０００１０」に「０１１０」を足して、次にアクセスするアドレス「１１０１０００」を算出する。 Therefore, the processor 2000 advances the pointer to the address “1100010” of the second table 14, advances the bit string by 6 bits, and then subtracts the second table 14 using the subsequent 4-bit value as an index from the address “1100010”. In this case, since the 4-bit value is “0110”, “0110” is added to the address “1100010”, and the next access address is “1101000”. That is, the processor 2000 adds “0110” to the next address pointer “1100010” to calculate the next access address “1101000”.

このアドレスは図３に示されるように、Ｌｉｎｅ１３にあたるので、プロセッサ２０００は、当該Ｌｉｎｅ１３のデータが主メモリ１０００からキャッシュメモリ３１００に移された後（図２９参照）、当該データをアクセスし、符号長が３ビットで、デコード結果が「６１」であることがわかる。上記符号長が３ビットであったので上記ビット列を３ビット進めると、ビット列は「０１０００１１１１ｘｘｘｘ」となる。 Since this address corresponds to Line 13, as shown in FIG. 3, the processor 2000 accesses the data after the data of Line 13 is transferred from the main memory 1000 to the cache memory 3100 (see FIG. 29), and the code length Is 3 bits and the decoding result is “61”. Since the code length is 3 bits, if the bit string is advanced by 3 bits, the bit string becomes “010001111xxxx”.

次に、先ほどと同様に、プロセッサ２０００は、当該ビット列の最初の６ビットをアドレスとして、第１テーブル１３を引く。この場合、「０１０００１」をアドレスとして、第１テーブル１３を引くが、当該アドレスは、図２に示されるようにＬｉｎｅ２にあたるので、当該Ｌｉｎｅ２のデータが主メモリ１０００からキャッシュメモリ３１００に移された後、当該データにアクセスしようとする。 Next, as before, the processor 2000 draws the first table 13 using the first 6 bits of the bit string as an address. In this case, the first table 13 is drawn using “010001” as an address. Since the address corresponds to Line 2 as shown in FIG. 2, the data of Line 2 is transferred from the main memory 1000 to the cache memory 3100. Try to access the data.

ところが今、キャッシュメモリに空き領域はない。このようにキャッシュメモリ３１００を埋め尽くしたデータに必要なデータがない場合、すなわちキャッシュミスした場合、制御部３２００は、キャッシュメモリ３１００に格納されているいずれかのラインのデータを主メモリ１０に書き戻し、それによって生じた空き領域に当該Ｌｉｎｅ２のデータを移す必要がある。 However, there is no free space in the cache memory now. When there is no necessary data in the data in which the cache memory 3100 is completely filled, that is, when a cache miss occurs, the control unit 3200 writes the data of any line stored in the cache memory 3100 to the main memory 10. It is necessary to return and transfer the data of the Line 2 to the free space generated by that.

このようなキャッシュミスが発生した場合、通常、一番過去にアクセスされたＬｉｎｅが書き戻されるので、この場合、制御部３２００は、Ｌｉｎｅ３を書き戻し（図３０参照）、そこにＬｉｎｅ２のデータを移す（図３１参照）。その後、プロセッサ２０００は、当該Ｌｉｎｅ２のデータをアクセスし、符号長が４ビットで、デコード結果が「１０」であることがわかる。上記符号長が４ビットであったので上記ビット列を４ビット進めると、ビット列は「０１１１１ｘｘｘｘ」となる。 When such a cache miss occurs, the most recently accessed Line is usually written back. In this case, the control unit 3200 writes Line 3 back (see FIG. 30), and Line 2 data is written there. (See FIG. 31). After that, the processor 2000 accesses the data of Line 2 and finds that the code length is 4 bits and the decoding result is “10”. Since the code length is 4 bits, if the bit string is advanced by 4 bits, the bit string becomes “01111xxxx”.

さて次に、先ほどと同様に、プロセッサ２０００は当該ビット列の最初の６ビットをアドレスとして、第１テーブル１３を引く。この場合、「０１１１１ｘ」をアドレスとして、第１テーブル１３を引くが、当該アドレスは、図２に示されるようにＬｉｎｅ３にあたるので、当該Ｌｉｎｅ３のデータを主メモリ１０００からキャッシュメモリ３１００に移した後、当該データにアクセスしようとする。ところが今、キャッシュメモリに空き領域はないので、制御部３２００は、キャッシュメモリ３１００に格納されているいずれかのラインのデータを主メモリ１０００に書き戻し、それによって生じた空き領域に当該Ｌｉｎｅ３のデータを移す。 Next, as before, the processor 2000 draws the first table 13 using the first 6 bits of the bit string as an address. In this case, the first table 13 is drawn using “01111x” as an address. Since the address corresponds to Line 3 as shown in FIG. 2, the data of Line 3 is transferred from the main memory 1000 to the cache memory 3100. Try to access the data. However, since there is no free space in the cache memory now, the control unit 3200 writes back the data of any line stored in the cache memory 3100 to the main memory 1000, and the data of the Line 3 in the free space generated thereby. Move.

このようなキャッシュミスが発生した場合、通常、ＬＲＵ方式により一番過去にアクセスされたＬｉｎｅが書き戻されるので、この場合、制御部３２００は、Ｌｉｎｅ６を書き戻し（図３２参照）、そこにＬｉｎｅ３のデータを移す（図３３参照）。その後、プロセッサ２０００は、当該Ｌｉｎｅ３のデータをアクセスし、符号長が５ビットで、デコード結果１６であることがわかる。 When such a cache miss occurs, the most recently accessed Line is normally written back by the LRU method. In this case, the control unit 3200 writes Line 6 back (see FIG. 32), and Line 3 is written there. (See FIG. 33). Thereafter, the processor 2000 accesses the data of the Line 3 and finds that the code length is 5 bits and the decoding result is 16.

なお、上記過程において、上記説明では、Ｌｉｎｅ３のデータや、Ｌｉｎｅ６のデータがキャッシュメモリ３１００から主メモリ１０００に書き戻されるとしたが、本一連の動作の過程では、Ｌｉｎｅ３のデータや、Ｌｉｎｅ６のデータは、キャッシュメモリに格納されている間に値が変化していないので、つまりキャッシュメモリ上での値と主メモリ１０００上での値が同一であるので、従来の多くのキャッシュメモリの制御方法では、書き戻すという動作はなく、Ｌｉｎｅ２のデータをＬｉｎｅ３のデータが格納されていたキャッシュメモリ３１００上に上書きしたり、Ｌｉｎｅ３のデータをＬｉｎｅ６のデータが格納されていたキャッシュメモリ上に上書きする、という処理が行われることになる。 In the above process, in the above description, the Line 3 data and the Line 6 data are written back from the cache memory 3100 to the main memory 1000. However, in this series of operations, the Line 3 data and the Line 6 data are written. Since the value does not change while being stored in the cache memory, that is, the value on the cache memory is the same as the value on the main memory 1000, many conventional cache memory control methods There is no operation of writing back, and the process of overwriting the data of Line 2 on the cache memory 3100 in which the data of Line 3 is stored, or overwriting the data of Line 3 on the cache memory in which the data of Line 6 is stored Will be done.

しかしながら、上記過程から明らかなように、ＬＲＵ方式でリプレース対象を決定しているので、キャッシュミスが多発するという問題がある。このようなキャッシュミスの多発は、短い時間の間に、主メモリ１０００とキャッシュメモリ３１００の間でＬｉｎｅ３等のデータ転送が複数回（２回）発生し、データの転送に多大の時間を要し、消費電力も増加するという問題を招く。 However, as is clear from the above process, since the replacement target is determined by the LRU method, there is a problem of frequent cache misses. Such frequent cache misses cause multiple transfers (two times) of data such as Line 3 between the main memory 1000 and the cache memory 3100 in a short time, and it takes a lot of time to transfer the data. This causes the problem of increased power consumption.

このような問題は、デコード処理の場合に限られず、キャッシュメモリを介して第１及び第２メモリ領域にアクセスし、第１及び第２メモリ領域のデータを読み出すことにより信号処理を行う信号処理装置一般にいえることである。 Such a problem is not limited to the decoding process, and the signal processing apparatus performs signal processing by accessing the first and second memory areas via the cache memory and reading the data in the first and second memory areas. This is generally true.

本発明は、このような従来の問題点に鑑みてなされたものであって、キャッシュメモリを用いた信号処理装置において、キャッシュミスを少なくし、データ伝送時間及びデータの転送量を削減することが可能な信号処理装置を提供することを目的とする。 The present invention has been made in view of such conventional problems, and in a signal processing device using a cache memory, it is possible to reduce cache misses and reduce data transmission time and data transfer amount. An object of the present invention is to provide a possible signal processing device.

ところで、本願発明者は、キャッシュメモリを用いた従来の技術を解析したところ、デコード処理において第１テーブル１３へのアクセス頻度が８５％で、第２テーブル１４へのアクセス頻度が１５％である。これにも拘わらず、従来ではＬＲＵ方式により第２テーブルのデータをキャッシュメモリに残すアルゴリズムを採用していたため、入れ替えアルゴリズム（ＲｅｐｌａｃｅｍｅｎｔＡｌｇｏｒｉｔｈｍ）が不適切であることを発見し、本考案を案出するに至った。 By the way, the inventor of the present application has analyzed the conventional technique using the cache memory. As a result, the access frequency to the first table 13 is 85% and the access frequency to the second table 14 is 15% in the decoding process. In spite of this, since the algorithm that leaves the data of the second table in the cache memory by the LRU method has been adopted in the past, the replacement algorithm (Replacement Algorithm) is found to be inappropriate and the present invention is devised. It came to.

そこで、上記目的を達成するために、本発明に係る信号処理装置においては、第１及び第２メモリ領域のデータを用いて信号処理を行う信号処理装置であって、前記データを一時格納するためのキャッシュメモリと、前記キャッシュメモリを介して前記第１及び第２メモリ領域の少なくとも１つにアクセスすることによって前記データを読み出すアクセス手段と、前記キャッシュメモリに一時格納する空き領域が不足した場合に、前記キャッシュメモリに空き領域を確保する制御手段とを備え、前記制御手段は、前記第１メモリ領域のデータが前記第２メモリ領域のデータよりも優先して前記キャッシュメモリに保持されるように、前記空き領域を確保することを特徴とする。 Therefore, in order to achieve the above object, in the signal processing device according to the present invention, the signal processing device performs signal processing using data in the first and second memory areas, and temporarily stores the data. A cache memory, access means for reading out the data by accessing at least one of the first and second memory areas via the cache memory, and when a free area temporarily stored in the cache memory is insufficient And a control means for securing a free area in the cache memory, wherein the control means is such that the data in the first memory area is held in the cache memory in preference to the data in the second memory area. The free space is secured.

この構成により、第１メモリ領域にアクセス頻度の高いデータを格納しておけば、リプレースメントアルゴリズムが適切となる。したがって、キャッシュメモリを用いた信号処理装置において、キャッシュミスを少なくし、データ伝送時間及びデータの転送量を削減した信号処理装置を提供することが可能となる。 With this configuration, if the frequently accessed data is stored in the first memory area, the replacement algorithm becomes appropriate. Therefore, it is possible to provide a signal processing device that uses a cache memory to reduce cache misses and reduce data transmission time and data transfer amount.

また、本発明に係る信号処理装置においては、前記信号処理装置はさらに、前記第１メモリ領域が前記第２メモリ領域よりも強度が強いことを示す強度属性を格納する属性格納手段を備え、前記制御手段は、前記属性格納手段に格納された強度属性に応じて、前記キャッシュメモリに保持するデータを決定することを特徴とすることができる。 In the signal processing device according to the present invention, the signal processing device further includes attribute storage means for storing an intensity attribute indicating that the first memory area is stronger than the second memory area, The control means may determine data to be held in the cache memory according to the strength attribute stored in the attribute storage means.

この構成により、リプレースメントアルゴリズムが確実に適切となり、第２メモリ領域のデータに対して割り当てられるキャッシュメモリのデータ領域を最小限にすますことができる。 With this configuration, the replacement algorithm is surely appropriate, and the data area of the cache memory allocated to the data in the second memory area can be minimized.

また、本発明に係る信号処理装置においては、前記アクセス手段は、前記第１メモリ領域については、前記第１メモリ領域のデータを前記キャッシュメモリに移してからアクセスし、前記第２メモリ領域については、前記第２メモリ領域のデータを前記キャッシュメモリに移さないでアクセスすることを特徴としてもよい。 In the signal processing device according to the present invention, the access means accesses the first memory area after transferring the data in the first memory area to the cache memory, and accesses the second memory area. The data in the second memory area may be accessed without being transferred to the cache memory.

この構成により、リプレースメントアルゴリズムが確実に適切となり、第２メモリ領域のデータに対して割り当てられるキャッシュメモリのデータ領域をなくし、キャッシュメモリデータ領域をすべて第１メモリ領域のデータに対して割り当てることができる。 With this configuration, the replacement algorithm is surely appropriate, the cache memory data area allocated to the data in the second memory area can be eliminated, and the entire cache memory data area can be allocated to the data in the first memory area. .

また、本発明に係る信号処理装置においては、前記第１メモリ領域は、前記第２メモリ領域よりもアクセス頻度が高いデータを格納することを特徴とすることもできる。 In the signal processing device according to the present invention, the first memory area stores data having a higher access frequency than the second memory area.

この構成により、リプレースメントアルゴリズムを確実に適切とすることができる。
また、本発明に係る信号処理装置においては、前記第１及び第２メモリ領域は、それぞれ、入力データを一定規則に従って変換するための変換規則を示す第１及び第２テーブルを保持し、前記制御手段は、前記第２テーブルのデータが前記第１テーブルのデータよりも優先して前記キャッシュメモリから廃棄されるように、前記空き領域を確保することを特徴とすることができる。 This configuration ensures that the replacement algorithm is appropriate.
In the signal processing device according to the present invention, the first and second memory areas hold first and second tables indicating conversion rules for converting input data according to a predetermined rule, respectively, and the control The means may secure the free area so that the data of the second table is discarded from the cache memory in preference to the data of the first table.

この構成により、第１メモリ領域にアクセス頻度の高い第１テーブルを格納しておけば、リプレースメントアルゴリズムが適切となる。したがって、キャッシュメモリを用いた信号処理装置において、キャッシュミスを少なくし、データ伝送時間及びデータの転送量を削減しつつ、入力データを一定規則に従って変換する信号処理装置を提供することが可能となる。 With this configuration, if the first table with high access frequency is stored in the first memory area, the replacement algorithm is appropriate. Therefore, in a signal processing device using a cache memory, it is possible to provide a signal processing device that converts input data according to a certain rule while reducing cache misses and reducing data transmission time and data transfer amount. .

また、本発明に係る信号処理装置においては、前記信号処理装置はさらに、前記第１テーブルが前記第２テーブルよりも強度が強いことを示す強度属性を格納する属性格納手段を備え、前記制御手段は、前記属性格納手段に格納された強度属性に応じて、前記キャッシュメモリから廃棄するデータを決定することを特徴としてもよい。 In the signal processing device according to the present invention, the signal processing device further includes attribute storage means for storing an intensity attribute indicating that the first table is stronger than the second table, and the control means May determine the data to be discarded from the cache memory according to the strength attribute stored in the attribute storage means.

この構成により、第１メモリ領域にアクセス頻度の高い第１テーブルを格納しておけば、リプレースメントアルゴリズムが確実に適切となり、第２メモリ領域のデータに対して割り当てられるキャッシュメモリのデータ領域を最小限にすますことができる。 With this configuration, if the first table with high access frequency is stored in the first memory area, the replacement algorithm is surely appropriate, and the cache memory data area allocated to the data in the second memory area is minimized. Can be

また、本発明に係る信号処理装置においては、前記信号処理装置は、入力データであるハフマン符号をデコードする装置であり、前記ハフマン符号は、符号長の最長ビットがＮビットであり、前記第１テーブルは、ｎを１以上でＮ未満の整数としたときに、ｎビット以下の符号データをデコードするためのテーブルであり、前記第２テーブルは、前記ｎビットよりも長い符号データをデコードするためのテーブルであることを特徴とすることもできる。 Further, in the signal processing device according to the present invention, the signal processing device is a device that decodes a Huffman code that is input data, and the Huffman code has a longest code length of N bits, The table is a table for decoding code data of n bits or less when n is an integer of 1 to less than N, and the second table is for decoding code data longer than the n bits. It can also be characterized by being a table.

この構成により、第１メモリ領域にアクセス頻度の高い符号長の短いハフマン符号デコード用の第１テーブルが格納されるので、リプレースメントアルゴリズムが確実に適切となり、第２メモリ領域のデータに対して割り当てられるキャッシュメモリのデータ領域を最小限にすますことができる。 With this configuration, since the first table for decoding a short Huffman code having a high code frequency is stored in the first memory area, the replacement algorithm is surely appropriate and assigned to the data in the second memory area. The data area of the cache memory can be minimized.

なお、本発明は、このような信号処理装置として実現することができるだけでなく、このような信号処理装置が備える特徴的な手段をステップとする信号処理方法として実現したり、それらのステップをコンピュータに実行させるプログラムとして実現したりすることもできる。そして、そのようなプログラムは、ＣＤ−ＲＯＭ等の記録媒体やインターネット等の伝送媒体を介して配信することができるのはいうまでもない。 Note that the present invention can be realized not only as such a signal processing device but also as a signal processing method using steps characteristic of the signal processing device as a step. It can also be realized as a program to be executed. Needless to say, such a program can be distributed via a recording medium such as a CD-ROM or a transmission medium such as the Internet.

以上の説明から明らかなように、本発明に係る信号処理装置によれば、第１メモリ領域にアクセス頻度の高いデータを格納しておけば、リプレースメントアルゴリズムが適切となるので、キャッシュメモリを用いた信号処理装置において、キャッシュミスを少なくし、データ伝送時間及びデータの転送量を削減した信号処理装置を提供することが可能となる。 As is clear from the above description, according to the signal processing device according to the present invention, if the frequently accessed data is stored in the first memory area, the replacement algorithm becomes appropriate. In the signal processing device, it is possible to provide a signal processing device in which cache misses are reduced and data transmission time and data transfer amount are reduced.

よって、本発明により、キャッシュミスが少なく、データ伝送時間及びデータの転送量を削減した信号処理が可能となり、多機能の携帯電話機などのコンピュータ装置が普及してきた今日における本願発明の実用的価値は極めて高い。 Therefore, according to the present invention, signal processing with fewer cache misses and reduced data transmission time and data transfer amount is possible, and the practical value of the present invention in the present day when computer devices such as multifunctional mobile phones have become widespread. Extremely expensive.

以下、本発明の実施の形態に係る信号処理装置について図面を参照しながら説明する。
＜全体構成＞
（実施の形態１）
図１は、本発明の実施の形態１に係る信号処理装置１の全体構成を示すブロック図である。なお、図１では、信号処理装置１を、図２２のＭＰＥＧ−ＡＡＣ規格のハフマンコードブック（例えば、８）を用いてハフマン符号にエンコードされたビット列をデコードするデコーダとして機能させる場合が図示されている。 Hereinafter, a signal processing device according to an embodiment of the present invention will be described with reference to the drawings.
<Overall configuration>
(Embodiment 1)
FIG. 1 is a block diagram showing an overall configuration of a signal processing device 1 according to Embodiment 1 of the present invention. FIG. 1 illustrates a case where the signal processing apparatus 1 functions as a decoder that decodes a bit string encoded in a Huffman code using the MPEG-AAC standard Huffman codebook (for example, 8) in FIG. Yes.

図１に示されるように信号処理装置１は、主メモリ１０と、アクセス手段としてのプロセッサ２０と、キャッシュメモリ装置３０等とを備える。 As shown in FIG. 1, the signal processing device 1 includes a main memory 10, a processor 20 as an access means, a cache memory device 30 and the like.

主メモリ１０は、ＤＲＡＭ等により構成され、第１メモリ領域１１に１段階でデコード処理するための１段階用の第１テーブル１３を、第２の領域１２に２段階でデコード処理するための２段階用の第２テーブル１４を、それぞれ記憶する。 The main memory 10 is composed of a DRAM or the like, and the first table 13 for one stage for decoding in the first memory area 11 in one stage and 2 for decoding in the second area 12 in two stages. A second stage table 14 is stored for each.

キャッシュメモリ装置３０は、ＳＲＡＭ等により構成され、タグエリア、データエリア、データエリアに残る確率の高さである強度の属性を示すウィークフラグＷ、書き戻しの要否を表すダーティフラグＤ等を記憶するキャッシュメモリ３４と、リプレース制御を行う制御部５１と、キャッシュメモリ３４に残る確率の高さを設定するためデータ（属性データ）を格納する属性格納部５２等とを備え、いわゆるＬＲＵ方式によってアクセス順序の古いキャッシュエントリをリプレースするリプレース制御を前提とした上で、リプレース対象を決定するためのアクセス順序を示す順序データを、アクセス順序に反して改変することにより、アクセス頻度の低いデータを保持するキャッシュエントリをリプレース対象としてキャッシュメモリから追い出すよう構成されている。具体的には、アクセス順序が最古であるとみなすことを示すウィークフラグＷをキャッシュエントリに付加することによって、順序データを間接的に改変している。これにより順序データを直接改変する複雑な回路を不要としている。 The cache memory device 30 is configured by an SRAM or the like, and stores a tag area, a data area, a weak flag W indicating an attribute having a high probability of remaining in the data area, a dirty flag D indicating the necessity of writing back, and the like. A cache memory 34 that performs replacement control, a control unit 51 that performs replacement control, an attribute storage unit 52 that stores data (attribute data) for setting the probability of remaining in the cache memory 34, and the like, and is accessed by a so-called LRU method. Assuming replacement control that replaces cache entries that are out of order, data with low access frequency is retained by modifying the order data indicating the access order for determining the replacement target against the access order. Cache entry from cache memory for replacement It is configured to begin to have. Specifically, the order data is indirectly modified by adding a weak flag W indicating that the access order is considered to be the oldest to the cache entry. This eliminates the need for a complicated circuit for directly modifying the order data.

プロセッサ２０は、デコード処理を開始するにあたって、主メモリ１０の第１メモリ領域１１に第１テーブル１３を、第２メモリ領域１２に第２テーブル１４を書き込んだり、予め定められた属性値設定データに基づいて、属性格納部５２に強度属性を設定したりする。そして、プロセッサ２０は、デコード対象のハフマン符号列をアドレスとしてキャッシュメモリにアクセスすることにより、デコード結果を取得する。 When starting the decoding process, the processor 20 writes the first table 13 in the first memory area 11 of the main memory 10 and the second table 14 in the second memory area 12, or sets predetermined attribute value setting data. Based on this, an intensity attribute is set in the attribute storage unit 52. Then, the processor 20 acquires the decoding result by accessing the cache memory using the Huffman code string to be decoded as an address.

図２は第１テーブル１３の構成を示す図であり、図３は第２テーブル１４の構成を示す図である。 FIG. 2 is a diagram showing the configuration of the first table 13, and FIG. 3 is a diagram showing the configuration of the second table 14.

第１テーブル１３及び第２テーブル１４は、入力データを一定規則に従って変換するための変換規則を示すものであり、具体的には、エンコードの際に用いられたハフマンコードブックに対応し、第１メモリ領域１１と第２メモリ領域１２とにそれぞれマップされる。 The first table 13 and the second table 14 show conversion rules for converting the input data according to a certain rule. Specifically, the first table 13 and the second table 14 correspond to the Huffman codebook used for encoding, Maps to the memory area 11 and the second memory area 12, respectively.

第１テーブル１３は、ビットストリームから読み出した所定ビット数データに含まれ得る、所定数（６ビット）以下のビット数を有する比較的符号長の短いハフマン符号をデコード結果にデコードするためのテーブルである。また、第２テーブル１４は、短いハフマン符号でデコードが完了しなかった場合に第１テーブル１３により取得されるアドレスポインタと、次の取得点で示されるビット長のハフマン符号とに基づいて算出される所定数を超えるビット数を有する比較的符号長の長いハフマン符号をデコードするためのテーブルである。このように、比較的短いハフマン符号と比較的長いハフマン符号とは、第１テーブル１３及び第２テーブル１４を用いた処理によってデコードされる。 The first table 13 is a table for decoding a relatively short code length Huffman code having a bit number equal to or less than a predetermined number (6 bits), which can be included in the predetermined bit number data read from the bit stream, into a decoding result. is there. The second table 14 is calculated based on the address pointer acquired by the first table 13 when decoding is not completed with a short Huffman code and the bit length Huffman code indicated by the next acquisition point. 3 is a table for decoding a Huffman code having a relatively long code length having a number of bits exceeding a predetermined number. As described above, the relatively short Huffman code and the relatively long Huffman code are decoded by the processing using the first table 13 and the second table 14.

第１テーブル１３は、図２に示されるように、符号長又は未完了記号（ここでは、「１５」）のデータが格納される欄と、デコード結果又は次のアドレスのデータが格納される欄と、次の取得長のデータが格納される欄とから構成される。なお、この第１テーブル１３の左側には、説明の便宜のため、これらのデータが格納されるアドレスと、ラインとが示されている。 As shown in FIG. 2, the first table 13 includes a column in which data of a code length or an incomplete symbol (here, “15”) is stored, and a column in which data of a decoding result or a next address is stored. And a field for storing data of the next acquisition length. Note that, on the left side of the first table 13, for convenience of explanation, addresses and lines where these data are stored are shown.

第２テーブル１４は、図３に示されるように、符号長のデータが格納される欄と、デコード結果のデータが格納される欄とから構成される。この第２テーブル１４の左側には、説明の便宜のため、アドレスを求めるための次のインデックスデータと、これらのデータが格納されるアドレスと、ラインとが示されている。 As shown in FIG. 3, the second table 14 includes a column for storing code-length data and a column for storing decoding result data. On the left side of the second table 14, for the convenience of explanation, the next index data for obtaining an address, an address where these data are stored, and a line are shown.

なお、ハフマンエンコードにおいては、一般に出現頻度のより高い入力値に対して、より短いハフマン符号が割り当てられ、出現頻度のより低い入力値に対して、より長いハフマン符号が割り当てられる。したがって、デコード処理においては、第２テーブル１４よりも第１テーブル１３の方がアクセス頻度が高いことになる。 In Huffman encoding, a shorter Huffman code is generally assigned to an input value having a higher appearance frequency, and a longer Huffman code is assigned to an input value having a lower appearance frequency. Therefore, in the decoding process, the first table 13 has a higher access frequency than the second table 14.

図４は、属性格納部５２に格納される属性データの内容の一例を示した図である。
この場合、第１テーブル１３にあたるＬｉｎｅ０からＬｉｎｅ７は強度が大（ウィークフラグＷが「０」）で、第２テーブル１４にあたるＬｉｎｅ８からＬｉｎｅ１４は強度が小（ウィークフラグＷが「１」）であることが示されている。このような強度属性が設定されることにより、適切なリプレイスメントアルゴリズムが発揮され、キャッシュミスが減少されることになる。
＜キャッシュメモリ装置の具体的構成＞
図５は、キャッシュメモリ装置３０の具体的構成例を示すブロック図である。なお、ここではキャッシュメモリ装置３０の具体例として、４ウェイ・セット・アソシエイティブ方式のキャッシュメモリに本発明を適用した場合の構成について説明する。 FIG. 4 is a diagram illustrating an example of the contents of attribute data stored in the attribute storage unit 52.
In this case, the strength of Line 0 to Line 7 corresponding to the first table 13 is high (the weak flag W is “0”), and the strength of Line 8 to Line 14 corresponding to the second table 14 is low (the weak flag W is “1”). It is shown. By setting such a strength attribute, an appropriate replacement algorithm is exhibited and cache misses are reduced.
<Specific Configuration of Cache Memory Device>
FIG. 5 is a block diagram illustrating a specific configuration example of the cache memory device 30. Here, as a specific example of the cache memory device 30, a configuration when the present invention is applied to a 4-way set associative cache memory will be described.

図５に示されるように、キャッシュメモリ装置３０は、アドレスレジスタ３１、メモリＩ／Ｆ３２、デコーダ３３、４つのウェイ３４ａ〜３４ｄ（以下「ウェイ０〜３」とも略記する。）、４つの比較器３５ａ〜３５ｄ、４つのアンド回路３６ａ〜３６ｄ、セレクタ３８、セレクタ３９、デマルチプレクサ４１及びキャッシュ制御部４２を備える。キャッシュ制御部４２は、リプレース部４２ａと、Ｗフラグ設定部４２ｂとを含む。 As shown in FIG. 5, the cache memory device 30 includes an address register 31, a memory I / F 32, a decoder 33, four ways 34a to 34d (hereinafter also abbreviated as “way 0 to 3”), four comparators. 35a to 35d, four AND circuits 36a to 36d, a selector 38, a selector 39, a demultiplexer 41, and a cache control unit 42. The cache control unit 42 includes a replacement unit 42a and a W flag setting unit 42b.

なお、このキャッシュメモリ装置３０の具体例におけるリプレース部４２ａは図１の制御部５１に該当し、Ｗフラグ設定部４２ｂは図１の属性格納部５２に該当する。 In the specific example of the cache memory device 30, the replacement unit 42a corresponds to the control unit 51 in FIG. 1, and the W flag setting unit 42b corresponds to the attribute storage unit 52 in FIG.

アドレスレジスタ３１は、主メモリ１０へのアクセスアドレスを保持するレジスタである。このアクセスアドレスは３２ビットであるものとする。図５に示されるように、アクセスアドレスは、最上位ビットから順に、２１ビットのタグアドレス、４ビットのセットインデックス（図５中のＳＩ）、５ビットのワードインデックス（図５中のＷＩ）を含む。ここで、タグアドレスは、ウェイにマッピングされるメモリ中の領域（そのサイズはセット数×ブロックである）を指す。この領域のサイズは、タグアドレスよりも下位のアドレスビット（Ａ１０〜Ａ０）で定まるサイズつまり２ｋバイトであり、１つのウェイのサイズでもある。セットインデックス（ＳＩ）は、ウェイ０〜３に跨る複数セットの１つを指す。このセット数は、セットインデックスが４ビットなので１６セットある。タグアドレス及びセットインデックスで特定されるキャッシュエントリは、リプレース単位であり、キャッシュメモリに格納されている場合はラインデータ又はラインと呼ばれる。ラインデータのサイズは、セットインデックスよりも下位のアドレスビットで定まるサイズつまり１２８バイトである。１ワードを４バイトとすると、１ラインデータは３２ワードである。ワードインデックス（ＷＩ）は、ラインデータを構成する複数ワード中の１ワードを指す。アドレスレジスタ３１中の最下位２ビット（Ａ１、Ａ０）は、ワードアクセス時には無視される。 The address register 31 is a register that holds an access address to the main memory 10. This access address is assumed to be 32 bits. As shown in FIG. 5, the access address includes a 21-bit tag address, a 4-bit set index (SI in FIG. 5), a 5-bit word index (WI in FIG. 5) in order from the most significant bit. Including. Here, the tag address indicates an area in the memory that is mapped to the way (its size is the number of sets × block). The size of this area is a size determined by address bits (A10 to A0) lower than the tag address, that is, 2k bytes, and is also the size of one way. The set index (SI) indicates one of a plurality of sets extending over the ways 0 to 3. There are 16 sets of sets because the set index is 4 bits. The cache entry specified by the tag address and the set index is a replacement unit, and is called line data or a line when stored in the cache memory. The size of the line data is a size determined by address bits lower than the set index, that is, 128 bytes. If one word is 4 bytes, one line data is 32 words. The word index (WI) indicates one word among a plurality of words constituting the line data. The least significant 2 bits (A1, A0) in the address register 31 are ignored during word access.

メモリＩ／Ｆ３２は、主メモリ１０からキャッシュメモリ装置３０へのデータのロードや、キャッシュメモリ装置３０から主メモリ１０へのデータのライトバック等、キャッシュメモリ装置３０から主メモリ１０をアクセスするためのＩ／Ｆである。 The memory I / F 32 is used for accessing the main memory 10 from the cache memory device 30 such as loading data from the main memory 10 to the cache memory device 30 and writing back data from the cache memory device 30 to the main memory 10. I / F.

デコーダ３３は、セットインデックスの４ビットをデコードし、４つのウェイ０〜３に跨る１６セット中の１つを選択する。 The decoder 33 decodes the 4 bits of the set index and selects one of the 16 sets across the four ways 0 to 3.

４つのウェイ０〜３は、同じ構成を有する４つのウェイであり、４×２ｋバイトの容量を有する。各ウェイは、１６個のキャッシュエントリを有する。 The four ways 0 to 3 are four ways having the same configuration and have a capacity of 4 × 2 kbytes. Each way has 16 cache entries.

図６は、図５に示される１つのキャッシュエントリにおける詳細なビット構成を示す図である。 FIG. 6 is a diagram showing a detailed bit configuration in one cache entry shown in FIG.

図６に示されるように、１つのキャッシュエントリは、バリッドフラグＶ０〜Ｖ３、２１ビットのタグ、１２８バイトのラインデータ、ウィークフラグＷ、ダーティフラグＤ０〜Ｄ３を有する。 As shown in FIG. 6, one cache entry has valid flags V0 to V3, a 21-bit tag, 128-byte line data, a weak flag W, and dirty flags D0 to D3.

タグは、２１ビットのタグアドレスのコピーである。
ラインデータは、タグアドレス及びセットインデックスにより特定されるブロック中の１２８バイトデータのコピーであり、３２バイトの４つのサブラインからなる。 A tag is a copy of a 21-bit tag address.
The line data is a copy of 128-byte data in the block specified by the tag address and the set index, and consists of four 32-byte sub-lines.

バリッドフラグＶ０〜Ｖ３は、４つのサブラインに対応し、サブラインが有効か否かを示す。 Valid flags V0 to V3 correspond to four sublines and indicate whether or not the sublines are valid.

ウィークフラグＷは、当該キャッシュエントリのアクセス順序が最も古いものとみなすためのフラグである。つまり、Ｗ＝１のとき、プロセッサ２０からは当該キャッシュエントリにこれ以上読み出しも書き込みもなされないこと、あるいはアクセス頻度が低いことを意味する。また、Ｗ＝１のとき、キャッシュメモリ装置３０においては、リプレース制御に関してアクセス順序が最古であると取り扱うこと、つまり、最弱（ウィーク）のキャシュエントリーであることを意味する。Ｗ＝０のとき、そうでないことを意味する。 The weak flag W is a flag for regarding that the access order of the cache entry is the oldest. That is, when W = 1, it means that the processor 20 does not read or write any more to the cache entry, or the access frequency is low. Further, when W = 1, the cache memory device 30 means that the access order is the oldest regarding the replacement control, that is, the weakest (week) cache entry. When W = 0, it means not.

ダーティフラグＤ０〜Ｄ３は、４つのサブラインに対応し、そのサブラインにプロセッサ２０から書き込みがあったか否かを示す。つまり、サブライン中にキャッシュされたデータが存在するが、書き込みにより主メモリ１０中のデータと異なるため、主メモリ１０に書き戻すことが必要か否かを示す。 Dirty flags D0 to D3 correspond to four sublines and indicate whether or not the processor 20 has written to the sublines. That is, the cached data exists in the sub-line, but it is different from the data in the main memory 10 by writing, and therefore it is shown whether it is necessary to write back to the main memory 10.

比較器３５ａは、アドレスレジスタ３１中のタグアドレスと、セットインデックスにより選択されたセットに含まれる４つのタグ中のウェイ０のタグとが一致するか否かを比較する。比較器３５ｂ〜３５ｃについても、ウェイ３４ｂ〜３４ｄに対応すること以外は同様である。 The comparator 35a compares whether or not the tag address in the address register 31 matches the tag of the way 0 among the four tags included in the set selected by the set index. The comparators 35b to 35c are the same except that they correspond to the ways 34b to 34d.

アンド回路３６ａは、バリッドフラグと比較器３５ａの比較結果とが一致するか否かを比較する。比較結果が「１」である場合は、アドレスレジスタ３１中のタグアドレス及びセットインデックスに対応するラインデータが存在すること、つまりウェイ０においてヒットしたことを意味する。比較結果が「０」である場合は、キャッシュミスしたことを意味する。アンド回路３６ｂ〜３６ｄについても、ウェイ３４ｂ〜３４ｄに対応すること以外は同様である。その比較結果は、ウェイ１〜３でヒットしたかキャッシュミスしたかを、それぞれ意味する。 The AND circuit 36a compares whether or not the valid flag matches the comparison result of the comparator 35a. When the comparison result is “1”, it means that there is line data corresponding to the tag address and the set index in the address register 31, that is, a hit in way 0. If the comparison result is “0”, it means that a cache miss has occurred. The AND circuits 36b to 36d are the same except that they correspond to the ways 34b to 34d. The comparison result indicates whether the way 1 to 3 hit or a cache miss.

セレクタ３８は、選択されたセットにおけるウェイ０〜３のラインデータのうち、ヒットしたウェイのラインデータを選択する。 The selector 38 selects the line data of the hit way among the line data of the ways 0 to 3 in the selected set.

セレクタ３９は、セレクタ３８により選択された３２ワードのラインデータのうち、ワードインデックスに示される１ワードを選択する。 The selector 39 selects one word indicated by the word index from the 32-word line data selected by the selector 38.

デマルチプレクサ４１は、キャッシュエントリにデータを書き込む際に、ウェイ０〜３の１つに書き込みデータを出力する。この書き込みデータはワード単位でよい。 The demultiplexer 41 outputs write data to one of the ways 0 to 3 when writing data to the cache entry. This write data may be in units of words.

キャッシュ制御部４２は、キャッシュメモリ装置３０の全体の制御を行う。特に、Ｗフラグの設定とＷフラグに従うリプレース制御を行う。 The cache control unit 42 performs overall control of the cache memory device 30. In particular, setting of the W flag and replacement control according to the W flag are performed.

図７は、キャッシュ制御部４２の構成を示すブロック図である。
キャッシュ制御部４２は、上記したように図１の制御部５１に該当するリプレース部４２ａと、図１の属性格納部５２に該当するＷフラグ設定部４２ｂとを含む。 FIG. 7 is a block diagram illustrating a configuration of the cache control unit 42.
As described above, the cache control unit 42 includes the replacement unit 42a corresponding to the control unit 51 of FIG. 1 and the W flag setting unit 42b corresponding to the attribute storage unit 52 of FIG.

リプレース部４２ａは、キャッシュミスによるリプレースに際して、Ｗ＝１が設定されているキャッシュエントリが存在すれば、当該キャッシュエントリのアクセス順序が最も古いとみなしてリプレース対象として選択し、リプレースを行う。 In the case of replacement due to a cache miss, if there is a cache entry for which W = 1 is set, the replacement unit 42a considers that the access order of the cache entry is the oldest, selects it as a replacement target, and performs replacement.

Ｗフラグ設定部４２ｂは、プロセッサ２０からのコマンドに応じてウィークフラグＷを設定する。プロセッサ２０は、もはや書き込みも読み出しもしないキャッシュエントリについてウィークフラグＷの設定を指示するコマンドをキャッシュメモリ装置３０に対して発行する。 The W flag setting unit 42 b sets a weak flag W in accordance with a command from the processor 20. The processor 20 issues a command to the cache memory device 30 to instruct the setting of the weak flag W for a cache entry that is no longer being written or read.

図８は、Ｗフラグ設定部４２ｂの構成例を示すブロック図である。
図８に示されるように、Ｗフラグ設定部４２ｂは、コマンドレジスタ４２１、スタートアドレスレジスタ４２２、サイズレジスタ４２３、スタートアライナ４２４、加算器４２５、エンドアライナ４２６、フラグ書換部４２７を備える。 FIG. 8 is a block diagram illustrating a configuration example of the W flag setting unit 42b.
As shown in FIG. 8, the W flag setting unit 42b includes a command register 421, a start address register 422, a size register 423, a start aligner 424, an adder 425, an end aligner 426, and a flag rewriting unit 427.

コマンドレジスタ４２１は、プロセッサ２０から直接アクセス可能なレジスタであり、プロセッサ２０により書き込まれたＷフラグ設定コマンドを保持する。 The command register 421 is a register that is directly accessible from the processor 20 and holds the W flag setting command written by the processor 20.

図９（ａ）に、コマンドレジスタ４２１にコマンドを書き込む命令の一例を示す。
この命令は、通常の転送命令（ｍｏｖ命令）であり、ソースオペランドとしてコマンドを、デスティネーションオペランドとしてコマンドレジスタ（ＣＲ）を指定している。 FIG. 9A shows an example of an instruction for writing a command to the command register 421.
This instruction is a normal transfer instruction (mov instruction), and designates a command as a source operand and a command register (CR) as a destination operand.

図９（ｂ）に、コマンドの一例を示す。このコマンドは、Ｗフラグ設定コマンドを示す特定のコードである。Ｗフラグ設定コマンドは、スタートアドレスレジスタ４２２に保持されたスタートアドレスからサイズレジスタ４２３に保持されたサイズのアドレス範囲に対応するデータを保持するキャッシュエントリに対して、Ｗフラグを設定することを指示するコマンドである。 FIG. 9B shows an example of the command. This command is a specific code indicating a W flag setting command. The W flag setting command instructs the cache entry that sets data corresponding to the address range of the size held in the size register 423 from the start address held in the start address register 422 to set the W flag. It is a command.

スタートアドレスレジスタ４２２は、プロセッサ２０から直接アクセス可能なレジスタであり、プロセッサ２０により書き込まれたスタートアドレスを保持する。このスタートアドレスはＷフラグを設定すべきアドレス範囲の開始位置を示す。 The start address register 422 is a register that is directly accessible from the processor 20 and holds the start address written by the processor 20. This start address indicates the start position of the address range in which the W flag is to be set.

図９（ｃ）に、スタートアドレスレジスタ４２２にスタートアドレスを書き込む命令の一例を示す。この命令も、図９（ａ）と同様に通常の転送命令（ｍｏｖ命令）である。 FIG. 9C shows an example of an instruction for writing a start address in the start address register 422. This instruction is also a normal transfer instruction (mov instruction) as in FIG.

サイズレジスタ４２３は、プロセッサ２０から直接アクセス可能なレジスタであり、プロセッサ２０により書き込まれたサイズを保持する。このサイズは、スタートアドレスからのアドレス範囲を示す。 The size register 423 is a register that is directly accessible from the processor 20 and holds the size written by the processor 20. This size indicates an address range from the start address.

図９（ｄ）に、サイズレジスタ４２３にサイズを書き込む命令の一例を示す。この命令も、図９（ａ）と同様に通常の転送命令（ｍｏｖ命令）である。なお、サイズの単位は、バイト数であっても、ライン数（キャッシュエントリ数）であってもよく、予め定められた単位であればよい。 FIG. 9D shows an example of an instruction for writing a size in the size register 423. This instruction is also a normal transfer instruction (mov instruction) as in FIG. The unit of size may be the number of bytes or the number of lines (number of cache entries), and may be a predetermined unit.

スタートアライナ４２４は、スタートアドレスをライン境界の位置に調整する。この調整によりプロセッサ２０はラインサイズ及びライン境界とは無関係に任意のアドレスをスタートアドレスとして指定することができる。 The start aligner 424 adjusts the start address to the position of the line boundary. By this adjustment, the processor 20 can designate an arbitrary address as the start address regardless of the line size and the line boundary.

加算器４２５は、スタートアドレスレジスタ４２２に保持されたスタートアドレスとサイズレジスタ４２３に保持されたサイズとを加算する。加算結果は、アドレス範囲の終了位置を指すエンドアドレスである。加算器４２５は、サイズがバイト数指定の場合はバイトアドレスとして加算し、サイズがライン数指定の場合はラインアドレスとして加算すればよい。 The adder 425 adds the start address held in the start address register 422 and the size held in the size register 423. The addition result is an end address indicating the end position of the address range. The adder 425 may add as a byte address when the size designates the number of bytes, and add as a line address when the size designates the number of lines.

エンドアライナ４２６は、エンドアドレスをライン境界の位置に調整する。この調整によりプロセッサ２０はラインサイズ及びライン境界とは無関係に任意の大きさを上記サイズとして指定することができる。 The end aligner 426 adjusts the end address to the position of the line boundary. By this adjustment, the processor 20 can designate an arbitrary size as the size regardless of the line size and the line boundary.

図１０に、スタートアライナ４２４及びエンドアライナ４２６の説明図を示す。
図１０において、プロセッサ２０から指定されたスタートアドレスはラインＮの途中の任意の位置を指す。スタートアライナ４２４は、次のライン（Ｎ＋１）の先頭を指すよう調整し、調整後のアドレスをアラインスタートアドレスとして出力する。アラインスタートアドレスが指すラインをスタートラインと呼ぶ。 FIG. 10 is an explanatory diagram of the start aligner 424 and the end aligner 426.
In FIG. 10, the start address designated by the processor 20 indicates an arbitrary position in the middle of the line N. The start aligner 424 adjusts to indicate the head of the next line (N + 1), and outputs the adjusted address as the align start address. The line indicated by the align start address is called the start line.

また、エンドアドレスはラインＭの途中の任意の位置を指す。エンドアライナ４２６は、直前のライン（Ｍ−１）の先頭を指すよう調整し、調整後のアドレスをアラインエンドアドレスとして出力する。アラインエンドアドレスが指すラインをエンドラインと呼ぶ。 The end address indicates an arbitrary position in the middle of the line M. The end aligner 426 adjusts to indicate the head of the immediately preceding line (M-1), and outputs the adjusted address as the align end address. The line indicated by the align end address is called an end line.

この場合、スタートライン（ライン（Ｎ＋１））からエンドライン（ライン（Ｍ−１））までの各ライン（キャッシュエントリ）にＷフラグが設定されることになる。このように、スタートアライナ４２４及びエンドアライナ４２６がプロセッサ２０から指定されたスタートアドレスからエンドアドレスまでのアドレス範囲よりも内側にアラインしているのは、ラインＮとラインＭの外側の部分にはプロセッサ２０から読み出し又は書き込みが発生する可能性があるからである。 In this case, the W flag is set in each line (cache entry) from the start line (line (N + 1)) to the end line (line (M-1)). As described above, the start aligner 424 and the end aligner 426 are aligned on the inner side of the address range from the start address to the end address specified by the processor 20. This is because there is a possibility of reading or writing from 20.

フラグ書換部４２７は、アラインスタートアドレスが指すラインからアラインエンドアドレスが指すラインまで（図１０の例ではライン（Ｎ＋１）からライン（Ｍ−１）まで）、キャッシュメモリ装置３０にエントリされていればＷフラグを１に設定する。
＜Ｗフラグ設定処理＞
図１１は、フラグ書換部４２７におけるＷフラグ設定処理の一例を示すフローチャートである。 If the flag rewriting unit 427 is entered in the cache memory device 30 from the line indicated by the align start address to the line indicated by the align end address (from the line (N + 1) to the line (M−1) in the example of FIG. 10). Set the W flag to 1.
<W flag setting process>
FIG. 11 is a flowchart illustrating an example of the W flag setting process in the flag rewriting unit 427.

フラグ書換部４２７は、コマンドレジスタ４２１にＷフラグ設定コマンドが保持されている場合、スタートラインからエンドラインまでの各ラインアドレスを順に出力しながらループ１の処理（Ｓ８２〜Ｓ８６）を行う。フラグ書換部４２７は、各ラインについて同じ処理を行うので、ここでは１ライン分の処理について説明する。 When the W flag setting command is held in the command register 421, the flag rewriting unit 427 performs loop 1 processing (S82 to S86) while sequentially outputting each line address from the start line to the end line. Since the flag rewriting unit 427 performs the same process for each line, the process for one line will be described here.

すなわち、フラグ書換部４２７は、キャッシュメモリ装置３０がプロセッサ２０からアクセスされていない間に、ラインアドレスをアドレスレジスタ３１に出力し（Ｓ８３）、アドレスレジスタ３１のタグアドレスとキャッシュエントリのタグとを比較器３５ａ〜３５ｄに比較させ、ヒットするかどうかを判定する（Ｓ８４）。さらにフラグ書換部４２７は、ヒットした場合には、ヒットしたキャッシュエントリに対してＷフラグを１にセットし（Ｓ８５）、キャッシュミスした場合には、キャッシュメモリにエントリされていないので、なにもしない。 That is, the flag rewriting unit 427 outputs the line address to the address register 31 while the cache memory device 30 is not accessed from the processor 20 (S83), and compares the tag address of the address register 31 with the tag of the cache entry. The devices 35a to 35d are compared to determine whether or not they hit (S84). Further, the flag rewriting unit 427 sets the W flag to 1 for the hit cache entry when it hits (S85), and if there is a cache miss, it is not entered in the cache memory. do not do.

これにより、スタートラインからエンドラインまでの各ラインについて、キャッシュメモリ装置３０にエントリされている場合には、Ｗフラグが「１」に設定される。
＜リプレース処理＞
図１２は、リプレース部４２ａにおけるリプレース処理を示すフローチャートである。 Thus, if each line from the start line to the end line is entered in the cache memory device 30, the W flag is set to “1”.
<Replace processing>
FIG. 12 is a flowchart showing the replacement process in the replacement unit 42a.

同図１２においてリプレース部４２ａは、メモリアクセスがキャッシュミスしたとき（Ｓ９１）、セットインデックスにより選択されたセットにおける、４つウェイのウィークフラグＷを読み出し（Ｓ９２）、４つのウィークフラグの論理和が１であるか否か、つまりＷ＝１のウェイが存在するか否かを判定する（Ｓ９３）。Ｗ＝１のウェイが存在すると判定された場合、当該キャッシュエントリのアクセス順序が最も古いとみなしてＷ＝１のウェイを１つ選択し（Ｓ９４）、Ｗ＝１のウェイが存在しないと判定された場合、通常のＬＲＵ方式によりウェイを１つ選択する（Ｓ９５）。このとき、ウィークフラグＷが１になっているウェイが複数存在する場合は、リプレース部４２ａはランダムに１つを選択する。 In FIG. 12, when the memory access causes a cache miss (S91), the replacement unit 42a reads the 4-way weak flag W in the set selected by the set index (S92), and the logical sum of the four weak flags is It is determined whether or not 1, that is, whether there is a way of W = 1 (S93). If it is determined that there is a way with W = 1, the access order of the cache entry is regarded as the oldest, and one way with W = 1 is selected (S94), and it is determined that there is no way with W = 1. If one of the ways is selected, one way is selected by the normal LRU method (S95). At this time, if there are a plurality of ways having the weak flag W set to 1, the replacement unit 42a randomly selects one.

さらに、リプレース部４２ａは、当該セットにおける選択されたウェイのキャッシュエントリを対象にリプレースし（Ｓ９６）、リプレース後に当該キャッシュエントリのウィークフラグＷを０初期化する（Ｓ９７）。なお、このときバリッドフラグＶ、ダーティフラグＤは、それぞれ１、０に初期化される。 Further, the replacement unit 42a replaces the cache entry of the selected way in the set (S96), and initializes the weak flag W of the cache entry after replacement (S97). At this time, the valid flag V and the dirty flag D are initialized to 1 and 0, respectively.

このように、Ｗ＝１のウェイが存在しない場合、リプレース対象は、通常のＬＲＵ方式により選択される。また、Ｗ＝１のウェイが存在する場合、リプレース対象は、Ｗ＝１のウェイのアクセス順序が最も古いものと取り扱われる結果、Ｗ＝１のウェイのキャッシュエントリが選択される。これにより、アクセス頻度の低いＷ＝１のデータがキャッシュメモリに存在するために生じるキャッシュミスを低減することができる。 As described above, when there is no way of W = 1, the replacement target is selected by the normal LRU method. When there is a way of W = 1, the replacement target is treated as the oldest access order of the way of W = 1, and as a result, the cache entry of the way of W = 1 is selected. As a result, it is possible to reduce cache misses caused by the fact that W = 1 data with low access frequency exists in the cache memory.

以上説明してきたように、本実施の形態１におけるキャッシュメモリによれば、ウィークフラグＷ＝１のラインは、プロセッサからこれ以上読み出しも書き込みもなされないラインであり、そのアクセス順序が最古として取り扱われる結果、真っ先にリプレース対象として選択される。これによりアクセス頻度の低いデータによるキャッシュミスの誘発を低減することができる。 As described above, according to the cache memory of the first embodiment, the line with the weak flag W = 1 is a line on which no further reading or writing is performed from the processor, and the access order is treated as the oldest. As a result, it is selected as the replacement target first. As a result, it is possible to reduce the occurrence of a cache miss due to data with low access frequency.

また、従来のＬＲＵ方式におけるアクセス順序を示す順序データを直接改変することなく、間接的にＷフラグを付加することによってアクセス順序を改変しているので、複雑なハードウェア回路を追加することなく、実現することができる。 In addition, since the access order is modified by adding the W flag indirectly without directly modifying the order data indicating the access order in the conventional LRU method, without adding a complicated hardware circuit, Can be realized.

次いで、信号処理装置１の各部が実行する動作を説明する。
図１３から図１９は、キャッシュメモリ３４に格納されるデータの流れを示した図である。 Next, operations performed by each unit of the signal processing device 1 will be described.
13 to 19 are diagrams showing the flow of data stored in the cache memory 34.

本実施の形態１では、図２２のＭＰＥＧ−ＡＡＣ規格のハフマンコードブック８を用いてエンコードされた符号をデコードする信号処理装置１を例にとり説明する。 In the first embodiment, a signal processing apparatus 1 that decodes a code encoded using the Huffman codebook 8 of the MPEG-AAC standard in FIG. 22 will be described as an example.

なお、従来の信号処理装置との動作の差異を明らかにするために、キャッシュメモリ３４の１つのウェイの４セットだけを使用するものとし、また、入力のビット列が、従来と同様「０１１１０１１００００１１１１１１０１１０１０００１１１１ｘｘｘｘ」と続いているものとして説明する。 In order to clarify the difference in operation with the conventional signal processing apparatus, only four sets of one way of the cache memory 34 are used. It will be described as continuing.

図１３に示されるように、プロセッサ２０は、まず、当該ビット列の最初の６ビット「０１１１０１」をアドレスとして、第１テーブル１３を引く。この場合、当該アドレス「０１１１０１」がＬｉｎｅ３にあたるので（図２参照）、当該Ｌｉｎｅ３のデータが主メモリ１０からキャッシュメモリ３４に移された後、アドレス「０１１１０１」のデータをアクセスし、符号長５ビットと、デコード結果「０」とを取得する。 As shown in FIG. 13, the processor 20 first draws the first table 13 using the first 6 bits “011101” of the bit string as an address. In this case, since the address “011101” corresponds to Line 3 (see FIG. 2), after the data in Line 3 is transferred from the main memory 10 to the cache memory 34, the data at the address “011101” is accessed and the code length is 5 bits. Then, the decoding result “0” is acquired.

より詳しくは、プロセッサ２０は、キャッシュメモリ装置３０に対して、ビット列の最初の６ビット「０１１１０１」をアドレスとして、リード信号を出力する。「０１１１０１」をアドレスとするリード信号を受信したキャッシュメモリ装置３０は、キャッシュメモリ３４にＬｉｎｅ３のデータが存在しないので、Ｌｉｎｅ３の先頭から最終までをアドレス指定することによりＬｉｎｅ３のデータをキャッシュメモリ３４の１つのセットに書き込む。そして、タグにはＬｉｎｅ３が、Ｗフラグには「０」が、ダーティフラグＤには「０」が書き込まれる。これらの書き込みが終わると、キャッシュメモリ装置３０は、アドレス「０１１１０１」のデータ、符号長５ビットと、デコード結果「０」とをプロセッサ２０に出力する。 More specifically, the processor 20 outputs a read signal to the cache memory device 30 using the first 6 bits “011101” of the bit string as an address. The cache memory device 30 that has received the read signal whose address is “011101” does not have Line 3 data in the cache memory 34. Write to one set. Line 3 is written in the tag, “0” is written in the W flag, and “0” is written in the dirty flag D. When these writings are completed, the cache memory device 30 outputs the data of the address “011101”, the code length of 5 bits, and the decoding result “0” to the processor 20.

そして、符号長５ビットと、デコード結果「０」とを取得すると、プロセッサ２０は、後の信号処理（逆量子化処理、ＩＭＤＣＴ（ＩｎｖｅｒｓｅＭｏｄｉｆｉｅｄＤｉｓｃｒｅｔｅＣｏｓｉｎｅＴｒａｎｓｆｏｒｍ）処理等）を行うためにデコード結果「０」をバッファ等に出力した後、上記符号長が５ビットであったので、上記ビット列を５ビット進める。つまり、取得点Ｐを５ビット移動させる。そうすると、次のビット列は「１１００００１１１１１１０１１０１０００１１１１ｘｘｘｘ」となる。 When the code length of 5 bits and the decoding result “0” are acquired, the processor 20 performs the subsequent signal processing (such as inverse quantization processing, IMDCT (Inverse Modified Discrete Cosine Transform) processing), and the like. After the “0” is output to the buffer or the like, since the code length is 5 bits, the bit string is advanced by 5 bits. That is, the acquisition point P is moved by 5 bits. Then, the next bit string becomes “110000111111011010001111xxxx”.

次のビット列が「１１００００１１１１１１０１１０１０００１１１１ｘｘｘｘ」である場合、図１４に示されるように、先ほどと同様に、プロセッサ２０は、当該ビット列の最初の６ビット「１１００００」をアドレスとして、第１テーブル１３を引く。この場合、当該アドレス「１１００００」がＬｉｎｅ６にあたるので（図２参照）、当該Ｌｉｎｅ６のデータが主メモリ１０からキャッシュメモリ３４に移された後（図１４）、当該アドレス「１１００００」のデータをアクセスし、符号長６ビット、デコード結果「３」を取得する。 When the next bit string is “110000111111011010001111xxxx”, as shown in FIG. 14, the processor 20 draws the first table 13 using the first 6 bits “110000” of the bit string as an address, as described above. In this case, since the address “110000” corresponds to Line 6 (see FIG. 2), after the data of Line 6 is transferred from the main memory 10 to the cache memory 34 (FIG. 14), the data at the address “110000” is accessed. The code length of 6 bits and the decoding result “3” are acquired.

なお、タグにはＬｉｎｅ６が、Ｗフラグには「０」が、ダーティフラグＤには「０」が書き込まれる。 Note that Line 6 is written in the tag, “0” is written in the W flag, and “0” is written in the dirty flag D.

そして、プロセッサ２０は、デコード結果「３」を出力した後、上記符号長が６ビットであったので、上記ビット列を６ビット進める。つまり、取得点Ｐを６ビット移動させる。上記符号長が６ビットであったので、上記ビット列を６ビット進めると、次のビット列は「１１１１１１０１１０１０００１１１１ｘｘｘｘ」となる。 Then, after outputting the decoding result “3”, the processor 20 advances the bit string by 6 bits because the code length is 6 bits. That is, the acquisition point P is moved by 6 bits. Since the code length is 6 bits, when the bit string is advanced by 6 bits, the next bit string is “111111011010001111xxxx”.

次のビット列が「１１１１１１０１１０１０００１１１１ｘｘｘｘ」である場合、図１５に示されるように、プロセッサ２０は、先ほどと同様に、当該ビット列の最初の６ビット「１１１１１１」をアドレスとして、第１テーブル１３を引く。この場合、当該アドレス「１１１１１１」がＬｉｎｅ７にあたるので（図２参照）、当該Ｌｉｎｅ７のデータが主メモリ１０からキャッシュメモリ３４に移された後、当該アドレス「１１１１１１」のデータにアクセスし、未完了記号「１５」、次のアドレスポインタ「１１０００１０」、次に読み込むビット数「４」を取得する。 When the next bit string is “111111011010001111xxxx”, as shown in FIG. 15, the processor 20 draws the first table 13 using the first 6 bits “111111” of the bit string as an address, as described above. In this case, since the address “111111” corresponds to the Line 7 (see FIG. 2), after the data of the Line 7 is transferred from the main memory 10 to the cache memory 34, the data at the address “111111” is accessed, and the incomplete symbol “15”, the next address pointer “1100010”, and the number of bits to be read “4” are acquired.

なお、タグにはＬｉｎｅ７が、Ｗフラグには「０」が、ダーティフラグＤには「０」が書き込まれる。ここで、キャッシュメモリ装置３０のデータの取得動作については、上記の場合と同様であるので、その説明を以下省略する。 Note that Line 7 is written in the tag, “0” is written in the W flag, and “0” is written in the dirty flag D. Here, since the data acquisition operation of the cache memory device 30 is the same as that described above, the description thereof is omitted below.

そこで、図３に示した第２テーブル１４のアドレス「１１０００１０」にポインタを進め、上記ビット列を６ビット進めた後に、続く４ビットの値をアドレス「１１０００１０」からのインデックスとし、第２テーブル１４を引く。この場合、当該４ビットの値は、「０１１０」であるので、アドレス「１１０００１０」に「０１１０」を足して、次にアクセスするアドレスは「１１０１０００」となる。 Therefore, the pointer is advanced to the address “1100010” of the second table 14 shown in FIG. 3, the bit string is advanced by 6 bits, and the subsequent 4-bit value is used as an index from the address “1100010”. Pull. In this case, since the 4-bit value is “0110”, “0110” is added to the address “1100010”, and the next access address is “1101000”.

つまり、プロセッサ２０は、図１６に示されるように、次の４ビット「０１１０」を取得し、先に取得したアドレス「１１０００１０」に次の４ビット「０１１０」を足して、次にアクセスするアドレスを「１１０１０００」を計算する。 That is, as shown in FIG. 16, the processor 20 acquires the next 4 bits “0110”, adds the next 4 bits “0110” to the previously acquired address “1100010”, and then accesses the next address. "1101000" is calculated.

計算が終わると、プロセッサ２０は、計算により得られた「１１０１０００」をアドレスとして、第２テーブル１４を引く。このアドレス「１１０１０００」がＬｉｎｅ１３にあたるので（図３参照）、当該Ｌｉｎｅ１３のデータが主メモリ１０からキャッシュメモリ３４に移された後、プロセッサ２０は、当該データをアクセスし、符号長３ビット、デコード結果「６１」を取得する。 When the calculation is completed, the processor 20 draws the second table 14 using “1101000” obtained by the calculation as an address. Since this address “1101000” corresponds to the Line 13 (see FIG. 3), after the data of the Line 13 is transferred from the main memory 10 to the cache memory 34, the processor 20 accesses the data, the code length is 3 bits, and the decoding result “61” is acquired.

なお、タグにはＬｉｎｅ１３が、Ｗフラグには「１」が、ダーティフラグＤには「０」が書き込まれる。 Note that Line 13 is written in the tag, “1” is written in the W flag, and “0” is written in the dirty flag D.

そして、プロセッサ２０は、デコード結果「６１」を出力した後、上記符号長が３ビットであったので、上記ビット列を３ビット進める。つまり、取得点Ｐを６＋３＝９ビット移動させる。取得点Ｐを移動させると、次のビット列は「０１０００１１１１ｘｘｘｘ」となる。 Then, after outputting the decoding result “61”, the processor 20 advances the bit string by 3 bits because the code length is 3 bits. That is, the acquisition point P is moved 6 + 3 = 9 bits. When the acquisition point P is moved, the next bit string becomes “010001111xxxx”.

次のビット列が「０１０００１１１１ｘｘｘｘ」である場合、図１７に示されるように、プロセッサ２０は、先ほどと同様に、当該ビット列の最初の６ビット「０１０００１」をアドレスとして、第１テーブル１３を引く。この場合、当該アドレス「０１０００１」がＬｉｎｅ２にあたるので（図２参照）、当該Ｌｉｎｅ２のデータを主メモリ１０からキャッシュメモリ３４に移した後、当該データにアクセスする必要がある。 When the next bit string is “010001111xxxx”, as shown in FIG. 17, the processor 20 draws the first table 13 using the first 6 bits “010001” of the bit string as an address, as described above. In this case, since the address “010001” corresponds to Line 2 (see FIG. 2), it is necessary to access the data after moving the data of Line 2 from the main memory 10 to the cache memory 34.

ところで、今、キャッシュメモリ３４に空き領域はないので、キャッシュメモリ３４に格納されているいずれかのラインのデータを主メモリ１０に書き戻し、それによって生じた空き領域に当該Ｌｉｎｅ２のデータを移す必要がある。 Now, since there is no free area in the cache memory 34, it is necessary to write back the data of any line stored in the cache memory 34 to the main memory 10 and move the data of the Line 2 to the free area generated thereby. There is.

さてここで、制御部５１は、属性格納部５２で指定されている主メモリ１０の第１メモリ領域１１及び第２メモリ領域１２の強度情報に基づいて、キャッシュメモリ３４に格納されているいずれかのラインのデータを主メモリ１０に書き戻すかを決定する。この場合、第１テーブル１３にあたるＬｉｎｅ０からＬｉｎｅ７は強度が大で、第２テーブル１４にあたるＬｉｎｅ８からＬｉｎｅ１４は強度が小であることが示されている（図４参照）。このため、第２テーブル１４のデータで、現在、キャッシュメモリ３４上に格納されているデータを主メモリ１０に書き戻す。つまり現在Ｌｉｎｅ１３のデータがキャッシュメモリ３４に格納されているので、このデータが主メモリ１０に書き戻され（図１７）、それによって生じた空き領域にＬｉｎｅ２のデータが移される（図１８）。 Here, the control unit 51 is one of those stored in the cache memory 34 based on the strength information of the first memory area 11 and the second memory area 12 of the main memory 10 specified by the attribute storage unit 52. It is determined whether to write back the data of this line back to the main memory 10. In this case, it is indicated that Line 0 to Line 7 corresponding to the first table 13 have a high strength, and Line 8 to Line 14 corresponding to the second table 14 have a low strength (see FIG. 4). Therefore, the data currently stored in the cache memory 34 as the data of the second table 14 is written back to the main memory 10. That is, since the data of Line 13 is currently stored in the cache memory 34, this data is written back to the main memory 10 (FIG. 17), and the data of Line 2 is moved to the free space generated thereby (FIG. 18).

その後、プロセッサ２０は、当該Ｌｉｎｅ２のデータをアクセスし、符号長４ビット、デコード結果１０を取得する。上記符号長が４ビットであったので上記ビット列を４ビット進めると、ビット列は「０１１１１ｘｘｘｘ」となる。 After that, the processor 20 accesses the data of the Line 2 and acquires the decoding result 10 with a code length of 4 bits. Since the code length is 4 bits, if the bit string is advanced by 4 bits, the bit string becomes “01111xxxx”.

この過程において、上記説明では、Ｌｉｎｅ１３のデータがキャッシュメモリ３４から主メモリ１０に書き戻されるとしたが、本一連の動作の過程では、Ｌｉｎｅ１３のデータは、キャッシュメモリ３４に格納されている間に値が変化していないので、つまりキャッシュメモリ３４上での値と主メモリ１０上での値とが同一であるので、従来の多くのキャッシュメモリの制御方法では、書き戻すという動作はなく、Ｌｉｎｅ２のデータをＬｉｎｅ１３のデータ上に上書きする、ということになる。 In this process, in the above description, it is assumed that the data of Line 13 is written back from the cache memory 34 to the main memory 10. However, in the process of this series of operations, the data of Line 13 is stored in the cache memory 34. Since the value has not changed, that is, the value on the cache memory 34 and the value on the main memory 10 are the same, many conventional cache memory control methods do not have an operation of rewriting, and Line 2 Is overwritten on the data of Line13.

すなわち、キャッシュメモリ３４に空きがない場合、従来では、ＬＲＵ方式により、Ｌｉｎｅ３のデータをリプレース対象に決定し、そのライン上にＬｉｎｅ２のデータを上書きしている。これに対して、本実施の形態１の制御部５１は、キャッシュメモリ３４のウィークフラグＷに「１」がセットされているラインをサーチし、Ｌｉｎｅ１３のデータをリプレース対象に決定する。そして、制御部５１は、Ｌｉｎｅ１３のダーティフラグＤの値が「０」のままであるのを確認し、そのライン上にＬｉｎｅ２のデータを上書きするようにしている。 That is, when there is no free space in the cache memory 34, conventionally, the data of Line 3 is determined as a replacement target by the LRU method, and the data of Line 2 is overwritten on the line. On the other hand, the control unit 51 of the first embodiment searches for a line in which the weak flag W of the cache memory 34 is set to “1”, and determines the data of Line 13 as a replacement target. Then, the control unit 51 confirms that the value of the dirty flag D of Line 13 remains “0”, and overwrites the data of Line 2 on the line.

これにより、アクセス頻度の高いデータがキャッシュメモリ３４に残るので、ヒットの確率が高くなり、キャッシュミスが生じる確率が低くなることになる。 As a result, since frequently accessed data remains in the cache memory 34, the probability of hits increases and the probability of occurrence of cache misses decreases.

次のビット列が「０１１１１ｘｘｘｘ」である場合、プロセッサ２０は、先ほどと同様に、当該ビット列の最初の６ビットをアドレスとして、第１テーブル１３を引く。この場合、当該アドレス「０１１１１ｘ」は、Ｌｉｎｅ３にあたる（図２参照）。今、当該Ｌｉｎｅ３のデータは、キャッシュメモリ３４に格納されているので、ヒットし、プロセッサ２０は、直ちに当該アドレス「０１１１１ｘ」のデータをアクセスし、直ちに符号長５ビット、デコード結果１６を取得する。 When the next bit string is “01111xxxx”, the processor 20 draws the first table 13 using the first 6 bits of the bit string as an address, as described above. In this case, the address “01111x” corresponds to Line 3 (see FIG. 2). Now, since the data of Line 3 is stored in the cache memory 34, it hits and the processor 20 immediately accesses the data of the address “01111x”, and immediately acquires the decoding result 16 of 5 bits in code length.

この過程においては、先に説明した従来の技術では、キャッシュメモリ３４に空き領域はない場合、通常、一番過去にアクセスされたＬｉｎｅが書き戻されるので、Ｌｉｎｅ３のデータが複数回、主メモリ１０とキャッシュメモリ３４の間で移動を繰り返したが、上記制御部５１（リプレース部４２ａ）と上記属性格納部５２（Ｗフラグ設定部４２ｂ）との働きによって、アクセス頻度の高いデータが優先的にキャッシュメモリ３４のデータエリア上に残るように制御され、結果として、主メモリ１０とキャッシュメモリ３４の間でのデータの移動回数を削減できる。 In this process, in the conventional technique described above, when there is no free space in the cache memory 34, the most recently accessed Line is usually written back, so the data in Line 3 is stored in the main memory 10 a plurality of times. And the cache memory 34 are repeatedly moved, but the frequently accessed data is preferentially cached by the functions of the control unit 51 (replacement unit 42a) and the attribute storage unit 52 (W flag setting unit 42b). As a result, the number of times of data movement between the main memory 10 and the cache memory 34 can be reduced.

以上のように本実施の形態１によれば、主メモリ１０の第１のメモリ領域１１と、第２のメモリ領域１２と、当該メモリ領域１１，１２のデータを一時格納するキャッシュメモリ３４と、キャッシュメモリ３４を介して上記第１、第２のメモリ領域１１，１２にアクセスするためのプロセッサ２０と、上記キャッシュメモリ３４に空き領域が不足した場合、空き領域を確保する制御部５１とを用いて、上記制御部５１において、上記第１のメモリ領域１１のデータ（第１テーブル１３）を優先的にキャッシュメモリ３４に保持するようにしながら空き領域を確保するようにすることによって、主メモリ１０とキャッシュメモリ３４との間の転送回数を削減できることとなる。 As described above, according to the first embodiment, the first memory area 11, the second memory area 12 of the main memory 10, the cache memory 34 for temporarily storing data in the memory areas 11 and 12, A processor 20 for accessing the first and second memory areas 11 and 12 via the cache memory 34 and a control unit 51 for securing a free area when the cache memory 34 runs out of free area are used. Thus, the control unit 51 secures a free area while preferentially holding the data (first table 13) of the first memory area 11 in the cache memory 34, so that the main memory 10 And the number of transfers between the cache memory 34 and the cache memory 34 can be reduced.

なお、本実施の形態におけるハフマンデコード処理では、第１のメモリ領域１１に１回アクセスすることでデコードが完了するか、第１のメモリ領域１１に１回アクセスした後、第２のメモリ領域１２に１回アクセスすることでデコードが完了するか、の２種類の過程でデコードが完了するような処理について説明したが、必ずしもそのようなものである必要はない。 In the Huffman decoding process according to the present embodiment, the decoding is completed by accessing the first memory area 11 once, or the second memory area 12 is accessed after the first memory area 11 is accessed once. Although the process has been described in which the decoding is completed by two accesses, i.e., decoding is completed by accessing once in the above, it is not always necessary.

例えば、特開２０００−２８６７１７号公報（実施の形態１）で示されるように、１回メモリにアクセスすることでデコードが完了するか、そうでない場合は、さらに１回以上の複数回メモリにアクセスすることでデコードを完了させるという処理であってもよい。その場合、たとえば、１回目にアクセスされる可能性があるメモリ領域を本実施の形態でいうところの第１のメモリ領域１１に割り当て、２回目以降にアクセスされる可能性のあるメモリ領域を本実施の形態でいうところの第２のメモリ領域１２に割り当てればよい。 For example, as shown in Japanese Patent Laid-Open No. 2000-286717 (Embodiment 1), decoding is completed by accessing the memory once, or if not, the memory is accessed one or more times more than once. In this case, the decoding may be completed. In this case, for example, a memory area that may be accessed for the first time is assigned to the first memory area 11 in the present embodiment, and a memory area that may be accessed for the second time or later is assigned to the main memory area. What is necessary is just to allocate to the 2nd memory area 12 as used in the embodiment.

あるいは、比較的小さな整数をｍとしたとき、ｍ回目までにアクセスされる可能性があるメモリ領域を本実施の形態でいうところの第１のメモリ領域１１に割り当て、ｍ回目より後にアクセスされる可能性のあるメモリ領域を本実施の形態でいうところの第２のメモリ領域１２に割り当てるという方法もある。 Alternatively, when m is a relatively small integer, a memory area that may be accessed by the mth time is assigned to the first memory area 11 in the present embodiment, and accessed after the mth time. There is also a method of allocating a possible memory area to the second memory area 12 in this embodiment.

この場合についても、第１のメモリ領域１１のデータ（第１テーブル５０００）を優先的にキャッシュメモリ３４に保持するようにしながら空き領域を確保するようにすることによって、主メモリ１０とキャッシュメモリ３４との間の転送回数を従来より大幅に削減できることとなる。 Also in this case, the main memory 10 and the cache memory 34 are secured by securing a free area while preferentially holding the data (first table 5000) in the first memory area 11 in the cache memory 34. Thus, the number of transfers to and from can be greatly reduced as compared with the prior art.

（実施の形態２）
次いで、本発明の実施の形態２に係る信号処理装置２について説明する。 (Embodiment 2)
Next, the signal processing device 2 according to Embodiment 2 of the present invention will be described.

ところで、実施の形態１の信号処理装置１では、制御部５１を設けて、アクセス頻度の低いデータがキャッシュメモリに残らないようにしている。つまり、信号処理装置１においては、第２テーブル１４のデータをキャッシュメモリ３４に一旦格納し、空きがない場合には第２テーブル１４のデータをリプレースの対象にしている。 By the way, in the signal processing apparatus 1 according to the first embodiment, the control unit 51 is provided so that data with low access frequency does not remain in the cache memory. That is, in the signal processing device 1, the data in the second table 14 is temporarily stored in the cache memory 34, and when there is no free space, the data in the second table 14 is targeted for replacement.

しかしながらそうすると、その分第１テーブル１３のデータがキャッシュメモリ３４に存在する確率が低くなる。すなわち、第１テーブル１３のデータに対する割り当て領域が減ることになる。 However, if it does so, the probability that the data of the 1st table 13 will exist in the cache memory 34 will become low. That is, the allocation area for the data of the first table 13 is reduced.

そこで、そもそもアクセス頻度の低いデータがキャッシュメモリに格納されないように制御するようにしている。 Therefore, control is performed so that data with low access frequency is not stored in the cache memory.

その場合は、図２０に示すような構成になる。
すなわち、図２０に示されるように、信号処理装置２は、主メモリ１０と、プロセッサ２０と、キャッシュメモリを介して主メモリ１０にアクセスする機能と、キャッシュメモリを介さずに主メモリ１０にアクセスする機能とを備えるキャッシュメモリ装置６０とを備え、主メモリ１０の第１メモリ領域１１に格納され、キャッシュメモリを介してアクセスされる第１テーブル１３と、主メモリ１０の第２メモリ領域１２に格納され、キャッシュメモリを介さずにアクセスされる第２テーブル１４とで構成される。 In that case, the configuration is as shown in FIG.
That is, as shown in FIG. 20, the signal processing device 2 accesses the main memory 10 via the main memory 10, the processor 20, the function of accessing the main memory 10 via the cache memory, and the cache memory. A cache memory device 60 having a function to store the first table 13 stored in the first memory area 11 of the main memory 10 and accessed via the cache memory, and the second memory area 12 of the main memory 10. The second table 14 is stored and accessed without going through the cache memory.

キャッシュメモリ装置６０は、キャッシュメモリ６１と、制御部６２と、アクセス制御部６３と、属性格納部６４等とからなる。 The cache memory device 60 includes a cache memory 61, a control unit 62, an access control unit 63, an attribute storage unit 64, and the like.

キャッシュメモリ６１は、キャッシュメモリ３４からウィークフラグＷを削除した構成である。 The cache memory 61 has a configuration in which the weak flag W is deleted from the cache memory 34.

制御部６２は、通常のＬＲＵ方式でリプレース対象を決定する。
属性格納部６４は、プロセッサ２０から出力された属性を設定するコマンドに基づいて強度属性のデータを格納する。 The control unit 62 determines a replacement target by a normal LRU method.
The attribute storage unit 64 stores strength attribute data based on a command for setting an attribute output from the processor 20.

図２１は、属性格納部６４に格納される属性データの内容の一例を示した図である。
この場合、第１テーブル１３にあたるＬｉｎｅ０からＬｉｎｅ７は強度が大（キャッシュ）で、第２テーブル１４にあたるＬｉｎｅ８からＬｉｎｅ１４は強度が小（ノンキャッシュ）であることが示されている。このような強度属性が設定されることにより、適切なリプレイスメントアルゴリズムが発揮され、キャッシュミスが減少されることになる。 FIG. 21 is a diagram illustrating an example of the contents of attribute data stored in the attribute storage unit 64.
In this case, it is indicated that Line 0 to Line 7 corresponding to the first table 13 have a high strength (cache), and Line 8 to Line 14 corresponding to the second table 14 have a low strength (non-cache). By setting such a strength attribute, an appropriate replacement algorithm is exhibited and cache misses are reduced.

アクセス制御部６３は、属性格納部６４に格納されたデータに基づいて、キャッシュメモリ６１を介してアクセスするか、第２テーブル１４にアクセスするかを決定する。具体的には、プロセッサ２０がアクセスするアドレスがＬｉｎｅ０〜Ｌｉｎｅ７の範囲である場合には、アクセス制御部６３は、キャッシュメモリ６１をイネーブルにし、キャッシュメモリ６１を介するＬｉｎｅ０〜Ｌｉｎｅ７のデータの読み出しを仲介する。プロセッサ２０がアクセスするアドレスがＬｉｎｅ８〜Ｌｉｎｅ１４の範囲である場合には、アクセス制御部６３は、キャッシュメモリ６１をディスエーブルにし、主メモリ１０の第２メモリ領域１２に格納された第２テーブル１４の該当アドレスのデータに直接アクセスし、読み出したデータをプロセッサ２０に出力する。 The access control unit 63 determines whether to access via the cache memory 61 or to access the second table 14 based on the data stored in the attribute storage unit 64. Specifically, when the address accessed by the processor 20 is in the range of Line 0 to Line 7, the access control unit 63 enables the cache memory 61 and mediates reading of the data of Line 0 to Line 7 via the cache memory 61. To do. When the address accessed by the processor 20 is in the range of Line 8 to Line 14, the access control unit 63 disables the cache memory 61 and stores the second table 14 stored in the second memory area 12 of the main memory 10. The data at the corresponding address is directly accessed, and the read data is output to the processor 20.

このような構成により、プロセッサ２０は、第１テーブル１３にアクセスするときは上記キャッシュメモリ６１を介してアクセスし、第２テーブル１４にアクセスするときはキャッシュメモリ６１を介さないでアクセスすることができ、アクセス頻度の低い第２テーブル１４のデータがキャッシュメモリ６１に格納されないようにすることができる。 With such a configuration, the processor 20 can access the first table 13 through the cache memory 61 and can access the second table 14 without going through the cache memory 61. The data of the second table 14 with low access frequency can be prevented from being stored in the cache memory 61.

これにより、アクセス頻度の高い第１テーブル１３のデータが一度キャッシュメモリに格納されるとそのまま保持される可能が高まり、結果として、主メモリ１０とキャッシュメモリ６１との間の転送回数を削減できることとなる。
＜変形例＞
なお、本発明のキャッシュメモリは、上記の実施の形態１の構成に限るものではなく、種々の変形が可能である。以下、変形例のいくつかについて説明する。 As a result, once the frequently accessed data in the first table 13 is stored in the cache memory, it is more likely to be held as it is, and as a result, the number of transfers between the main memory 10 and the cache memory 61 can be reduced. Become.
<Modification>
The cache memory of the present invention is not limited to the configuration of the first embodiment, and various modifications can be made. Hereinafter, some modified examples will be described.

（１）上記実施の形態１では、ウィークフラグＷによりアクセス順序が最古であることを示しているが、アクセス順序が最新又は最古でないことを示すものとしてもよい。この場合、リプレース部４２ａは、Ｗ＝１のキャッシュエントリは最古でないとみなして、リプレース対象として選択せず、他のキャッシュエントリを選択する構成とすればよい。最古でないことを示すウィークフラグＷを、アクセス頻度の高いデータあるいはアクセス頻度が中くらいのデータを保持するキャッシュエントリに付加することにより、無駄リプレースを防止することができる。 (1) In the first embodiment, the weak flag W indicates that the access order is the oldest, but the access order may indicate that the access order is the latest or not the oldest. In this case, the replacement unit 42a may consider that the cache entry with W = 1 is not the oldest and select another cache entry without selecting it as a replacement target. By adding a weak flag W indicating that it is not the oldest to a cache entry that holds data having a high access frequency or data having a medium access frequency, it is possible to prevent unnecessary replacement.

（２）プロセッサ２０が、ウィークフラグＷ＝１の設定とデータの書き込みとを命令する特別なストア命令を実行し、キャッシュ制御部４２は、さらに、特別なストア命令を検出する命令検出部と、当該ストア命令による書き込みの際にＷ＝１に設定するフラグ設定部とを備える構成としてもよい。 (2) The processor 20 executes a special store instruction that instructs the setting of the weak flag W = 1 and the writing of data, and the cache control unit 42 further includes an instruction detection unit that detects the special store instruction; It is good also as a structure provided with the flag setting part which sets W = 1 at the time of the write by the said store instruction.

（３）上記実施の形態１では、４ウェイ・セット・アソシエイティブのキャッシュメモリを例に説明したが、ウェイ数は、いくつでもよい。また、上記実施の形態１では、セット数が１６である例を説明したが、セット数はいくつでもよい。 (3) In the first embodiment, a 4-way set associative cache memory has been described as an example. However, the number of ways may be any number. In the first embodiment, an example in which the number of sets is 16 has been described. However, any number of sets may be used.

（４）上記実施の形態１では、セット・アソシエイティブのキャッシュメモリを例に説明したが、フル・アソシエイティブ方式のキャッシュメモリであってもよい。 (4) In the first embodiment, a set-associative cache memory has been described as an example. However, a full-associative cache memory may be used.

（５）上記実施の形態１では、サブラインのサイズをラインのサイズの１／４としているが、１／２、１／８、１／１６等他のサイズでもよい。その場合、各キャッシュエントリは、サブラインと同数のバリッドフラグ及びダーティフラグをそれぞれ保持すればよい。 (5) In the first embodiment, the size of the sub-line is set to 1/4 of the size of the line, but other sizes such as 1/2, 1/8, and 1/16 may be used. In that case, each cache entry may hold the same number of valid flags and dirty flags as sublines.

（６）また、上記実施の形態では、音楽データのエンコードデータについて例えばＭＰＥＧ−ＡＡＣ規格のハフマンコードブック（例えば、８）に対応するテーブルを用いてデコードする場合について説明したが、他のコードブックに対応したテーブルを用いてデコードする場合に適用できる。また、ＭＰ３等のエンコードデータをデコードする場合にも適用できる。さらに、圧縮符号化された映像データ等をデコードするような種々の信号処理に適用できるのは、いうまでもない。 (6) In the above embodiment, the case where the encoded data of music data is decoded using a table corresponding to, for example, the Huffman codebook (for example, 8) of the MPEG-AAC standard has been described. This can be applied when decoding using a table corresponding to. Also, the present invention can be applied when decoding encoded data such as MP3. Furthermore, it goes without saying that the present invention can be applied to various signal processes such as decoding compressed and encoded video data.

本発明に係る信号処理装置は、キャッシュミスを低減し、キャッシュメモリと主メモリとの間のデータ転送時間・量を削減する効果を有し、プロセッサと記憶装置によって構成される機器、例えば、パーソナルコンピュータ等において有用である。また、デジタル化されたオーディオ信号やビデオ信号をプロセッサと記憶装置によって処理する機器、例えば、デジタルテレビ、ＤＶＤ機器、携帯電話機、携帯情報端末、ヘッドホンステレオ等の用途に応用できる。 The signal processing device according to the present invention has the effect of reducing cache misses and reducing the time and amount of data transfer between the cache memory and the main memory, and is composed of a device composed of a processor and a storage device, for example, a personal computer Useful in computers and the like. Further, the present invention can be applied to devices such as digital televisions, DVD devices, mobile phones, portable information terminals, and headphone stereos that process digitized audio signals and video signals by a processor and a storage device.

本発明の実施の形態１に係る信号処理装置１の構成を示す図である。It is a figure which shows the structure of the signal processing apparatus 1 which concerns on Embodiment 1 of this invention. 第１テーブル１３の内容の一例を示した図である。5 is a diagram showing an example of the contents of a first table 13. FIG. 第２テーブル１４の内容の一例を示した図である。It is the figure which showed an example of the content of the 2nd table. 属性格納部５２が格納するデータ内容の一例を示した図である。It is the figure which showed an example of the data content which the attribute storage part 52 stores. 図１に示されるキャッシュメモリ装置３０の構成例を示すブロック図である。FIG. 2 is a block diagram illustrating a configuration example of a cache memory device 30 illustrated in FIG. 1. キャッシュエントリの詳細なビット構成を示す。The detailed bit structure of a cache entry is shown. キャッシュ制御部４２の構成を示すブロック図である。3 is a block diagram showing a configuration of a cache control unit 42. FIG. Ｗフラグ設定部の構成例を示すブロック図である。It is a block diagram which shows the structural example of a W flag setting part. （ａ）コマンドレジスタにコマンドを書き込む命令の一例を示す図である。（ｂ）コマンドの一例を示す図である。（ｃ）スタートアドレスレジスタにスタートアドレスを書き込む命令の一例を示す図である。（ｄ）サイズレジスタにサイズを書き込む命令の一例を示す図である。(A) It is a figure which shows an example of the command which writes a command in a command register. (B) It is a figure which shows an example of a command. (C) It is a figure which shows an example of the command which writes a start address in a start address register. (D) It is a figure which shows an example of the command which writes size in a size register. スタートアライナ及びエンドアライナの説明図を示す図である。It is a figure which shows explanatory drawing of a start aligner and an end aligner. フラグ書換部におけるＷフラグ設定処理の示すフローチャートである。It is a flowchart which shows the W flag setting process in a flag rewriting part. リプレース部におけるリプレース処理を示すフローチャートである。It is a flowchart which shows the replacement process in a replacement part. キャッシュメモリに格納されるデータの流れを示した図である。It is the figure which showed the flow of the data stored in a cache memory. キャッシュメモリに格納されるデータの流れを示した図である。It is the figure which showed the flow of the data stored in a cache memory. キャッシュメモリに格納されるデータの流れを示した図である。It is the figure which showed the flow of the data stored in a cache memory. キャッシュメモリに格納されるデータの流れを示した図である。It is the figure which showed the flow of the data stored in a cache memory. キャッシュメモリに格納されるデータの流れを示した図である。It is the figure which showed the flow of the data stored in a cache memory. キャッシュメモリに格納されるデータの流れを示した図である。It is the figure which showed the flow of the data stored in a cache memory. キャッシュメモリに格納されるデータの流れを示した図である。It is the figure which showed the flow of the data stored in a cache memory. 本発明の実施の形態２に係る信号処理装置２の構成を示すブロック図である。It is a block diagram which shows the structure of the signal processing apparatus 2 which concerns on Embodiment 2 of this invention. 属性格納部６４が格納する属性データの内容の一例を示した図である。It is the figure which showed an example of the content of the attribute data which the attribute storage part 64 stores. ＭＰＥＧ−ＡＡＣ規格のハフマンコードブック８を示す図である。It is a figure which shows the Huffman codebook 8 of an MPEG-AAC standard. （ａ）ハフマンデコード処理を高速に行うための第１テーブル５０００を示す図である。（ｂ）ハフマンデコード処理を高速に行うための第２テーブル６０００を示す図である。(A) It is a figure which shows the 1st table 5000 for performing a Huffman decoding process at high speed. (B) It is a figure which shows the 2nd table 6000 for performing a Huffman decoding process at high speed. キャッシュメモリと主メモリとプロセッサとの関係を示した図である。It is the figure which showed the relationship between a cache memory, main memory, and a processor. 従来のキャッシュメモリを用いてハフマンデコードを行う場合の構成図である。It is a block diagram in the case of performing Huffman decoding using a conventional cache memory. 従来の技術によって、キャッシュメモリに格納されるデータの流れを示した図である。It is the figure which showed the flow of the data stored in a cache memory by the prior art. 従来の技術によって、キャッシュメモリに格納されるデータの流れを示した図である。It is the figure which showed the flow of the data stored in a cache memory by the prior art. 従来の技術によって、キャッシュメモリに格納されるデータの流れを示した図である。It is the figure which showed the flow of the data stored in a cache memory by the prior art. 従来の技術によって、キャッシュメモリに格納されるデータの流れを示した図である。It is the figure which showed the flow of the data stored in a cache memory by the prior art. 従来の技術によって、キャッシュメモリに格納されるデータの流れを示した図である。It is the figure which showed the flow of the data stored in a cache memory by the prior art. 従来の技術によって、キャッシュメモリに格納されるデータの流れを示した図である。It is the figure which showed the flow of the data stored in a cache memory by the prior art. 従来の技術によって、キャッシュメモリに格納されるデータの流れを示した図である。It is the figure which showed the flow of the data stored in a cache memory by the prior art. 従来の技術によって、キャッシュメモリに格納されるデータの流れを示した図である。It is the figure which showed the flow of the data stored in a cache memory by the prior art.

Explanation of symbols

１０主メモリ
１１第１メモリ領域
１２第２メモリ領域
１３第１テーブル
１４第２テーブル
２０プロセッサ
３０，６０キャッシュメモリ装置
３４，６１キャッシュメモリ
５１，６２制御部
５２，６４属性格納部 10 main memory 11 first memory area 12 second memory area 13 first table 14 second table 20 processor 30, 60 cache memory device 34, 61 cache memory 51, 62 control unit 52, 64 attribute storage unit

Claims

A signal processing device that performs signal processing using data in first and second memory areas,
A cache memory for temporarily storing the data;
Access means for reading the data by accessing at least one of the first and second memory areas via the cache memory;
Control means for securing free space in the cache memory when there is a shortage of free space to be temporarily stored in the cache memory;
The signal processing apparatus, wherein the control means reserves the free area so that the data in the first memory area is held in the cache memory in preference to the data in the second memory area.

The signal processing apparatus further includes attribute storage means for storing an intensity attribute indicating that the first memory area is stronger than the second memory area,
The signal processing apparatus according to claim 1, wherein the control unit determines data to be held in the cache memory according to an intensity attribute stored in the attribute storage unit.

The access means accesses the first memory area after transferring the data in the first memory area to the cache memory, and accesses the data in the second memory area for the second memory area. The signal processing apparatus according to claim 1, wherein the signal processing apparatus is accessed without moving to the process.

The signal processing apparatus according to claim 1, wherein the first memory area stores data having a higher access frequency than the second memory area.

The first and second memory areas respectively hold first and second tables indicating conversion rules for converting input data according to a certain rule,
2. The signal according to claim 1, wherein the control unit reserves the free space so that the data in the second table is discarded from the cache memory in preference to the data in the first table. Processing equipment.

The signal processing device further includes attribute storage means for storing an intensity attribute indicating that the first table is stronger than the second table,
The signal processing apparatus according to claim 5, wherein the control unit determines data to be discarded from the cache memory according to an intensity attribute stored in the attribute storage unit.

The signal processing device is a device for decoding a Huffman code that is input data,
In the Huffman code, the longest bit of the code length is N bits,
The first table is a table for decoding code data of n bits or less when n is an integer greater than or equal to 1 and less than N;
The signal processing apparatus according to claim 5, wherein the second table is a table for decoding code data longer than the n bits.

A cache comprising first and second memory areas, a cache memory for temporarily storing the data, and access means for reading the data by accessing the first and second memory areas via the cache memory A control method for a memory device, comprising:
The free space is secured so that the data in the first memory area is held in the cache memory in preference to the data in the second memory area when there is a shortage of free space to be temporarily stored in the cache memory. A cache memory control method comprising the steps of:

A program for a signal processing device that performs signal processing using data in first and second memory areas,
A program for causing a computer to execute the steps included in the control method according to claim 8.