[go: up one dir, main page]

CN115291838A - Multi-user electronic mailbox mail indexing method based on KV system - Google Patents

Multi-user electronic mailbox mail indexing method based on KV system Download PDF

Info

Publication number
CN115291838A
CN115291838A CN202210872986.6A CN202210872986A CN115291838A CN 115291838 A CN115291838 A CN 115291838A CN 202210872986 A CN202210872986 A CN 202210872986A CN 115291838 A CN115291838 A CN 115291838A
Authority
CN
China
Prior art keywords
mail
information
data
key
folder
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210872986.6A
Other languages
Chinese (zh)
Other versions
CN115291838B (en
Inventor
佟路林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
BEIJING EYOU INFORMATION TECHNOLOGY CO LTD
Original Assignee
BEIJING EYOU INFORMATION TECHNOLOGY CO LTD
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by BEIJING EYOU INFORMATION TECHNOLOGY CO LTD filed Critical BEIJING EYOU INFORMATION TECHNOLOGY CO LTD
Priority to CN202210872986.6A priority Critical patent/CN115291838B/en
Publication of CN115291838A publication Critical patent/CN115291838A/en
Application granted granted Critical
Publication of CN115291838B publication Critical patent/CN115291838B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/252Integrating or interfacing systems involving database management systems between a Database Management System and a front-end application
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management
    • G06Q10/107Computer-aided management of electronic mailing [e-mailing]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • Human Resources & Organizations (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Strategic Management (AREA)
  • General Engineering & Computer Science (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • General Business, Economics & Management (AREA)
  • Economics (AREA)
  • Tourism & Hospitality (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Software Systems (AREA)
  • Information Transfer Between Computers (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a multi-user electronic mailbox mail indexing method based on a KV system, which comprises the following steps: constructing a Server program architecture; constructing a Key-List mailbox structure based on a KV system according to the Server program architecture; constructing a Map structure thread lock barrel based on a memory; and storing a plurality of field information of the mail by adopting a ProtoBuf structure. The newly added mail information field is flexible and efficient, backward/forward compatibility of the mail information data field structure is effectively realized, and the summary information of the user mailbox folder and the index of the mail information are realized.

Description

一种基于KV系统的多用户电子邮箱邮件索引方法A method of indexing multi-user email boxes based on KV system

技术领域technical field

本发明涉及电子邮件服务器领域,尤其涉及一种基于KV系统的多用户电子邮箱邮件索引方法。The invention relates to the field of e-mail servers, in particular to a KV system-based multi-user e-mail indexing method.

背景技术Background technique

电子邮件服务器的其中一个重要功能就是存储多个用户的邮箱信息和邮件原文,用户通过MUA(如webmail、pop、imap)客户端访问电子邮件服务器,读取个人邮箱的汇总信息和邮件原文内容。随着用户数量及其邮件数量的剧增,电子邮件服务器在显示个人邮箱汇总信息(文件夹信息、邮件数量)和读取邮件原文列表的时候,速度越来越慢,性能也越来越低,甚至在大并发、高负载的情况,造成电子邮件服务器宕机或无法服务。One of the important functions of the email server is to store the mailbox information and original emails of multiple users. Users access the email server through MUA (such as webmail, pop, imap) clients to read the summary information of individual mailboxes and the original email content. With the sharp increase in the number of users and their emails, the email server is getting slower and slower when displaying the summary information of personal mailboxes (folder information, number of emails) and reading the list of original emails , Even in the case of large concurrency and high load, the email server will be down or unable to serve.

基于以上原因,需要对用户的邮箱汇总信息和邮件原文列表信息进行缓存或是索引,以加速用户通过MUA客户端收取邮件时,快速显示出用户的邮箱汇总信息和邮件列表信息,同时也需要提高用户遍历邮件列表的速度,以及用户邮箱内邮件数量发生变更时(比如新收到邮件、删除邮件、将邮件从一个邮件夹移动到另一个邮件夹等)可以快速更新用户的邮箱汇总信息及调整邮件列表信息。Based on the above reasons, it is necessary to cache or index the user's mailbox summary information and mail text list information to speed up the user's mailbox summary information and mailing list information when receiving mail through the MUA client. At the same time, it is also necessary to improve The speed at which the user traverses the mailing list, and when the number of mails in the user's mailbox changes (such as newly received mail, deleted mail, moving mail from one mail folder to another, etc.) can quickly update the user's mailbox summary information and adjust Mailing list information.

通常的电子邮件服务器的邮箱格式为mbox、maildir格式,分别为采用单一文件或是目录结构文件的方式组织邮件原文的存储,而没有相关的邮箱汇总信息的缓存或是索引,当需要显示邮箱汇总信息的时候,mbox格式需要对单一文件内的所有邮件原文信息和数量进行汇总统计,maildir格式需要统计整个目录内的文件数量及文件大小等信息,变更或是移动时,也都非常复杂,所以效率通常都比较低。The mailbox format of the usual email server is mbox and maildir format, which respectively organize the storage of the original text of the email in the form of a single file or a directory structure file, and there is no cache or index of the relevant mailbox summary information. When it is necessary to display the mailbox summary For information, the mbox format needs to collect statistics on the original text information and quantity of all emails in a single file, and the maildir format needs to count the number and size of files in the entire directory. It is also very complicated when changing or moving, so Efficiency is usually lower.

大部分改进型的电子邮件服务器的邮箱格式依然采用maildir格式或是云存储的方式存储邮件原文,但同时会增加一个索引装置,存储邮箱的汇总信息和邮件原文列表信息,以加快在读取相关信息时的速度。The mailbox format of most improved email servers still uses maildir format or cloud storage to store the original text of the mail, but at the same time, an index device will be added to store the summary information of the mailbox and the list information of the original text of the mail, so as to speed up the reading of relevant speed of information.

其中一种索引装置是以二进制结构化索引文件存储用户的邮件原文信息的方式,使用结构化文件中的每一个固定大小的结构块记录邮件原文的相关信息(发件人、收件人、主题、大小、文件ID等等),读取信息时,通过计算结构化文件中结构块的数量及遍历结构块信息,计算出相关的汇总信息,显示邮件列表信息时,则可以直接通过读取结构化文件中的结构块信息,以达到提升速度的效果。由于结构化块的数据结构是固定的,如果需要增加邮件信息字段,则需要对整个结构化文件进行更改,无论采用在线方式或是离线方式,都需要非常大的工作量。One of the indexing devices is to store the user's original email information in a binary structured index file, and use each fixed-sized structural block in the structured file to record the relevant information (sender, recipient, subject, etc.) of the original email. , size, file ID, etc.), when reading information, calculate the relevant summary information by calculating the number of structural blocks in the structured file and traversing the structural block information. When displaying mailing list information, you can directly read the structure The structural block information in the file is optimized to achieve the effect of increasing speed. Since the data structure of the structured block is fixed, if an email information field needs to be added, the entire structured file needs to be changed, which requires a very large amount of work whether online or offline.

另外一种索引装置是以MySQL数据库表存储用户的邮件原文信息的方式,使用数据库的邮件夹信息表和邮件信息表,将所有用户的邮件夹信息(邮件夹名、邮件数量等等)和邮件原文的相关信息(发件人、收件人、主题、大小、邮件夹ID、文件ID等等)按表字段的方式存在数据库的记录中,读取信息时,通过SQL相关语句进行查询、遍历,在进行邮件新增或删除等数据变更时,由于邮件夹信息表和邮件信息表存在关联更新关系,为了保证原子更新,就需要使用数据库事务操作,数据库事务操作的性能相对较低。同时随着用户邮件数据量的增加,数据库的查询性能也会存在一定的瓶颈。在增加邮件信息字段时,需要进行锁表操作,数据量非常大时,需要长时间锁表进行数据库新增字段的更新。Another kind of indexing device is to store the user's mail original text information in the MySQL database table, use the mail folder information table and mail information table of the database, and collect all users' mail folder information (mail folder name, mail number, etc.) and mail The relevant information of the original text (sender, recipient, subject, size, mail folder ID, file ID, etc.) is stored in the database records in the form of table fields. When reading information, query and traverse through SQL related statements , when performing data changes such as adding or deleting emails, due to the associated update relationship between the mail folder information table and the email information table, in order to ensure atomic updates, it is necessary to use database transaction operations, and the performance of database transaction operations is relatively low. At the same time, as the amount of user mail data increases, there will be a certain bottleneck in the query performance of the database. When adding email information fields, table locking operations are required. When the amount of data is very large, table locking is required for a long time to update new fields in the database.

综上所述,现有技术中的索引装置的工作效率低。To sum up, the working efficiency of the indexing device in the prior art is low.

发明内容Contents of the invention

鉴于上述问题,提出了本发明以便提供克服上述问题或者至少部分地解决上述问题的一种基于KV系统的多用户电子邮箱邮件索引方法。In view of the above problems, the present invention is proposed to provide a KV system-based multi-user email indexing method that overcomes the above problems or at least partially solves the above problems.

根据本发明的一个方面,提供了一种基于KV系统的多用户电子邮箱邮件索引方法包括:According to one aspect of the present invention, there is provided a kind of multi-user electronic mailbox mail indexing method based on KV system comprising:

构建Server程序架构;Construct the Server program architecture;

根据所述Server程序架构构建基于KV系统的Key-List邮箱结构;Build the Key-List mailbox structure based on the KV system according to the Server program framework;

构建基于内存的Map结构线程锁桶;Build a memory-based Map structure thread lock bucket;

采用ProtoBuf结构存储邮件的多个字段信息。Use the ProtoBuf structure to store multiple field information of the mail.

可选的,所述构建Server程序架构具体包括:Optionally, said building the Server program architecture specifically includes:

Server程序为一个多线程的服务程序,包括网络IO线程池,用于提供HTTP协议的Restful接口访问服务;工作线程池,用于进行KV存储系统的数据查询和更新服务;The Server program is a multi-threaded service program, including a network IO thread pool, which is used to provide the Restful interface access service of the HTTP protocol; a worker thread pool, which is used to perform data query and update services of the KV storage system;

所述网络IO线程池和所述工作线程池之间通过消息队列进行交互。The network IO thread pool and the worker thread pool interact through a message queue.

可选的,所述根据所述Server程序架构构建基于KV系统的Key-List邮箱结构具体包括:Optionally, the construction of the Key-List mailbox structure based on the KV system according to the Server program architecture specifically includes:

工作线程池线程利用基于KV存储系统的多条Key-Value数据,构建Key-List的邮件夹信息和邮件信息列表结构,一条所述Key-Value数据为邮件夹信息数据,多条所述Key-Value数据为邮件信息数据,每一条所述Key-Value数据代表一封邮件数据,存储邮件的相关信息;The working thread pool thread uses multiple Key-Value data based on the KV storage system to construct the mail folder information and mail information list structure of the Key-List, one of the Key-Value data is the mail folder information data, and the multiple Key-Value data The Value data is mail information data, and each Key-Value data represents a piece of mail data, and stores relevant information of the mail;

对所述Key-Value数据中的Key进行内容格式的规划,Key包括TAG标识字段,表明Key的类型,邮件夹ID和邮件ID字段;前缀部分包括相同的用户ID;Carry out content format planning to the Key in the Key-Value data, Key includes TAG identification field, shows the type of Key, mail folder ID and mail ID field; Prefix part includes identical user ID;

Value包括多个字段,邮件夹信息的Value包括邮件夹更新时间戳、邮件夹内邮件数量、邮件夹内邮件总Size、邮件夹内最大邮件ID和最小邮件ID,邮件信息的Value包括邮件的时间戳、邮件的Size和邮件的摘要信息ProtoBuf序列化数据。Value includes multiple fields. The Value of the mail folder information includes the time stamp of the mail folder update, the number of mails in the mail folder, the total size of the mail in the mail folder, the maximum mail ID and the minimum mail ID in the mail folder, and the Value of the mail information includes the time of the mail ProtoBuf serialization data of the stamp, the size of the mail, and the summary information of the mail.

可选的,所述构建基于内存的Map结构线程锁桶具体包括:Optionally, the construction of the memory-based Map structure thread lock bucket specifically includes:

工作线程池的线程在更新用户的邮件夹信息和邮件信息时,为了保证数据的原子性操作,使用用户级的线程互斥锁;When the thread of the worker thread pool updates the user's mail folder information and mail information, in order to ensure the atomic operation of the data, a user-level thread mutex is used;

使用了基于内存的Map结构线程锁桶;A memory-based Map structure thread lock bucket is used;

在更新KV数据时,采用了KV系统批量操作接口。When updating KV data, the KV system batch operation interface is adopted.

可选的,所述采用ProtoBuf结构存储邮件的多个字段信息具体包括:Optionally, the multiple field information of storing emails using the ProtoBuf structure specifically includes:

ProtoBuf结构存储邮件信息的多个字段,在存储到KV系统的Key-Value时,将整个结构序列化为一个字符串,读取的时候,再将此字符串反序列化,整个过程对用户透明;The ProtoBuf structure stores multiple fields of email information. When storing to the Key-Value of the KV system, the entire structure is serialized into a string. When reading, the string is deserialized. The whole process is transparent to the user ;

如果需要新增邮件信息字段时,只需要增加一个ProtoBuf结构的字段,在对旧数据反序列化过程中新增的字段会自动采用默认值,更新时,会自动将新增的字段序列化到字符串中,整个过程对整体数据无任何影响,数据转换灵活、高效,实现了数据结构的向后/向前兼容。If you need to add an email information field, you only need to add a field of the ProtoBuf structure. The newly added field will automatically adopt the default value during the deserialization process of the old data. When updating, the newly added field will be automatically serialized to In the string, the whole process has no impact on the overall data, the data conversion is flexible and efficient, and the backward/forward compatibility of the data structure is realized.

本发明提供的一种基于KV系统的多用户电子邮箱邮件索引方法包括:构建Server程序架构;根据所述Server程序架构构建基于KV系统的Key-List邮箱结构;构建基于内存的Map结构线程锁桶;采用ProtoBuf结构存储邮件的多个字段信息。新增邮件信息字段灵活、高效,有效实现了邮件信息数据字段结构的向后/向前兼容,实现了用户邮箱文件夹汇总信息和邮件信息的索引。A kind of multi-user electronic mailbox mail indexing method based on the KV system provided by the present invention includes: constructing the Server program architecture; constructing the Key-List mailbox structure based on the KV system according to the Server program architecture; constructing a memory-based Map structure thread lock bucket ; Use the ProtoBuf structure to store multiple field information of the mail. The new mail information field is flexible and efficient, effectively realizing the backward/forward compatibility of the data field structure of the mail information, and realizing the index of the user mailbox folder summary information and mail information.

上述说明仅是本发明技术方案的概述,为了能够更清楚了解本发明的技术手段,而可依照说明书的内容予以实施,并且为了让本发明的上述和其它目的、特征和优点能够更明显易懂,以下特举本发明的具体实施方式。The above description is only an overview of the technical solution of the present invention. In order to better understand the technical means of the present invention, it can be implemented according to the contents of the description, and in order to make the above and other purposes, features and advantages of the present invention more obvious and understandable , the specific embodiments of the present invention are enumerated below.

附图说明Description of drawings

为了更清楚地说明本发明实施例的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其它的附图。In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the following will briefly introduce the accompanying drawings that need to be used in the description of the embodiments. Obviously, the accompanying drawings in the following description are only some embodiments of the present invention. For Those of ordinary skill in the art can also obtain other drawings based on these drawings without making creative efforts.

图1为本发明实施例提供的Server程序架构示意图;Fig. 1 is a schematic diagram of the Server program architecture provided by the embodiment of the present invention;

图2为本发明实施例提供的基于KV系统的Key-List邮箱结构示意图;Fig. 2 is the Key-List mailbox structural representation based on KV system that the embodiment of the present invention provides;

图3为本发明实施例提供的基于内存的MAP结构线程锁桶示意图。FIG. 3 is a schematic diagram of a memory-based MAP structure thread lock bucket provided by an embodiment of the present invention.

具体实施方式Detailed ways

下面将参照附图更详细地描述本公开的示例性实施例。虽然附图中显示了本公开的示例性实施例,然而应当理解,可以以各种形式实现本公开而不应被这里阐述的实施例所限制。相反,提供这些实施例是为了能够更透彻地理解本公开,并且能够将本公开的范围完整的传达给本领域的技术人员。Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. Although exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited by the embodiments set forth herein. Rather, these embodiments are provided for more thorough understanding of the present disclosure and to fully convey the scope of the present disclosure to those skilled in the art.

本发明的说明书实施例和权利要求书及附图中的术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元。The terms "comprising" and "having" and any variations thereof in the description, embodiments, claims and drawings of the present invention are intended to cover non-exclusive inclusion, for example, including a series of steps or units.

下面结合附图和实施例,对本发明的技术方案做进一步的详细描述。The technical solutions of the present invention will be described in further detail below in conjunction with the accompanying drawings and embodiments.

如图1所示,构建Server程序架构:Server程序是一个多线程的服务程序,主要由两组线程池组成,一组为网络IO线程池,提供HTTP协议的Restful接口访问服务,一组为工作线程池,进行KV存储系统的数据查询和更新服务。网络IO线程和工作线程池之间通过消息队列进行交互。As shown in Figure 1, build the server program architecture: the server program is a multi-threaded service program, mainly composed of two sets of thread pools, one set is the network IO thread pool, which provides the Restful interface access service of the HTTP protocol, and the other set is the work Thread pool for data query and update services of the KV storage system. The interaction between the network IO thread and the worker thread pool is through the message queue.

内部集成了KV存储引擎系统,将多个用户的邮箱汇总信息和邮件信息等索引数据存储于KV系统中,KV系统是一个非常快速的NoSQL存储系统,可存储海量数据,提供数据的快速访问。The KV storage engine system is integrated internally, and the index data such as mailbox summary information and mail information of multiple users are stored in the KV system. The KV system is a very fast NoSQL storage system that can store massive data and provide fast access to data.

如图2所示,建立基于KV系统的Key-List邮箱结构包括:As shown in Figure 2, the establishment of a Key-List mailbox structure based on the KV system includes:

工作线程池线程利用基于KV存储系统的多条Key-Value数据,构建Key-List的邮件夹信息和邮件信息列表结构,其中一条KV数据为邮件夹信息数据,其余多条KV数据为邮件信息数据,其中每一条KV数据代表一封邮件数据,存储邮件的各个相关信息。为了建立邮件夹信息和邮件信息的关联关系,需要对这些KV数据中的Key进行内容格式的规划,Key由为多个字段,其中前缀部分包括相同的用户ID,保证同一个用户的数据具有相同的前缀,同时Key还包括TAG标识字段,表明Key的类型(邮件夹信息、邮件信息),以及邮件夹ID和邮件ID字段,Value也由多个字段组成,邮件夹信息的Value主要由邮件夹更新时间戳、邮件夹内邮件数量、邮件夹内邮件总Size、邮件夹内最大邮件ID和最小邮件ID组成,邮件信息的Value主要由邮件的时间戳、邮件的Size和邮件的摘要信息ProtoBuf序列化数据组成。The working thread pool thread uses multiple Key-Value data based on the KV storage system to construct the Key-List mail folder information and mail information list structure. One of the KV data is the mail folder information data, and the rest of the KV data are mail information data. , where each piece of KV data represents a piece of mail data, storing various relevant information of the mail. In order to establish the association relationship between mail folder information and mail information, it is necessary to plan the content format of the Key in these KV data. The Key consists of multiple fields, and the prefix part includes the same user ID to ensure that the data of the same user has the same At the same time, Key also includes a TAG identification field, indicating the type of Key (mail folder information, mail information), and mail folder ID and mail ID fields. Value is also composed of multiple fields. The Value of mail folder information is mainly composed of mail folder The update timestamp, the number of mails in the mail folder, the total size of the mail in the mail folder, the largest mail ID and the smallest mail ID in the mail folder, the value of the mail information is mainly composed of the timestamp of the mail, the size of the mail and the ProtoBuf sequence of the summary information of the mail composition of data.

新增或删除一封邮件时,需要增加或删除一条邮件Key-Value数据,并同时更新邮件夹Key-Value数据的Value中的Count、Bytes、MaxID、MinID字段数值。When adding or deleting an email, you need to add or delete an email Key-Value data, and at the same time update the Count, Bytes, MaxID, MinID field values in the Value of the Mail Folder Key-Value data.

读取邮件夹信息时,只需要读取邮件夹Key-Value的数据,是NoSQL系统中非常快速的读取动作,直接可以获取邮件夹的信息。When reading mail folder information, you only need to read the data of the mail folder Key-Value, which is a very fast reading action in the NoSQL system, and you can directly obtain the information of the mail folder.

遍历邮件夹内的邮件时,只需要按UID前缀和MailID顺序,顺序读取NoSQL系统中的Key-Value数据,利用NoSQL系统的KV遍历指针,即使在海量数据中,也能够快速读取Key-Value数据。When traversing the mail in the mail folder, you only need to read the Key-Value data in the NoSQL system sequentially according to the UID prefix and MailID order, and use the KV traversal pointer of the NoSQL system to quickly read the Key-Value data even in massive data. Value data.

通过KV系统的多条Key-Value数据构建Key-List结构,即:Key为一条记录,记录用户的邮件夹信息,List为多条记录,每条记录用户的一封邮件信息,通过此种方式构建成用户的邮箱索引结构。读取邮件夹信息时,只需要读取一条Key-Value数据,采用NoSQL的KV存储系统的优势,在海量数据中快速定位并读取到一条Key-Value邮件夹信息数据,读取速度非常快。新增、删除邮件时,只需要同时更新2条记录,一条为邮件夹信息,一条为新增或删除的邮件信息,且不需要使用事务,而是利用用户级内存Mutex线程锁,采用KV存储系统的批量操作(BatchPut)接口一次完成,速度和效率都非常高。Construct the Key-List structure through multiple Key-Value data of the KV system, that is: Key is a record, which records the user's mail folder information, and List is multiple records, each of which records a user's mail information. In this way Build into the user's mailbox index structure. When reading mail folder information, you only need to read a Key-Value data, using the advantages of the NoSQL KV storage system, quickly locate and read a Key-Value mail folder information data in massive data, and the reading speed is very fast . When adding or deleting emails, only two records need to be updated at the same time, one is the mail folder information, and the other is the added or deleted email information, and there is no need to use transactions, but use user-level memory Mutex thread locks and KV storage The batch operation (BatchPut) interface of the system is completed at one time, and the speed and efficiency are very high.

如图3所示,构建基于内存的Map结构线程锁桶包括:工作线程池的线程在更新用户的邮件夹信息和邮件信息时,为了保证数据的原子性操作,使用用户级(UID)的线程互斥锁,同时为了满足大并发的用户同时更新数据,使用了基于内存的Map结构线程锁桶,预先分配了10000个桶,Map的元素即为UID的hash值,此元素对应的数据即为线程锁Mutex对象,根据Map数据结构的特性,一个UID获取一把锁的时间复杂度为O(1)或O(logn),在更新KV数据时,采用了KV系统批量操作接口(BatchPut),保证了更新数据的快速和原子性,同时在工作线程内对Mutex锁实现了自动释放功能,有效的避免了系统死锁的情况。As shown in Figure 3, the construction of a memory-based Map structure thread lock bucket includes: when the threads of the worker thread pool update the user's mail folder information and mail information, in order to ensure the atomic operation of data, use user-level (UID) threads Mutex locks, and in order to satisfy large concurrent users to update data at the same time, a memory-based Map structure thread lock bucket is used, and 10,000 buckets are pre-allocated. The element of the Map is the hash value of the UID, and the data corresponding to this element is The thread locks the Mutex object. According to the characteristics of the Map data structure, the time complexity for a UID to acquire a lock is O(1) or O(logn). When updating the KV data, the KV system batch operation interface (BatchPut) is used. The speed and atomicity of updating data are guaranteed, and at the same time, the automatic release function of the Mutex lock is realized in the working thread, which effectively avoids the situation of system deadlock.

为保证更新Key-List数据的原子性,在Server程序内使用基于内存的Map结构的线程锁桶,通过对Key进行hash分担到不同的桶上获取到线程锁。将Map中的元素数量(即:桶的数量)设置为10000(甚至更多)时,可获得更多用户并发访问的要求。Map的元素定位的时间复杂度,能够达到O(1)或O(logn),能够快速定位到具体的锁,保证数据的快速并发更新。In order to ensure the atomicity of updating the Key-List data, the thread lock bucket based on the memory Map structure is used in the server program, and the thread lock is obtained by hashing the Key and sharing it to different buckets. When the number of elements in the Map (that is, the number of buckets) is set to 10,000 (or even more), the requirements for concurrent access by more users can be obtained. The time complexity of Map element positioning can reach O(1) or O(logn), and specific locks can be quickly located to ensure fast concurrent update of data.

采用ProtoBuf结构存储邮件的多个字段信息包括:ProtoBuf结构存储邮件信息的多个字段,在存储到KV系统的Key-Value时,将整个结构序列化为一个字符串,读取的时候,再将此字符串反序列化,整个过程对用户透明。当需要新增邮件信息字段时,只需要增加一个ProtoBuf结构的字段,在对旧数据反序列化过程中新增的字段会自动采用默认值,不影响数据的完整性,而更新时,会自动将新增的字段序列化到字符串中,整个过程对整体数据无任何影响,数据转换灵活、高效,有效的实现了数据结构的向后/向前兼容。The multiple field information of email stored in ProtoBuf structure includes: multiple fields of email information stored in ProtoBuf structure, when stored in the Key-Value of the KV system, the entire structure is serialized into a string, and when read, the This string is deserialized, and the whole process is transparent to the user. When you need to add a new mail information field, you only need to add a field of the ProtoBuf structure. During the deserialization process of the old data, the newly added field will automatically adopt the default value without affecting the integrity of the data. When updating, it will automatically Serialize the newly added fields into strings. The whole process has no impact on the overall data. The data conversion is flexible and efficient, and the backward/forward compatibility of the data structure is effectively realized.

对于邮件信息的一条Key-Value数据,将邮件信息的多个字段使用ProtoBuf结构序列化后存储在Value中,读取时进行反序列化即可。当新增字段时,现在存储的所有邮件信息数据无需更新,ProtoBuf结构会提供新增字段的默认数据值,只有Key-Value数据需要更新时,ProtoBuf存储的邮件信息数据才会被更新,而且是按单条数据进行更新,不影响已经存储全部数据,这种动态灵活的数据存储方式,有效提高了邮件信息数据字段存储、更新的灵活性。For a piece of Key-Value data of an email message, multiple fields of the email message are serialized using the ProtoBuf structure and stored in the Value, and then deserialized when reading. When a new field is added, all the mail information data stored in the current store does not need to be updated. The ProtoBuf structure will provide the default data value of the newly added field. Only when the Key-Value data needs to be updated, the mail information data stored in ProtoBuf will be updated, and it is Updating a single piece of data does not affect all stored data. This dynamic and flexible data storage method effectively improves the flexibility of storing and updating email information data fields.

有益效果:通过KV系统中的多条Key-Value数据,构建实现多个用户的Key-List形式的邮箱索引结构,实现用户邮箱的文件夹汇总信息和邮件信息的索引,可实现快速查询/更新邮件夹信息,以及快速新增/删除/更新邮件信息。Beneficial effects: Through the multiple Key-Value data in the KV system, a mailbox index structure in the form of Key-List for multiple users is constructed to realize the indexing of folder summary information and mail information of user mailboxes, and fast query/update can be realized Mail folder information, and quickly add/delete/update mail information.

通过内存级Map结构的线程锁桶,解决大并发下多个用户的Key-List结构中邮件夹信息和邮件信息的快速原子性更新。Through the thread lock bucket of the memory-level Map structure, it solves the fast atomic update of mail folder information and mail information in the Key-List structure of multiple concurrent users.

将邮件信息多个字段以ProtoBuf结构的数据存储于Key-Value的Value中,新增邮件信息字段灵活、高效,有效的实现了邮件信息数据字段结构的向后/向前兼容。Multiple fields of mail information are stored in the Value of Key-Value with ProtoBuf structure data, and the new mail information field is flexible and efficient, effectively realizing the backward/forward compatibility of the data field structure of mail information.

以上的具体实施方式,对本发明的目的、技术方案和有益效果进行了进一步详细说明,所应理解的是,以上仅为本发明的具体实施方式而已,并不用于限定本发明的保护范围,凡在本发明的精神和原则之内,所做的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。The above specific implementation manners have further described the purpose, technical solutions and beneficial effects of the present invention in detail. It should be understood that the above are only specific implementation modes of the present invention, and are not used to limit the protection scope of the present invention. Within the spirit and principles of the present invention, any modifications, equivalent replacements, improvements, etc., shall be included in the protection scope of the present invention.

Claims (5)

1. A multi-user E-mail indexing method based on a KV system is characterized by comprising the following steps:
constructing a Server program architecture;
constructing a Key-List mailbox structure based on a KV system according to the Server program architecture;
constructing a Map structure thread lock barrel based on a memory;
and storing a plurality of field information of the mail by adopting a ProtoBuf structure.
2. The method according to claim 1, wherein the constructing of the Server program architecture specifically comprises:
the Server program is a multi-thread service program and comprises a network IO thread pool, a network IO thread pool and a Server module, wherein the network IO thread pool is used for providing Restful interface access service of an HTTP protocol; the working thread pool is used for carrying out data query and update service of the KV storage system;
and the network IO thread pool and the work thread pool are interacted through a message queue.
3. The method according to claim 2, wherein the constructing a Key-List mailbox structure based on the KV system according to the Server program architecture specifically comprises:
the method comprises the steps that a working thread pool thread utilizes a plurality of pieces of Key-Value data based on a KV storage system to construct a mail folder information and mail information List structure of a Key-List, one piece of Key-Value data is mail folder information data, a plurality of pieces of Key-Value data are mail information data, each piece of Key-Value data represents one piece of mail data, and relevant information of a mail is stored;
planning a content format of a Key in the Key-Value data, wherein the Key comprises a TAG identification field which indicates the type of the Key, a mail folder ID and a mail ID field; the prefix portion includes the same user ID;
the Value includes a plurality of fields, the Value of the folder information includes a folder update time stamp, the number of mails in the folder, a total Size of the mails in the folder, a maximum mail ID and a minimum mail ID in the folder, and the Value of the mail information includes a time stamp of the mail, a Size of the mail and summary information ProtoBuf serialization data of the mail.
4. The method for indexing the multi-user electronic mailbox mail based on the KV system as claimed in claim 1, wherein the constructing of the Map-structured thread lock bucket based on the memory specifically comprises:
when the thread of the working thread pool updates the mail folder information and the mail information of a user, a user-level thread mutual exclusion lock is used for ensuring the atomicity operation of data;
a Map structure thread lock barrel based on a memory is used;
when the KV data is updated, a KV system batch operation interface is adopted.
5. The method according to claim 1, wherein the storing of the information on the plurality of fields of the mail by using the ProtoBuf structure specifically comprises:
the ProtoBuf structure stores a plurality of fields of mail information, when the fields are stored in Key-Value of a KV system, the whole structure is serialized into a character string, and when the fields are read, the character string is deserialized, so that the whole process is transparent to a user;
if a new mail information field is needed, only one field of the ProtoBuf structure needs to be added, the newly added field can automatically adopt a default value in the de-serialization process of old data, and the newly added field can be automatically serialized into a character string during updating, so that the whole process has no influence on the whole data, the data conversion is flexible and efficient, and the backward/forward compatibility of the data structure is realized.
CN202210872986.6A 2022-07-22 2022-07-22 Multiuser email box mail indexing method based on KV system Active CN115291838B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210872986.6A CN115291838B (en) 2022-07-22 2022-07-22 Multiuser email box mail indexing method based on KV system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210872986.6A CN115291838B (en) 2022-07-22 2022-07-22 Multiuser email box mail indexing method based on KV system

Publications (2)

Publication Number Publication Date
CN115291838A true CN115291838A (en) 2022-11-04
CN115291838B CN115291838B (en) 2025-07-18

Family

ID=83825141

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210872986.6A Active CN115291838B (en) 2022-07-22 2022-07-22 Multiuser email box mail indexing method based on KV system

Country Status (1)

Country Link
CN (1) CN115291838B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106161193A (en) * 2015-04-10 2016-11-23 腾讯科技(成都)有限公司 A kind of email processing method, device and system
US20170139914A1 (en) * 2015-11-18 2017-05-18 Oracle International Corporation Electronic mail data modeling for efficient indexing
CN111367996A (en) * 2020-02-25 2020-07-03 深圳联友科技有限公司 KV index-based thermal data increment synchronization method and device
CN114329633A (en) * 2021-12-31 2022-04-12 深圳依时货拉拉科技有限公司 Data storage and access method and device and computer equipment

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106161193A (en) * 2015-04-10 2016-11-23 腾讯科技(成都)有限公司 A kind of email processing method, device and system
US20170139914A1 (en) * 2015-11-18 2017-05-18 Oracle International Corporation Electronic mail data modeling for efficient indexing
CN111367996A (en) * 2020-02-25 2020-07-03 深圳联友科技有限公司 KV index-based thermal data increment synchronization method and device
CN114329633A (en) * 2021-12-31 2022-04-12 深圳依时货拉拉科技有限公司 Data storage and access method and device and computer equipment

Also Published As

Publication number Publication date
CN115291838B (en) 2025-07-18

Similar Documents

Publication Publication Date Title
KR101584828B1 (en) Web-based multiuser collaboration
KR101635228B1 (en) Displaying a list of file attachments associated with a message thread
US6167402A (en) High performance message store
Adya et al. Fast key-value stores: An idea whose time has come and gone
US9043303B2 (en) Methods and systems for sharing email in a multitenant database system
CN105068864B (en) Method and system for processing asynchronous message queue
US8131809B2 (en) Online archiving of message objects
CN103412803B (en) The method and device that data are recovered
JP7030831B2 (en) Manage large association sets with optimized bitmap representations
CN102799679B (en) Hadoop-based massive spatial data indexing updating system and method
US8935193B2 (en) Methods and systems for performing email management customizations in a multi-tenant database system
EP2858025B1 (en) An order book management device in a hardware platform
US20100161737A1 (en) Techniques to manage electronic mail personal archives
CN103020315A (en) Method for storing mass of small files on basis of master-slave distributed file system
WO2012082414A2 (en) Using text messages to interact with spreadsheets
US20110219083A1 (en) Email auto-filing and management
US8015195B2 (en) Modifying entry names in directory server
CN110287160A (en) Method and device for clearing cache space
US20150304264A1 (en) Context aware serialization
RU2635887C2 (en) System and method for mixed presentation of locally and remotely stored electronic messages
CN101217449B (en) A remote call management procedure
CN107273443B (en) A Hybrid Indexing Method Based on Big Data Model Metadata
CN115291838A (en) Multi-user electronic mailbox mail indexing method based on KV system
US9137032B2 (en) Specifying desired list of recipients in electronic mails
CA2600504C (en) Container-level transaction management system and method therefor

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant