[go: up one dir, main page]

CN117938937A - Method and system for processing read-write request based on distributed storage cluster - Google Patents

Method and system for processing read-write request based on distributed storage cluster Download PDF

Info

Publication number
CN117938937A
CN117938937A CN202311696286.7A CN202311696286A CN117938937A CN 117938937 A CN117938937 A CN 117938937A CN 202311696286 A CN202311696286 A CN 202311696286A CN 117938937 A CN117938937 A CN 117938937A
Authority
CN
China
Prior art keywords
osd
object storage
distributed
cluster
read
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311696286.7A
Other languages
Chinese (zh)
Inventor
陈二运
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Unicom Online Information Technology Co Ltd
Original Assignee
China Unicom Online Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Unicom Online Information Technology Co Ltd filed Critical China Unicom Online Information Technology Co Ltd
Priority to CN202311696286.7A priority Critical patent/CN117938937A/en
Publication of CN117938937A publication Critical patent/CN117938937A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • G06F3/0611Improving I/O performance in relation to response time
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/08Network architectures or network communication protocols for network security for authentication of entities
    • H04L63/0807Network architectures or network communication protocols for network security for authentication of entities using tickets, e.g. Kerberos
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/104Peer-to-peer [P2P] networks
    • H04L67/1044Group management mechanisms 
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/2866Architectures; Arrangements
    • H04L67/2895Intermediate processing functionally located close to the data provider application, e.g. reverse proxies
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/32Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials
    • H04L9/321Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials involving a third party or a trusted authority
    • H04L9/3213Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials involving a third party or a trusted authority using tickets or tokens, e.g. Kerberos
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/40Network security protocols

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Human Computer Interaction (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Computer Hardware Design (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application relates to a method and a system for processing read-write requests based on a distributed storage cluster, belonging to the technical field of distributed storage, wherein the method comprises the following steps: initiating identity authentication to the distributed cluster, connecting Rados in the distributed cluster based on the obtained token after the authentication is passed, and inquiring key configuration information of the distributed cluster after the connection is completed; the load balancing gateway analyzes the read-write request sent by the client to obtain a file stored by the socket, selects an optimal object storage gateway RGW based on the file stored by the socket and the key configuration information, forwards the read-write request to the optimal object storage gateway RGW, and processes the read-write request by the optimal object storage gateway RGW; the key configuration information of the distributed cluster which is locally cached is updated to the latest version by periodically inquiring Rados of the distributed cluster, and the method and the system provided by the application reduce network transmission between RGWs and OSD (on-screen display) crossing hosts, thereby reducing delay and improving efficiency.

Description

Method and system for processing read-write request based on distributed storage cluster
Technical Field
The present invention relates to the field of distributed storage technologies, and in particular, to a method and system for processing a read-write request based on a distributed storage cluster.
Background
Ceph is used as a high-reliability and extensible distributed storage system and has wide application in large-scale data storage and processing. However, as the storage size and concurrent access increases, the topology design of network transport, ingress bandwidth, cluster host network card performance, etc. limit the throughput of the entire cluster.
The existing balance gateway is distributed to the object storage gateway (RGW) to be routed by adopting a polling load balance algorithm, the topological relation between the object storage gateway and the OSD in the Ceph cluster is not considered, and in the Ceph cluster with high throughput, the request file is forwarded for a plurality of times through a switch, so that the bandwidth resources of a host network card and a service network of the switch are more seriously stricken, and the throughput of the whole Ceph cluster is limited.
Disclosure of Invention
The invention aims to provide a method and a system for processing read-write requests based on a distributed storage cluster, which are used for solving the defects in the prior art.
The invention provides a method for processing read-write requests based on a distributed storage cluster, which comprises the following steps:
Initiating identity authentication to the distributed cluster based on configuration information of a reverse proxy corresponding to the load balancing gateway, connecting Rados in the distributed cluster based on the obtained token after the authentication is passed, and inquiring key configuration information of the distributed cluster after the connection is completed;
The load balancing gateway analyzes the read-write request sent by the client to obtain a file stored by the socket, selects an optimal object storage gateway RGW based on the file stored by the socket and the key configuration information, forwards the read-write request to the optimal object storage gateway RGW, and processes the read-write request by the optimal object storage gateway RGW;
and periodically inquiring Rados of the distributed clusters, and updating the key configuration information of the locally cached distributed clusters to the latest version.
In the above scheme, the configuration information of the reverse proxy includes configuration information of a listening port of the proxy server and list information of the object storage gateway RGW in the distributed cluster.
In the above solution, the key configuration information of the distributed cluster includes: the method comprises the steps of distributed cluster ID, a topology information list deployed by an object storage gateway RGW, configuration information of a storage pool, an object placement strategy, object storage category and metadata information of a socket in a Ceph cluster.
In the above scheme, selecting the optimal object storage gateway RGW based on the file stored in the socket and the key configuration information includes:
Segmenting files stored in the socket to obtain a plurality of segmented files, acquiring head objects of the segmented files, and acquiring head object IDs of the segmented files according to the sizes of the segmented files;
Acquiring PG_ID based on the head object ID;
Acquiring OSD_IDs corresponding to the PG_IDs based on the PG_IDs, and acquiring HOST_IDs corresponding to the various OSD based on the OSD_IDs;
And selecting the optimal object storage gateway RGW according to the connection state of the object storage gateway RGW where the HOST_ID corresponding to each OSD is located.
In the above scheme, obtaining the header object ID of each split file according to the size of the split file includes:
when the size of the segmented file is < =4m, the header object ID of the segmented file is the file name of the segmented file;
When the size of the split file is >4M, the header object ID of the split file is the file name.0.
In the above-described scheme, acquiring the pg_id based on the header object ID includes:
Taking the head object ID as input, and generating a hash value with a fixed length through a hash function;
And performing modular operation on the hash value corresponding to the head object ID and the size of the PG placed in the pool to obtain the PG_ID.
In the above-described aspect, acquiring the osd_id corresponding to the pg_id based on the pg_id, and acquiring the host_id corresponding to each OSD based on the osd_id includes:
Traversing the PG_MAP data structure in a local cache of the client, searching by using the PG_ID as a key query, and acquiring an OSD_ID corresponding to the PG_ID when the PG_ID matched with the PG_MAP data structure in the local cache is obtained;
And inquiring related information of the OSD_MAP corresponding to the OSD_ID in a local buffer memory of the client, and acquiring HOST_ID corresponding to each OSD based on the related information of the OSD_MAP.
In the above-described scheme, selecting the optimal object storage gateway RGW according to the connection state of the object storage gateway RGW where the host_id corresponding to each OSD is located includes:
Calculating a connection weight value of an object storage gateway RGW where the HOST_ID corresponding to each OSD is located, carrying out weighting processing on list information of the object storage gateways RGWs in the distributed cluster by adopting the connection weight value, sorting the weighting processing results of the object storage gateways RGWs where the HOST_ID corresponding to each OSD is located according to the order from big to small, and taking the object storage gateway RGW corresponding to the weighting processing result sorted in the first bit as the optimal object storage gateway RGW.
In the above-described scheme, the OSD includes a master OSD and a slave OSD,
The calculation formula of the connection weight value of the object storage gateway RGW where the host_id corresponding to the main OSD is located is:
100-HOST_ID connection number x 0.1 corresponding to object storage gateway RGW;
The calculation formula of the connection weight value of the object storage gateway RGW where the OSD host ID corresponding to the OSD is located is as follows:
(100-HOST_ID connection number corresponding to object storage gateway RGW. Times.0.1). Times.0.5.
The system for processing the read-write request based on the distributed storage cluster provided by the invention adopts the method for processing the read-write request based on the distributed storage cluster to process the read-write request, and the system comprises the following components:
The configuration information query module is used for initiating identity authentication to the distributed cluster based on the configuration information of the reverse proxy corresponding to the load balancing gateway, connecting Rados in the distributed cluster based on the obtained token after the authentication is passed, and querying key configuration information of the distributed cluster after the connection is completed;
The read-write request analysis module is used for analyzing the read-write request sent by the client by the load balancing gateway, acquiring a file stored by the socket, selecting an optimal object storage gateway RGW based on the file stored by the socket and the key configuration information, forwarding the read-write request to the optimal object storage gateway RGW by the load balancing gateway, and processing the read-write request by the optimal object storage gateway RGW;
And the information updating module is used for periodically inquiring Rados of the distributed clusters and updating the key configuration information of the locally cached distributed clusters into the latest version.
The embodiment of the invention has the following advantages:
According to the method and the system for processing the read-write request based on the distributed storage cluster, the load balancing gateway analyzes the read-write request sent by the client, obtains the corresponding file stored by the socket based on the analysis result, segments the file stored by the socket to obtain a plurality of segmented files, obtains PG_ID based on the head object ID of each segmented file, obtains the OSD_ID and the host ID of each local cached PG_MAP by inquiring the PG_MAP and the OSD_MAP, selects the optimal object storage gateway RGW according to the connection state of the object storage gateway RGW where each OSD host ID is located, and the load balancing gateway forwards the read-write request to the optimal object storage gateway RGW.
Drawings
FIG. 1 is a step diagram of a method of processing read and write requests based on a distributed storage cluster in one embodiment of the invention;
FIG. 2 is a flow diagram of selecting an optimal object storage gateway RGW in accordance with an embodiment of the present invention;
FIG. 3 is a flow diagram of obtaining a header object ID of a split file in one embodiment of the invention;
FIG. 4 is a schematic diagram of a system for processing read and write requests based on a distributed storage cluster in accordance with one embodiment of the invention.
Detailed Description
It should be noted that, without conflict, the embodiments of the present application and features of the embodiments may be combined with each other. The application will be described in detail below with reference to the drawings in connection with embodiments.
As shown in fig. 1, the present invention provides a method for processing read-write requests based on a distributed storage cluster, which includes the following steps:
Step S1: and (3) initiating identity authentication to the distributed cluster based on configuration information of a reverse proxy corresponding to the load balancing gateway, connecting Rados in the distributed cluster based on the obtained token after the authentication is passed, and inquiring key configuration information of the distributed cluster after the connection is completed.
Specifically, step S1 includes:
Step S11: starting a load balancing gateway and loading configuration information of a reverse proxy corresponding to the load balancing gateway, wherein the configuration information of the reverse proxy comprises monitoring port configuration information of a proxy server, list information of an object storage gateway RGW in a distributed cluster and the like;
The proxy server is used as a reverse proxy corresponding to the load balancing gateway, the reverse proxy is arranged between the user side and the distributed storage cluster, and the reverse proxy server is used for receiving a request from the client side and then forwarding the request to the distributed storage cluster, so that the real address of the distributed storage cluster is hidden, and the load of the distributed storage cluster can be reduced to a certain extent.
In an embodiment of the present invention, the distributed storage cluster adopts a Ceph cluster, and core components in the Ceph cluster include: rados, monitor Monitor, libradios library, object storage gateway RGW, OSD, storage pool, and placement team PG.
Rados is the core and foundation of the Ceph cluster, rados is totally named Reliable Autonomic Distributed Object Store, namely reliable autonomous distributed object storage, in the Ceph cluster, all data are stored in the form of objects, no matter what data type the objects are, rados is responsible for storing the objects, and Rados can ensure that the data always keep consistent.
The Monitor is used for maintaining various mappings of cluster states, such as how many storage pools exist in ceph clusters, how many placement groups PG exist in each storage pool, and the mapping relation between the storage pools and the placement groups PG, etc., and the Monitor adopts Monitor maps, MANAGER MAP, OSD maps, PG maps, MDS maps, CRUSH maps, etc. as key cluster states required by daemons to coordinate with each other; meanwhile, the Monitor is also responsible for providing identity verification between the application program and the client; in addition, the Monitor also provides a logging service for logging Ceph cluster IDs, etc.
The Libradios library is an API that provides Rados, and because Rados is difficult to access directly, the object storage gateway RGW accesses Rados via Libradios library.
The object storage gateway RGW is a generic term RADOS gateway, which is an application of a Ceph cluster and is used for providing an object storage service for the external of the Ceph cluster, the object storage gateway RGW provides an HTTP server, a user can access Rados in the Ceph cluster in a RESTful manner through an HTTP protocol, and the object storage gateway RGW can call an API of Libradios library to realize data storage and reading Rados, and provides an object storage access interface compatible with S3 and OpenStack shift.
The OSDs are storage nodes in a Ceph cluster, which are responsible for storing data and providing storage services, in which each OSD is an independent entity, each OSD is running on a specific OSD host, and is responsible for storing and managing data objects, each data object in the Ceph cluster is copied into multiple OSDs, which may be 3,4 or other number of copies, depending on a policy configured by an administrator, when one OSD fails, the Ceph cluster automatically detects and removes it from a service list, and then the Ceph automatically creates copies of the failed OSDs on other OSDs to maintain high availability and redundancy of data.
In the embodiment of the invention, when the object storage gateway RGW and the OSD are deployed in the Ceph cluster, the OSD in the Ceph cluster needs to be initialized, including adding the correspondence between the OSD host name and the OSD host IP, closing the firewall and selinux, closing NetworkManager and the like; object storage gateway RGW and OSD may then be deployed using ceph-deploy tools, where one or more object storage gateways RGW may be connected to multiple OSDs.
The storage pool is a logic concept for storing data in the Ceph cluster, a plurality of storage pool pools can jointly form the Ceph cluster, definition and configuration of the storage pool are completed in a configuration file of the Ceph cluster, and parameters such as names, sizes, replication levels and the like of the storage pool can be specified in the configuration file.
For example, the following is a pool definition in one example configuration file:
```
[global]
#...
# defines a pool named my_pool of 10GB size and replication level 3
pool my_pool size 10G replicated 3
```
In the above example we define a pool named "my_pool" of size 10GB and replication level 3, which means that the pool will use 10GB of storage space and data will be replicated onto 3 different nodes to guarantee the reliability and availability of the data.
The set of places PG is a collection of objects, all objects in the collection have the same placement policy, i.e. objects in the same place group PG will be placed on the same hard disk, which is similar to the index in the database when data is addressed: each object is fixedly mapped into one placement group PG, so when one object is to be found, only the placement group PG to which the object belongs needs to be found first, then the placement group PG is traversed, all objects do not need to be traversed, in addition, in the data migration process, the placement group PG is taken as a basic unit to carry out migration, ceph cannot directly operate the objects, in the Ceph cluster, a pool is a logical group formed by a plurality of placement groups PG, and how many placement groups PG in one pool can be calculated through a formula.
Specifically, the configuration information of the monitoring port of the proxy server includes IP list information of Monitor in the Ceph cluster, and the like.
Step S12: initiating Cephx identity authentication to the Ceph cluster according to IP list information and key ring of a Monitor in the monitoring port configuration information, and obtaining a token after the authentication is passed;
Step S13: based on the obtained token, the Libradios library in the Ceph cluster is loaded, the Libradios library is adopted to provide an API for Rados, and Rados in the Ceph cluster is connected through a proxy server.
Step S14: after Rados connections in the Ceph cluster are completed, inquiring key configuration information of the Ceph cluster, wherein the key configuration information of the Ceph cluster comprises: ceph cluster ID, topology information list deployed by an object storage gateway RGW, configuration information of pool, object placement strategy, object storage category, metadata information of socket in Ceph cluster and the like;
Wherein, the object placement policy and the object storage class are data storage pool relationships used by the Ceph cluster to define file objects, such as: the file data is stored in the storage pool1, and the index data is stored in the storage pool2.
Step S15: calling an interface of an object storage gateway RGW to inquire metadata information of a socket in a Ceph cluster, wherein the metadata information of the socket comprises: a current socket list, a socket associated object placement strategy and the like;
In a Ceph cluster, a Bucket is an object used to store data, similar to directories or folders in other distributed storage systems, and is an abstract concept used to organize and name data objects stored in a Ceph cluster; the main purpose of a Bucket is to provide a simple way to organize and access data stored in a Ceph cluster, which can be used to store various types of data, including files, pictures, videos, journals, and the like.
To query CephCeph the Bucket in the cluster, the GET method of the object storage gateway RGW interface may be used, the following is an example request:
```bash
GET/buckets
```
This returns a list of the Bucket, including the name, access rights, storage type, etc. of each Bucket.
When a particular Bucket needs to be queried, the Bucket's name may be specified in the request. For example, to query a bucket named "mybucket", the following request may be used:
```bash
GET/buckets/mybucket
```
this will return detailed information about the bucket named "mybucket," including the bucket's metadata, access rights, storage type, etc.
Step S16: inquiring configuration information of a storage pool in the Ceph cluster, wherein the configuration information of the storage pool comprises: storage pool name, ID of storage pool, number of placed groups PG in storage pool, size of placed groups PG in storage pool, type of storage pool, etc.
Specifically, the command is executed when querying configuration information of a pool in the Ceph cluster: ceph osd pool ls detail A
In one embodiment of the invention, the above hits are run, returning to the example as follows:
pool 70'default.rgw.buckets.data'erasure profile ec-profile-4x1 size 5 min_size 4 crush_rule 1 object_hash rjenkins pg_num 1024 pgp_num 1024 autoscale_mode off last_change 17085 flags hashpspool stripe_width 16384 application rgw;
Wherein 70 is a storage pool ID, default.rgw.bucket.data is a storage pool name, and the error is a storage pool type, 1024 is PG number, etc.
Step S2: the load balancing gateway analyzes the read-write request sent by the client to obtain a file stored by the socket, selects an optimal object storage gateway RGW based on the file stored by the socket and the key configuration information, forwards the read-write request to the optimal object storage gateway RGW, and processes the read-write request by the optimal object storage gateway RGW.
As shown in fig. 2, step S2 includes the steps of:
Step S21: the reverse proxy server receives and analyzes a read-write request sent by a client, wherein the read-write request adopts an HTTP S3 request, and then forwards the request to a Ceph cluster, and a load balancing gateway analyzes the read-write request sent by the client to acquire the name of a socket corresponding to the read-write request in a socket list in the metadata information of the socket, a file identifier stored in the socket, a storage type of the socket and file size information stored in the socket;
wherein, the S3 request format is as follows:
PUT/Key+HTTP/1.1
Host:Bucket.s3.amazonaws.com
x-amz-acl:ACL
Cache-Control:CacheControl
Content-Disposition:ContentDisposition
Content-Encoding:ContentEncoding
Content-Language:ContentLanguage
Content-Length:ContentLength
Content-MD5:ContentMD5
Content-Type:ContentType
x-amz-sdk-checksum-algorithm:ChecksumAlgorithm
x-amz-checksum-crc32:ChecksumCRC32
x-amz-checksum-crc32c:ChecksumCRC32C
x-amz-checksum-sha1:ChecksumSHA1
x-amz-checksum-sha256:ChecksumSHA256
Expires:Expires
x-amz-server-side-encryption:ServerSideEncryption
x-amz-storage-class:StorageClass
Body
Specifically, the name of the socket, the file identifier stored in the socket, the storage type of the socket and the file size information stored in the socket can be obtained by analyzing the URL and the header information of the HTTP S3 request.
Step S22: inquiring information of pool in the storage pool according to the object placement strategy associated with the socket in the metadata information of the socket;
specifically, the information storage pool of the storage pool of the socket is acquired through a command:
Step S23: in order to improve the storage efficiency of large files, the large files are divided into a plurality of Rados by RGW (global graphics w), so that file division is required to be carried out on files stored in a socket according to the same rule at a load balancing gateway to obtain a plurality of divided files, head objects of the divided files are obtained, and head object IDs of the divided files are obtained according to the sizes of the divided files, wherein when the size < = 4M of the divided files is obtained, the head object ID of the divided file is the file name of the divided file, and when the size of the divided file is greater than 4M, the head object ID of the divided file is the file name.0, and specific reference can be made to FIG. 3;
Step S24: mapping the head object ID into the placement group PG by adopting a Monitor in the Ceph cluster to obtain PG_ID;
Specifically, step S24 includes: taking the head object ID as input, generating a hash value with a fixed length through a hash function, thus obtaining a pseudo-random value which is approximately uniformly distributed, and performing modular operation on the hash value corresponding to the head object ID and the size of the placed group PG in the pool to obtain the PG_ID.
Step S25: traversing PG_MAP data structure in local buffer of client end, using PG_ID as key inquiry to search, obtaining OSD_ID corresponding to PG_ID when PG_ID matching PG_MAP data structure in local buffer is obtained, wherein OSD_ID is ID of OSD, placing group and OSD in pool are used together to identify position and attribution relation of object in Ceph cluster, for example, one object may be allocated to one specific placing group PG, and actual data is stored by one or more OSD in the placing group PG, wherein when actual data is stored by one OSD in PG, one OSD is used as main OSD, and other OSD is used as slave OSD when actual data is stored by multiple OSD in PG, thus OSD_ID corresponding to PG_ID can be obtained based on PG_ID;
Step S26: inquiring relevant information of an OSD_MAP corresponding to the OSD_ID in a local cache of a client, for example, using osdmaptool to obtain a JSON representation of the OSD_MAP by adopting a 'get osdmap' command, wherein in the JSON representation of the OSD_MAP, each OSD has a corresponding OSD_ID field, and the value of the field is the OSD HOST ID of the OSD, so as to obtain HOST_ID;
For example, in the JSON representation of osd_map, the osd_id of an OSD is "OSD-1234", and the OSD host ID where the OSD is located is "1234".
Step S27: selecting an optimal object storage gateway RGW according to the connection state of the object storage gateway RGW where the HOST_ID corresponding to each OSD is located;
Specifically, the connection weight value of the object storage gateway RGW where the host_id corresponding to each OSD is located is calculated, the connection weight value is adopted to perform weighting processing on the list information of the object storage gateway RGW in the distributed cluster, the weighting processing results of the object storage gateway RGW where the host_id corresponding to each OSD are located are ranked from large to small, and the object storage gateway RGW corresponding to the weighting processing result ranked at the first position is used as the optimal object storage gateway RGW.
Specifically, the calculation formula of the connection weight value of the object storage gateway RGW where the host_id corresponding to the main OSD is located is:
100-HOST_ID number of connections corresponding to object storage gateway RGW. Times.0.1
Specifically, the calculation formula of the connection weight value of the target storage gateway RGW where the host_id corresponding to the main OSD is located is:
(HOST_ID connection number corresponding to 100-object storage gateway RGW. Times.0.1). Times.0.5
Specifically, the host_id connection number corresponding to the object storage gateway RGW is obtained from the list information of the object storage gateway RGW in the distributed cluster in the configuration information of the reverse proxy corresponding to the load balancing gateway.
Step S28: the load balancing gateway forwards the read-write request to a corresponding optimal object storage gateway RGW according to the IP and port information configured by the load balancing gateway, and the object storage gateway RGW processes the read-write request.
Step S3: and periodically inquiring Rados, and updating the key configuration information of the locally cached distributed cluster to the latest version.
Specifically, step S3 includes the steps of:
Step S31: updating key configuration information of the Ceph cluster based on the acquired object, wherein the key configuration information of the Ceph cluster comprises: the Ceph cluster ID, a topology information list deployed by the object storage gateway RGW, configuration information of a storage pool, an object placement strategy, an object storage category, metadata information of a socket in the Ceph cluster and the like, key configuration parameters of the Ceph cluster are usually stored in a configuration file of the Ceph cluster, and the parameters can be modified according to specific requirements to adjust the behavior of the Ceph cluster;
Step S32: the PG_MAP and OSD_MAP are refreshed.
Specifically, osd_map is information of all OSDs in the Ceph cluster, including: topological relation of OSD, current update version, state of OSD, etc.; PG_MAP stores mapping relation of all PG lists and OSD lists in the current cluster, master-slave information, current PG state of updated version, etc.; OSD_MAP and PG_MAP are maintained by a Monitor and cached in a memory, a client queries through an API interface, and in order to improve the query efficiency of a load balancing gateway, local caching is carried out at a gateway side, and a timing comparison refreshing mechanism is realized.
As shown in fig. 3, the present invention provides a system for processing read-write requests based on a distributed storage cluster, which processes read-write requests by adopting the method for processing read-write requests based on the distributed storage cluster, and the system comprises:
The configuration information query module is used for initiating identity authentication to the distributed cluster based on the configuration information of the reverse proxy corresponding to the load balancing gateway, connecting Rados in the distributed cluster based on the obtained token after the authentication is passed, and querying key configuration information of the distributed cluster after the connection is completed;
The read-write request analysis module is used for analyzing the read-write request sent by the client by the load balancing gateway, acquiring a file stored by the socket, selecting an optimal object storage gateway RGW based on the file stored by the socket and the key configuration information, forwarding the read-write request to the optimal object storage gateway RGW by the load balancing gateway, and processing the read-write request by the optimal object storage gateway RGW;
And the information updating module is used for timing inquiry Rados and updating the key configuration information of the locally cached distributed cluster into the latest version.
It should be noted that the foregoing detailed description is exemplary and is intended to provide further explanation of the application. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of exemplary embodiments according to the present application. As used herein, the singular is intended to include the plural unless the context clearly indicates otherwise. Furthermore, it will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, steps, operations, devices, components, and/or groups thereof.
It should be noted that the terms "first," "second," and the like in the description and the claims of the present application and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the terms so used are interchangeable under appropriate circumstances such that the embodiments of the application described herein are capable of operation in sequences other than those illustrated or otherwise described herein.
Furthermore, the terms "comprise" and "have," as well as any variations thereof, are intended to cover a non-exclusive inclusion. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those elements but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Spatially relative terms, such as "above … …," "above … …," "upper surface on … …," "above," and the like, may be used herein for ease of description to describe one device or feature's spatial location relative to another device or feature as illustrated in the figures. It will be understood that the spatially relative terms are intended to encompass different orientations in use or operation in addition to the orientation depicted in the figures. For example, if the device in the figures is turned over, elements described as "above" or "over" other devices or structures would then be oriented "below" or "beneath" the other devices or structures. Thus, the exemplary term "above … …" may include both orientations "above … …" and "below … …". The device may also be positioned in other different ways, such as rotated 90 degrees or at other orientations, and the spatially relative descriptors used herein interpreted accordingly.
In the above detailed description, reference is made to the accompanying drawings, which form a part hereof. In the drawings, like numerals typically identify like components unless context indicates otherwise. The illustrated embodiments described in the detailed description, drawings, and claims are not meant to be limiting. Other embodiments may be utilized, and other changes may be made, without departing from the spirit or scope of the subject matter presented herein.
The above description is only of the preferred embodiments of the present invention and is not intended to limit the present invention, but various modifications and variations can be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. A method for processing read-write requests based on a distributed storage cluster, the method comprising:
Initiating identity authentication to the distributed cluster based on configuration information of a reverse proxy corresponding to the load balancing gateway, connecting Rados in the distributed cluster based on the obtained token after the authentication is passed, and inquiring key configuration information of the distributed cluster after the connection is completed;
The load balancing gateway analyzes the read-write request sent by the client to obtain a file stored by the socket, selects an optimal object storage gateway RGW based on the file stored by the socket and the key configuration information, forwards the read-write request to the optimal object storage gateway RGW, and processes the read-write request by the optimal object storage gateway RGW;
and periodically inquiring Rados of the distributed clusters, and updating the key configuration information of the locally cached distributed clusters to the latest version.
2. The method of claim 1, wherein the configuration information of the reverse proxy includes configuration information of listening ports of proxy servers and list information of object storage gateways RGWs in the distributed cluster.
3. The method for processing read-write requests based on distributed storage clusters according to claim 1, wherein the key configuration information of the distributed clusters comprises: the method comprises the steps of distributed cluster ID, a topology information list deployed by an object storage gateway RGW, configuration information of a storage pool, an object placement strategy, object storage category and metadata information of a socket in a Ceph cluster.
4. The method of claim 1, wherein selecting an optimal object storage gateway RGW based on the files stored by the Bucket and the critical configuration information comprises:
Segmenting files stored in the socket to obtain a plurality of segmented files, acquiring head objects of the segmented files, and acquiring head object IDs of the segmented files according to the sizes of the segmented files;
Acquiring PG_ID based on the head object ID;
Acquiring OSD_IDs corresponding to the PG_IDs based on the PG_IDs, and acquiring HOST_IDs corresponding to the various OSD based on the OSD_IDs;
And selecting the optimal object storage gateway RGW according to the connection state of the object storage gateway RGW where the HOST_ID corresponding to each OSD is located.
5. The method for processing read-write requests based on distributed storage clusters according to claim 4, wherein obtaining the header object ID of each split file according to the size of the split file comprises:
when the size of the segmented file is < =4m, the header object ID of the segmented file is the file name of the segmented file;
When the size of the split file is >4M, the header object ID of the split file is the file name.0.
6. The method of claim 4, wherein obtaining the pg_id based on the header object ID comprises:
Taking the head object ID as input, and generating a hash value with a fixed length through a hash function;
And performing modular operation on the hash value corresponding to the head object ID and the size of the PG placed in the pool to obtain the PG_ID.
7. The method of claim 4, wherein obtaining osd_ids corresponding to the pg_ids based on the pg_ids and obtaining host_ids corresponding to the OSDs based on the osd_ids comprises:
Traversing the PG_MAP data structure in a local cache of the client, searching by using the PG_ID as a key query, and acquiring an OSD_ID corresponding to the PG_ID when the PG_ID matched with the PG_MAP data structure in the local cache is obtained;
And inquiring related information of the OSD_MAP corresponding to the OSD_ID in a local buffer memory of the client, and acquiring HOST_ID corresponding to each OSD based on the related information of the OSD_MAP.
8. The method for processing read-write requests based on distributed storage clusters according to claim 4, wherein selecting an optimal object storage gateway RGW according to a connection state of the object storage gateway RGW where the host_id corresponding to each OSD is located comprises:
Calculating a connection weight value of an object storage gateway RGW where the HOST_ID corresponding to each OSD is located, carrying out weighting processing on list information of the object storage gateways RGWs in the distributed cluster by adopting the connection weight value, sorting the weighting processing results of the object storage gateways RGWs where the HOST_ID corresponding to each OSD is located according to the order from big to small, and taking the object storage gateway RGW corresponding to the weighting processing result sorted in the first bit as the optimal object storage gateway RGW.
9. The method of claim 8, wherein the OSD comprises a master OSD and a slave OSD,
The calculation formula of the connection weight value of the object storage gateway RGW where the host_id corresponding to the main OSD is located is:
100-HOST_ID connection number x 0.1 corresponding to object storage gateway RGW;
The calculation formula of the connection weight value of the object storage gateway RGW where the OSD host ID corresponding to the OSD is located is as follows:
(HOST_ID connection number corresponding to 100-object storage gateway RGW. Times.0.1). Times.0.5 .
10. A system for processing read-write requests based on a distributed storage cluster, which processes read-write requests using a method for processing read-write requests based on a distributed storage cluster according to any one of claims 1-9, the system comprising:
The configuration information query module is used for initiating identity authentication to the distributed cluster based on the configuration information of the reverse proxy corresponding to the load balancing gateway, connecting Rados in the distributed cluster based on the obtained token after the authentication is passed, and querying key configuration information of the distributed cluster after the connection is completed;
the read-write request analysis module is used for analyzing the read-write request sent by the client by the load balancing gateway, acquiring a file stored by the socket, selecting an optimal object storage gateway RGW based on the file stored by the socket and the key configuration information, forwarding the read-write request to the optimal object storage gateway RGW by the load balancing gateway, and processing the read-write request by the optimal object storage gateway RGW;
And the information updating module is used for periodically inquiring Rados of the distributed clusters and updating the key configuration information of the locally cached distributed clusters into the latest version.
CN202311696286.7A 2023-12-11 2023-12-11 Method and system for processing read-write request based on distributed storage cluster Pending CN117938937A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311696286.7A CN117938937A (en) 2023-12-11 2023-12-11 Method and system for processing read-write request based on distributed storage cluster

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311696286.7A CN117938937A (en) 2023-12-11 2023-12-11 Method and system for processing read-write request based on distributed storage cluster

Publications (1)

Publication Number Publication Date
CN117938937A true CN117938937A (en) 2024-04-26

Family

ID=90761918

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311696286.7A Pending CN117938937A (en) 2023-12-11 2023-12-11 Method and system for processing read-write request based on distributed storage cluster

Country Status (1)

Country Link
CN (1) CN117938937A (en)

Similar Documents

Publication Publication Date Title
US20210203742A1 (en) Providing access to managed content
US8676951B2 (en) Traffic reduction method for distributed key-value store
CN110324177B (en) A service request processing method, system and medium under a microservice architecture
CN106874383B (en) Decoupling distribution method of metadata of distributed file system
KR100951257B1 (en) Computerized Systems, Methods, and Internal Memory for Enterprise Storage System Management
US6424992B2 (en) Affinity-based router and routing method
CN110213352B (en) Namespace unified decentralized autonomous storage resource aggregation method
US6339793B1 (en) Read/write data sharing of DASD data, including byte file system data, in a cluster of multiple data processing systems
JP4538454B2 (en) Search for electronic document replicas in computer networks
US7237027B1 (en) Scalable storage system
US8176012B1 (en) Read-only mirroring for load sharing
CN102148850B (en) Cluster system and service processing method thereof
US20100161657A1 (en) Metadata server and metadata management method
US7778967B2 (en) System and method for efficient management of distributed spatial data
US10579597B1 (en) Data-tiering service with multiple cold tier quality of service levels
US10320905B2 (en) Highly available network filer super cluster
US7191225B1 (en) Mechanism to provide direct multi-node file system access to files on a single-node storage stack
US10523493B2 (en) Cross-cloud operation management
JP2008542887A (en) Virtualization network storage system, network storage apparatus and virtualization method thereof
WO2007056336A1 (en) System and method for writing data to a directory
US10708379B1 (en) Dynamic proxy for databases
US9390156B2 (en) Distributed directory environment using clustered LDAP servers
US12335340B2 (en) Scalable autonomous storage networks
US8250176B2 (en) File sharing method and file sharing system
US20060123121A1 (en) System and method for service session management

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination