CN114328431A - Distributed file system - Google Patents
Distributed file system Download PDFInfo
- Publication number
- CN114328431A CN114328431A CN202111108565.8A CN202111108565A CN114328431A CN 114328431 A CN114328431 A CN 114328431A CN 202111108565 A CN202111108565 A CN 202111108565A CN 114328431 A CN114328431 A CN 114328431A
- Authority
- CN
- China
- Prior art keywords
- node
- namenode
- datanode
- nodes
- file system
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
技术领域technical field
本发明属于文件存储系统技术领域,具体为一种分布式小文件系统。The invention belongs to the technical field of file storage systems, in particular to a distributed small file system.
背景技术Background technique
现有分布式存储系统大多数采用C语言、Go语言开发实现。在生态体系庞大的Java语言技术栈应用领域中无法很好的兼容,对于Java开发人员来说上手难度较高,中小型公司维护比较困难。系统延迟较高、经常出现系统崩溃现象。无法提供高并发、高稳定、高可用的系统输出,提高了企业使用的维护成本。在文件存储和传输方面有一定的瓶颈、无法横向扩容。在系统使用层面上,用户手册晦涩难懂、文档缺少等问题导致使用难度比较高,通常需要专业人员指导才能使用。Most of the existing distributed storage systems are developed and implemented in C language and Go language. It is not very compatible in the application field of the Java language technology stack with a huge ecosystem. It is difficult for Java developers to get started, and it is difficult for small and medium-sized companies to maintain. High system latency and frequent system crashes. The system output of high concurrency, high stability and high availability cannot be provided, which increases the maintenance cost of enterprise use. There are certain bottlenecks in file storage and transmission, and horizontal expansion is impossible. At the level of system use, problems such as obscure user manuals and lack of documentation make it difficult to use, and usually require professional guidance to use.
发明内容SUMMARY OF THE INVENTION
针对上述情况,为克服现有技术的缺陷,本发明提供分布式存储系统,有效的解决了背景技术中的问题。In view of the above situation, in order to overcome the defects of the prior art, the present invention provides a distributed storage system, which effectively solves the problems in the background art.
为实现上述目的,本发明提供如下技术方案:一种分布式小文件系统,包括client节点、NameNode节点、BackNode节点、DataNode节点与储存节点,In order to achieve the above object, the present invention provides the following technical solutions: a distributed small file system, including a client node, a NameNode node, a BackNode node, a DataNode node and a storage node,
所述NameNode节点用于负责管理文件系统的元数据,The NameNode node is used to manage the metadata of the file system,
所述BackupNode节点负责作为NameNode节点的备份节点,The BackupNode node is responsible for serving as the backup node of the NameNode node,
所述DataNode节点负责存储真实文件,The DataNode node is responsible for storing real files,
所述DataNode节点分别与client节点、NameNode节点之间通信连接,所述client节点与NameNode节点之间通信连接,所述NameNode节点还与 BackupNode节点之间通信连接。The DataNode node is connected to the client node and the NameNode node respectively, the client node is connected to the NameNode node, and the NameNode node is also connected to the BackupNode node.
优选的,还包括Netty框架,所述NameNod节点、BackNode节点、DataNode 节点与储存节点均基于Netty框架进行通讯。Preferably, it also includes a Netty framework, and the NameNode, BackNode, DataNode and storage nodes all communicate based on the Netty framework.
优选的,还包括ProtoBuf序列化技术。Preferably, ProtoBuf serialization technology is also included.
优选的,所述client节点、NameNode节点、BackNode节点、DataNode 节点与储存节点均基于Java语言开发。Preferably, the client node, NameNode node, BackNode node, DataNode node and storage node are all developed based on Java language.
与现有技术相比,本发明的有益效果是:Compared with the prior art, the beneficial effects of the present invention are:
本发明采用NameNode、DataNode、BackupNode共三种节点类型相互搭配形成系统,节点之间职责分明,代码通俗易懂,实现了高可用的架构设计和高拓展性,支持集群节点弹性伸缩。NameNode节点支持单机和集群两种部署模式,单机模式可以应对中小型公司大量小文件的存储需求。集群模式可以支撑中大型公司海量小文件的存储需求。The invention adopts NameNode, DataNode and BackupNode to form a system by collocating three types of nodes, the responsibilities between nodes are clear, the code is easy to understand, the architecture design of high availability and high scalability are realized, and the elastic expansion and contraction of cluster nodes is supported. The NameNode node supports both single-machine and cluster deployment modes. The single-machine mode can meet the storage requirements of small and medium-sized companies for a large number of small files. The cluster mode can support the storage requirements of large and medium-sized companies with a large number of small files.
本发明采用Java语言开发,可以完美兼容国内较为庞大的Java技术的生态体系,对于Java开发人员来说,上手难度极低,在遇到问题时因为是采用Java语言开发,可以快速阅读源码定位问题。支持海量小文件的上传、存储、下载等基本功能,能支持亿级别的小文件存储,支持横向扩容,理论上系统容量无上限。使用了行业内比较成熟Netty技术作为网络通讯框架,可以提供高并发、高性能、高稳定的系统表现。同时设计了先进的主备切换机制,具备高可用的特性The invention is developed in Java language, and can be perfectly compatible with the relatively large domestic Java technology ecosystem. For Java developers, it is extremely difficult to get started. When encountering problems, because it is developed in Java language, it is possible to quickly read the source code positioning problem. . It supports basic functions such as uploading, storing, and downloading massive small files. It can store hundreds of millions of small files, and support horizontal expansion. Theoretically, there is no upper limit on the system capacity. Using the relatively mature Netty technology in the industry as the network communication framework, it can provide high concurrency, high performance and high stability of the system performance. At the same time, an advanced active-standby switching mechanism is designed, which has the characteristics of high availability.
附图说明Description of drawings
附图用来提供对本发明的进一步理解,并且构成说明书的一部分,与本发明的实施例一起用于解释本发明,并不构成对本发明的限制。在附图中:The accompanying drawings are used to provide a further understanding of the present invention, and constitute a part of the specification, and are used to explain the present invention together with the embodiments of the present invention, and do not constitute a limitation to the present invention. In the attached image:
图1为本发明的现有技术的结构示意图;Fig. 1 is the structural representation of the prior art of the present invention;
图2为本发明的系统结构示意图。FIG. 2 is a schematic diagram of the system structure of the present invention.
具体实施方式Detailed ways
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例;基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only a part of the embodiments of the present invention, rather than all the embodiments; based on the The embodiments of the present invention, and all other embodiments obtained by those of ordinary skill in the art without creative work, fall within the protection scope of the present invention.
由图1-2给出,本发明公开了一种分布式小文件系统,其特征在于:包括client节点、NameNode节点、BackNode节点、DataNode节点与储存节点,As shown in Figure 1-2, the present invention discloses a distributed small file system, which is characterized in that it includes a client node, a NameNode node, a BackNode node, a DataNode node and a storage node,
NameNode节点用于负责管理文件系统的元数据,The NameNode node is responsible for managing the metadata of the file system,
BackupNode节点负责作为NameNode节点的备份节点,The BackupNode node is responsible for serving as the backup node of the NameNode node,
DataNode节点负责存储真实文件,The DataNode node is responsible for storing real files,
DataNode节点分别与client节点、NameNode节点之间通信连接,client 节点与NameNode节点之间通信连接,NameNode节点还与BackupNode节点之间通信连接。The DataNode node is connected to the client node and the NameNode node respectively, the client node is connected to the NameNode node, and the NameNode node is also connected to the BackupNode node.
还包括Netty框架,所述NameNode节点、BackNode节点、DataNode节点与储存节点均基于Netty框架进行通讯。It also includes a Netty framework, and the NameNode nodes, BackNode nodes, DataNode nodes and storage nodes all communicate based on the Netty framework.
本系统采用了Netty、Google Protobuf技术。整个系统包含几种角色的节点,其中有NameNode节点负责管理文件系统的元数据。BackupNode节点负责作为NameNode节点的备份节点,用于提供主备切换功能。DataNode节点负责存储真实文件在机器磁盘上。使用Netty框架实现了高并发、高性能的文件传输、管理功能。使用ProtoBuf序列化技术,拥有极高的性能和压缩比。This system adopts Netty, Google Protobuf technology. The entire system contains nodes with several roles, among which the NameNode node is responsible for managing the metadata of the file system. The BackupNode node is responsible for serving as the backup node of the NameNode node to provide the active-standby switching function. The DataNode node is responsible for storing the real files on the machine disk. Using the Netty framework to achieve high concurrency, high performance file transfer and management functions. Using ProtoBuf serialization technology, it has extremely high performance and compression ratio.
client节点、NameNod节点、BackNode节点、DataNode节点与储存节点均基于Java语言开发。The client node, NameNod node, BackNode node, DataNode node and storage node are all developed based on Java language.
且在本发明实施的时候,还可以使用配套工具集,包括WebUI、运维脚本等,提供可视化操作界面和数据迁移的脚本等。And when the present invention is implemented, a supporting tool set can also be used, including WebUI, operation and maintenance scripts, etc., to provide a visual operation interface and data migration scripts.
工作原理:客户端集成SDK通过网络发送请求给NameNode节点,NameNode 节点分配存储文件的DataNode节点列表,接着客户端SDK再和DataNode建立链接,然后进行文件内容传输。通过这种架构设计,实现了元数据管理和文件存储职责分离,实现了可伸缩的系统架构。Working principle: The client-side integrated SDK sends a request to the NameNode node through the network, and the NameNode node allocates a list of DataNode nodes that store files, and then the client-side SDK establishes a link with the DataNode, and then transfers the file content. Through this architecture design, metadata management and file storage responsibilities are separated, and a scalable system architecture is realized.
需要说明的是,在本文中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。It should be noted that, in this document, relational terms such as first and second are only used to distinguish one entity or operation from another entity or operation, and do not necessarily require or imply any relationship between these entities or operations. any such actual relationship or sequence exists. Moreover, the terms "comprising", "comprising" or any other variation thereof are intended to encompass non-exclusive inclusion such that a process, method, article or device comprising a list of elements includes not only those elements, but also includes not explicitly listed or other elements inherent to such a process, method, article or apparatus.
尽管已经示出和描述了本发明的实施例,对于本领域的普通技术人员而言,可以理解在不脱离本发明的原理和精神的情况下可以对这些实施例进行多种变化、修改、替换和变型,本发明的范围由所附权利要求及其等同物限定。Although embodiments of the present invention have been shown and described, it will be understood by those skilled in the art that various changes, modifications, and substitutions can be made in these embodiments without departing from the principle and spirit of the invention and modifications, the scope of the present invention is defined by the appended claims and their equivalents.
Claims (4)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202111108565.8A CN114328431A (en) | 2021-09-22 | 2021-09-22 | Distributed file system |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202111108565.8A CN114328431A (en) | 2021-09-22 | 2021-09-22 | Distributed file system |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN114328431A true CN114328431A (en) | 2022-04-12 |
Family
ID=81045068
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202111108565.8A Pending CN114328431A (en) | 2021-09-22 | 2021-09-22 | Distributed file system |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN114328431A (en) |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US7711711B1 (en) * | 2006-03-29 | 2010-05-04 | Emc Corporation | Networked storage system employing information lifecycle management in conjunction with a distributed global file system |
| CN103581332A (en) * | 2013-11-15 | 2014-02-12 | 武汉理工大学 | HDFS framework and pressure decomposition method for NameNodes in HDFS framework |
| US10298709B1 (en) * | 2014-12-31 | 2019-05-21 | EMC IP Holding Company LLC | Performance of Hadoop distributed file system operations in a non-native operating system |
| US20190347336A1 (en) * | 2018-05-10 | 2019-11-14 | Paypal, Inc. | System and method for scouting distributed file system metadata |
| CN111159132A (en) * | 2018-11-08 | 2020-05-15 | 北京航天长峰科技工业集团有限公司 | Batch small file processing system based on HDFS |
-
2021
- 2021-09-22 CN CN202111108565.8A patent/CN114328431A/en active Pending
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US7711711B1 (en) * | 2006-03-29 | 2010-05-04 | Emc Corporation | Networked storage system employing information lifecycle management in conjunction with a distributed global file system |
| CN103581332A (en) * | 2013-11-15 | 2014-02-12 | 武汉理工大学 | HDFS framework and pressure decomposition method for NameNodes in HDFS framework |
| US10298709B1 (en) * | 2014-12-31 | 2019-05-21 | EMC IP Holding Company LLC | Performance of Hadoop distributed file system operations in a non-native operating system |
| US20190347336A1 (en) * | 2018-05-10 | 2019-11-14 | Paypal, Inc. | System and method for scouting distributed file system metadata |
| CN111159132A (en) * | 2018-11-08 | 2020-05-15 | 北京航天长峰科技工业集团有限公司 | Batch small file processing system based on HDFS |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN111078121B (en) | Data migration method and system for distributed storage system and related components | |
| CN111078120B (en) | A data migration method, system and related components for a distributed file system | |
| US8285686B2 (en) | Executing prioritized replication requests for objects in a distributed storage system | |
| TWI476610B (en) | Peer-to-peer redundant file server system and methods | |
| CN104113597B (en) | The HDFS data read-write method of a kind of many Data centres | |
| Deka | A survey of cloud database systems | |
| CN112099918A (en) | Live migration of clusters in containerized environments | |
| CN103002027B (en) | Data-storage system and the method for tree directory structure is realized based on key-value pair system | |
| Lai et al. | Towards a framework for large-scale multimedia data storage and processing on Hadoop platform | |
| CN104516967A (en) | Electric power system mass data management system and use method thereof | |
| AU2009330067A1 (en) | Asynchronous distributed garbage collection for replicated storage clusters | |
| US12079193B2 (en) | Distributed storage systems and methods to provide change tracking integrated with scalable databases | |
| CN107343021A (en) | A kind of Log Administration System based on big data applied in state's net cloud | |
| CN103942330B (en) | A kind of processing method of big data, system | |
| US12032847B2 (en) | Cross-platform replication of logical units | |
| US20180316756A1 (en) | Cross-platform replication of logical units | |
| US20150046394A1 (en) | Storage system, storage control device, and storage medium storing control program | |
| US20240104081A1 (en) | Integrating change tracking of storage objects of a distributed object storage database into a distributed storage system | |
| US20190303035A1 (en) | Copying garbage collector for geographically distributed data storage environment | |
| Chandra et al. | A study on cloud database | |
| CN115878269A (en) | Cluster migration method, related device and storage medium | |
| Gupta et al. | Hadoop-an open source framework for big data | |
| CN103853612A (en) | Method for reading data based on digital family content under distributed storage | |
| CN118916431A (en) | Data acquisition method, system, equipment and medium of GoldenDB database | |
| CN114328431A (en) | Distributed file system |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| WD01 | Invention patent application deemed withdrawn after publication | ||
| WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20220412 |