[go: up one dir, main page]

CN107577460A - A kind of method from unstructured data extraction structural data - Google Patents

A kind of method from unstructured data extraction structural data Download PDF

Info

Publication number
CN107577460A
CN107577460A CN201710757615.2A CN201710757615A CN107577460A CN 107577460 A CN107577460 A CN 107577460A CN 201710757615 A CN201710757615 A CN 201710757615A CN 107577460 A CN107577460 A CN 107577460A
Authority
CN
China
Prior art keywords
parsing
data
node
resolver
analytic tree
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710757615.2A
Other languages
Chinese (zh)
Inventor
耐尔
屈朝晖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Shengmei Intelligent System Co Ltd
Original Assignee
Suzhou Shengmei Intelligent System Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Shengmei Intelligent System Co Ltd filed Critical Suzhou Shengmei Intelligent System Co Ltd
Priority to CN201710757615.2A priority Critical patent/CN107577460A/en
Publication of CN107577460A publication Critical patent/CN107577460A/en
Pending legal-status Critical Current

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention discloses a kind of method from unstructured data extraction structural data, and it comprises the following steps:Establishment model storehouse, it includes writing several patterns of acquisition based on regular expression;Operated based on image conversion, establish an empty resolver;The part in data to be resolved is obtained as sample data, understands parsing demand;According to parsing demand, at least one pattern is called, editing classification rule and resolution rules to sample data to classify and parse successively, and classifying rules and resolution rules are stored to resolver;Show the field value name of the sample data of resolution rules parsing;The resolver is disposed in actual environment.Method provided by the invention from unstructured data extraction structural data, parsing operation are based on graphic interface, and operation is flexible, without in face of programming file, beneficial to exploitation, renewal and safeguarding.

Description

A kind of method from unstructured data extraction structural data
Technical field
The present invention relates to the data analytic technique in big data field, it is more particularly related to a kind of from non-structural Change the method for data extraction structural data.
Background technology
Operation/maintenance data not only incredible amount, species is various, and output position is also different, deals with very multiple It is miscellaneous, and we generally require to obtain answer in seconds.We need a kind of method or mode can be fast from mass data Pass positioning, find problem, find root because.Meanwhile the data of these substantial amounts further comprises it is many to enterprises and institutions, each group Knit all very valuable information of focal pointe.We need a kind of method or mode these data become it is significant, Valuable information, so as to have an impact.
Operation/maintenance data includes structural data, semi-structured data and unstructured data.Unstructured data is by machine Or the mankind produce, semi-structured data caused by the mankind includes the data of the forms such as text, sound, picture.Semi-structured number According to essentially from system journal, application program, server, middleware, the network equipment, safety means and database.Structuring Data are from database, monitoring system, daily record network monitoring system, system monitoring etc..Data by proxy server, SYSLOG, The modes such as TCP, UDP, FTP, HTTP are from different station acquisitions.Data handling system processing is sent to after data acquisition.Locating Just need to parse data during reason, and therefrom extract effective field information.Data parse and field extraction needs are in real time or near Complete in real time or under match pattern (Pattern).
The parsing of a data structure can be preserved, stores, loads, edits, inputs, exports and disposed in production environment System is referred to as resolver.It is developer to develop a kind of person of the method from unstructured data extraction structural data.Open Hair personnel inspection sample data, the type and field for understanding sample data, redevelopment resolver (Parser), then by resolver It is deployed in actual production environment, data can be parsed and used by data analysis system in actual production environment.Specifically, open The work of hair personnel is exactly first to sort out the data record in sample data, is then directed to per a kind of data record, therefrom Field is extracted, and is named to field, these fields are embodied as numerical value, character string, IP address etc..Can in order to successfully create With scanned samples data, sort out to sample data, extraction, one kind of output field carry from unstructured data from sample data The method for taking structural data, developer have to carry out above-mentioned two steps operation.Traditional is a kind of from unstructured data extraction Resolver is developed and disposed to the method for structural data using three kinds of methods.1. use high-level programming language exploitation resolver. 2. traditional approach needs programming personnel to encode regular expression (RegEx).3. traditional development scheme is in such as JSON or XML Resolver definition is developed in simpler script.After these resolvers are developed, legacy system uses and replicates resolver Mode to file is deployed in production system.
The major defect of conventional analytic method has:
1. developer needs skilled grasp pattern matching statement or high-level programming language;When the data volume to be parsed Greatly, the use of these language is difficult to safeguard, debugging is got up also highly difficult when species is various;2. developer needs to check sample number According to, understand the record type of sample data and significant field, according to individual to pattern matching statement or high level program The grasp situation of language, develops interpretive model;3. developer has to write resolver manually.Without convenient graphical Development environment.Without can addendum interactivity instrument, developer need before exploitation by all designs be skilled at the heart or Presented with document form;4. developer detects the correctness of resolver without fast method on stream, and can only be real Border could find the mistake in resolver after being deployed in production;5. because lack clash handle instrument, traditional resolution system In, hand-written conventional analytic device is not reuse ability;6. if to make some changes during deployment resolver, it is necessary to Restart whole data handling system.
The content of the invention
For weak point present in above-mentioned technology, the present invention provides a kind of from unstructured data extraction structuring number According to method, parsing operation be based on graphic interface, and operation is flexible, without in face of programming file, beneficial to exploitation, renewal with tieing up Shield.
It is achieved through the following technical solutions to realize according to object of the present invention and further advantage, the present invention:
The present invention provides a kind of method from unstructured data extraction structural data, and it comprises the following steps:
Establishment model storehouse, it includes writing several patterns of acquisition based on regular expression;
Operated based on image conversion, establish an empty resolver;
The part in data to be resolved is obtained as sample data, understands parsing demand;
According to the parsing demand, at least one pattern is called, editing classification rule and resolution rules are with right successively Sample data is classified and parsing, and the classifying rules and the resolution rules are stored to the resolver;
Show the field value name of the sample data of the resolution rules parsing;
Resolver is deployed to practical service environment on GUI, analysis result is output to the next step of data analysis.
The resolver is established, is comprised the following steps:
Establish main analytic tree;
At least one parsing node is added with burl point mode side by side on the main analytic tree tree root and automatically generates one Individual other nodes arranged side by side with the parsing node;
The parsing list that data parse after starting on the parsing node to the taxon of data classification and to classification Member;Start the resolution unit in other described nodes;
In each taxon, based at least one F-rule and/or B-rule, at least one institute is called The pattern editor of stating forms the classifying rules, meets any data distribution for parsing taxon classifying rules on node to the section Point;If not meeting, distribute to other described nodes;
In the resolution unit, at least one pattern editor is called to form the resolution rules with to sorted Data are parsed.
Preferably, the resolver, in addition to step are established:
With burl point mode, the addition at least one sub- analytic tree arranged side by side with the main analytic tree;
With parsing node described with the main analytic tree same way addition, the classification on each sub- analytic tree Unit, the resolution unit and automatically generate other nodes;
Start the taxon, the resolution unit successively, to be classified to data and be parsed.
Preferably, the parsing node is added, it is further comprising the steps of:
On the parsing node, at least one parsing child node is added with burl point mode side by side, in parsing Node starts the taxon to classify;
The resolution unit of the parsing node moves to the corresponding each parsing child node added to parse;
Wherein, when adding the parsing child node, other nodes arranged side by side with the parsing child node are automatically generated;
The resolution unit of the parsing child node moves to child node described in corresponding afterbody to parse.
Preferably, the resolver is established, it is further comprising the steps of:
In the main analytic tree, the sub- analytic tree, the parsing node, other described nodes, the parsing child node And its in the child node:
Edit title;
Edit and show the data type currently parsed;
Edit and show creation time;
Edit simultaneously display refresh time;
Edit for adding the label for identification;And
The operation modified, edit and deleted to resolver.
Preferably, the resolver is established, it is further comprising the steps of:
In the main analytic tree, the sub- analytic tree, the parsing node and the parsing child node:
Pasted being replicated on a node on an analytic tree with a newly-built node on an other analytic tree, or Person pastes the node with a newly-built node after a node is replicated in same analytic tree;
Node is added with burl point mode.
Preferably, the classifying rules or the resolution rules are edited, are comprised the following steps:
Setup rule formula bar;
If editing classification is regular, after selecting at least one F-rule and/or B-rule, call at least one described Pattern editor forms classifying rules, and classifying rules is applied into corresponding data, and data are classified;
If editor's resolution rules, select at least one pattern to be drawn to rule editing column and enter edlin, described in formation Resolution rules, resolution rules are applied in corresponding data, data are parsed.
Preferably, the field value name of the sample data of the resolution rules parsing, in addition to step are shown:
Download, store, disposing, being multiplexed the resolver, resolver is sent to practical service environment on GUI and disposed, Analysis result is output to the next step of data analysis.
The present invention comprises at least following beneficial effect:
Method provided by the invention from unstructured data extraction structural data, based on burl point mode and figure Change operation interface, after uploading sample data to be resolved, call at least one pattern in parser database to enter edlin shape Resolver is saved in into resolution rules, resolver is formed, with field corresponding with resolution rules in sample drawn data and shows; Whole resolving, based on graphic interface, operation is flexible, without in face of programming file, beneficial to exploitation, renewal and safeguarding.
Further advantage, target and the feature of the present invention embodies part by following explanation, and part will also be by this The research and practice of invention and be understood by the person skilled in the art.
Brief description of the drawings
Fig. 1 is the schematic flow sheet of the method for the present invention that structural data is extracted from unstructured data;
Fig. 2 is the schematic flow sheet of the resolver of the present invention established and include main analytic tree;
Fig. 3 is the schematic flow sheet of the resolver of the present invention established and include sub- analytic tree;
Fig. 4 is the schematic flow sheet that addition of the present invention parses child node;
Fig. 5 is the schematic flow sheet of editor's resolution rules of the present invention;
Fig. 6 is classifying rules of the present invention and the schematic diagram for parsing rule editing
Fig. 7 is the node schematic diagram of analytic tree of the present invention;
In figure:
The main analytic trees of 10-;
11- parses node;
Other nodes of 12-;
111- parses child node.
20- analytic trees;
30- resolver rule editings GUI;
31- rule editings column;
32- composite mode units;
33- general modfel units;
34- field values name display unit.
Embodiment
The present invention is described in further detail below in conjunction with the accompanying drawings, to make those skilled in the art with reference to specification text Word can be implemented according to this.
It should be appreciated that such as " having ", "comprising" and " comprising " term used herein are not precluded from one or more The presence or addition of individual other elements or its combination.
As shown in figure 1, the method for the present invention from unstructured data extraction structural data, it includes following step Suddenly:
S10, establishment model storehouse, it includes writing several patterns of acquisition based on regular expression;
S20, operated based on image conversion, establish an empty resolver;
S30, the part in data to be resolved is obtained as sample data, understands parsing demand;
S40, according to parsing demand, at least one pattern is called, editing classification rule and resolution rules are with to sample successively Data are classified and parsing, and classifying rules and resolution rules are stored to resolver;
S50, the field value name of the sample data of display resolution rules parsing;
S60, the resolver of formation is sent in actual environment on GUI and disposed, analysis result is input to data analysis Next step.
In above-mentioned embodiment, pattern is write with regular expression, for representing common point of different daily record texts Cut scheme.The splitting scheme structure of often row daily record is similar, thus developer can give same one name of segmentation, is referred to as certain Certain pattern.The present invention defines this and same is divided into pattern.Including at least general modfel (Normal Pattern) and compound die Formula (Container pattern) two major classes.
In above-mentioned embodiment, as shown in Figure 2 and Figure 7, in step S20, resolver is established, is comprised the following steps:
S21, establish main analytic tree;
S22, at least one parsing node is added with burl point mode side by side on main analytic tree tree root and automatically generates one Individual other nodes arranged side by side with parsing node;
S23, the parsing list that data parse after starting on parsing node to the taxon of data classification and to classification Member;Start resolution unit in other nodes;
S24, in each taxon, based at least one F-rule (Positive Rule) and/or B-rule (Negative Rule), call at least one pattern editor to form classifying rules, meet taxon point on any parsing node The data distribution of rule-like is to the node;If not meeting, distribute to other nodes;
S25, in resolution unit, at least one pattern editor is called to form resolution rules to be carried out to sorted data Parsing, when user does not know that what pattern of the selection is put into resolution rules to pull, can choose one in sample daily record Part, system can recommend one or more matchings automatically, and this chooses the pattern of log portion to be selected for user.
It is to be added side by side at least with burl point mode on main analytic tree tree root to step S22- step S25 supplementary notes After one parsing node, each classifying rules parsed between node is different, so as to realize the different classifications to sample data;Often The resolution rules of individual parsing node can be the same or different.But for same parsing node, first to grouping sheet After first editing classification rule to sample data to carry out preliminary classification, then resolution unit editor's resolution rules to the parsing node Further to be parsed to sorted sample data.
As the preferred of above-mentioned embodiment, as shown in Figure 3 and Figure 7, establishing resolver also includes step:
S26, with burl point mode, the addition at least one sub- analytic tree arranged side by side with main analytic tree;
S27, node, taxon, parsing list are parsed to be added with main analytic tree same way on each sub- analytic tree Member and automatically generate other nodes;
S28, start taxon, resolution unit successively, to be classified to data and be parsed.
In the embodiment, the foundation of sub- analytic tree, primarily to avoiding the complexity of main analytic tree parsing.As for son The difference of the classification and parsing of analytic tree and main analytic tree, example of the present invention are:Have in sample data in the presence of at least a series of The data record of the association of same characteristic features, after being sent to main analytic tree and being classified and parsed, user is according to the number after parsing According to whether classifying, at least one sub- analytic tree is established, the data for needing further to parse then are sent into sub- analytic tree is carried out Further classification and parsing, obstructed main analytic tree continuous later are parsed, and can so mitigate the parsing burden of main analytic tree, Be also beneficial to it is special, accurate Analysis individually is carried out to this series of features, for subsequent analysis, do not influence the complete aobvious of analysis result Show.
Show as the preferred of above-mentioned embodiment, such as Fig. 4 and Fig. 7, addition parsing node in step S22, in addition to it is following Step:
S221, on parsing root vertex, at least one parsing node is added with burl point mode side by side, in parsing node Start taxon to classify;During addition parsing node, other nodes arranged side by side with parsing node are automatically generated;
S222, the resolution unit for parsing root node move to the corresponding each parsing node added to parse.
In the embodiment, the different parsing tree nodes under same analytic tree (main analytic tree or sub- analytic tree) can With sample data is carried out based on different classifications rule preliminary classification after be further continued for each self-analytic data;So, in same parsing At least one parsing child node is added with burl point mode side by side after node, is that the sample data under the parsing node-classification is entered The further classification refinement of row and corresponding parsing.
In addition, after setting parsing child node, parsing node only exists taxon and carries out preliminary classification, parses the solution of node Analysis unit is moved in each parsing child node of corresponding addition, then parses node and be used to parse without resolution unit, but at least one After individual parsing child node carries out step refining classification to the sample data after preliminary classification, the mobile solution obtained in child node is parsed Analyse unit and carry out follow-up parsing, by that analogy, if desired continue to add child node, parse resolution unit on node successively by Parsing node, parsing child node are moved in several child nodes of afterbody.
As the preferred of above-mentioned embodiment, resolver is established, it is further comprising the steps of:
In main analytic tree, sub- analytic tree, parsing node, other nodes, parsing child node and its child node:
Edit title;
Edit and show the data type currently parsed;
Edit and show creation time;
Edit simultaneously display refresh time;
Edit for adding the label for identification;And
The operation modified, edit and deleted to resolver.
In the embodiment, editor's title is for identification and distinguishes, and several resolvers being named are formed with list is in It is existing.For the form and content of the label of identification, depending on demands of individuals, the present invention is not specifically limited.Resolver is repaiied The operation for changing, editing and deleting, specifically, modification refer to changing corresponding resolver title, data type and label;Compile Volume refer to association jump to resolver GUI 30 general modfel unit, composite mode unit, rule editing column and, pass through Pull at least one pattern of selection addition and enter edlin into rule editing column, form classifying rules or resolution rules, show sample Field value name corresponding with resolution rules in notebook data;Deletion refer to deleting sub- analytic tree, parsing node, parsing child node with And the operation such as child node.
As the preferred of above-mentioned embodiment, resolver is established, it is further comprising the steps of:
In main analytic tree, sub- analytic tree, parsing node and parsing child node:
Pasted being replicated on a node on an analytic tree with a newly-built node on an other analytic tree, or Person pastes the node with a newly-built node after a node is replicated in same analytic tree;
Node is added with burl point mode.
In the embodiment, pasted being replicated on a node on an analytic tree on an other analytic tree with new A node is built, or the node is pasted with a newly-built node after replicating a node in same analytic tree, therefore, can Relative position between concept transfer.
Summary explanation, it is also necessary to supplement, as analytic tree, parsing node and parse child node quantity, The quantity of the also non-series and every grade of lining node for adding child node step by step with burl point mode, it is to regard sample data Depending on classification demand, the present invention is not specifically limited.
As the preferred of above-mentioned embodiment, as shown in figure 5, in step S40, editing classification rule or resolution rules, bag Include following steps:
S41, setup rule formula bar;
S42, if editing classification is regular, after selecting at least one F-rule and/or B-rule, call at least one Pattern editor forms classifying rules, is applied to corresponding data, data are classified;
S43, if editor's resolution rules, select at least one pattern to be drawn to rule editing column and enter edlin, form parsing Rule, it is applied in corresponding data, data is parsed.
In the embodiment, Fig. 6 is classifying rules and the schematic diagram for parsing rule editing.In addition, step S40 is as excellent Choosing, define the rule that composite mode is used for editing complexity, with the daily record that parsing is complicated, and can to composite mode in itself and/or The rule that general modfel is formed enters the rale element of edlin.By above-mentioned resolver to sample data parse after, can obtain with Field corresponding to classifying rules and resolution rules shows that the field value of at least one piece of data record is ordered specially corresponding to display Name.In order to distinguish the analysis result of different resolution rules, the field value name parsed to different resolution rules carries out different face Color highlights, and to promptly appreciate that each change of sample data parsing content, provides interactive mode for developer and sets Meter experience.
As the preferred of above-mentioned embodiment, after showing the field value name for the sample data being parsed, in addition to step:
Download, store, disposing, multiplexing resolver, resolver is sent to practical service environment deployment, parsing on GUI As a result input to the next step of data analysis.
In the embodiment, at least one data record corresponding with least one pattern editor that formula bar is put into is shown Field value name after, clicked on GUI and resolver be sent to practical service environment deployment, analysis result inputted to data The next step of analysis.Or click on and be stored as local file, if there is similar parsing demand next time, resolver confession can be uploaded again Use.
It should be added that analytic application rule requires that every data line record in rule and sample data is complete Matching, it means that if there is any mistake in rule, matching process will fail.This is rule-based expression parsing system Major defect, very long rule includes tens kinds of patterns, and (equivalent to one regular expression has hundreds of characters to several Kilo-character), such regular expression is difficult to effective exploitation or debugging.And resolver provided by the invention, can quickly it open The long rule of hair.When needing formulation one long regular, developer need not disposably complete to formulate whole rule, on the contrary, only Need to place the pattern to match in the starting position of sample data, all remainders of sample data can be shown as ash automatically Color, demonstrate the need for further parsing.If the process is wrong, pattern and text near mistake can all show grey automatically. Therefore, data parser provided by the invention, analytic structure data, semi-structured data and unstructured number are gone for According to, such as the daily record data from acquisitions such as computer server, the network equipment, software application, Database Systems.
Method provided by the invention from unstructured data extraction structural data, based on burl point mode and figure Change user interface, after uploading sample data to be resolved, call at least one pattern in parser database to enter edlin shape Resolver is saved in into resolution rules, resolver is formed, to choose field corresponding with resolution rules in sample data and show; Whole resolving, based on graphic interface, operation is flexible, without in face of programming file, beneficial to exploitation, renewal and safeguarding.Cause To create analytic tree, various types of nodes, label, editing classification rule, the deletion section for being provided for classifying to each node The operation such as point, staff can carry out the thinking of data classification with the increment of exploitation, and at any time to any during exploitation One link is modified, without stagnating development process, it is not necessary to all decisions being ready in advance in development process, favorably Exploitation in resolver, safeguard, the particularly later stage continues to optimize and updated.
Although embodiment of the present invention is disclosed as above, it is not restricted in specification and embodiment listed With.It can be applied to various suitable the field of the invention completely.Can be easily for those skilled in the art Realize other modification.Therefore it is of the invention and unlimited under the universal limited without departing substantially from claim and equivalency range In specific details and shown here as the legend with description.

Claims (8)

  1. A kind of 1. method from unstructured data extraction structural data, it is characterised in that it comprises the following steps:
    Establishment model storehouse, it includes writing several patterns of acquisition based on regular expression;
    Operated based on image conversion, establish an empty resolver;
    The part in data to be resolved is obtained as sample data, understands parsing demand;
    According to the parsing demand, at least one pattern is called, editing classification rule and resolution rules are with to sample successively Data are classified and parsing, and the classifying rules and the resolution rules are stored to the resolver;
    Show the field value name of the sample data of the resolution rules parsing;
    Resolver is deployed to practical service environment on GUI, analysis result is output to the next step of data analysis.
  2. 2. as claimed in claim 1 from the method for unstructured data extraction structural data, it is characterised in that described in foundation Resolver, comprise the following steps:
    Establish main analytic tree;
    On the main analytic tree tree root with burl point mode add side by side it is at least one parsing node and automatically generate one with The parsing node other nodes arranged side by side;
    The resolution unit that data parse after starting on the parsing node to the taxon of data classification and to classification; Other described nodes start the resolution unit;
    In each taxon, based at least one F-rule and/or B-rule, at least one mould is called Formula editor forms the classifying rules, meets any data distribution for parsing taxon classifying rules on node to the node; If not meeting, distribute to other described nodes;
    In the resolution unit, at least one pattern editor is called to form the resolution rules with to sorted data Parsed.
  3. 3. as claimed in claim 2 from the method for unstructured data extraction structural data, it is characterised in that described in foundation Resolver, in addition to step:
    With burl point mode, the addition at least one sub- analytic tree arranged side by side with the main analytic tree;
    With parsing node described with the main analytic tree same way addition, the grouping sheet on each sub- analytic tree First, described resolution unit and automatically generate other nodes;
    Start the taxon, the resolution unit successively, to be classified to data and be parsed.
  4. 4. as claimed in claim 2 or claim 3 from the method for unstructured data extraction structural data, it is characterised in that addition The parsing node, it is further comprising the steps of:
    On the parsing node, at least one parsing child node is added with burl point mode side by side, in the parsing child node Start the taxon to classify;
    The resolution unit of the parsing node moves to the corresponding each parsing child node added to parse;
    Wherein, when adding the parsing child node, other nodes arranged side by side with the parsing child node are automatically generated;
    The resolution unit of the parsing child node moves to child node described in corresponding afterbody to parse.
  5. 5. as claimed in claim 4 from the method for unstructured data extraction structural data, it is characterised in that described in foundation Resolver, it is further comprising the steps of:
    The main analytic tree, the sub- analytic tree, it is described parsing node, other described nodes, it is described parsing child node and its In the child node:
    Edit title;
    Edit and show the data type currently parsed;
    Edit and show creation time;
    Edit simultaneously display refresh time;
    Edit for adding the label for identification;And
    The operation modified, edit and deleted to resolver.
  6. 6. as claimed in claim 4 from the method for unstructured data extraction structural data, it is characterised in that described in foundation Resolver, it is further comprising the steps of:
    In the main analytic tree, the sub- analytic tree, the parsing node and the parsing child node:
    Pasted being replicated on a node on an analytic tree on an other analytic tree with a newly-built node, Huo Zhe The node is pasted with a newly-built node after replicating a node in same analytic tree;
    Node is added with burl point mode.
  7. 7. as claimed in claim 1 from the method for unstructured data extraction structural data, it is characterised in that described in editor Classifying rules or the resolution rules, comprise the following steps:
    Setup rule formula bar;
    If editing classification is regular, after selecting at least one F-rule and/or B-rule, at least one pattern is called Editor forms classifying rules, and classifying rules is applied into corresponding data, and data are classified;
    If editing resolution rules, select at least one pattern to be drawn to rule editing column and enter edlin, form the parsing Rule, resolution rules are applied in corresponding data, data are parsed.
  8. 8. the method from unstructured data extraction structural data as any one of claim 1-7, its feature exist In showing the field value name of the sample data of resolution rules parsing, in addition to step:
    Download, store, disposing, being multiplexed the resolver, resolver is sent to practical service environment on GUI and disposed, parsing As a result it is output to the next step of data analysis.
CN201710757615.2A 2017-08-29 2017-08-29 A kind of method from unstructured data extraction structural data Pending CN107577460A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710757615.2A CN107577460A (en) 2017-08-29 2017-08-29 A kind of method from unstructured data extraction structural data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710757615.2A CN107577460A (en) 2017-08-29 2017-08-29 A kind of method from unstructured data extraction structural data

Publications (1)

Publication Number Publication Date
CN107577460A true CN107577460A (en) 2018-01-12

Family

ID=61030481

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710757615.2A Pending CN107577460A (en) 2017-08-29 2017-08-29 A kind of method from unstructured data extraction structural data

Country Status (1)

Country Link
CN (1) CN107577460A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108255494A (en) * 2018-01-30 2018-07-06 平安科技(深圳)有限公司 A kind of XML file analytic method, device, computer equipment and storage medium
CN110532760A (en) * 2019-08-12 2019-12-03 广州海颐信息安全技术有限公司 Compatible structure and unstructured privilege threaten the method and device of behavioral data
CN112269818A (en) * 2020-11-25 2021-01-26 成都数之联科技有限公司 Method, system, device and medium for positioning device parameter root cause
CN113377419A (en) * 2021-05-31 2021-09-10 同盾科技有限公司 Business processing method and device, readable storage medium and electronic equipment
CN114862631A (en) * 2021-01-20 2022-08-05 台达电子国际(新加坡)私人有限公司 Interactive recording and analyzing method
CN117389569A (en) * 2023-10-26 2024-01-12 重庆猪哥亮科技有限责任公司 Program interpretation execution method

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8635551B1 (en) * 2004-06-01 2014-01-21 The United States Of America, As Represented By The Secretary Of The Army Graphic user interface and software for processing large size signal data samples in a small buffer using automatically adjusted decimation ratio
CN106294673A (en) * 2016-08-08 2017-01-04 杭州玳数科技有限公司 A kind of method and system of User Defined rule real time parsing daily record data
CN106776995A (en) * 2016-12-06 2017-05-31 北京神舟航天软件技术有限公司 A kind of tree-like acquisition technique of structural data based on MDA

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8635551B1 (en) * 2004-06-01 2014-01-21 The United States Of America, As Represented By The Secretary Of The Army Graphic user interface and software for processing large size signal data samples in a small buffer using automatically adjusted decimation ratio
CN106294673A (en) * 2016-08-08 2017-01-04 杭州玳数科技有限公司 A kind of method and system of User Defined rule real time parsing daily record data
CN106776995A (en) * 2016-12-06 2017-05-31 北京神舟航天软件技术有限公司 A kind of tree-like acquisition technique of structural data based on MDA

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
刘琰: "一种基于树的Whois文档解析方法", 《计算机应用研究》 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108255494A (en) * 2018-01-30 2018-07-06 平安科技(深圳)有限公司 A kind of XML file analytic method, device, computer equipment and storage medium
WO2019148671A1 (en) * 2018-01-30 2019-08-08 平安科技(深圳)有限公司 Xml file parsing method, device, computer apparatus, and storage medium
CN110532760A (en) * 2019-08-12 2019-12-03 广州海颐信息安全技术有限公司 Compatible structure and unstructured privilege threaten the method and device of behavioral data
CN112269818A (en) * 2020-11-25 2021-01-26 成都数之联科技有限公司 Method, system, device and medium for positioning device parameter root cause
CN112269818B (en) * 2020-11-25 2023-11-21 成都数之联科技股份有限公司 Equipment parameter root cause positioning method, system, device and medium
CN114862631A (en) * 2021-01-20 2022-08-05 台达电子国际(新加坡)私人有限公司 Interactive recording and analyzing method
CN113377419A (en) * 2021-05-31 2021-09-10 同盾科技有限公司 Business processing method and device, readable storage medium and electronic equipment
CN117389569A (en) * 2023-10-26 2024-01-12 重庆猪哥亮科技有限责任公司 Program interpretation execution method

Similar Documents

Publication Publication Date Title
CN107577460A (en) A kind of method from unstructured data extraction structural data
US12106095B2 (en) Deep learning-based java program internal annotation generation method and system
CN104199871B (en) A kind of high speed examination question introduction method for wisdom teaching
WO2005010727A2 (en) Extracting data from semi-structured text documents
Huang et al. Learning human-written commit messages to document code changes
CN107273117A (en) A kind of quick Code automatic build system for programming friendly
US20240086647A1 (en) Artificial intelligence-enabled system and method for authoring a scientific document
CN106648819A (en) Internationalized code conversion method based on editor
Almonaies et al. A framework for migrating web applications to web services
CN107622093A (en) A kind of system from unstructured data extraction structural data
Akundi et al. Text-to-model transformation: natural language-based model generation framework
Sajjad et al. NLP based verification of a UML class model
CN118966343A (en) Question and answer knowledge base construction method, device, equipment and storage medium
CN110533143B (en) Method and device for generating electronic card, storage medium and computer equipment
CN105930453A (en) Repeatability analyzing method and device
CN115358477A (en) Random generation system for battle scenario and application thereof
CN112540753A (en) Case feature analysis method
Sneed et al. Reverse engineering a visual age application
Nabais et al. SSD2 and FoodEx2 compliant real‐time registration and classification of food sampling data‐Improving Data Quality for Risk Assessment (IDRisk)
Afreen et al. An Intelligent Approach for CRC Models Based Agile Software Requirement Engineering Using SBVR
Thottempudi A visual narrative of ramayana using extractive summarization topic modeling and named entity recognition
Boronat A comparison of HTML-aware tools for Web Data extraction
Xia Construction of a Metal Defect Knowledge Graph Using Deep Learn-ing and Text Mining
Alvarez-Garcia et al. Streamlining Text Pre-Processing and Metrics Extraction
CN112445391B (en) Service data generation method, device and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20180112