Disclosure of Invention
The technical task of the invention is to provide a method and a system for generating a multi-mode report based on a large language model and FREEMARKER aiming at the defects, which can solve the problems of single report generation form, poor display form and information level of the traditional large language model, thereby improving the professional field acceptance of the generated report and meeting the requirements of various industries on high-quality and diversified reports.
The technical scheme adopted for solving the technical problems is as follows:
A method of generating a multimodal report based on a large language model and FREEMARKER, the implementation of the method comprising the steps of:
Step 1, professional data preprocessing, which comprises collecting, cleaning and formatting data to be analyzed;
step 2, splicing the preprocessed data, and performing data analysis and text generation by using a large language model;
Step 3, analyzing the output result of the model to obtain corresponding analysis conclusion, data statistics, reference document and the like, and constructing various expression forms of the content including tables, pictures, videos, hyperlinks and the like by means of object storage;
Step 4, designing a report template in FREEMARKER, wherein the report template comprises a preset dynamic catalog, a table placeholder, a picture and video embedded area, a hyperlink position and the like;
And 5, filling the content generated by the large language model into FREEMARKER templates, formatting and typesetting, and outputting and displaying the report content.
The method utilizes a large language model to carry out data analysis and text generation, and simultaneously realizes the multi-mode content organization and display of the report by virtue of the FREEMARKER template function.
Further, the method for preprocessing professional data comprises the following steps:
Data preprocessing, namely cleaning, mid-culture and the like of original professional data to reduce understanding deviation of a large language model and convert a data set into a form which can be processed by a computer;
Keyword recognition, namely recognizing the key values/keywords in the data collection by utilizing a Natural Language Processing (NLP) technology, wherein the key values/keywords comprise related names, categories, degrees, areas and the like, and recognizing meanings corresponding to the data to form a data dictionary;
And converting the data set subjected to formatting into a computer-processable form, such as JSON and other formats, constructing a project, and inputting the project into a LLM model for analysis and processing.
Further, the FREEMARKER template design allows the user to customize the layout and style of the report to meet the requirements of different industries and application scenarios.
Further, the FREEMARKER template design further includes:
designing a template layout, including defining the size, margin, font style, size and the like of a page, and setting the title, subtitle and footer of a report;
Setting a dynamic catalog including links of chapter titles and subtitle so that a user can quickly jump to a corresponding report part by clicking a catalog item;
Configuration tables, pictures, videos, hyperlinks, etc. for multimodal content placeholders.
Further, the method for presetting the module dynamic catalogue comprises the following steps:
The dynamic catalog algorithm can automatically identify chapters and sub-chapters in the report, and a structured catalog is generated, so that a user can conveniently and quickly navigate to different parts of the report;
WIN32 implements updating the directory field for the generated report, ensuring that the directory page number is consistent with the dynamic report content page number.
Furthermore, the setting of the dynamic catalog, in FREEMARKER templates, uses specific FREEMARKER grammar to load the dynamic catalog by Servlet context loading or Spring Boot integration, and automatically generates the catalog according to the content in the report, wherein the catalog comprises links of chapter titles and sub-titles, so that a user can quickly jump to a corresponding report part by clicking catalog items;
Configuring table placeholders, namely reserving positions of tables in a report template, dynamically generating the tables by using a FREEMARKER list and a circulating instruction, and automatically filling table data according to a data analysis result generated by a large language model;
The method comprises the steps of embedding pictures and videos, reserving placeholders of the pictures and videos in a template, storing link addresses of related pictures and videos output by a model into an object, acquiring remote URL (uniform resource locator) of the object, and embedding the remote URL into a report through URL processing instructions of FREEMARKER;
Adding hyperlinks-in the text of the report, hyperlinks are added using the FREEMARKER link instructions.
Further, the text content generated by the large language model is filled according to the appointed position of FREEMARKER templates, the FREEMARKER template engine replaces the analysis template file with the actual content by placeholders, and a preset formatting rule is applied to generate a final report document;
the method for outputting and displaying the report content is as follows:
The object storage such as Minio is utilized to store pictures, videos, reference documents and the like in the report, so that the report has high expandability and reliability, and the remote quick access and efficient storage of the report are ensured.
The invention also claims a system for generating a multimodal report based on a large language model and FREEMARKER, comprising:
The professional data preprocessing module is used for collecting, cleaning and formatting data to be analyzed;
The large language model processing module is used for carrying out data analysis and text generation on the preprocessed data by splicing the template by using a large language model;
The content analysis module analyzes the output result of the model to obtain corresponding analysis conclusion, data statistics, reference document and the like, and constructs various expression forms of the content including tables, pictures, videos, hyperlinks and the like by means of object storage;
The FREEMARKER template custom module designs a report template in FREEMARKER, wherein the report template comprises a preset dynamic catalog, a table placeholder, a picture and video embedded area, a hyperlink position and the like;
The output and display module is used for filling the content generated by the large language model into a FREEMARKER template, formatting and typesetting the content, and outputting and displaying the report content;
the system realizes the generation of the multi-modal report by the method for generating the multi-modal report based on the large language model and FREEMARKER.
The invention also claims an apparatus for generating a multimodal report based on a large language model and FREEMARKER, comprising at least one memory and at least one processor;
the at least one memory for storing a machine readable program;
The at least one processor is configured to invoke the machine-readable program to implement the method described above.
The invention also claims a computer readable medium having stored thereon computer instructions which, when executed by a processor, cause the processor to perform the above-described method.
Compared with the prior art, the method and the system for generating the multi-mode report based on the large language model and FREEMARKER have the following beneficial effects:
The invention provides an innovative report generation method, which combines an advanced large language model and a flexible FREEMARKER template technology to realize the multi-modeling of report contents. Compared with the prior method for generating the report by means of the large language model, the method enables the generated report to not only contain text information, but also embed various modal contents such as dynamic catalogues, tables, pictures, videos and hyperlinks, and greatly enriches the display form and information level of the report. Compared with manual writing, the working time is greatly shortened. The invention realizes the intelligent generation and the customized display of the report content through the dynamic analysis capability of the large language model and the fixed template structure of FREEMARKER, and is suitable for various industries and application scenes.
Detailed Description
The invention will be further described with reference to the drawings and the specific examples.
The embodiment of the invention provides a method for generating a multi-modal report based on a large language model and FREEMARKER, which comprises the following steps:
step 1, professional data preprocessing, which comprises collecting, cleaning and formatting data to be analyzed;
Step 2, splicing the preprocessed data, and performing data analysis and text generation by using a large language model;
Analyzing the output result of the model to obtain corresponding analysis conclusion, data statistics, reference document and the like, and constructing various expression forms of the content including tables, pictures, videos, hyperlinks and the like by means of object storage;
Step 4, designing a report template in FREEMARKER, wherein the report template comprises a preset dynamic catalog, a table placeholder, a picture and video embedded area, a hyperlink position and the like;
And 5, filling the content generated by the large language model into FREEMARKER templates, formatting and typesetting, and outputting and displaying the report content.
The report template is designed in FREEMARKER, and FREEMARKER template design allows users to customize the layout and style of the report to meet the requirements of different industries and application scenarios. The FREEMARKER template design further includes:
designing a template layout, including defining the size, margin, font style, size and the like of a page, and setting the title, subtitle and footer of a report;
Setting a dynamic catalog including links of chapter titles and subtitle so that a user can quickly jump to a corresponding report part by clicking a catalog item;
Configuration tables, pictures, videos, hyperlinks, etc. for multimodal content placeholders.
The dynamic catalog algorithm can automatically identify chapters and sub-chapters in the report, and a structured catalog is generated, so that a user can conveniently and quickly navigate to different parts of the report;
WIN32 implements updating the directory field for the generated report, ensuring that the directory page number is consistent with the dynamic report content page number.
The report content is output and displayed, and object storage such as Minio is utilized to store pictures, videos, reference documents and the like in the report, so that the report has high expandability and reliability, and the remote quick access and high-efficiency storage of the report are ensured.
The method utilizes a large language model to carry out data analysis and text generation, and simultaneously realizes the multi-mode content organization and display of the report by virtue of the FREEMARKER template function. The method is described in further detail below with reference to fig. 1-2.
As shown in FIG. 1, a report generation flow chart for generating a multimodal report based on a large language model and FREEMARKER is shown:
s1, collecting professional data to be analyzed, formatting the data, and ensuring that the data is suitable for the input requirement of a large language model.
S2, constructing a prompt project, and carrying out data analysis and text generation on the preprocessed data by using a large language model.
And S3, analyzing the output result of the model to obtain corresponding analysis conclusion, data statistics, reference document and the like, and constructing various expression forms of the content including tables, pictures, videos, hyperlinks and the like by means of object storage.
S4, a user self-defines a outline template of the report, FREEMARKER a template engine analyzes and generates the report template, presets a dynamic catalog, a table placeholder, a picture and video embedded area, a hyperlink position and the like, and ensures reasonable logic structure and layout of the content.
And S5, filling the content generated by the large language model according to the structure of the FREEMARKER template, automatically generating report content containing multiple modes, and displaying the finally generated multi-mode report.
The method for formatting professional data comprises the following steps:
And data preprocessing, namely cleaning, mid-culture and the like of the original professional data to reduce understanding deviation of a large language model, and converting the data set into a computer-processable form.
Keyword recognition, namely recognizing the key values/keywords in the data collection by utilizing a Natural Language Processing (NLP) technology, wherein the key values/keywords are related to names, categories, degrees, areas and the like, and recognizing the meanings corresponding to the data to form a data dictionary.
And converting the data set subjected to formatting into a computer-processable form, such as JSON and other formats, constructing a project, and inputting the project into a LLM model for analysis and processing.
Prompt is a technique based on Artificial Intelligence (AI) instructions by explicitly and specifically directing the output of a language model. In Prompt word engineering, the definition of Prompt encompasses three main elements of task, instruction and role to ensure that the model generates text that meets the needs of the user. For example, the road disease professional data is taken as an example, and the following Prompt is spliced, wherein the corresponding relation of the data dictionary is that the name of the data field is the Chinese meaning of the data field, the following content is needed to be analyzed according to the following data, the following content is described by a section of natural language (task) +data set+what all disease types are in the whole road section in the data, what the total sum of all disease areas is, and what the typical structural disease type is (instruction) "is used for standardizing the output of a large language model, so that the text with correlation, accuracy and high quality is generated.
The large language model described above may be a model of public cloud deployment. Such as the caretaker, GPT series, or a professional model of the local deployment training.
As a detailed embodiment of constructing FREEMARKER templates, the steps are as shown in fig. 2:
s41, designing FREEMARKER template layout, namely designing the layout and structure of the report by using the FREEMARKER template engine according to the requirements of the report. This includes defining the page size, margins, font style and size, etc., and setting the headings, subheadings, and footers of the report.
S42, setting the dynamic catalogue, namely loading the dynamic catalogue in FREEMARKER templates by using specific FREEMARKER grammar such as Servlet context loading or Spring Boot integration and the like. The catalog will be automatically generated based on the content in the report, including links to chapter titles and sub-titles, enabling the user to quickly jump to the corresponding report section by clicking on the catalog entry.
S43, configuring a table placeholder, namely reserving the position of a table in a report template, and dynamically generating the table by using the list of FREEMARKER and a circulating instruction. And automatically filling table data, such as statistical information of disease types, areas, frequencies and the like, according to data analysis results generated by the large language model.
S44, embedding the pictures and the videos, namely reserving placeholders of the pictures and the videos in the template. And storing the link addresses of the related pictures and videos output by the model into an object for storage, acquiring remote URL thereof, and embedding the remote URL into a report through URL processing instructions of FREEMARKER. The media content may be live photographs of disease, video recordings, or other related visual material.
S45, adding hyperlinks, namely adding hyperlinks in the text of the report by using FREEMARKER link instructions. These hyperlinks may provide the user with more information and context for some of the generated content reference documents provided by the large model, or for externally directed reference documents, related studies, or further resources.
As an optimized implementation mode, after report contents are generated, page numbers of the dynamic catalogue are automatically updated by means of the WIN32, and algorithm codes are as follows:
doc=word.Documents.Open(file_path)
doc.Fields.Update()
And filling the text content generated by the large language model according to the appointed position of the FREEMARKER template. And FREEMARKER, the template engine analyzes the template file, replaces placeholders with actual contents, and applies preset formatting rules to generate a final report document.
The method realizes the following steps:
and the multi-mode fusion is realized by combining the text generation capacity of the large language model with the template design function of FREEMARKER for the first time.
Dynamic content generation, namely dynamically generating report content by utilizing the data analysis capability of a large language model, thereby greatly reducing manual intervention, reducing subjectivity and saving labor cost.
The embodiment of the invention also provides a system for generating the multi-modal report based on the large language model and FREEMARKER, which realizes the generation of the multi-modal report by the method for generating the multi-modal report based on the large language model and FREEMARKER.
The system comprises:
The professional data preprocessing module is used for collecting, cleaning and formatting data to be analyzed;
The large language model processing module is used for carrying out data analysis and text generation on the preprocessed data by splicing the template by using a large language model;
The content analysis module analyzes the output result of the model to obtain corresponding analysis conclusion, data statistics, reference document and the like, and constructs various expression forms of the content including tables, pictures, videos, hyperlinks and the like by means of object storage;
The FREEMARKER template custom module designs a report template in FREEMARKER, wherein the report template comprises a preset dynamic catalog, a table placeholder, a picture and video embedded area, a hyperlink position and the like;
And the output and display module is used for filling the content generated by the large language model into the FREEMARKER template, formatting and typesetting the content, and outputting and displaying the report content.
The professional data preprocessing module is used for preprocessing the professional data, and the method for preprocessing the professional data comprises the following steps:
Data preprocessing, namely cleaning, mid-culture and the like of original professional data to reduce understanding deviation of a large language model and convert a data set into a form which can be processed by a computer;
Keyword recognition, namely recognizing the key values/keywords in the data collection by utilizing a Natural Language Processing (NLP) technology, wherein the key values/keywords comprise related names, categories, degrees, areas and the like, and recognizing meanings corresponding to the data to form a data dictionary;
And converting the data set subjected to formatting into a computer-processable form, such as JSON and other formats, constructing a project, and inputting the project into a LLM model for analysis and processing.
The FREEMARKER template customization module allows a user to customize the layout and style of the report so as to meet the requirements of different industries and application scenes. FREEMARKER the template design further includes:
designing a template layout, including defining the size, margin, font style, size and the like of a page, and setting the title, subtitle and footer of a report;
Setting a dynamic catalog including links of chapter titles and subtitle so that a user can quickly jump to a corresponding report part by clicking a catalog item;
Configuration tables, pictures, videos, hyperlinks, etc. for multimodal content placeholders.
The dynamic catalog algorithm can automatically identify chapters and sub-chapters in the report, and a structured catalog is generated, so that a user can conveniently and quickly navigate to different parts of the report;
WIN32 implements updating the directory field for the generated report, ensuring that the directory page number is consistent with the dynamic report content page number.
The setting of the dynamic catalogue, in FREEMARKER templates, uses specific FREEMARKER grammar to load Servlet context or Spring Boot integration, loads the dynamic catalogue, automatically generates the catalogue according to the content in the report, and comprises links of chapter titles and sub-titles, so that a user can quickly jump to a corresponding report part by clicking a catalogue item;
Configuring table placeholders, namely reserving positions of tables in a report template, dynamically generating the tables by using a FREEMARKER list and a circulating instruction, and automatically filling table data according to a data analysis result generated by a large language model;
The method comprises the steps of embedding pictures and videos, reserving placeholders of the pictures and videos in a template, storing link addresses of related pictures and videos output by a model into an object, acquiring remote URL (uniform resource locator) of the object, and embedding the remote URL into a report through URL processing instructions of FREEMARKER;
Adding hyperlinks-in the text of the report, hyperlinks are added using the FREEMARKER link instructions.
The FREEMARKER template engine replaces the analysis template file with the actual content and applies the preset formatting rule to generate the final report document;
The output and display module utilizes object storage such as Minio to store pictures, videos, reference documents and the like in the report, has high expandability and reliability, and ensures remote quick access and efficient storage of the report.
The embodiment of the invention also provides a device for generating the multi-modal report based on the large language model and FREEMARKER, which comprises at least one memory and at least one processor;
the at least one memory for storing a machine readable program;
the at least one processor is configured to invoke the machine-readable program to implement the method for generating a multimodal report based on the large language model and FREEMARKER as described in the above embodiments.
Embodiments of the present invention also provide a computer readable medium having stored thereon computer instructions that, when executed by a processor, cause the processor to perform the method of generating a multimodal report based on a large language model and FREEMARKER as described in the above embodiments. Specifically, a system or apparatus provided with a storage medium on which a software program code realizing the functions of any of the above embodiments is stored, and a computer (or CPU or MPU) of the system or apparatus may be caused to read out and execute the program code stored in the storage medium.
In this case, the program code itself read from the storage medium may realize the functions of any of the above-described embodiments, and thus the program code and the storage medium storing the program code form part of the present invention.
Examples of storage media for providing program code include floppy disks, hard disks, magneto-optical disks, optical disks (e.g., CD-ROMs, CD-R, CD-RWs, DVD-ROMs, DVD-RAMs, DVD-RWs, DVD+RWs), magnetic tapes, nonvolatile memory cards, and ROMs. Alternatively, the program code may be downloaded from a server computer by a communication network.
Further, it should be apparent that the functions of any of the above-described embodiments may be implemented not only by executing the program code read out by the computer, but also by causing an operating system or the like operating on the computer to perform part or all of the actual operations based on the instructions of the program code.
Further, it is understood that the program code read out by the storage medium is written into a memory provided in an expansion board inserted into a computer or into a memory provided in an expansion unit connected to the computer, and then a CPU or the like mounted on the expansion board or the expansion unit is caused to perform part and all of actual operations based on instructions of the program code, thereby realizing the functions of any of the above embodiments.
While the invention has been illustrated and described in detail in the drawings and in the preferred embodiments, the invention is not limited to the disclosed embodiments, and it will be appreciated by those skilled in the art that the code audits of the various embodiments described above may be combined to produce further embodiments of the invention, which are also within the scope of the invention.