[go: up one dir, main page]

CN104219271B - Based on the asynchronous multiserver synchronous method for downloading the page of multithreading - Google Patents

Based on the asynchronous multiserver synchronous method for downloading the page of multithreading Download PDF

Info

Publication number
CN104219271B
CN104219271B CN201310220524.7A CN201310220524A CN104219271B CN 104219271 B CN104219271 B CN 104219271B CN 201310220524 A CN201310220524 A CN 201310220524A CN 104219271 B CN104219271 B CN 104219271B
Authority
CN
China
Prior art keywords
page
effective page
effective
download
download thread
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201310220524.7A
Other languages
Chinese (zh)
Other versions
CN104219271A (en
Inventor
夏乃琛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Cheerbright Technologies Co Ltd
Original Assignee
Beijing Cheerbright Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Cheerbright Technologies Co Ltd filed Critical Beijing Cheerbright Technologies Co Ltd
Priority to CN201310220524.7A priority Critical patent/CN104219271B/en
Publication of CN104219271A publication Critical patent/CN104219271A/en
Application granted granted Critical
Publication of CN104219271B publication Critical patent/CN104219271B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The present invention provides a kind of based on the asynchronous multiserver synchronous method for downloading the page of multithreading, comprises the following steps:Based on the multiple download threads of thread creation rule creation;Define the corresponding relation of each download thread and effective page address;Each download thread reads effective page by effective page address corresponding with itself, then asynchronous to download effective page;Each described download thread obtains sync server IP corresponding with effective page address and path by reading configuration file;Effective page is synchronized in the sync server represented by sync server IP and path by each download thread;At the end of any one download thread is performed, download thread application handles next effective page, when no need effective page to be processed, nullifies download thread.While ensureing that user accesses website speed, additionally it is possible to ensure the synchronization of each WEB server, so as to improve the access experience of user.

Description

Based on the asynchronous multiserver synchronous method for downloading the page of multithreading
Technical field
The invention belongs to communication technical field, and in particular to a kind of same based on the asynchronous multiserver for downloading the page of multithreading One step process.
Background technology
With IT fast development, website is quickly grown, and exponentially property increases website visiting amount, thus causes single service Device can not undertake huge visit capacity, so as to cause accelerating website access to reduce.Therefore, it is the access speed of raising website, it is existing Have in technology, many WEB servers are generally set up using a website, also, run timing is quiet in each WEB server State page program, by timing static page program, by emphasis page static, then in each WEB server Preserve static page.
Aforesaid way exist subject matter be:Because there is inconsistency phenomenon in the time of each WEB server, because This, there is page generation time deviation in the static page that each WEB server is generated, for example:In synchronization, WEB clothes The business device A current setting time is 9:00, and the WEB server B current setting time is 9:08;Now, WEB server A is quiet Stateization 9:Page A when 00, obtains 9:Page A when 00;And WEB server B statics 9:Page A when 08, obtains 9:08 When page A;And 9:Page A and 9 when 00:Page A when 08 is differed.Therefore, in synchronization, if user A and use Family B accesses the website simultaneously, then there is a situation where that user A and user B have access to the different pages, so as to reduce the access of user Experience.
The content of the invention
The defect existed for prior art, the present invention provides a kind of based on the asynchronous multiserver for downloading the page of multithreading Synchronous method, while ensureing that user accesses website speed, additionally it is possible to ensure the synchronization of each WEB server, so as to improve The access experience of user.
The technical solution adopted by the present invention is as follows:
The present invention provides a kind of based on the asynchronous multiserver synchronous method for downloading the page of multithreading, comprises the following steps:
S1, receives the storage location information of configuration file;
S2, according to the storage location information received, reads the configuration file;Wherein, the configuration file has been Store the parent page address list that the parent page address downloaded by more than one needs is constituted, also storage and each original Beginning corresponding more than one sync server IP in page address and path;
S3, is pre-processed, the new configuration file after being handled to the configuration file read;Wherein, it is described Stored in new configuration file by more than one needs download effective page group of addresses into effective page address list, also store More than one sync server IP corresponding with each effective page address and path;
S4, calculates the effective page number of addresses stored in effective page address list, is then based on thread creation The multiple download threads of rule creation;
S5, defines the corresponding relation of each described download thread and effective page address;
S6, each described download thread reads effective page, Ran Houyi by effective page address corresponding with itself Step downloads effective page;
S7, each described download thread is obtained corresponding with effective page address by reading the configuration file Sync server IP and path;
S8, each described download thread by the obtained effective pages of S6 be synchronized to S7 acquisition sync server IP and In sync server represented by path;
At the end of any one of download thread performs S8, the download thread application handles next active page Face, when no need effective page to be processed, nullifies the download thread.
It is preferred that, in S3, the configuration file read is pre-processed, is specially:
Filter operation is carried out to the configuration file read.
It is preferred that, filter operation is carried out to the configuration file read, is specially:
Whether be legal address link, if there is illegal address chain if judging each described parent page address Connect, then delete the illegal address link;And/or
Judge to whether there is illustrative words in the configuration file;If it is present deleting the illustrative words; And/or
Judge corresponding with sync server IP and path with the presence or absence of identical parent page address in the configuration file Relation;If it is present deleting the parent page address repeated and sync server IP and path corresponding relation.
It is preferred that, in S6, the specified effective page of the specified asynchronous download of download thread is specially:
The specified download thread sends downloading request message to specified page server;Then judge whether in pre- timing Between be spaced in receive the download response message of specified page server return, if received, continue follow-up process;If Do not receive, then nullify the specified download thread.
It is preferred that, in S8, specific effective page is synchronized on certain synchronization server by particular download thread, is specially:
In previous moment, the particular download thread downloads effective page representated by specific effective page address P1;Then effective page P1 is saved on the certain synchronization server with filename X;
At current time, the particular download thread downloads effective page representated by specific effective page address P2;Then, effective page P2 is write on the certain synchronization server with filename Y first;The active page is used again Face P2 replaces effective page P1;Wherein, filename X and filename Y is differed.
It is preferred that, specific effective page is synchronized on certain synchronization server by S8, particular download thread, is specially:
The particular download thread first determines whether whether specific effective page is read only attribute;If it is, by institute The read only attribute for stating specific effective page is changed to after non-read only attribute, then specific effective page of non-read only attribute is same Walk on certain synchronization server.
Beneficial effects of the present invention are as follows:
The multiserver synchronous method based on the asynchronous download page of multithreading that the present invention is provided, is ensureing user's access net While speed of standing, additionally it is possible to ensure the synchronization of each WEB server, so as to improve the access experience of user.
Brief description of the drawings
The flow signal based on the asynchronous multiserver synchronous method for downloading the page of multithreading that Fig. 1 provides for the present invention Figure.
Embodiment
Below in conjunction with accompanying drawing, the present invention is described in detail:
As shown in figure 1, the present invention provides a kind of based on the asynchronous multiserver synchronous method for downloading the page of multithreading, including Following steps:
S1, receives the storage location information of configuration file;
S2, according to the storage location information received, reads the configuration file;Wherein, the configuration file has been Store the parent page address list that the parent page address downloaded by more than one needs is constituted, also storage and each original Beginning corresponding more than one sync server IP in page address and path;
For example:Parent page address can be page URL addresses.Configuration file can be txt file, each behavior one Configuration, is split, each sync server IP between the page URL addresses sync server IP synchronous with needing and path with space Split with path with English comma.Such as:
http://m.autohome.com.cn/ashx/baidu/BookList.ashxCount=200 10.168.0.120 d $ m.autohome.com.cn includeFile baidu dealer datalist.xml, 10.168.0.166\d$\m.autohome.com.cn\includeFile\baidu\dealer\datalist.xml
Wherein, http://m.autohome.com.cn/ashx/baidu/BookList.ashxCount=200 is page Face URL addresses;\\10.168.0.120\d$\m.autohome.com.cn\includeFile\baidu\dealer\ Datalist.xml, which is one, needs synchronous server absolute path;\\10.168.0.166\d$\m.autoho Me.com.cn include File baidu dealer datalist.xml it is absolute for the synchronous server of another needs Path.
S3, is pre-processed, the new configuration file after being handled to the configuration file read;Wherein, it is described Stored in new configuration file by more than one needs download effective page group of addresses into effective page address list, also store More than one sync server IP corresponding with each effective page address and path;Specifically, in the present invention, active page Face address and sync server IP and path are many-one relationship.
In practical application, it is possible to use .Net internal objects StreamReader reads configuration file content, then it will read The configuration file got is put into fixed container, and filtration and de-weighting operation is carried out to configuration file.
Wherein, filter operation specifically includes following two modes:(1) whether judge each parent page address is legal Address is linked, and is linked if there is illegal address, then deletes illegal address link.(2) judge be in configuration file It is no to there are illustrative words;If it is present deleting illustrative words.
Deduplication operation specifically includes in the following manner:Judge to whether there is identical parent page address in configuration file and same Walk server ip and path corresponding relation;If it is present deleting the parent page address and sync server IP and road repeated Footpath corresponding relation.
For deduplication operation, specifically also include two ways:(1) parent page address and sync server IP and path are complete It is exactly the same.For example:Configuration file records following two corresponding relations:First, page address http://autohome.com.cn pairs Answer sync server 1, sync server 2 and sync server 3;2nd, page address http://autohome.com.cn correspondences Sync server 1, sync server 2 and sync server 3.Then this two corresponding relations are identical wherein appoints, it is necessary to delete One corresponding relation of meaning.(2) identical parent page address corresponding part identical sync server IP and path.For example:Match somebody with somebody Put file and record following two corresponding relations:Correspondence 1:Page address http://autohome.com.cn correspondence sync servers 1st, sync server 2 and sync server 3;Correspondence 2:Page address http://autohome.com.cn correspondence sync servers 2nd, sync server 3 and sync server 4.Due to there is the following corresponding relation of identical:Page address http:// Autohome.com.cn correspondence sync servers 2 and sync server 3.It is therefore desirable to delete the synchronization in correspondence 1 or correspondence 2 Server 2 and sync server 3.
By the above-mentioned pretreatment to configuration file, work complexity when can simplify follow-up asynchronous download and be synchronous, Simplify asynchronous download and synchronous step, improve asynchronous download and synchronous efficiency.
S4, calculates the effective page number of addresses stored in effective page address list, is then based on thread creation The multiple download threads of rule creation.In this step, thread creation rule can be:Maximum thread creation number is set, if maximum Thread creation number is more than effective page number of addresses, for example:Maximum thread creation number is 20, and effective page number of addresses is 15 It is individual, then create and effective page number of addresses identical download thread, i.e. need to create 15 download threads, a downloading wire Journey handles an effective page.If maximum thread creation number is less than effective page number of addresses, such as:Maximum thread creation number For 15, effective page number of addresses is 20, then creates and maximum thread creation number identical download thread, i.e.,:Need wound 15 download threads are built, 15 effective pages are handled respectively by 15 download threads, when the processing of some download thread completes one During individual effective page, then apply for the still untreated effective page of processing, when no need effective page to be processed, then nullify Download thread;When all download threads are canceled, whole process terminates.
S5, defines the corresponding relation of each described download thread and effective page address;
S6, each described download thread reads effective page, Ran Houyi by effective page address corresponding with itself Step downloads effective page.For example:If necessary to download 3 effective pages, then 3 download threads are opened simultaneously, by many Thread downloads the page, can effectively improve download efficiency.
In this step, exemplified by specifying the asynchronous download of download thread to specify effective page, including:Download thread is specified to finger Determine page server and send downloading request message;Then judge whether interior at preset time intervals to receive the specified page service The download response message that device is returned, if received, continues follow-up process;If do not received, survey specified page service is taken Device is likely to occur failure, therefore nullifies the specified download thread, discharges resource.
S7, each described download thread is obtained corresponding with effective page address by reading the configuration file Sync server IP and path;
S8, each described download thread by the obtained effective pages of S6 be synchronized to S7 acquisition sync server IP and In sync server represented by path.At the end of any one of download thread performs S8, the download thread application Next effective page is handled, when no need effective page to be processed, the download thread is nullified.
In this step, so that specific effective page is synchronized on certain synchronization server by particular download thread as an example, including Following steps:
Due to in the same time, the content of pages that same page address is included not there may be difference, such as:The page enters Go and updated operation etc..Accordingly, it would be desirable to according to certain frequency, refresh certain synchronization server.The flow being described below is One refresh process.
In previous moment, the particular download thread downloads effective page representated by specific effective page address P1;Then effective page P1 is saved on the certain synchronization server with filename X;
At current time, particular download thread downloads effective page P2 representated by specific effective page address;So Afterwards, effective page P2 is write on the certain synchronization server with filename Y first, wherein, filename X and filename Y is differed;File Y alternate files X is used again.Because the speed that file writes is slower, and the speed that file is replaced is very fast, institute With, the present invention in, using first write replace again by the way of update the page, it is ensured that in the premise for not influenceing the page normally to watch Lower carry out renewal of the page.
In summary, the multiserver synchronous method based on the asynchronous download page of multithreading that the present invention is provided, leads to first Cross on asynchronous each page of download of multithreading, the page synchronization for then again obtaining download to each server, so as to ensure While user accesses website speed, additionally it is possible to ensure the synchronization of each WEB server, the access experience of user is improved.
Described above is only the preferred embodiment of the present invention, it is noted that for the ordinary skill people of the art For member, under the premise without departing from the principles of the invention, some improvements and modifications can also be made, these improvements and modifications also should Depending on protection scope of the present invention.

Claims (6)

1. it is a kind of based on the asynchronous multiserver synchronous method for downloading the page of multithreading, it is characterised in that to comprise the following steps:
S1, receives the storage location information of configuration file;
S2, according to the storage location information received, reads the configuration file;Wherein, the configuration file has been stored The parent page address list that the parent page address downloaded by more than one needs is constituted, also storage and each original page Corresponding more than one sync server IP in face address and path;
S3, is pre-processed, the new configuration file after being handled to the configuration file read;Wherein, it is described newly to match somebody with somebody Put stored in file effective page group of addresses for being downloaded by more than one needs into effective page address list, also storage with it is every Corresponding more than one sync server IP in individual effective page address and path;
S4, calculates the effective page number of addresses stored in effective page address list, is then based on thread creation rule Create multiple download threads;
S5, defines the corresponding relation of each described download thread and effective page address;
S6, each described download thread reads effective page by effective page address corresponding with itself, it is then asynchronous under Carry effective page;
S7, each described download thread is obtained corresponding synchronous with effective page address by reading the configuration file Server ip and path;
The obtained effective pages of S6 are synchronized to sync server IP and the path of S7 acquisitions by S8, each described download thread In represented sync server;
At the end of any one of download thread performs S8, the download thread application handles next effective page, directly During to no need effective page to be processed, the download thread is nullified.
2. it is according to claim 1 based on the asynchronous multiserver synchronous method for downloading the page of multithreading, it is characterised in that In S3, the configuration file read is pre-processed, is specially:
Filter operation and deduplication operation are carried out to the configuration file read.
3. it is according to claim 2 based on the asynchronous multiserver synchronous method for downloading the page of multithreading, it is characterised in that Filter operation and deduplication operation are carried out to the configuration file read, are specially:
Filter operation is specifically included:Whether be legal address link, if there is not if judging each described parent page address Legal address link, then delete the illegal address link;And/or judge to whether there is explanation in the configuration file Property word;If it is present deleting the illustrative words;And/or
Deduplication operation specifically includes in the following manner:Judge to whether there is identical parent page address in the configuration file and same Walk server ip and path corresponding relation;If it is present deleting the parent page address repeated and sync server IP With path corresponding relation.
4. it is according to claim 1 based on the asynchronous multiserver synchronous method for downloading the page of multithreading, it is characterised in that In S6, the specified effective page of the specified asynchronous download of download thread is specially:
The specified download thread sends downloading request message to specified page server;Then judge whether between the scheduled time The download response message that the specified page server is returned is received every interior, if received, continues follow-up process;If no Receive, then nullify the specified download thread.
5. it is according to claim 1 based on the asynchronous multiserver synchronous method for downloading the page of multithreading, it is characterised in that In S8, specific effective page is synchronized on certain synchronization server by particular download thread, is specially:
In previous moment, the particular download thread downloads effective page P1 representated by specific effective page address;So Effective page P1 is saved on the certain synchronization server with filename X afterwards;
At current time, the particular download thread downloads effective page P2 representated by specific effective page address;So Afterwards, effective page P2 is write on the certain synchronization server with filename Y first;Replaced again with effective page P2 Change effective page P1;Wherein, filename X and filename Y is differed.
6. it is according to claim 1 based on the asynchronous multiserver synchronous method for downloading the page of multithreading, it is characterised in that Specific effective page is synchronized on certain synchronization server by S8, particular download thread, is specially:
The particular download thread first determines whether whether specific effective page is read only attribute;If it is, by the spy The read only attribute of fixed effective page is changed to after non-read only attribute, then specific effective page of non-read only attribute is synchronized to On certain synchronization server.
CN201310220524.7A 2013-06-05 2013-06-05 Based on the asynchronous multiserver synchronous method for downloading the page of multithreading Active CN104219271B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310220524.7A CN104219271B (en) 2013-06-05 2013-06-05 Based on the asynchronous multiserver synchronous method for downloading the page of multithreading

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310220524.7A CN104219271B (en) 2013-06-05 2013-06-05 Based on the asynchronous multiserver synchronous method for downloading the page of multithreading

Publications (2)

Publication Number Publication Date
CN104219271A CN104219271A (en) 2014-12-17
CN104219271B true CN104219271B (en) 2017-10-27

Family

ID=52100401

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310220524.7A Active CN104219271B (en) 2013-06-05 2013-06-05 Based on the asynchronous multiserver synchronous method for downloading the page of multithreading

Country Status (1)

Country Link
CN (1) CN104219271B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105049483B (en) * 2015-06-03 2019-05-14 中国银行股份有限公司 A kind of data uploading method and device based on browser
CN105162885B (en) * 2015-09-25 2019-04-12 宇龙计算机通信科技(深圳)有限公司 Resource downloading method, resource downloading system and terminal
CN112035198A (en) * 2020-08-12 2020-12-04 深圳创维-Rgb电子有限公司 Homepage loading method, TV and storage medium
CN112162985B (en) * 2020-09-29 2024-05-03 银盛支付服务股份有限公司 Asynchronous downloading method based on keyle business intelligent platform

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2009040425A2 (en) * 2007-09-28 2009-04-02 Apertio Limited System and method for replication and synchronisation
CN102868731A (en) * 2012-08-27 2013-01-09 济南大学 Method and device for software online updating and downloading acceleration
CN102917056A (en) * 2012-10-19 2013-02-06 山东中磁视讯股份有限公司 Mobile learning system and using method thereof

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2009040425A2 (en) * 2007-09-28 2009-04-02 Apertio Limited System and method for replication and synchronisation
CN102868731A (en) * 2012-08-27 2013-01-09 济南大学 Method and device for software online updating and downloading acceleration
CN102917056A (en) * 2012-10-19 2013-02-06 山东中磁视讯股份有限公司 Mobile learning system and using method thereof

Also Published As

Publication number Publication date
CN104219271A (en) 2014-12-17

Similar Documents

Publication Publication Date Title
CN102722563B (en) Method and device for displaying page
CN103475687B (en) Distributed method and system for download site data
CN105608134B (en) A kind of web crawling system based on multi-thread and its web crawling method
CN103118007B (en) A kind of acquisition methods of user access activity and system
CN106202112A (en) CACHE DIRECTORY method for refreshing and device
CN104423982B (en) The processing method and processing equipment of request
CN104219271B (en) Based on the asynchronous multiserver synchronous method for downloading the page of multithreading
CN106506703A (en) Based on the service discovery method of shared drive, apparatus and system, server
CN110401724A (en) File management method, ftp server and storage medium
CN107343031A (en) A kind of method, apparatus for automatically updating file, electronic equipment and storage medium
CN110427364A (en) A kind of data processing method, device, electronic equipment and storage medium
CN101441629A (en) Automatic acquiring method of non-structured web page information
CN106375362A (en) Cache synchronization method and system for distributed server
CN106503158A (en) Method of data synchronization and device
CN104301304A (en) Vulnerability detection system based on large ISP interconnection port and method thereof
CN106250476B (en) A method, device and system for updating and synchronizing a whitelist
CN109871503A (en) Data calling method, device, computer equipment and storage medium
CN104239353B (en) WEB classification control and log audit method
CN110209909A (en) Data crawling method, device, computer equipment and storage medium
CN106850761A (en) Journal file storage method and device
CN106020891A (en) Page loading method and device
CN106708911A (en) Method and device for synchronizing data files in cloud environment
CN102567313B (en) Progressive webpage library deduplication system and its implementation
CN106407442A (en) Massive text data processing method and apparatus
CN105975352A (en) Cache data processing method and server

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant