CN107835191A - A kind of method and apparatus for detecting webpage malicious and distorting - Google Patents
A kind of method and apparatus for detecting webpage malicious and distorting Download PDFInfo
- Publication number
- CN107835191A CN107835191A CN201711220764.1A CN201711220764A CN107835191A CN 107835191 A CN107835191 A CN 107835191A CN 201711220764 A CN201711220764 A CN 201711220764A CN 107835191 A CN107835191 A CN 107835191A
- Authority
- CN
- China
- Prior art keywords
- webpage
- hash
- cryptographic hash
- changed
- similar
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
- 
        - H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1441—Countermeasures against malicious traffic
- H04L63/1466—Active attacks involving interception, injection, modification, spoofing of data unit addresses, e.g. hijacking, packet injection or TCP sequence number attacks
 
- 
        - G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/951—Indexing; Web crawling techniques
 
- 
        - H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1408—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
- H04L63/1416—Event detection, e.g. attack signature detection
 
Landscapes
- Engineering & Computer Science (AREA)
- Computer Security & Cryptography (AREA)
- General Engineering & Computer Science (AREA)
- Signal Processing (AREA)
- Computing Systems (AREA)
- Computer Networks & Wireless Communication (AREA)
- Computer Hardware Design (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Information Transfer Between Computers (AREA)
Abstract
It is a kind of to detect the method for webpage tamper, including the root of website is scanned, the cryptographic Hash of each webpage is calculated using similar hash algorithm, and the cryptographic Hash for collecting generation establishes basic Hash storehouse;The write operation of monitoring site catalogue, for the page changed, recalculated using similar hash algorithm by the cryptographic Hash of the modification page, and the cryptographic Hash of corresponding document is extracted from basic Hash storehouse;The cryptographic Hash generated twice before and after contrast, if the similarity of comparison result is less than a threshold value, can be considered that the webpage is changed;Feature detection is carried out to the webpage changed, and judges whether webpage is maliciously tampered.Beneficial effect is:Detection method proposed by the invention need not periodically calculate the fingerprint of webpage under site listing, and can be detected in real time when modification operation occurs for website, simplify operating procedure, improve the efficiency of detection webpage tamper.
  Description
Technical field
      The present invention relates to safe web page field, in particular to a kind of method for detecting webpage malicious and distorting.
    Background technology
      Webpage tamper is a kind of common attack.Attacker often changes existing after website is attacked
Webpage, malicious code or junk information etc. are write into the existing page.The webpage being tampered not only have impact on the normal of website
Operation, also have propagated malicious code and invalid information etc. to the user for browsing webpage, and harm is extremely serious.
      The method of currently used detection webpage tamper is web page fingerprint Comparison Method.This method is counted in advance by hash function
The digital finger-print of each webpage under website is calculated, digital finger-print is collected and establishes fingerprint base, is recalculated again after being separated by certain time every
The fingerprint of individual webpage, and be compared with the fingerprint in fingerprint base.Illustrate the webpage if the digital finger-print difference of same webpage
It is tampered.But this method needs to establish fingerprint base before being not tampered with website, and also must during newly-built every time or modification webpage
Fingerprint base must be updated, cumbersome and efficiency is low.
    The content of the invention
      The present invention be directed to the deficiencies in the prior art, it is proposed that a kind of method for detecting webpage malicious and distorting, this method
Using can fast and effectively detect whether webpage is changed, there is higher security.
      A kind of method for detecting webpage tamper, including:
      The root of website is scanned, the cryptographic Hash of each webpage is calculated using similar hash algorithm, and collects generation
Cryptographic Hash establishes basic Hash storehouse;
      The write operation of monitoring site catalogue, for the page changed, recalculated using similar hash algorithm by modification page
The cryptographic Hash in face, and extract from basic Hash storehouse the cryptographic Hash of corresponding document;
      The cryptographic Hash generated twice before and after contrast, if the similarity of comparison result is less than a threshold value, can be considered the webpage quilt
Modification;
      Feature detection is carried out to the webpage changed, and judges whether webpage is maliciously tampered.
      Described similar hash algorithm is:Similar hash algorithm is consistent with other hash algorithms, and difference is that similar Hash is calculated
Method is for specifying object to generate unique and fixed length cryptographic Hash;For two objects, if two objects are more similar, the Kazakhstan generated
Uncommon value difference is smaller.
      Meanwhile the invention also provides a kind of device for detecting webpage tamper, the device includes web page crawl unit, calculated
Unit and detection unit;
      The web page crawl unit, for traveling through site listing, obtain all webpages under website, while monitoring station catalogue
Write operation, record the webpage changed;
      The computing unit, calculate the cryptographic Hash for the webpage that web page crawl unit obtains using similar hash algorithm and store to base
In plinth Hash storehouse;The webpage changed monitored simultaneously for web page crawl unit, is recalculated by the Hash of modification webpage
Value, and be compared with the corresponding cryptographic Hash in basic Hash storehouse, calculate the similarity of webpage before and after modification;
      The detection unit, obtain and the webpage changed that similarity is less than given threshold is calculated in computing unit, using spy
Whether the webpage that sign detection method detection is changed contains malicious code or harmful information.
      Further, the web page crawl unit can remove the html tag that webpage includes when crawling webpage, to obtain
The content of text of webpage.
      The beneficial effect of technical scheme of the present invention is:Detection method proposed by the invention uses similar hash algorithm
The similarity of webpage is calculated, judges whether webpage is tampered with this.Compared with existing web page fingerprint Comparison Method, the present invention is carried
The detection method gone out need not periodically calculate the fingerprint of webpage under site listing, and can be carried out in real time when modification operation occurs for website
Detection, simplifies operating procedure, improves the efficiency of detection webpage tamper.
    Embodiment
      In order that those skilled in the art more fully understand technical scheme, with reference to specific embodiment to this
Invention is described in further detail.
      A kind of method for detecting webpage tamper, including:
      The root of website is scanned, the cryptographic Hash of each webpage is calculated using similar hash algorithm, and collects generation
Cryptographic Hash establishes basic Hash storehouse;
      The write operation of monitoring site catalogue, for the page changed, recalculated using similar hash algorithm by modification page
The cryptographic Hash in face, and extract from basic Hash storehouse the cryptographic Hash of corresponding document;
      The cryptographic Hash generated twice before and after contrast, if the similarity of comparison result is less than a threshold value, can be considered the webpage quilt
Modification;
      Feature detection is carried out to the webpage changed, and judges whether webpage is maliciously tampered.
      Described similar hash algorithm is:Similar hash algorithm is consistent with other hash algorithms, and difference is that similar Hash is calculated
Method is for specifying object to generate unique and fixed length cryptographic Hash;For two objects, if two objects are more similar, the Kazakhstan generated
Uncommon value difference is smaller.Therefore, the algorithm can be used for the similarity for quickly comparing two objects.
      Meanwhile the invention also provides a kind of device for detecting webpage tamper, the device includes web page crawl unit, calculated
Unit and detection unit;
      The web page crawl unit, for traveling through site listing, obtain all webpages under website, while monitoring station catalogue
Write operation, record the webpage changed;
      The computing unit, calculate the cryptographic Hash for the webpage that web page crawl unit obtains using similar hash algorithm and store to base
In plinth Hash storehouse;The webpage changed monitored simultaneously for web page crawl unit, is recalculated by the Hash of modification webpage
Value, and be compared with the corresponding cryptographic Hash in basic Hash storehouse, calculate the similarity of webpage before and after modification;
      The detection unit, obtain and the webpage changed that similarity is less than given threshold is calculated in computing unit, using spy
Whether the webpage that sign detection method detection is changed contains malicious code or harmful information.
      Further, the web page crawl unit can remove the html tag that webpage includes when crawling webpage, to obtain
The content of text of webpage.
      Above a kind of method for detecting webpage tamper provided by the present invention is included being described in detail, herein should
The principle and embodiment of the application are set forth with embodiment, the explanation of above example is only intended to help and understood
The present processes and its core concept;Meanwhile for those of ordinary skill in the art, according to the thought of the application, having
There will be changes in body embodiment and application, in summary, this specification content should not be construed as to the application
Limitation.
    Claims (4)
-  A kind of 1. method for detecting webpage tamper, it is characterised in that including:The root of website is scanned, the cryptographic Hash of each webpage is calculated using similar hash algorithm, and collects generation Cryptographic Hash establishes basic Hash storehouse;The write operation of monitoring site catalogue, for the page changed, recalculated using similar hash algorithm by modification page The cryptographic Hash in face, and extract from basic Hash storehouse the cryptographic Hash of corresponding document;The cryptographic Hash generated twice before and after contrast, if the similarity of comparison result is less than a threshold value, can be considered the webpage quilt Modification;Feature detection is carried out to the webpage changed, and judges whether webpage is maliciously tampered.
-  A kind of 2. method for detecting webpage tamper as claimed in claim 1, it is characterised in that described similar hash algorithm For:Similar hash algorithm is consistent with other hash algorithms, and difference is that similar hash algorithm is unique and fixed for specifying object generation Long cryptographic Hash;For two objects, if two objects are more similar, the cryptographic Hash difference generated is smaller.
-  A kind of 3. device for detecting webpage tamper, it is characterised in that:The device of the detection webpage tamper includes web page crawl list Member, computing unit and detection unit;The web page crawl unit, for traveling through site listing, obtain all webpages under website, while monitoring station catalogue Write operation, record the webpage changed;The computing unit, calculate the cryptographic Hash for the webpage that web page crawl unit obtains using similar hash algorithm and store to base In plinth Hash storehouse;The webpage changed monitored simultaneously for web page crawl unit, is recalculated by the Hash of modification webpage Value, and be compared with the corresponding cryptographic Hash in basic Hash storehouse, calculate the similarity of webpage before and after modification;The detection unit, obtain and the webpage changed that similarity is less than given threshold is calculated in computing unit, using spy Whether the webpage that sign detection method detection is changed contains malicious code or harmful information.
-  A kind of 4. device for detecting webpage tamper as claimed in claim 3, it is characterised in that:The web page crawl unit can be The html tag that webpage includes is removed when crawling webpage, to obtain the content of text of webpage.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title | 
|---|---|---|---|
| CN201711220764.1A CN107835191A (en) | 2017-11-29 | 2017-11-29 | A kind of method and apparatus for detecting webpage malicious and distorting | 
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title | 
|---|---|---|---|
| CN201711220764.1A CN107835191A (en) | 2017-11-29 | 2017-11-29 | A kind of method and apparatus for detecting webpage malicious and distorting | 
Publications (1)
| Publication Number | Publication Date | 
|---|---|
| CN107835191A true CN107835191A (en) | 2018-03-23 | 
Family
ID=61646360
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date | 
|---|---|---|---|
| CN201711220764.1A Pending CN107835191A (en) | 2017-11-29 | 2017-11-29 | A kind of method and apparatus for detecting webpage malicious and distorting | 
Country Status (1)
| Country | Link | 
|---|---|
| CN (1) | CN107835191A (en) | 
Cited By (6)
| Publication number | Priority date | Publication date | Assignee | Title | 
|---|---|---|---|---|
| CN108809943A (en) * | 2018-05-14 | 2018-11-13 | 苏州闻道网络科技股份有限公司 | Web publishing method and its device | 
| CN109474587A (en) * | 2018-11-01 | 2019-03-15 | 北京亚鸿世纪科技发展有限公司 | The method that HTTP based on letter peace system kidnaps monitoring analysis and positioning | 
| CN109978626A (en) * | 2019-03-29 | 2019-07-05 | 上海幻电信息科技有限公司 | Web advertisement change monitoring method, apparatus and storage medium | 
| CN110008392A (en) * | 2019-03-07 | 2019-07-12 | 北京华安普特网络科技有限公司 | A kind of webpage tamper detection method based on web crawlers technology | 
| CN111967064A (en) * | 2020-09-05 | 2020-11-20 | 湖南西盈网络科技有限公司 | Webpage tamper-proofing method and system | 
| CN117056584A (en) * | 2023-10-08 | 2023-11-14 | 杭州海康威视数字技术股份有限公司 | Information system abnormal change monitoring method and equipment based on dynamic similarity threshold | 
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title | 
|---|---|---|---|---|
| CN102624713A (en) * | 2012-02-29 | 2012-08-01 | 深信服网络科技(深圳)有限公司 | Method and device for website tampering identification | 
| US20120284270A1 (en) * | 2011-05-04 | 2012-11-08 | Nhn Corporation | Method and device to detect similar documents | 
| CN103281177A (en) * | 2013-04-10 | 2013-09-04 | 广东电网公司信息中心 | Method and system for detecting hostile attack on Internet information system | 
| CN106528508A (en) * | 2016-10-27 | 2017-03-22 | 乐视控股(北京)有限公司 | Repeated text judgment method and apparatus | 
- 
        2017
        - 2017-11-29 CN CN201711220764.1A patent/CN107835191A/en active Pending
 
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title | 
|---|---|---|---|---|
| US20120284270A1 (en) * | 2011-05-04 | 2012-11-08 | Nhn Corporation | Method and device to detect similar documents | 
| CN102624713A (en) * | 2012-02-29 | 2012-08-01 | 深信服网络科技(深圳)有限公司 | Method and device for website tampering identification | 
| CN103281177A (en) * | 2013-04-10 | 2013-09-04 | 广东电网公司信息中心 | Method and system for detecting hostile attack on Internet information system | 
| CN106528508A (en) * | 2016-10-27 | 2017-03-22 | 乐视控股(北京)有限公司 | Repeated text judgment method and apparatus | 
Cited By (8)
| Publication number | Priority date | Publication date | Assignee | Title | 
|---|---|---|---|---|
| CN108809943A (en) * | 2018-05-14 | 2018-11-13 | 苏州闻道网络科技股份有限公司 | Web publishing method and its device | 
| CN108809943B (en) * | 2018-05-14 | 2021-05-14 | 苏州闻道网络科技股份有限公司 | Website monitoring method and device | 
| CN109474587A (en) * | 2018-11-01 | 2019-03-15 | 北京亚鸿世纪科技发展有限公司 | The method that HTTP based on letter peace system kidnaps monitoring analysis and positioning | 
| CN110008392A (en) * | 2019-03-07 | 2019-07-12 | 北京华安普特网络科技有限公司 | A kind of webpage tamper detection method based on web crawlers technology | 
| CN109978626A (en) * | 2019-03-29 | 2019-07-05 | 上海幻电信息科技有限公司 | Web advertisement change monitoring method, apparatus and storage medium | 
| CN111967064A (en) * | 2020-09-05 | 2020-11-20 | 湖南西盈网络科技有限公司 | Webpage tamper-proofing method and system | 
| CN117056584A (en) * | 2023-10-08 | 2023-11-14 | 杭州海康威视数字技术股份有限公司 | Information system abnormal change monitoring method and equipment based on dynamic similarity threshold | 
| CN117056584B (en) * | 2023-10-08 | 2024-01-16 | 杭州海康威视数字技术股份有限公司 | Information system abnormal change monitoring method and equipment based on dynamic similarity threshold | 
Similar Documents
| Publication | Publication Date | Title | 
|---|---|---|
| CN107835191A (en) | A kind of method and apparatus for detecting webpage malicious and distorting | |
| Niakanlahiji et al. | Phishmon: A machine learning framework for detecting phishing webpages | |
| Rao et al. | A computer vision technique to detect phishing attacks | |
| CN102624713B (en) | The method of website tamper Detection and device | |
| CN103929440B (en) | Webpage tamper resistant device and its method based on web server cache match | |
| Wang et al. | Anagram: A content anomaly detector resistant to mimicry attack | |
| US20180248896A1 (en) | System and method to prevent, detect, thwart, and recover automatically from ransomware cyber attacks, using behavioral analysis and machine learning | |
| CN107332848A (en) | A kind of exception of network traffic real-time monitoring system based on big data | |
| CN102111267A (en) | Website safety protection method based on digital signature and system adopting same | |
| CN106599242A (en) | Webpage change monitoring method and system based on similarity calculation | |
| CN102571768A (en) | Detection method for phishing site | |
| CN102779245A (en) | Webpage abnormality detection method based on image processing technology | |
| TW202112110A (en) | Attack path detection method, attack path detection system and non-transitory computer-readable medium | |
| CN104598820A (en) | Trojan virus detection method based on feature behavior activity | |
| Provos et al. | Search worms | |
| CN107729386B (en) | A Dark Chain Detection Technology Based on Polymerization Analysis | |
| Britt et al. | Clustering Potential Phishing Websites Using {DeepMD5} | |
| Huang et al. | Mitigate web phishing using site signatures | |
| CN104503962A (en) | Method for detecting hidden link of webpage | |
| Yue et al. | Fine-grained mining and classification of malicious Web pages | |
| CN104778407B (en) | A kind of multidimensional is without condition code malware detection methods | |
| Yin | An improved BM pattern matching algorithm in intrusion detection system | |
| KR20210144452A (en) | The two-stage method for detecting ransomware using dynamic analysis and machine learning | |
| Alshaikh et al. | Crypto-ransomware detection and prevention techniques and tools a survey | |
| Peng et al. | Detection of cache-based side channel attack based on performance counters | 
Legal Events
| Date | Code | Title | Description | 
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| RJ01 | Rejection of invention patent application after publication | ||
| RJ01 | Rejection of invention patent application after publication | Application publication date: 20180323 |