[go: up one dir, main page]

CN113761401A - Method and device for determining website root domain name - Google Patents

Method and device for determining website root domain name Download PDF

Info

Publication number
CN113761401A
CN113761401A CN202010685974.3A CN202010685974A CN113761401A CN 113761401 A CN113761401 A CN 113761401A CN 202010685974 A CN202010685974 A CN 202010685974A CN 113761401 A CN113761401 A CN 113761401A
Authority
CN
China
Prior art keywords
domain name
cookie
current domain
cookie file
current
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010685974.3A
Other languages
Chinese (zh)
Inventor
谢航
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Century Trading Co Ltd
Beijing Wodong Tianjun Information Technology Co Ltd
Original Assignee
Beijing Jingdong Century Trading Co Ltd
Beijing Wodong Tianjun Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Century Trading Co Ltd, Beijing Wodong Tianjun Information Technology Co Ltd filed Critical Beijing Jingdong Century Trading Co Ltd
Priority to CN202010685974.3A priority Critical patent/CN113761401A/en
Publication of CN113761401A publication Critical patent/CN113761401A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/955Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]
    • G06F16/9566URL specific, e.g. using aliases, detecting broken or misspelled links
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L61/00Network arrangements, protocols or services for addressing or naming
    • H04L61/45Network directories; Name-to-address mapping
    • H04L61/4505Network directories; Name-to-address mapping using standardised directories; using standardised directory access protocols
    • H04L61/4511Network directories; Name-to-address mapping using standardised directories; using standardised directory access protocols using domain name system [DNS]

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

本发明公开了一种确定网站根域名的方法和装置,涉及计算机技术领域。该方法的一具体实施方式包括:提取目标URL中的域名部分;选取所述域名部分最右端的两个元素构建当前域名;在当前域名下执行创建cookie文件的操作;在该操作执行完毕之后,判断该当前域名下是否存在所述cookie文件:若是,将该当前域名确定为目标URL对应的网站根域名;否则,将所述域名部分中未被选取的最右端元素拼接在该当前域名形成新的当前域名,再次执行所述创建和所述判断的步骤;其中,每一当前域名中各元素的排列顺序与所述域名部分一致。该实施方式能够在不引入外部文件的前提下准确获取目标URL中的网站根域名。

Figure 202010685974

The invention discloses a method and a device for determining the root domain name of a website, and relates to the technical field of computers. A specific implementation of the method includes: extracting the domain name part in the target URL; selecting two elements at the rightmost end of the domain name part to construct the current domain name; performing the operation of creating a cookie file under the current domain name; after the operation is completed, Judging whether the cookie file exists under this current domain name: if so, this current domain name is determined as the website root domain name corresponding to the target URL; the current domain name of the current domain name, and perform the steps of creating and judging again; wherein, the arrangement order of each element in each current domain name is partially consistent with the domain name. This embodiment can accurately acquire the website root domain name in the target URL without introducing external files.

Figure 202010685974

Description

Method and device for determining website root domain name
Technical Field
The invention relates to the technical field of computers, in particular to a method and a device for determining a website root domain name.
Background
The support of the website root domain name cannot be separated in internet technologies such as cross-domain tracking of user behaviors and single sign-on, so it is very important to accurately acquire the website root domain name from one URL (Uniform Resource Locator). Currently, the website root domain name is generally obtained through a regular expression based on top-level domain names such as "com", "net", and the like.
In the process of implementing the invention, the inventor finds that the prior art has at least the following problems: the existing method based on the regular expression needs to introduce a library file containing all current top-level domain names in advance, the file occupies a large space, and the loading performance of a website is easily influenced.
Disclosure of Invention
In view of this, embodiments of the present invention provide a method and an apparatus for determining a website root domain name, which can accurately obtain the website root domain name in a target URL without introducing an external file.
To achieve the above object, according to one aspect of the present invention, a method for determining a root domain name of a website is provided.
The method for determining the website root domain name in the embodiment of the invention comprises the following steps: extracting a domain name part in the target URL; selecting two elements at the rightmost end of the domain name part to construct a current domain name; executing operation of creating cookie files under the current domain name; after the operation is finished, judging whether the cookie file exists under the current domain name: if yes, determining the current domain name as a website root domain name corresponding to the target URL; otherwise, splicing the right-most end element which is not selected in the domain name part with the current domain name to form a new current domain name, and executing the steps of creating and judging again; and the arrangement sequence of the elements in each current domain name is consistent with the domain name part.
Optionally, the extracting a domain name part in the target URL includes: the domain name portion is extracted using a document domain name attribute of the browser script.
Optionally, the performing an operation of creating a cookie file under the current domain name includes: and assigning the customized cookie file identification data and the current domain name to the cookie related attribute of the browser script.
Optionally, the determining whether the cookie file exists under the current domain name includes: and reading cookie file information under the current domain name based on the cookie related attribute of the browser script, and judging whether cookie file identification data exists in returned data.
Optionally, the method further comprises: after determining a website root domain name corresponding to the target URL, assigning cookie file identification data used when creating the cookie file and expiration time data with a value of zero to cookie related attributes of the browser script, and deleting the created cookie file.
Optionally, the browser script is JavaScript; the document domain name attribute is document. Cookie related attribute is document. The cookie file identification data comprises: the name of the cookie file and the value of the cookie file.
To achieve the above object, according to another aspect of the present invention, there is provided an apparatus for determining a root domain name of a website.
The device for determining the website root domain name in the embodiment of the invention can comprise: an extracting unit configured to extract a domain name part in the target URL; the current domain name constructing unit is used for selecting two elements at the rightmost end of the domain name part to construct a current domain name; the cookie creating unit is used for executing the operation of creating the cookie file under the current domain name; a determination unit configured to: after the operation is finished, judging whether the cookie file exists under the current domain name: if yes, determining the current domain name as a website root domain name corresponding to the target URL; otherwise, splicing the right-most end element which is not selected in the domain name part with the current domain name to form a new current domain name, and executing the steps of creating and judging again; and the arrangement sequence of the elements in each current domain name is consistent with the domain name part.
Optionally, the cookie creating unit may be further configured to: assigning the customized cookie file identification data and the current domain name to cookie related attributes of the browser script; the determining unit may be further configured to: and reading cookie file information under the current domain name based on the cookie related attribute of the browser script, and judging whether cookie file identification data exists in returned data.
To achieve the above object, according to still another aspect of the present invention, there is provided an electronic apparatus.
An electronic device of the present invention includes: one or more processors; the storage device is used for storing one or more programs, and when the one or more programs are executed by the one or more processors, the one or more processors implement the method for determining the root domain name of the website provided by the invention.
To achieve the above object, according to still another aspect of the present invention, there is provided a computer-readable storage medium.
A computer-readable storage medium of the present invention has stored thereon a computer program which, when executed by a processor, implements the method of determining a root domain name of a web site provided by the present invention.
According to the technical scheme of the invention, the embodiment of the invention has the following advantages or beneficial effects: firstly, extracting a domain name part in a target URL, then selecting two elements at the rightmost end of the domain name part to construct a current domain name, trying to create a cookie under the current domain name, and if the creation is successful, indicating that the current domain name is a website root domain name corresponding to the target URL; if the creation is failed, the current domain name is not legal, at this time, the right-most end elements which are not selected in the domain name part can be spliced with the current domain name to form a new current domain name, and the steps of creating the cookie file and judging whether the creation is successful or not are repeatedly executed until the corresponding website root domain name is obtained. According to the method, the website root domain name is obtained by utilizing the characteristic that the browser can only write the cookie under the legal domain name, and the website root domain name is accurately identified on the premise of not introducing an external library file.
Further effects of the above-mentioned non-conventional alternatives will be described below in connection with the embodiments.
Drawings
The drawings are included to provide a better understanding of the invention and are not to be construed as unduly limiting the invention. Wherein:
FIG. 1 is a schematic diagram illustrating the main steps of a method for determining a root domain name of a website according to an embodiment of the present invention;
FIG. 2 is a schematic diagram illustrating an implementation of a method for determining a root domain name of a website according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a portion of an apparatus for determining a root domain name of a website according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of an electronic device for implementing the method for determining a root domain name of a website in the embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present invention are described below with reference to the accompanying drawings, in which various details of embodiments of the invention are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
It should be noted that the embodiments of the present invention and the technical features of the embodiments may be combined with each other without conflict.
Fig. 1 is a schematic diagram illustrating the main steps of a method for determining a root domain name of a website according to an embodiment of the present invention.
As shown in fig. 1, the method for determining a root domain name of a website according to the embodiment of the present invention may be specifically executed according to the following steps:
step S101: the domain name portion in the target URL is extracted.
In the embodiment of the present invention, the website root domain name refers to the root domain name of one site, which is different from the "root domain name" in "13 root domain name servers all over the world". For example, the domain name of a site search service is "www.test.com", the domain name of a translation service is "fanyi. It can be seen that for any website domain name, which is composed of multiple elements (typically, one element is a word), any two adjacent elements are separated by a separator ". times.. Generally, for any website domain name, the rightmost element of the website domain name may be referred to as a top-level domain name (also referred to as a first-level domain name), and a second-level domain name, a third-level domain name, a fourth-level domain name, and the like may be sequentially arranged from left to right. At present, three types of top-level domain names are shared in a domain name system on the internet, namely category top-level domain names (such as com, net, org, gov and the like), geographic top-level domain names (such as cn, uk and the like), and new top-level domain names (such as aero, biz, coop and the like).
In this step, the domain name portion in the target URL may be extracted first. It will be appreciated that the web page URL is generally comprised of a protocol section (e.g., http:), a domain name section, a port section (e.g., 8080), a virtual directory section (e.g.,/a/b), a parameter section, and the like. In some embodiments, the domain name portion may be extracted using a document domain name attribute (i.e., domain main) of a browser script (e.g., JavaScript), and for example, the domain name portion may be obtained by inputting JavaScript: alert in an address bar of a page corresponding to the target URL. In the area of front-end technology, alert is a method for displaying a specified message.
Step S102: and selecting two elements at the rightmost end of the domain name part to construct the current domain name.
After the domain name part of the target URL is obtained, the website root domain name can be obtained by utilizing the characteristic that the browser can only write cookies under a legal domain name. Because the website root domain name is necessarily composed of two or more than two elements at the right end of the domain name part, the two elements at the rightmost end of the domain name part are selected to construct the current domain name in the step. It will be appreciated that the current domain name needs to be constructed while maintaining the order of the elements in the domain name portion, and while retaining the delimiter. For example, if the domain name portion extracted from the target URL is "www.test.com.cn", the current domain name constructed in this step is "com.
Step S103: the operation of creating the cookie file is performed under the current domain name.
In the field of computer technology, a cookie refers to a text file stored in a user terminal and data therein, and a server can recognize a user state through cookie information carried when the user terminal sends a request. Preferably, in the embodiment of the present invention, a cookie-related attribute (i.e., document cookie) in JavaScript may be used to attempt to create and save a cookie file. Specifically, custom cookie file identification data (e.g., the name of the cookie file and the value of the cookie file) and the current domain name may be assigned to the document. For example, a code document, cookie _ test 1; cn performs the above assignment. It is to be understood that "_ cookie _ test" in the above code is the name of the cookie file to be created, "1" is the value of the cookie file to be created, "domain" is the domain name attribute of the cookie file, and "com.
Step S104: and after the operation is finished, judging whether the cookie file exists under the current domain name.
In this step, after the operation of creating the cookie file is completed, cookie file information under the current domain name may be read based on the document. If the cookie file identification data exists in the returned data, which indicates that the cookie file is successfully created and the corresponding current domain name is legal, the current domain name is determined as the website root domain name corresponding to the target URL (i.e., step S105). Generally, all cookie file information under the current domain name can be obtained by reading the document.
If the cookie file identification data does not exist in the returned data, indicating that the creation of the cookie file fails, at this time, the rightmost element that is not selected in the domain name part may be spliced to the current domain name to form a new current domain name, and the aforementioned steps of creating the cookie file and determining whether the creation of the cookie file is successful are performed again until the website root domain name is determined (i.e., step S106).
For example, under the current domain name "com.cn", document.cookie _ cookie _ test 1 is executed; cn to try to create a cookie file, the cookie may be read to obtain cookie file information under the current domain name "com. In practical application, it can be found that corresponding cookie file identification data "_ cookie _ test ═ 1" does not exist in the returned data, which indicates that creating a cookie file before the time is failed, at this time, the rightmost element "test" that is not selected in the domain name part "www.test.com.cn" may be spliced to the current domain name "com.cn" to form a new current domain name "test.com.cn", and the steps of creating a cookie file and determining whether the cookie file is successful or not are executed again under the new current domain name "test.com.cn", that is, the step of executing document. Com.cn to try to create a cookie file, and thereafter reads the cookie to acquire cookie file information under the current domain name "test. In practical application, it can be found that cookie file identification data "_ cookie _ test ═ 1" exists in the returned character string data, which indicates that the current domain name is legal, and the current domain name "test.
In specific application, if the current domain name is illegal, the cookie file cannot be successfully created under the current domain name, and no cookie file exists under the current domain name; creating a cookie file under the current domain name is generally successful if the current domain name is legitimate, and there may be other cookie files under the current domain name in addition to this cookie file created. Thus, in some embodiments, whether the cookie file was created successfully may be determined by: and if the returned data is null after the document cookie attribute is read, the creation is failed, and if the returned data is not null, the creation is successful.
Preferably, since the cookie file created in the above process has no typical role of the cookie file, the created cookie file may be deleted after determining the website root domain name corresponding to the target URL. Specifically, cookie file identification data used when creating a cookie file and expiration time data whose value is zero may be assigned to the document. And (5) deleting the created cookie file if maxAge is 0. It is understood that maxAge in the above instructions is the expiration time attribute of the cookie file.
Fig. 2 is a schematic diagram of a specific implementation of the method for determining a root domain name of a website in the embodiment of the present invention, and specifically executes the following steps: step S201: after the domain name part is extracted from the target URL, a preset regular expression is used for matching with the domain name part. In this step, a small number of regular expressions can be written based on the most common top-level domain names com, cn, com. Step S202: and judging whether the matching is successful, if so, obtaining the website root domain name (namely step S205), and ending the process. Step S203: and if the matching is judged to be failed, constructing the current domain name according to the method, and trying to create a cookie file under the current domain name. Step S204: whether creating the cookie file is successful is determined using the aforementioned method. Step S205: if the creation is successful, the website root domain name is obtained, and the process is ended. In step S206, if the creation fails, a new current domain name is constructed by using the aforementioned method, and step S203 and step S204 are executed again until the website root domain name is finally obtained.
According to the technical scheme of the embodiment of the invention, firstly, a domain name part in a target URL is extracted, then two elements at the rightmost end of the domain name part are selected to construct a current domain name, cookie creation is tried under the current domain name, and if the cookie creation is successful, the current domain name is indicated to be a website root domain name corresponding to the target URL; if the creation is failed, the current domain name is not legal, at this time, the right-most end elements which are not selected in the domain name part can be spliced with the current domain name to form a new current domain name, and the steps of creating the cookie file and judging whether the creation is successful or not are repeatedly executed until the corresponding website root domain name is obtained. The method obtains the website root domain name by utilizing the characteristic that the browser can only write the cookie under the legal domain name, and realizes accurate identification of the website root domain name on the premise of not influencing the loading performance of the website.
It should be noted that, for the convenience of description, the foregoing method embodiments are described as a series of acts, but those skilled in the art will appreciate that the present invention is not limited by the order of acts described, and that some steps may in fact be performed in other orders or concurrently. Moreover, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred and that no acts or modules are necessarily required to implement the invention.
To facilitate a better implementation of the above-described aspects of embodiments of the present invention, the following also provides relevant means for implementing the above-described aspects.
Referring to fig. 3, an apparatus 300 for determining a root domain name of a web site according to an embodiment of the present invention may include: an extracting unit 301, a current domain name constructing unit 302, a cookie creating unit 303, and a judging unit 304.
Wherein, the extracting unit 301 may be configured to extract a domain name part in the target URL; the current domain name constructing unit 302 may be configured to select two elements at the rightmost end of the domain name part to construct a current domain name; the cookie creating unit 303 may be configured to perform an operation of creating a cookie file under the current domain name; the determining unit 304 is operable to: after the operation is finished, judging whether the cookie file exists under the current domain name: if yes, determining the current domain name as a website root domain name corresponding to the target URL; otherwise, splicing the right-most end element which is not selected in the domain name part with the current domain name to form a new current domain name, and executing the steps of creating and judging again; and the arrangement sequence of the elements in each current domain name is consistent with the domain name part.
In an embodiment of the present invention, the cookie creating unit 303 may be further configured to: assigning the customized cookie file identification data and the current domain name to cookie related attributes of the browser script; the determining unit 304 may be further configured to: and reading cookie file information under the current domain name based on the cookie related attribute of the browser script, and judging whether cookie file identification data exists in returned data.
In a specific application, the extracting unit 301 may further be configured to: the domain name portion is extracted using a document domain name attribute of the browser script.
As a preferred aspect, the apparatus 300 may further include a cookie deletion unit for: after determining a website root domain name corresponding to the target URL, assigning cookie file identification data used when creating the cookie file and expiration time data with a value of zero to cookie related attributes of the browser script, and deleting the created cookie file.
In addition, in the embodiment of the present invention, the browser script is JavaScript; the document domain name attribute is document. Cookie related attribute is document. The cookie file identification data comprises: the name of the cookie file and the value of the cookie file.
According to the technical scheme of the embodiment of the invention, firstly, a domain name part in a target URL is extracted, then two elements at the rightmost end of the domain name part are selected to construct a current domain name, cookie creation is tried under the current domain name, and if the cookie creation is successful, the current domain name is indicated to be a website root domain name corresponding to the target URL; if the creation is failed, the current domain name is not legal, at this time, the right-most end elements which are not selected in the domain name part can be spliced with the current domain name to form a new current domain name, and the steps of creating the cookie file and judging whether the creation is successful or not are repeatedly executed until the corresponding website root domain name is obtained. The method obtains the website root domain name by utilizing the characteristic that the browser can only write the cookie under the legal domain name, and realizes accurate identification of the website root domain name on the premise of not influencing the loading performance of the website.
The invention also provides the electronic equipment. The electronic device of the embodiment of the invention comprises: one or more processors; the storage device is used for storing one or more programs, and when the one or more programs are executed by the one or more processors, the one or more processors implement the method for determining the root domain name of the website provided by the invention.
Referring now to FIG. 4, a block diagram of a computer system 400 suitable for use with the electronic device implementing an embodiment of the invention is shown. The electronic device shown in fig. 4 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.
As shown in fig. 4, the computer system 400 includes a Central Processing Unit (CPU)401 that can perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)402 or a program loaded from a storage section 408 into a Random Access Memory (RAM) 403. In the RAM403, various programs and data necessary for the operation of the computer system 400 are also stored. The CPU401, ROM 402, and RAM403 are connected to each other via a bus 404. An input/output (I/O) interface 405 is also connected to bus 404.
The following components are connected to the I/O interface 405: an input section 406 including a keyboard, a mouse, and the like; an output section 407 including a display device such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage section 408 including a hard disk and the like; and a communication section 409 including a network interface card such as a LAN card, a modem, or the like. The communication section 409 performs communication processing via a network such as the internet. A driver 410 is also connected to the I/O interface 405 as needed. A removable medium 411 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 410 as necessary, so that a computer program read out therefrom is mounted into the storage section 408 as necessary.
In particular, the processes described in the main step diagrams above may be implemented as computer software programs, according to embodiments of the present disclosure. For example, embodiments of the invention include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the main step diagram. In the above-described embodiment, the computer program can be downloaded and installed from a network through the communication section 409, and/or installed from the removable medium 411. The computer program performs the above-described functions defined in the system of the present invention when executed by the central processing unit 401.
It should be noted that the computer readable medium shown in the present invention can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present invention, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units described in the embodiments of the present invention may be implemented by software or hardware. The described units may also be provided in a processor, and may be described as: a processor includes an extracting unit, a current domain name constructing unit, a cookie creating unit, and a judging unit. Where the names of these units do not in some cases constitute a limitation of the unit itself, for example, an extraction unit may also be described as a "unit providing a domain name part to the current domain name construction unit".
According to the technical scheme of the embodiment of the invention, firstly, a domain name part in a target URL is extracted, then two elements at the rightmost end of the domain name part are selected to construct a current domain name, cookie creation is tried under the current domain name, and if the cookie creation is successful, the current domain name is indicated to be a website root domain name corresponding to the target URL; if the creation is failed, the current domain name is not legal, at this time, the right-most end elements which are not selected in the domain name part can be spliced with the current domain name to form a new current domain name, and the steps of creating the cookie file and judging whether the creation is successful or not are repeatedly executed until the corresponding website root domain name is obtained. The method obtains the website root domain name by utilizing the characteristic that the browser can only write the cookie under the legal domain name, and realizes accurate identification of the website root domain name on the premise of not influencing the loading performance of the website.
The above-described embodiments should not be construed as limiting the scope of the invention. Those skilled in the art will appreciate that various modifications, combinations, sub-combinations, and substitutions can occur, depending on design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. A method for determining a root domain name of a web site, comprising:
extracting a domain name part in the target URL;
selecting two elements at the rightmost end of the domain name part to construct a current domain name;
executing operation of creating cookie files under the current domain name;
after the operation is finished, judging whether the cookie file exists under the current domain name:
if yes, determining the current domain name as a website root domain name corresponding to the target URL;
otherwise, splicing the right-most end element which is not selected in the domain name part with the current domain name to form a new current domain name, and executing the steps of creating and judging again; and the arrangement sequence of the elements in each current domain name is consistent with the domain name part.
2. The method of claim 1, wherein extracting the domain name portion of the target URL comprises:
the domain name portion is extracted using a document domain name attribute of the browser script.
3. The method of claim 2, wherein the performing the operation of creating the cookie file under the current domain name comprises:
and assigning the customized cookie file identification data and the current domain name to the cookie related attribute of the browser script.
4. The method of claim 3, wherein the determining whether the cookie file exists under the current domain name comprises:
and reading cookie file information under the current domain name based on the cookie related attribute of the browser script, and judging whether cookie file identification data exists in returned data.
5. The method of claim 3, further comprising:
after determining a website root domain name corresponding to the target URL, assigning cookie file identification data used when creating the cookie file and expiration time data with a value of zero to cookie related attributes of the browser script, and deleting the created cookie file.
6. The method according to any one of claims 3 to 5,
the browser script is JavaScript;
the document domain name attribute is document.
Cookie related attribute is document.
The cookie file identification data comprises: the name of the cookie file and the value of the cookie file.
7. An apparatus for determining a root domain name of a web site, comprising:
an extracting unit configured to extract a domain name part in the target URL;
the current domain name constructing unit is used for selecting two elements at the rightmost end of the domain name part to construct a current domain name;
the cookie creating unit is used for executing the operation of creating the cookie file under the current domain name;
a determination unit configured to: after the operation is finished, judging whether the cookie file exists under the current domain name: if yes, determining the current domain name as a website root domain name corresponding to the target URL; otherwise, splicing the right-most end element which is not selected in the domain name part with the current domain name to form a new current domain name, and executing the steps of creating and judging again; and the arrangement sequence of the elements in each current domain name is consistent with the domain name part.
8. The apparatus of claim 7,
the cookie creating unit is further configured to: assigning the customized cookie file identification data and the current domain name to cookie related attributes of the browser script;
the judging unit is further configured to: and reading cookie file information under the current domain name based on the cookie related attribute of the browser script, and judging whether cookie file identification data exists in returned data.
9. An electronic device, comprising:
one or more processors;
a storage device for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-6.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 1-6.
CN202010685974.3A 2020-07-16 2020-07-16 Method and device for determining website root domain name Pending CN113761401A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010685974.3A CN113761401A (en) 2020-07-16 2020-07-16 Method and device for determining website root domain name

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010685974.3A CN113761401A (en) 2020-07-16 2020-07-16 Method and device for determining website root domain name

Publications (1)

Publication Number Publication Date
CN113761401A true CN113761401A (en) 2021-12-07

Family

ID=78785520

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010685974.3A Pending CN113761401A (en) 2020-07-16 2020-07-16 Method and device for determining website root domain name

Country Status (1)

Country Link
CN (1) CN113761401A (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1781087A (en) * 2003-04-08 2006-05-31 丛林网络公司 Method and system for secure access to a private network with client reception
CN108833603A (en) * 2018-05-28 2018-11-16 北京奇虎科技有限公司 A method, server and system for implementing domain name resolution based on blockchain

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1781087A (en) * 2003-04-08 2006-05-31 丛林网络公司 Method and system for secure access to a private network with client reception
CN108833603A (en) * 2018-05-28 2018-11-16 北京奇虎科技有限公司 A method, server and system for implementing domain name resolution based on blockchain

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
文蔺: "笔记:如何获取网站根域名", pages 1 - 5, Retrieved from the Internet <URL:https://segmentfault.com/a/1190000009211044> *

Similar Documents

Publication Publication Date Title
CN113760948B (en) Data query method and device
CN108427731B (en) Page code processing method and device, terminal equipment and medium
CN108846753B (en) Method and apparatus for processing data
KR102090982B1 (en) How to identify malicious websites, devices and computer storage media
CN107480205B (en) Method and device for partitioning data
WO2020088104A1 (en) Method and apparatus for performing block chain record-keeping on webpage by means of file acquisition
JP2019536171A (en) Web page clustering method and apparatus
US20160321254A1 (en) Unsolicited bulk email detection using url tree hashes
WO2015188604A1 (en) Phishing webpage detection method and device
CN112988583A (en) Method and device for testing syntax compatibility of database
CN103593406A (en) Static resource identifier processing method and device
CN113010405A (en) Application program testing method and device
US9398041B2 (en) Identifying stored vulnerabilities in a web service
CN109815243B (en) Structured storage method and device during document interface modification
CN112130860A (en) JSON object analysis method and device, electronic device and storage medium
CN110851343B (en) A test method and device based on decision tree
CN108011936B (en) Method and device for pushing information
CN111107133A (en) Generation method of difference packet, data updating method, device and storage medium
CN105610596B (en) Resource directory management method and network terminal
CN113806647A (en) Methods and related equipment for identifying development frameworks
CN110795915B (en) XML file batch modification method, system, device and computer-readable storage medium
CN111782244A (en) Configuration file update method, device, computer equipment and storage medium
CN113761401A (en) Method and device for determining website root domain name
CN113626747B (en) Breadcrumb navigation generation method and device
CN108664535B (en) Information output method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20211207

RJ01 Rejection of invention patent application after publication