KR101822629B1

KR101822629B1 - Method and system for performing classification and archiving of emails based on ontology

Info

Publication number: KR101822629B1
Application number: KR1020160112236A
Authority: KR
Inventors: 송석일; 김응진; 문지혜; 전진환
Original assignee: 한국교통대학교산학협력단; 전진환
Priority date: 2016-08-31
Filing date: 2016-08-31
Publication date: 2018-01-26
Anticipated expiration: 2036-08-31

Abstract

The present invention provides a method for classifying and archiving an email based on ontology. The method for classifying and archiving an email based on ontology includes the steps of: determining whether an email is generated; extracting a feature word of the email; expanding the feature word by using the ontology related to the email; classifying the email by using the feature words; generating indexing information of the email according to a classification result; and storing the email according to the indexing information. Accordingly, the present invention can improve work efficiency.

Description

&Lt; Desc / Clms Page number 1 > METHOD AND SYSTEM FOR FOR PERFORMING CLASSIFICATION AND ARCHIVING OF EMAILS BASED ON ONTOLOGY < RTI ID =

본 발명은 온톨로지 기반 이메일 분류 및 아카이빙 방법 및 시스템을 제공하는 것에 관한 것이다.The present invention relates to providing an ontology-based email classification and archiving method and system.

일반적으로 기업이나 회사에서 업무상 발생하는 이메일은 해당 이메일을 거래처 또는 클라이언트와 주고 받은 피고용인이 관리하여 왔다.In general, business emails from companies or companies have been managed by employees who have exchanged emails with clients or clients.

이러한 개개인의 이메일을 관리하는 기술들이 종래에 공지되어 있다. 예컨대, 2009년 11월 18일 공개된 “이메일 관리 시스템, 방법, 및 프로그램”이란 명칭의 특허공개 10-2009-0118908가 있다. 이 특허에서는 개인 계정의 이메일을 관리하는 기술이 개시되어 있다. 또한, 2014년 7월 4일 공개된 “이메일 태그”란 명칭의 특허공개 10-2014-0084316가 있다. 이 특허에서는 ‘이메일의 내용에 기초하여 적어도 하나의 제안된 태그에 대한 제안을 제공하고, 선택된 태그의 선택을 수신하고, 컴퓨터 데이터베이스에 이메일을 저장하고, 따라서 저장된 이메일을 생성하고, 선택된 태그를 컴퓨터 데이터베이 스의 저장된 이메일과 연관시키게 하는 기술’이 공개되어 있다. 이와 같이, 각 개인이 자신의 계정의 이메일을 관리하기 위한 기술들이 공지되어 있다.Techniques for managing such individual emails are conventionally known. For example, there is a patent publication 10-2009-0118908 entitled " Email Management System, Method, and Program " issued November 18, 2009. This patent discloses a technique for managing the e-mail of an individual account. There is also a patent publication 10-2014-0084316 entitled " E-MAIL TAG " published July 4, In this patent it is proposed to provide a proposal for at least one proposed tag based on the content of the email, receive a selection of the selected tag, store the email in the computer database and thus generate the stored email, To associate it with the stored e-mail of the database '. As such, techniques are known for each person to manage their account email.

그런데, 기업이나 회사에 소속되어 일하던 피고용인이 퇴사하는 경우, 해당 피고용인이 어떤 업무에 관련되어 주고 받은 이메일들은 해당 업무를 인수인계받는 사람에게 전달되지 않는 경우가 대부분이다. 따라서, 기업이나 기관에서 이메일들을 관리하기 위한 기술이 요구되고 있다.However, when an employee who is a member of a corporation or a company leaves the company, the emails the employee has received related to a certain job are often not delivered to the recipient. Therefore, there is a demand for a technique for managing emails in a corporation or an organization.

KR 10-2009-0118908 B (2009.11.18.)KR 10-2009-0118908 B (2009.11.18.) KR 10-2014-0084316 B (2014.07.04.)KR 10-2014-0084316 B (2014.07.04.)

본 발명은 전술한 바와 같은 점에 착안하여 창출된 것으로서, 기업의 업무에 대한 온톨로지를 구축하고 이를 이용하여 이메일을 분류하여 업무 효율을 향상시킨 온톨로지 기반 이메일 분류 및 아카이빙 방법 및 시스템을 제공하는 것을 목적으로 한다.An object of the present invention is to provide an ontology-based e-mail classification and archiving method and system, which is created by focusing on the above-described points, and which is constructed by establishing an ontology for a business of an enterprise, .

또한, 본 발명은 업무 분야별로 선택적인 이메일 아카이빙을 수행하고 업무 분야에 따른 아카이빙된 이메일 검색이 가능하게 한 온톨로지 기반 이메일 분류 및 아카이빙 방법 및 시스템을 제공하는 것을 목적으로 한다.It is another object of the present invention to provide an ontology-based e-mail classification and archiving method and system capable of performing selective e-mail archiving for each business field and searching archived e-mails according to business fields.

본 발명의 일실시예에 따라, 온톨로지 기반 이메일 분류 및 아카이빙 방법은 이메일이 발생하는 지를 판단하는 단계; 상기 이메일의 특징 단어를 추출하는 단계; 상기 이메일과 관련된 온톨로지를 이용하여 특징 단어를 확장하는 단계; 상기 특징 단어들을 이용하여 상기 이메일의 분류를 수행하는 단계; 상기 분류 결과에 따라 상기 이메일의 인덱싱 정보를 생성하는 단계; 상기 인덱싱 정보에 따라 상기 이메일을 저장하는 단계를 포함한다.According to one embodiment of the present invention, an ontology-based email classification and archiving method comprises: determining whether an email is generated; Extracting feature words of the email; Expanding a feature word using an ontology associated with the email; Performing classification of the e-mail using the feature words; Generating indexing information of the e-mail according to the classification result; And storing the email according to the indexing information.

상기 온톨로지 기반 이메일 분류 및 아카이빙 방법은 상기 온톨로지를 구축하는 단계를 더 포함할 수 있다.The ontology-based email classification and archiving method may further include building the ontology.

상기 온톨로지를 구축하는 단계는 기업의 부서, 부서의 업무나 프로젝트 및 특정 이벤트를 설명하는 단어들 중 적어도 하나를 이용하여 수행될 수 있다.The step of constructing the ontology may be performed using at least one of words describing a task, a project, and a specific event of a department or department of an enterprise.

상기 온톨로지 기반 이메일 분류 및 아카이빙 방법은 상기 이메일에 태깅(tagging)을 수행하는 단계를 더 포함할 수 있다.The ontology-based email classification and archiving method may further include performing tagging on the email.

상기 이메일의 분류를 수행하는 단계는 SVM(support vector machine)을 이용하여 수행될 수 있다.The step of classifying the e-mail may be performed using a support vector machine (SVM).

본 발명의 다른 실시예에 따라, 온톨로지 기반 이메일 분류 및 아카이빙 시스템은 이메일이 발생하면, 상기 이메일의 특징 단어를 추출하는 특징 단어 추출부; 상기 이메일과 관련된 온톨로지를 이용하여 특징 단어를 확장하는 특징 단어 확장부; 상기 특징 단어들을 이용하여 이메일을 분류하는 이메일 분류부; 및 상기 분류 결과에 따라 상기 이메일의 인덱싱 정보를 생성하고, 상기 인덱싱 정보에 따라 상기 이메일을 저장하는 아카이빙 수행부를 포함한다. According to another embodiment of the present invention, an ontology-based e-mail classification and archiving system includes a feature word extracting unit for extracting a feature word of the e-mail when an e-mail is generated; A feature word expander for expanding a feature word using an ontology related to the e-mail; An email classifier for classifying emails using the feature words; And an archiving unit for generating indexing information of the e-mail according to the classification result and storing the e-mail according to the indexing information.

상기 온톨로지를 구축하는 단계는 기업의 부서, 부서의 업무나 프로젝트 및 특정 이벤트를 설명하는 단어들 중 적어도 하나를 기반하여 구축될 수 있다. The step of constructing the ontology can be constructed based on at least one of words describing a task, a project, and a specific event of an enterprise department, department, or the like.

상기 온톨로지 기반 이메일 분류 및 아카이빙 시스템은 상기 이메일에 태깅(tagging)을 수행하는 이메일 태깅부를 더 포함할 수 있다. The ontology-based email classification and archiving system may further include an email tagging unit for performing tagging on the email.

상기 이메일 분류부는 SVM(support vector machine)을 이용하여 상기 이메일을 분류할 수 있다. The email classifier may classify the email using a support vector machine (SVM).

본 발명의 실시예들에 따라, 기업의 업무에 대한 온톨로지를 구축하고 이를 이용하여 이메일을 분류함으로써 업무 효율을 향상시킬 수 있다. 또한, 본 발명은 업무 분야별로 선택적인 이메일 아카이빙을 수행하고 업무 분야에 따른 아카이빙된 이메일 검색이 가능하게 함으로써, 업무 효율을 향상시킬 수 있다.According to the embodiments of the present invention, work efficiency can be improved by constructing an ontology for the business of an enterprise and classifying the e-mail using the constructed ontology. In addition, the present invention can perform selective e-mail archiving for each business field and enable archived e-mail search according to a business field, thereby improving work efficiency.

도 1은 본 발명의 일 실시예에 따른 온톨로지 기반 이메일 분류 및 아카이빙 시스템의 블록 구성도이다.
도 2는 본 발명에 따른 기업의 연구부서의 온톨로지의 일 예를 모식적으로 나타낸 도면이다.
도 3은 본 발명에 따른 어떤 기업의 온톨로지의 일 예를 나타낸 도면이다.
도 4는 본 발명의 일 실시예에 따른 온톨로지 기반 이메일 분류 및 아카이빙 방법의 흐름도이다.1 is a block diagram of an ontology-based e-mail classification and archiving system according to an embodiment of the present invention.
2 is a diagram schematically showing an example of an ontology of a research department of a company according to the present invention.
FIG. 3 is a diagram illustrating an example of a company ontology according to the present invention.
4 is a flowchart of an ontology-based email classification and archiving method according to an embodiment of the present invention.

본 발명은 다양한 변환을 가할 수 있고 여러 가지 실시예를 가질 수 있는바, 특정 실시예들을 도면에 예시하고 상세한 설명에 상세하게 설명하고자 한다. 본 발명의 효과 및 특징, 그리고 그것들을 달성하는 방법은 도면과 함께 상세하게 후술되어 있는 실시예들을 참조하면 명확해질 것이다. 그러나 본 발명은 이하에서 개시되는 실시예들에 한정되는 것이 아니라 다양한 형태로 구현될 수 있다. BRIEF DESCRIPTION OF THE DRAWINGS The present invention is capable of various modifications and various embodiments, and specific embodiments are illustrated in the drawings and described in detail in the detailed description. The effects and features of the present invention and methods of achieving them will be apparent with reference to the embodiments described in detail below with reference to the drawings. The present invention may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein.

이하, 첨부된 도면을 참조하여 본 발명의 실시예들을 상세히 설명하기로 하며, 도면을 참조하여 설명할 때 동일하거나 대응하는 구성 요소는 동일한 도면부호를 부여하고 이에 대한 중복되는 설명은 생략하기로 한다. Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings, wherein like reference numerals refer to like or corresponding components throughout the drawings, and a duplicate description thereof will be omitted .

도 1은 본 발명의 일 실시예에 따른 온톨로지 기반 이메일 분류 및 아카이빙 시스템의 블록 구성도이다. 1 is a block diagram of an ontology-based e-mail classification and archiving system according to an embodiment of the present invention.

도 1을 참조하면, 본 발명의 일 실시예에 따른 온톨로지 기반 이메일 분류 및 아카이빙 시스템(100)은 통신망(30)을 통해 메일 서버(40)에 연결되어 있다. 메일 서버(40)은 예컨대, 어떤 기업의 메일 서버일 수 있다. 해당 기업에 근무하는 피고용인들은 메일 서버(40)를 통해 기업 관련 업무에 필요한 이메일을 거래처 또는 클라이언트와 주고받을 수 있다. 피고용인들은 자신의 사용자 단말기를 통해 이메일을 송수신할 수 있다. Referring to FIG. 1, an ontology-based e-mail classification and archiving system 100 according to an embodiment of the present invention is connected to a mail server 40 through a communication network 30. The mail server 40 may be, for example, a mail server of an enterprise. Employees who work in the company can send and receive e-mails necessary for the business related business to the customer or client through the mail server 40. Employees can send and receive e-mails through their user terminals.

사용자 단말기는 데이터 통신이 가능하고, 애플리케이션의 실행이 가능한 단말기이면 어떤 단말기라도 가능하다. 사용자 단말기의 예로서, 도 1에는 랩톱 컴퓨터(20) 및 스마트폰(22)이 도시되어 있지만, 이에 한정되지 않는다. 사용자 단말기의 다른 예에는 핸드헬드 컴퓨터, 개인용 디지털 어시스턴트(PDA), 태블릿 컴퓨터, 데스크탑 컴퓨터, 셀룰러 전화기, 또는 이들 데이터 처리 디바이스 중 임의의 둘 이상의 조합 또는 임의의 다른 적절한 데이터 처리 디바이스가 포함되지만, 이에 제한되지는 않는다. The user terminal can be any type of terminal capable of data communication and capable of executing an application. As an example of a user terminal, a laptop computer 20 and a smartphone 22 are shown in FIG. 1, but are not limited thereto. Other examples of user terminals include hand-held computers, personal digital assistants (PDAs), tablet computers, desktop computers, cellular telephones, or any combination of two or more of these data processing devices or any other suitable data processing device, But is not limited to.

또한, 통신망(30)의 예로서, 로컬 영역 네트워크("LAN") 및 광역 네트워크("WAN"), 예를 들어, 인터넷이 포함된다. 통신망(30)는 다양한 유선 또는 무선 프로토콜, 예컨대 이더넷(Ethernet), 범용 직렬 버스(Universal Serial Bus, USB), 화이어와이어(FIREWIRE), GSM(Global 시스템 for Mobile Communications), EDGE(Enhanced Data GSM Environment), 코드 분할 다중 접속(code division multiple access, CDMA), 시간 분 할 다중 접속(time division multiple access, TDMA), 블루투스(Bluetooth), Wi-Fi, VoIP(voice over Internet Protocol), Wi-MAX, 또는 임의의 기타 적합한 통신 프로토콜을 포함하는 임의의 알려진 네트워크 프로토콜을 사용하여 구현될 수 있다.Examples of the communication network 30 include a local area network ("LAN") and a wide area network ("WAN"), for example, the Internet. The communication network 30 may include a variety of wired or wireless protocols such as Ethernet, Universal Serial Bus (USB), FIREWIRE, GSM (Global System for Mobile Communications), EDGE (Enhanced Data GSM Environment) (CDMA), time division multiple access (TDMA), Bluetooth, Wi-Fi, voice over Internet Protocol (VoIP), Wi-MAX, or Any other suitable communication protocol may be implemented using any known network protocol.

온톨로지 기반 이메일 분류 및 아카이빙 시스템(100)은 인터페이스(110), 특징 단어 추출부(120), 특징 단어 확장부(130), 이메일 분류부(140), 이메일 태깅부(150), 및 아카이빙 수행부(160)를 포함한다. 온톨로지 기반 이메일 분류 및 아카이빙 시스템(100)은 또한, 온톨로지 데이터베이스(DB)(210) 및 이메일 데이터베이스(220)를 포함할 수 있다. The ontology-based email classification and archiving system 100 includes an interface 110, a feature word extracting unit 120, a feature word expanding unit 130, an email classification unit 140, an email tagging unit 150, (160). The ontology-based email classification and archiving system 100 may also include an ontology database (DB) 210 and an email database 220.

인터페이스(110)는 메일 서버(40)에 연결되어 메일 서버(40)와의 인터페이스를 수행한다. 인터페이스(110)는 메일 서버(40)를 통해 이메일 발생을 검출하면, 해당 이메일을 메일 서버(40)에 요청하여 수신할 수 있다.The interface 110 is connected to the mail server 40 and interfaces with the mail server 40. When the interface 110 detects the occurrence of an e-mail via the mail server 40, the interface 110 can request and receive the e-mail to the mail server 40.

특징 단어 추출부(120)는 이메일이 발생하면, 상기 이메일의 특징 단어를 추출한다. 이메일의 특징 단어는 이메일의 제목 및 내용으로부터 추출될 수 있다. 또는 이메일을 송신자 및 수신자 정보로부터 추출될 수 있다. The feature word extracting unit 120 extracts a feature word of the e-mail when the e-mail is generated. The feature word of the email can be extracted from the title and content of the email. Or e-mail may be extracted from the sender and recipient information.

특징 단어 확장부(130)는 상기 이메일과 관련된 온톨로지를 이용하여 특징 단어를 확장한다. 특징 단어 확장부(130)는 이메일의 특징 단어와 연관된 온톨로지로부터 다른 특징 단어를 획득할 수 있다. The feature word expander 130 expands the feature word using the ontology associated with the e-mail. The feature word expansion unit 130 may obtain another feature word from the ontology associated with the feature word of the email.

이를 위해, 특징 단어 확장부(130)는 온톨로지 데이터베이스(210)에 연결되어 있다. 용어 "온톨로지"는 일반적으로 데이터 요소에 대한 지식(knowledge)을 말한다. 주어진 데이터 요소들의 집합은 하나 이상의 관련 온톨로지를 가질 수 있다. 온톨로지는 데이터베이스에 온톨로지를 저장하기에 적당하지 않게 하는 특징들을 갖고 있다. 예를 들면, 온톨로지 내의 지식은 전형적으로 데이터 요소들보다 덜 구조화되어 있다.To this end, the feature word expansion unit 130 is connected to the ontology database 210. The term "ontology" generally refers to knowledge of data elements. A given set of data elements may have one or more associated ontologies. The ontology has features that make it inappropriate to store the ontology in the database. For example, knowledge in an ontology is typically less structured than data elements.

온톨로지 데이터베이스(210)는 운영자에 의해 구축된 온톨로지 정보를 저장할 수 있다. 예컨대, 기업이나 기관의 각 부서 담당자가 자신의 부서 업무의 온톨로지를 구축할 수 있다. 이후, 각 부서 담당자는 부서 업무가 바뀌거나 추가될 때마다 이를 온톨로지에 반영하도록 해당 온톨로지를 수정할 수 있다. The ontology database 210 may store the ontology information constructed by the operator. For example, a person in charge of each department of a corporation or an institution can establish an ontology of his / her department work. After that, each department manager can modify the corresponding ontology so that it is reflected in the ontology whenever department work is changed or added.

도 2는 본 발명에 따른 기업의 연구부서의 온톨로지의 일 예를 모식적으로 나타낸 도면이다. 각 부서의 온톨로지는 도 2에 도시된 바와 같이 트리 형태의 계층 구조를 가지고 있다. 도 2에 도시된 연구부서의 온톨로지는 서브 클래스로서 아카이빙 서버 개발 및 스토리지 개발을 포함한다. 2 is a diagram schematically showing an example of an ontology of a research department of a company according to the present invention. The ontology of each department has a tree-like hierarchical structure as shown in FIG. The ontology of the research department shown in FIG. 2 includes an archiving server development and a storage development as subclasses.

아카이빙 서버 개발은 서브 클래스로서 아카이빙, 이메일, 및 이노브레인을 포함한다. 스토리지 개발은 서브 클래스로서 NVDIMN, 스토리지, 및 커널을 포함한다. 이러한 복수개의 온톨로지는 상위 계층의 온톨로지에 따라 서로 연결될 수 있다. 이러한 온톨로지 단어들은 예컨대, 부서 담당자 또는 프로젝트 담당자에 의해 업무 영역의 주요 키워드로서 결정될 수 있다. Archiving server development is a subclass that includes archiving, email, and in-brain. Storage development includes NVDIMN, storage, and kernel as subclasses. The plurality of ontologies may be connected to each other according to an ontology of an upper layer. These ontology words can be determined, for example, by the department manager or the project manager as the main keywords of the business domain.

도 3은 본 발명에 따른 어떤 기업의 온톨로지의 일 예를 나타낸 도면이다. 도 3에 도시된 바와 같이, A 기업은 영업부서, 기획부서, 및 연구부서의 서브 클래스를 갖는다. 영업부서의 서브 클래스는 영업부서 온톨로지 트리를 갖는다. 기획부서의 서브 클래스는 기획부서 온톨로지 트리를 갖는다. 또한, 연구부서의 서브 클래스는 연구부서 온톨로지 트리를 갖는다.FIG. 3 is a diagram illustrating an example of a company ontology according to the present invention. As shown in FIG. 3, Company A has a subclass of a sales department, a planning department, and a research department. The subclass of the sales department has a sales department ontology tree. A subclass of a planning department has a planning department ontology tree. The subclass of the research department also has a research department ontology tree.

한편, 도 2의 온톨로지는 부서에 기반하여 구축되어 있지만, 이에 한정되지 않는다. 예컨대, 온톨로지는 기업의 부서, 부서의 업무나 프로젝트 및 특정 이벤트를 설명하는 단어 중 적어도 하나에 기반하여 구축될 수 있다. On the other hand, the ontology of FIG. 2 is constructed based on the department, but is not limited thereto. For example, an ontology can be built based on at least one of a business department, a business of a department, a project, and a word describing a specific event.

특징 단어 확장부(130)는 전술한 바와 같은 온톨로지를 이용하여 특징 단어를 확장한다. 구체적으로, 이메일로부터 추출된 특징 단어가 서브 클래스로서 속한 온톨로지를 이용하여 다른 특징 단어를 추출한다.The feature word expansion unit 130 expands feature words using the ontology as described above. Specifically, another feature word is extracted using the ontology to which the feature word extracted from the email belongs as a subclass.

예컨대, 이메일 1의 특징 단어가 아카이빙, 회의, 및 날짜인 경우, 이메일 1의 특징 단어를 온톨로지를 이용하여 확장하면, 이메일 1의 특징 단어는 아카이빙, 서버, 개발, 프로젝트, 연구부서, 회의, 및 날짜를 포함할 수 있다.For example, if the feature word of e-mail 1 is archiving, conference, and date, if the feature word of e-mail 1 is extended using the ontology, the feature word of e-mail 1 may be archiving, server, development, project, research department, It can include dates.

이메일 분류부(140)는 상기 특징 단어들을 이용하여 이메일을 분류하다. 구체적으로, 이메일 분류부(140)는 서포트 벡터 머신(support vector machine: SVM)을 이용하여 상기 이메일을 분류한다. The email classifier 140 classifies the email using the feature words. Specifically, the email classifier 140 classifies the email using a support vector machine (SVM).

서포트 벡터 머신(SVM)은 기계 학습의 분야 중 하나로 패턴 인식, 자료 분석을 위한 지도 학습 모델이며, 주로 분류와 회귀 분석을 위해 사용한다. 두 카테고리 중 어느 하나에 속한 데이터의 집합이 주어졌을 때, SVM 알고리즘은 주어진 데이터 집합을 바탕으로 하여 새로운 데이터가 어느 카테고리에 속할지 판단하는 비확률적 이진 선형 분류 모델을 만든다. 만들어진 분류 모델은 데이터가 사상된 공간에서 경계로 표현되는데 SVM 알고리즘은 그 중 가장 큰 폭을 가진 경계를 찾는 알고리즘이다. SVM은 선형 분류와 더불어 비선형 분류에서도 사용될 수 있다.Support Vector Machine (SVM) is one of the fields of machine learning. It is a map learning model for pattern recognition and data analysis. It is mainly used for classification and regression analysis. Given a set of data belonging to either category, the SVM algorithm builds a non-probabilistic binary linear classification model that determines, based on a given set of data, which categories the new data belongs to. The generated classification model is represented as a boundary in the space where data is mapped. The SVM algorithm is an algorithm for finding the boundary having the largest width. SVM can be used in nonlinear classification as well as linear classification.

예컨대, 이메일 분류부(140)는 확장된 특징 단어들을 이용하여 서포트 벡터 머신(support vector machine: SVM)을 이용하여 상기 이메일이 온톨로지의 어떤 클래스로 분류 할 지를 판단할 수 있다. 전술한 바와 같은 이메일 1은 아카이빙, 서버, 개발, 프로젝트, 연구부서, 회의, 및 날짜의 특징 단어를 갖는다. 이러한 특징 단어에 의해 이메일 1은 연구부서 및 아카이빙 서버 개발로 분류될 수 있다. For example, the email classifier 140 may use a support vector machine (SVM) to determine which class of the ontology the email is to be classified using the extended feature words. E-mail 1 as described above has the characteristics of archiving, servers, development, projects, research departments, meetings, and dates. With these feature words, e-mail 1 can be classified as research department and archiving server development.

이메일 태깅부(150) 이메일에 분류 결과에 따라 태깅(tagging)을 수행한다. 예컨대, 이메일 1의 태킹 데이터 또는 태깅 정보는 (연구부서, 아카이빙 서버개발)이 될 수 있다. 이후, 이메일 1은 추후 아카이빙 및 아카이빙된 이메일에 대한 검색에 활용된다.The email tagging unit 150 performs tagging according to the classification result in the e-mail. For example, tagging data or tagging information of e-mail 1 may be (research department, archiving server development). Thereafter, e-mail 1 is used to search for later archived and archived e-mails.

아카이빙 수행부(160)는 상기 분류 결과에 따라 상기 이메일의 인덱싱 정보를 생성하고, 상기 인덱싱 정보에 따라 상기 이메일을 저장한다. 아카이빙란 용어는 주로 데이터를 원래 있던 위치에서 다른 곳으로 이동하여 보관하는 의미로 백업과 함께 데이터 보정을 위한 하나의 방법으로 사용되고 있다. The archiving unit 160 generates indexing information of the e-mail according to the classification result, and stores the e-mail according to the indexing information. Archiving terms are used primarily as a means of data correction along with backups, meaning moving data from one location to another.

구체적으로 설명하면, 아카이빙 수행부(160)는 이메일 데이터를 이메일 데이터베이스(200)에 저장한다. 아카이빙 수행부(160)는 이메일 데이터베이스(200)에 연결되어 있다. 이메일 데이터베이스(200)는 아카이빙 수행부(160)에 따라 생성된 인덱스 정보에 따라 이메일을 저장한다. 인덱스 정보는 온톨로지 또는 분류 결과에 따라 결정될 수 있다. 인덱스 정보는 사용자에 의해 결정될 수도 있다. 예컨대, 인덱스 정보로서 특정 시간이 결정되면, 아카이빙 수행부(160)는 해당 이메일을 특정 시간에 관련하여 저장할 수 있다. 유사하게, 인덱스 정보로서 특정 프로젝트가 결정되면, 아카이빙 수행부(160)는 해당 이메일을 특정 프로젝트에 관련하여 저장할 수 있다. 즉, 이메일의 분류 결과가 특정 인덱스 정보에 따라 정렬될 수 있다.More specifically, the archiving unit 160 stores the e-mail data in the e-mail database 200. [ The archiving unit 160 is connected to the e-mail database 200. The e-mail database 200 stores e-mails according to the index information generated by the archiving unit 160. The index information may be determined according to the ontology or classification result. The index information may be determined by the user. For example, if the specific time is determined as the index information, the archiving unit 160 may store the email in association with a specific time. Similarly, if a specific project is determined as the index information, the archiving unit 160 may store the email in association with a specific project. That is, the classification result of the e-mail can be sorted according to the specific index information.

도 4는 본 발명의 일 실시예에 따른 온톨로지 기반 이메일 분류 및 아카이빙 방법의 흐름도이다.4 is a flowchart of an ontology-based email classification and archiving method according to an embodiment of the present invention.

도 4를 참조하면, 온톨로지 기반 이메일 분류 및 아카이빙 시스템(100)은 단계 310에서 이메일이 발생하는 지를 판단한다. 온톨로지 기반 이메일 분류 및 아카이빙 시스템(100)은 연결된 메일 서버(40)를 통해 이메일 발생 여부를 판단할 수 있다. Referring to FIG. 4, the ontology-based email classification and archiving system 100 determines in step 310 whether an email is generated. The ontology-based e-mail classification and archiving system 100 can determine whether e-mail has been generated through the connected mail server 40.

온톨로지 기반 이메일 분류 및 아카이빙 시스템(100)은 이메일이 발생하면, 단계 320에서 상기 이메일의 특징 단어를 추출한다. 이메일의 특징 단어는 이메일의 제목 및 내용으로부터 추출될 수 있다. 또는 이메일을 송신자 및 수신자 정보로부터 추출될 수 있다. The ontology-based email classification and archiving system 100 extracts the feature word of the email in step 320 when the email is generated. The feature word of the email can be extracted from the title and content of the email. Or e-mail may be extracted from the sender and recipient information.

온톨로지 기반 이메일 분류 및 아카이빙 시스템(100)은 단계 330에서 상기 이메일과 관련된 온톨로지를 이용하여 특징 단어를 확장한다. 즉, 온톨로지 기반 이메일 분류 및 아카이빙 시스템(100)은 이메일의 특징 단어와 연관된 온톨로지로부터 다른 특징 단어를 획득할 수 있다. The ontology-based email classification and archiving system 100 expands the feature word using the ontology associated with the email in step 330. That is, the ontology-based email classification and archiving system 100 may obtain other feature words from the ontology associated with the feature words of the email.

온톨로지 기반 이메일 분류 및 아카이빙 시스템(100)은 단계 340에서 상기 특징 단어들을 이용하여 이메일을 분류하다. 예컨대, 온톨로지 기반 이메일 분류 및 아카이빙 시스템(100)은 서포트 벡터 머신(support vector machine: SVM)을 이용하여 상기 이메일이 온톨로지의 어떤 클래스로 분류 할 지를 판단할 수 있다.The ontology-based email classification and archiving system 100 classifies emails using the feature words in step 340. For example, the ontology-based email classification and archiving system 100 may use a support vector machine (SVM) to determine which class of the ontology the email is to be classified.

이어서, 온톨로지 기반 이메일 분류 및 아카이빙 시스템(100)은 단계 350에서 분류 결과에 따라 이메일에 태깅(tagging)을 수행한다. 태깅은 이메일에 대해 관련 단어를 붙이는 동작을 의미한다. 즉, 태깅은 이메일에 대한 정보를 나타내는 적어도 하나의 단어를 연결시키는 동작이다. 이러한 태킹 정보는 추후 아카이빙 및 아카이빙된 이메일에 대한 검색에 활용된다.Subsequently, the ontology-based email classification and archiving system 100 performs tagging on the email according to the classification result in step 350. Tagging refers to attaching related words to emails. That is, tagging is an operation that links at least one word representing information about e-mail. This tacking information is used to search for archived and archived emails in the future.

온톨로지 기반 이메일 분류 및 아카이빙 시스템(100)은 단계 360에서 상기 분류 결과에 따라 상기 이메일의 인덱싱 정보를 생성하고, 상기 인덱싱 정보에 따라 상기 이메일을 저장한다. The ontology-based e-mail classification and archiving system 100 generates indexing information of the e-mail according to the classification result in step 360, and stores the e-mail according to the indexing information.

구체적으로, 온톨로지 기반 이메일 분류 및 아카이빙 시스템(100)은 이메일 데이터를 이메일 데이터베이스(200)에 저장한다. 인덱스 정보는 온톨로지 또는 분류 결과에 따라 결정될 수 있다. 인덱스 정보는 사용자에 의해 결정될 수도 있다. 예컨대, 인덱스 정보로서 특정 시간이 결정되면, 온톨로지 기반 이메일 분류 및 아카이빙 시스템(100)은 해당 이메일을 특정 시간에 관련하여 저장할 수 있다.Specifically, the ontology-based email classification and archiving system 100 stores the email data in the email database 200. The index information may be determined according to the ontology or classification result. The index information may be determined by the user. For example, if the specific time is determined as the index information, the ontology-based e-mail classification and archiving system 100 may store the e-mail in association with a specific time.

한편, 본 발명의 상세한 설명에서는 첨부된 도면에 의해 참조되는 바람직한 실시 예를 중심으로 구체적으로 기술되었으나, 본 발명의 범위에서 벗어나지 않는 한도 내에서 여러 가지 변형이 가능함은 물론이다. 그러므로 본 발명의 범위는 설명된 실시 예에 국한되어 정해져서는 안되며 후술하는 특허청구의 범위뿐 아니라 이 특허청구의 범위와 균등한 것들에 의해서 정해져야 한다.Although the preferred embodiments of the present invention have been disclosed for illustrative purposes, those skilled in the art will appreciate that various modifications, additions and substitutions are possible, without departing from the scope and spirit of the invention as disclosed in the accompanying claims. Therefore, the scope of the present invention should not be limited by the described embodiments, but should be determined by the scope of the appended claims and equivalents thereof.

Claims

An ontology-based email classification and archiving method,
Determining whether an email sent to or received from the mail server occurs;
Extracting feature words of the email;
Expanding a feature word using an ontology associated with the email;
Performing classification of the e-mail using the feature words;
Generating indexing information of the e-mail according to the classification result; And
And storing the email according to the indexing information. &Lt; RTI ID = 0.0 > 11. < / RTI >

The ontology-based email classification and archiving method of claim 1, further comprising: constructing the ontology.

The ontology-based e-mail classification and archiving method according to claim 2, wherein the ontology is performed based on at least one of words describing a task, a project, and a specific event of a department or department of an enterprise.

The ontology-based email classification and archiving method of claim 1, further comprising performing tagging on the email.

The ontology-based email classification and archiving method of claim 1, wherein the step of classifying the e-mail is performed using a support vector machine (SVM).

An ontology-based email classification and archiving system,
A feature word extracting unit for extracting a feature word of the email when an email transmitted or received to the mail server occurs;
A feature word expander for expanding a feature word using an ontology related to the e-mail;
An email classifier for classifying emails using the feature words; And
And an archiving unit for generating indexing information of the email according to the classification result and storing the email according to the indexing information.

The ontology-based email classification and archiving system according to claim 6, wherein the ontology is constructed based on at least one of words describing tasks, projects, and specific events of a department or department of an enterprise.

The system of claim 6, further comprising an email tagging unit for performing tagging on the email.

7. The system of claim 6, wherein the email classifier classifies the email using a support vector machine (SVM).