KR102425013B1

KR102425013B1 - System for predicting carbon credits price using search volume analysis and multiple regression analysis and method for performed by the same

Info

Publication number: KR102425013B1
Application number: KR1020200142026A
Authority: KR
Inventors: 한승우; 김현호; 김유진; 임기성; 이민우
Original assignee: 인하대학교 산학협력단
Priority date: 2020-10-29
Filing date: 2020-10-29
Publication date: 2022-07-25
Anticipated expiration: 2040-10-29
Also published as: KR20220057121A

Abstract

본 발명은 검색량 분석과 다중회귀 분석을 이용한 탄소 배출권 가격 예측 시스템 및 그것에 의해 수행되는 탄소 배출권 가격 예측 방법에 관한 것이다.
본 발명에 따르면, 탄소 배출권과 관련된 복수의 검색어에 대한 주간 검색 빈도수 데이터를 추출하는 단계, 상기 탄소 배출권의 주간 종가 데이터를 한국 거래소로부터 수집하는 단계, 상기 주간 검색 빈도수 데이터와 상기 탄소 배출권의 주간 종가 데이터를 각각의 주차별로 저장하여 데이터베이스를 구축하는 단계, 상기 데이터베이스에 저장된 각각의 상기 주간 검색 빈도수 데이터와 주간 종가 데이터를 교차상관 분석하여 각 주차 별로 상기 주간 검색 빈도수의 상관지수를 연산하는 단계, 상기 연산된 상관지수가 기준 값보다 작은 경우, 해당 주차의 주간 검색 빈도수 데이터를 상기 데이터베이스에서 삭제하고, 주간 검색 빈도수 데이터와 주간 종가 데이터를 다중회귀 분석방법에 적용하여 상기 각 주차 별로 복수의 회귀 모델을 추출하는 단계, 상기 각각의 주차 별로 추출된 복수의 회귀모델에 대하여 각각의 적합도를 연산하고, 각 주차 별 적합도가 가장 큰 값을 가지는 회귀모델을 주차별로 선정하는 단계, 상기 각 주차 별로 선정된 회귀모델을 예측오차 분석방법에 적용하여 평균 절대 오차 비율을 연산하고, 상기 평균 절대 오차 비율이 가장 작은 회귀모델을 최종 예측모델로 선정하는 단계, 그리고 현재 시점의 탄소 배출권과 관련된 복수의 관심 검색어를 입력받아 상기 최종 예측 모델에 적용하여 최종 탄소 배출권 가격을 예측하는 단계를 포함한다.
이와 같이 본 발명에 따르면, 탄소배출권 관련 검색어에 따라 탄소배출권 가격을 예측할 수 있어 변화하는 탄소배출권 시장의 영향을 반영하여 건설 현장에서의 정확한 환경부담금을 고려한 예산을 산정할 수 있다.The present invention relates to a carbon credit price prediction system using search volume analysis and multiple regression analysis, and a carbon credit price prediction method performed by the system.
According to the present invention, the steps of extracting weekly search frequency data for a plurality of search terms related to carbon credits, collecting the weekly closing price data of the carbon credits from the Korean Exchange, the weekly search frequency data and the weekly closing price of the carbon credits building a database by storing data for each week; calculating a correlation index of the weekly search frequency for each week by cross-correlating each of the weekly search frequency data and weekly closing price data stored in the database; When the calculated correlation index is smaller than the reference value, the weekly search frequency data of the corresponding week is deleted from the database, and the weekly search frequency data and weekly closing price data are applied to the multiple regression analysis method to obtain a plurality of regression models for each week. extracting, calculating each fit for a plurality of regression models extracted for each parking, and selecting a regression model having the greatest fitness for each parking for each parking, regression selected for each parking Applying the model to the prediction error analysis method to calculate the average absolute error ratio, selecting the regression model with the smallest average absolute error ratio as the final prediction model, and inputting a plurality of keywords of interest related to the current carbon credit and predicting the final carbon credit price by applying it to the final prediction model.
As described above, according to the present invention, it is possible to predict the price of carbon credits according to search terms related to carbon credits, so that it is possible to reflect the influence of the changing carbon credit market, and to calculate a budget in consideration of accurate environmental charges at the construction site.

Description

Carbon credit price prediction system using search volume analysis and multiple regression analysis, and carbon credit price prediction method performed by it

본 발명은 검색량 분석과 다중회귀 분석을 이용한 탄소 배출권 가격 예측 시스템 및 그것에 의해 수행되는 탄소 배출권 가격 예측 방법에 관한 것으로, 검색어 데이터와 탄소 배출권 가격 데이터를 이용하여 탄소 배출권의 가격을 정확히 예측하기 위한 검색량 분석과 다중회귀 분석을 이용한 탄소 배출권 가격 예측 시스템 및 그것에 의해 수행되는 탄소 배출권 가격 예측 방법에 관한 것이다.The present invention relates to a carbon credit price prediction system using search volume analysis and multiple regression analysis, and a carbon credit price prediction method performed by the same, for accurately predicting the price of carbon credits using search word data and carbon credit price data To a carbon credit price prediction system using search volume analysis and multiple regression analysis, and to a carbon credit price prediction method performed by the same.

범국가적 탄소배출권 거래제가 시행된 이후 세계적으로 탄소배출과 관련한 관심이 높아지고 있다. 대학민국은 2015년부터 탄소배출권 거래제도가 시행되었고, 이에 발맞추어 각 산업 분야에 탄소배출권을 할당하였다. After the nationwide carbon emission trading system was implemented, interest in carbon emission has been increasing worldwide. The Republic of Korea University has implemented the carbon credit trading system since 2015, and in line with this, carbon credits have been allocated to each industrial sector.

그러나 현재 건설업에서 고려되는 탄소배출권 거래 품목은 완공 이후 건물에서 배출되는 탄소에 한정되어 있으며, 시공단계로 확대될 탄소배출권 시장에 대비하여 시공단계의 탄소배출권의 규모와 그 가격을 통해 정확한 예산을 파악하는 것이 우선되어야 한다.However, the currently considered carbon credit trading items in the construction industry are limited to carbon emitted from buildings after completion. should be prioritized.

국내의 탄소배출권 거래제의 도입 이후, 탄소배출권 가격의 영향요인들에 대한 분석이 진행되고 있지만, 탄소배출권 가격을 실질적으로 예측하는 예측모형 등에 대한 개발은 더딘 상황이다. After the introduction of the domestic carbon credit trading system, analysis of the factors affecting the carbon credit price is in progress, but development of a forecasting model that actually predicts the carbon credit price is slow.

변동하는 국내의 탄소배출권 가격을 예측할 지표의 부재로 탄소배출권 가격의 정확한 추이를 짐작하기 어려우며, 이에 대략적으로 환경부담금을 고려하여 예산 산정이 진행되고 있어 전체 공사에 있어 효율적인 배분이 이루어지고 있다고 말하기 어려운 실정이다.It is difficult to estimate the exact trend of carbon credit prices due to the absence of indicators to predict fluctuating domestic carbon credit prices. the current situation.

따라서, 건설 현장에서의 정확한 환경부담금을 고려한 예산을 산정하기 위한 기술 개발이 필요하다.Therefore, it is necessary to develop a technology to calculate the budget considering the accurate environmental charges at the construction site.

본 발명의 배경이 되는 기술은 대한민국 공개특허 제10-2008-0074753호(2008.08.13. 공개)에 개시되어 있다.The technology that is the background of the present invention is disclosed in Korean Patent Laid-Open No. 10-2008-0074753 (published on August 13, 2008).

본 발명이 이루고자 하는 기술적 과제는 검색어 데이터와 탄소 배출권 가격 데이터를 이용하여 탄소 배출권의 가격을 정확히 예측하기 위한 검색량 분석과 다중회귀 분석을 이용한 탄소 배출권 가격 예측 시스템 및 그것에 의해 수행되는 탄소 배출권 가격 예측 방법에 관한 것이다.The technical problem to be achieved by the present invention is a carbon credit price prediction system using search volume analysis and multiple regression analysis to accurately predict the price of carbon credits using search word data and carbon credit price data, and carbon credit price prediction performed by the search volume analysis and multiple regression analysis it's about how

이러한 기술적 과제를 이루기 위한 본 발명의 실시 예에 따르면, 탄소 배출권 가격 예측 시스템에 의해 수행되는 탄소 배출권 가격 예측 방법에 있어서, 탄소 배출권과 관련된 복수의 검색어에 대한 주간 검색 빈도수 데이터를 추출하는 단계, 상기 탄소 배출권의 주간 종가 데이터를 한국 거래소로부터 수집하는 단계, 상기 주간 검색 빈도수 데이터와 상기 탄소 배출권의 주간 종가 데이터를 각각의 주차별로 저장하여 데이터베이스를 구축하는 단계, 상기 데이터베이스에 저장된 각각의 상기 주간 검색 빈도수 데이터와 주간 종가 데이터를 교차상관 분석하여 각 주차 별로 상기 주간 검색 빈도수의 상관지수를 연산하는 단계, 상기 연산된 상관지수가 기준 값보다 작은 경우, 해당 주차의 주간 검색 빈도수 데이터를 상기 데이터베이스에서 삭제하고, 주간 검색 빈도수 데이터와 주간 종가 데이터를 다중회귀 분석방법에 적용하여 상기 각 주차 별로 복수의 회귀 모델을 추출하는 단계, 상기 각각의 주차 별로 추출된 복수의 회귀모델에 대하여 각각의 적합도를 연산하고, 각 주차 별 적합도가 가장 큰 값을 가지는 회귀모델을 주차별로 선정하는 단계, 상기 각 주차 별로 선정된 회귀모델을 예측오차 분석방법에 적용하여 평균 절대 오차 비율을 연산하고, 상기 평균 절대 오차 비율이 가장 작은 회귀모델을 최종 예측모델로 선정하는 단계, 그리고 최종 예측모델에 적용된 복수의 관심 검색어에 대한 검색량을 입력받아 상기 최종 예측 모델에 적용하여 최종 탄소 배출권 가격을 예측하는 단계를 포함한다.According to an embodiment of the present invention for achieving this technical task, in the carbon credit price prediction method performed by the carbon credit price prediction system, extracting weekly search frequency data for a plurality of search terms related to carbon credits, the Collecting the weekly closing price data of carbon credits from the Korean Exchange, storing the weekly search frequency data and the weekly closing price data of the carbon credits for each week to build a database, Each of the weekly search frequencies stored in the database Cross-correlation analysis of data and weekly closing price data to calculate a correlation index of the weekly search frequency for each week, if the calculated correlation index is smaller than a reference value, delete the weekly search frequency data of the corresponding week from the database, , extracting a plurality of regression models for each week by applying the weekly search frequency data and weekly closing price data to the multiple regression analysis method, calculating the respective fitness for the plurality of regression models extracted for each week, selecting a regression model having the greatest fitness for each parking for each week, calculating the average absolute error ratio by applying the regression model selected for each parking to the prediction error analysis method, and the average absolute error ratio is the most Selecting a small regression model as the final predictive model, and receiving a search amount for a plurality of search terms of interest applied to the final predictive model, and applying it to the final predictive model to predict the final carbon credit price.

상기 상관지수를 연산하는 단계는, 아래의 수학식을 통해 상기 상관지수를 연산할 수 있다.In the calculating of the correlation index, the correlation index may be calculated through the following equation.

여기서, i는 해당 검색어의 인덱스이고, t는 해당 주차,

는 해당 주차에서의 해당 검색어의 검색 빈도수,

는 해당 검색어 빈도수 데이터의 평균 값이고,

는 해당 주차로부터 k번째 주차에서의 탄소 배출권 가격 데이터이고,

는 탄소 배출권 가격 데이터의 평균이다. where i is the index of the corresponding search term, t is the corresponding parking,

is the search frequency of the search term in the corresponding parking lot,

is the average value of the search term frequency data,

is the carbon emission price data in the kth parking from the corresponding parking,

is the average of carbon credit price data.

상기 적합도는, 아래의 수학식을 통해 연산될 수 있다.The fitness can be calculated through the following equation.

여기서, R은 상관계수이고, n은 상관지수가 기준 값 이상인 검색어의 개수이고, p는 상기 탄소 배출권과 관련된 검색어의 총 개수이다.Here, R is a correlation coefficient, n is the number of search terms having a correlation index equal to or greater than the reference value, and p is the total number of search terms related to the carbon credits.

상기 상관계수(R)는, 아래의 수학식을 통해 연산될 수 있다.The correlation coefficient R may be calculated through the following equation.

여기서,

는 해당 주차에서의 상기 회귀모델의 근사 값,

는 해당 주차에서의 상기 회귀모델의 근사 값의 평균 값,

는 해당 주차에서의 탄소 배출권 가격이다.here,

is the approximate value of the regression model in the corresponding parking,

is the average value of the approximate value of the regression model in the corresponding parking,

is the carbon credit price for the parking lot.

상기 최종 탄소 배출권 가격을 예측하는 단계는, 상기 각 주차 별 선정된 회귀모형을 아래의 수학식에 적용하여 각 주차 별 평균 절대 오차 비율을 연산할 수 있다.In the predicting of the final carbon credit price, the average absolute error ratio for each parking can be calculated by applying the regression model selected for each parking to the following equation.

여기서, MAPE는 평균 절대 오차 비율이고,

는 해당 주차에서의 탄소 배출권 가격이고,

는 해당 주차에서의 회귀모델의 근사값이다.where MAPE is the mean absolute error rate,

is the price of carbon credits in the parking lot,

is an approximation of the regression model at the corresponding parking.

상기 복수의 관심 검색어는, 탄소배출권거래, 후성, 탄소배출권거래제, 이건산업, 탄소배출권가격, 홈데코, "productive", 비유 및 포크레인을 포함할 수 있다.The plurality of search terms of interest may include carbon credit trading, hoosung, carbon credit trading system, this industry, carbon credit price, home decor, "productive", metaphor, and fork crane.

상기 최종 탄소 배출권 가격을 예측하는 단계는, 아래의 수학식을 통해 상기 최종 탄소 배출권 가격을 예측할 수 있다.The predicting of the final carbon credit price may include predicting the final carbon credit price through the following equation.

여기서,

는 해당 주차에서의 관심 검색어의 검색 빈도수이다.here,

is the search frequency of the search term of interest in the corresponding parking lot.

본 발명의 다른 실시예에 따르면, 탄소 배출권 가격을 예측하기 위한 탄소 배출권 가격 예측 시스템에 있어서, 탄소 배출권과 관련된 복수의 검색어에 대한 주간 검색 빈도수 데이터를 추출하는 데이터 추출부, 상기 탄소 배출권의 주간 종가 데이터를 한국 거래소로부터 수집하는 데이터 수집부, 상기 주간 검색 빈도수 데이터와 상기 탄소 배출권의 주간 종가 데이터를 각각의 주차별로 저장하여 데이터베이스를 구축하는 데이터베이스부, 상기 데이터베이스에 저장된 각각의 상기 주간 검색 빈도수 데이터와 주간 종가 데이터를 교차상관 분석하여 각 주차 별로 상기 주간 검색 빈도수의 상관지수를 연산하는 연산부, 상기 연산된 상관지수가 기준 값보다 작은 경우, 해당 주차의 주간 검색 빈도수 데이터를 상기 데이터베이스에서 삭제하고, 주간 검색 빈도수 데이터와 주간 종가 데이터를 다중회귀 분석방법에 적용하여 상기 각 주차 별로 복수의 회귀 모델을 추출하는 회귀모델 추출부, 상기 각각의 주차 별로 추출된 복수의 회귀모델에 대하여 각각의 적합도를 연산하고, 각 주차 별 적합도가 가장 큰 값을 가지는 회귀모델을 주차별로 선정하는 회귀모델 선정부, 상기 각 주차 별로 선정된 회귀모델을 예측오차 분석방법에 적용하여 평균 절대 오차 비율을 연산하고, 상기 평균 절대 오차 비율이 가장 작은 회귀모델을 최종 예측모델로 선정하는 최종 예측모델 생성부, 그리고 최종 예측모델에 적용된 복수의 관심 검색어에 대한 검색량을 입력받아 상기 최종 예측 모델에 적용하여 최종 탄소 배출권 가격을 예측하는 가격 예측부를 포함한다.According to another embodiment of the present invention, in the carbon credit price prediction system for predicting the price of carbon credits, a data extraction unit for extracting weekly search frequency data for a plurality of search terms related to carbon credits, the weekly closing price of the carbon credits A data collection unit for collecting data from the Korea Exchange, a database unit for building a database by storing the weekly search frequency data and the weekly closing price data of the carbon credits for each week, each of the weekly search frequency data stored in the database and A calculator that calculates the correlation index of the weekly search frequency for each week by cross-correlation analysis of the weekly closing price data. When the calculated correlation index is smaller than the reference value, the weekly search frequency data of the corresponding week is deleted from the database, and weekly A regression model extractor that extracts a plurality of regression models for each week by applying the search frequency data and the weekly closing price data to the multiple regression analysis method, and calculates each fit for the plurality of regression models extracted for each week, , a regression model selector that selects a regression model having the greatest fitness for each parking for each parking, calculates the average absolute error ratio by applying the regression model selected for each parking to the prediction error analysis method, and the average absolute error ratio is calculated. The final prediction model generator that selects the regression model with the smallest error ratio as the final prediction model, and the final prediction model receives the search volume for a plurality of search terms of interest applied to the final prediction model and applies it to the final prediction model to predict the final carbon credit price It includes a price forecasting unit.

이와 같이 본 발명에 따르면, 탄소배출권 관련 검색어에 따라 탄소배출권 가격을 예측할 수 있어 변화하는 탄소배출권 시장의 영향을 반영하여 건설 현장에서의 정확한 환경부담금을 고려한 예산을 산정할 수 있다.As described above, according to the present invention, it is possible to predict the price of carbon credits according to search terms related to carbon credits, so that it is possible to reflect the influence of the changing carbon credit market and to calculate a budget considering the exact environmental burden at the construction site.

도 1은 본 발명의 실시예에 따른 탄소 배출권 가격 예측 시스템의 구성을 설명하기 위한 구성도이다.
도 2는 본 발명의 실시예에 따른 탄소 배출권 가격 예측 방법을 설명하기 위한 순서도이다.
도 3은 도 2의 S230 단계를 설명하기 위한 도면이다.
도 4는 본 발명의 실시예에 따른 교차상관 분석 방법을 설명하기 위한 그래프이다.
도 5는 본 발명의 실시예에 따른 교차상관 분석 방법을 설명하기 위한 도면이다.
도 6은 도 2의 S270 단계를 설명하기 위한 도면이다.1 is a configuration diagram for explaining the configuration of a carbon emission credit price prediction system according to an embodiment of the present invention.
2 is a flowchart for explaining a carbon emission credit price prediction method according to an embodiment of the present invention.
FIG. 3 is a diagram for explaining step S230 of FIG. 2 .
4 is a graph for explaining a cross-correlation analysis method according to an embodiment of the present invention.
5 is a diagram for explaining a cross-correlation analysis method according to an embodiment of the present invention.
FIG. 6 is a view for explaining step S270 of FIG. 2 .

아래에서는 첨부한 도면을 참조하여 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 본 발명의 실시 예를 상세히 설명한다. 그러나 본 발명은 여러 가지 상이한 형태로 구현될 수 있으며 여기에서 설명하는 실시예에 한정되지 않는다. 그리고 도면에서 본 발명을 명확하게 설명하기 위해서 설명과 관계없는 부분은 생략하였으며 명세서 전체를 통하여 유사한 부분에 대해서는 유사한 도면 부호를 붙였다.Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings so that those of ordinary skill in the art can easily carry out the present invention. However, the present invention may be embodied in several different forms and is not limited to the embodiments described herein. And in order to clearly explain the present invention in the drawings, parts irrelevant to the description are omitted, and similar reference numerals are attached to similar parts throughout the specification.

명세서 전체에서, 어떤 부분이 어떤 구성요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있는 것을 의미한다.Throughout the specification, when a part "includes" a certain component, it means that other components may be further included, rather than excluding other components, unless otherwise stated.

그러면 첨부한 도면을 참고로 하여 본 발명의 실시 예에 대하여 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 상세히 설명한다.Then, with reference to the accompanying drawings, embodiments of the present invention will be described in detail so that those of ordinary skill in the art to which the present invention pertains can easily implement them.

도 1은 본 발명의 실시예에 따른 탄소 배출권 가격 예측 시스템의 구성을 설명하기 위한 구성도이다.1 is a configuration diagram for explaining the configuration of a carbon emission credit price prediction system according to an embodiment of the present invention.

도 1에서 나타낸 것처럼, 본 발명의 실시예에 따른 탄소 배출권 가격 예측 시스템(100)은 데이터 추출부(110), 데이터 수집부(120), 데이터베이스부(130), 연산부(140), 회귀모델 추출부(150), 회귀모델 선정부(160), 최종 예측모델 생성부(170) 및 가격 예측부(180)를 포함한다.As shown in FIG. 1 , the carbon credit price prediction system 100 according to an embodiment of the present invention includes a data extraction unit 110 , a data collection unit 120 , a database unit 130 , an operation unit 140 , and a regression model extraction. It includes a unit 150 , a regression model selection unit 160 , a final prediction model generation unit 170 , and a price prediction unit 180 .

먼더, 데이터 추출부(110)는 탄소 배출권과 관련된 복수의 검색어에 대한 주간 검색 빈도수 데이터를 추출한다.First, the data extraction unit 110 extracts weekly search frequency data for a plurality of search terms related to carbon credits.

이때, 탄소 배출권과 관련된 복수의 검색어는 네이버 연관 검색어, 논문 키워드 또는 참고 논문을 바탕으로 선정된다.In this case, a plurality of search terms related to carbon credits are selected based on Naver-related search terms, thesis keywords, or reference thesis.

다음으로, 데이터 수집부(120)는 탄소 배출권의 주간 종가 데이터를 한국 거래소로부터 수집한다.Next, the data collection unit 120 collects weekly closing price data of carbon credits from the Korean Exchange.

다음으로, 데이터베이스부(130)는 데이터 추출부(110)와 데이터 수집부(120)로부터 수집된 주간 검색 빈도수 데이터와 탄소 배출권의 주간 종가 데이터를 각각의 주차별로 저장하여 데이터베이스화한다.Next, the database unit 130 stores the weekly search frequency data and the weekly closing price data of carbon credits collected from the data extraction unit 110 and the data collection unit 120 for each week into a database.

다음으로, 연산부(140)는 데이터베이스에 저장된 각각의 주간 검색 빈도수 데이터와 주간 종가 데이터를 교차상관 분석하여 각 주차 별로 주간 검색 빈도수의 상관지수를 연산한다.Next, the calculating unit 140 calculates the correlation index of the weekly search frequency for each week by cross-correlating the weekly search frequency data and the weekly closing price data stored in the database.

다음으로, 회귀모델 추출부(150)는 연산된 상관지수가 기준 값보다 작은 경우, 해당 주차의 주간 검색 빈도수 데이터를 상기 데이터베이스에서 삭제하고, 주간 검색 빈도수 데이터와 주간 종가 데이터를 다중회귀 분석방법에 적용하여 각 주차 별로 복수개의 회귀 모델을 추출한다.Next, when the calculated correlation index is smaller than the reference value, the regression model extraction unit 150 deletes the weekly search frequency data of the corresponding week from the database, and adds the weekly search frequency data and the weekly closing price data to the multiple regression analysis method. It is applied to extract a plurality of regression models for each week.

다음으로, 회귀모델 선정부(160)는 각각의 주차 별로 추출된 복수의 회귀모델에 대하여 각각의 적합도를 연산하고, 각 주차 별 적합도가 가장 큰 값을 가지는 회귀모델을 주차별로 선정한다.Next, the regression model selector 160 calculates the respective fitness for the plurality of regression models extracted for each parking, and selects the regression model having the greatest fitness for each parking for each parking.

다음으로, 최종 예측모델 생성부(170)는 각 주차 별로 선정된 회귀모델을 예측오차 분석방법에 적용하여 평균 절대 오차 비율을 연산하고, 평균 절대 오차 비율이 가장 작은 회귀모델을 최종 예측모델로 선정한다.Next, the final prediction model generator 170 calculates the average absolute error ratio by applying the regression model selected for each week to the prediction error analysis method, and selects the regression model with the smallest average absolute error ratio as the final prediction model do.

다음으로, 가격 예측부(180)는 최종 예측모델에 적용된 복수의 관심 검색어에 대한 검색량을 입력받아 최종 예측 모델에 적용하여 최종 탄소 배출권 가격을 예측하는 가격 예측부를 포함한다.Next, the price prediction unit 180 includes a price prediction unit that receives the search amount for a plurality of search terms of interest applied to the final prediction model and predicts the final carbon credit price by applying it to the final prediction model.

이하에서는 도 2 내지 도 6을 이용하여 본 발명의 실시예에 따른 탄소 배출권 가격 예측방법에 대하여 설명한다.Hereinafter, a carbon emission credit price prediction method according to an embodiment of the present invention will be described with reference to FIGS. 2 to 6 .

또한, 본 발명에서는 발명의 설명의 편의상 해당 주차로부터 4주차까지의 데이터를 이용하여 최종 예측모델을 선정하였다.In addition, in the present invention, for convenience of explanation of the invention, the final prediction model was selected using data from the corresponding week to the fourth week.

도 2는 본 발명의 실시예에 따른 탄소 배출권 가격 예측 방법을 설명하기 위한 순서도이다.2 is a flowchart for explaining a carbon emission permit price prediction method according to an embodiment of the present invention.

도 2에서 나타낸 것처럼, 데이터 추출부(110)는 탄소 배출권과 관련된 복수의 검색어에 대한 주간 검색 빈도수 데이터를 추출한다(S210).As shown in FIG. 2 , the data extraction unit 110 extracts weekly search frequency data for a plurality of search terms related to carbon credits ( S210 ).

이때, 본 발명에서는 탄소 배출권과 관련된 복수의 검색어를 "탄소배출권 거래", "후성", "탄소배출권 거래제", "이건산업", "탄소배출권 가격", "한솔홈데코", "유니슨", "홈데코", "휴켐스", "productive", "productively", "productivity", "비유", "포크레인 emissions", "CO2 배출량", "비교", "correlation", "globalwarming", "NOx", "PEMS", "durable", "furniture", "wakefulness"으로 설정하였으며, 탄소 배툴권과 관련된 검색어의 개수 또는 종류는 변경될 수 있다.At this time, in the present invention, a plurality of search terms related to carbon credits are searched for "carbon credit trading", "Woosung", "carbon trading system", "Lee Gun Industry", "carbon credit price", "Hansol Home Deco", "Unison", "Home Decor", "Huchems", "productive", "productively", "productivity", "figurative", "forkrain emissions", "CO2 emissions", "comparison", "correlation", "globalwarming", "NOx" , "PEMS", "durable", "furniture", and "wakefulness" are set, and the number or type of search terms related to carbon ventilator can be changed.

다음으로, 데이터 수집부(120)는 한국 거래소로부터 탄소 배출권의 주간 종가 데이터를 수집한다(S220).Next, the data collection unit 120 collects weekly closing price data of carbon credits from the Korean exchange (S220).

여기서, 한국 거래소는 주식, 채권, KRX 및 탄소 배출권을 관리하는 국가기관으로, 본 발명에서는 탄소 배출권의 주간 종가 가격에 대한 데이터를 획득하기 위해 이용된다.Here, the Korea Exchange is a national institution that manages stocks, bonds, KRX, and carbon credits, and in the present invention, it is used to acquire data on the weekly closing price of carbon credits.

다음으로, 데이터베이스부(130)는 주간 검색 빈도수 데이터와 탄소 배출권의 주간 종가 데이터를 각각의 주차별로 저장하여 데이터베이스화한다(S230).Next, the database unit 130 stores the weekly search frequency data and the weekly closing price data of carbon credits for each week into a database (S230).

도 3은 도 2의 S230 단계를 설명하기 위한 도면이다.FIG. 3 is a diagram for explaining step S230 of FIG. 2 .

즉, 도 3에서 나타낸 것과 같이, 데이터베이스부(130)는 해당 주차에 따른 주간 검색 빈도수 데이터와 탄소 배출권의 주간 종가 데이터를 함께 저장하여 데이터베이스화한다.That is, as shown in FIG. 3 , the database unit 130 stores the weekly search frequency data according to the corresponding parking and the weekly closing price data of carbon credits together to form a database.

다음으로, 연산부(140)는 데이터베이스에 저장된 각각의 상기 주간 검색 빈도수 데이터와 주간 종가 데이터를 교차상관 분석하여 각 주차 별로 주간 검색 빈도수의 상관지수를 연산한다(S240).Next, the calculation unit 140 cross-correlates each of the weekly search frequency data and the weekly closing price data stored in the database to calculate a correlation index of the weekly search frequency for each week (S240).

도 4는 본 발명의 실시예에 따른 교차상관 분석 방법을 설명하기 위한 그래프이고, 도 5는 본 발명의 실시예에 따른 교차상관 분석 방법을 설명하기 위한 도면이다.4 is a graph for explaining a cross-correlation analysis method according to an embodiment of the present invention, and FIG. 5 is a diagram for explaining a cross-correlation analysis method according to an embodiment of the present invention.

여기서, 도 4에서 나타낸 것처럼, 본 발명의 실시예에 따른 교차상관 분석 방법은 탄소 배출권 가격(붉은색 그래프)의 데이터는 유지한 상태에서 검색어 데이터(파란색 그래프)의 시점을 이동시켜 각각의 상관성을 분석하여 상관지수를 연산한다.Here, as shown in FIG. 4 , the cross-correlation analysis method according to the embodiment of the present invention moves the time point of the search word data (blue graph) while maintaining the data of the carbon credit price (red graph) to analyze each correlation. Analyze and calculate the correlation index.

예를 들어, 도 5에서 나타낸 것처럼, 검색어가 수집된 시점 7월 16일, 배출권 시점 7월 20일로 같은 행인 상태를 기준으로 7월 23일-7월 27일(1주주차), 7월 30일-8월 3일(2주주차)와 같이 짝 지어진 데이터를 각각 상관분석을 진행하여, 각각의 검색어를 교차상관분석한다.For example, as shown in Figure 5, based on the same passer-by status as July 16th when the search term was collected and July 20th at the time of emission permits, July 23rd - July 27th (week 1), July 30th Correlation analysis is performed on each paired data such as Sun-August 3 (week 2), and cross-correlation analysis is performed on each search term.

이때, 연산부(140)는 아래의 수학식 1을 통해 상관지수를 연산한다.At this time, the calculator 140 calculates the correlation index through Equation 1 below.

여기서, i는 해당 검색어의 인덱스이고, t는 해당 주차,

는 해당 주차에서의 해당 검색어의 검색 빈도수,

는 해당 검색어 빈도수 데이터의 평균 값이고,

는 탄소 배출권 가격 데이터의 평균이다.where i is the index of the corresponding search term, t is the corresponding parking,

is the search frequency of the search term in the corresponding parking lot,

is the average value of the search term frequency data,

is the average of carbon credit price data.

다음으로, 회귀모델 추출부(150)는 각 주차별로 복수의 검색어에 대한 주간 검색 빈도수의 상관지수 각각에 대하여 기준 값보다 작은지 여부를 판단한다(S250).Next, the regression model extraction unit 150 determines whether each of the correlation indices of the weekly search frequency for a plurality of search terms for each week is smaller than a reference value ( S250 ).

이때, 각 주차별로 복수의 검색어에 대한 주간 검색 빈도수의 상관지수가 기준 값 이상인 경우, 해당 주차에 대한 검색어는 데이터베이스에서 제거하지 않는다.In this case, if the correlation index of the weekly search frequency for a plurality of search words for each parking is equal to or greater than the reference value, the search word for the corresponding parking is not removed from the database.

반면에, 각 주차별로 복수의 검색어에 대한 주간 검색 빈도수의 상관지수가 기준 값 보다 작으면, 회귀모델 추출부(150)는 해당 주차에 대한 검색어를 데이터베이스에서 삭제하고, 삭제가 완료된 데이터베이스의 주간 검색 빈도수 데이터와 주간 종가 데이터를 다중회귀 분석방법에 적용하여 각 주차별 복수개의 회귀 모델을 추출한다(S260).On the other hand, if the correlation index of the weekly search frequency for a plurality of search terms for each week is less than the reference value, the regression model extraction unit 150 deletes the search word for the corresponding parking from the database, and the weekly search of the database in which the deletion is completed A plurality of regression models for each week are extracted by applying the frequency data and weekly closing price data to the multiple regression analysis method (S260).

이때, 다중회귀 분석방법은 복수개의 독립 변수들과 하나의 종속변수의 관계를 분석하는 방법으로, 본 발명의 실시예에서는 각 주차별 복수개의 회귀 모델을 추출하기 위해 사용된다.In this case, the multiple regression analysis method is a method of analyzing the relationship between a plurality of independent variables and one dependent variable, and is used to extract a plurality of regression models for each week in the embodiment of the present invention.

다음으로, 회귀모델 선정부(160)는 각각의 주차별로 추출된 복수의 회귀모델에 대하여 각각의 적합도를 연산하고, 각 주차별 적합도가 가장 큰 값을 가지는 회귀 모델을 각 주차별로 선정한다(S270).Next, the regression model selector 160 calculates the respective fitness for a plurality of regression models extracted for each parking, and selects the regression model having the greatest fitness for each parking for each parking (S270). ).

이때, 적합도는 아래의 수학식 2을 통해 연산된다.In this case, the fitness is calculated through Equation 2 below.

그리고, 상관계수(R)은 아래의 수학식 3을 통해 연산된다.And, the correlation coefficient (R) is calculated through Equation 3 below.

여기서,

는 해당 주차에서의 상기 회귀모델의 근사 값,

는 해당 주차에서의 상기 회귀모델의 근사 값의 평균 값,

는 해당 주차에서의 탄소 배출권 가격이다.here,

is the approximate value of the regression model in the corresponding parking,

is the carbon credit price for the parking lot.

도 6은 도 2의 S270 단계를 설명하기 위한 도면이다.FIG. 6 is a diagram for explaining step S270 of FIG. 2 .

도 6에서 나타낸 것처럼, 1주차를 case 1로, 2주차를 case 2로, 3주차를 case 3으로, 4주차를 case 4로 나타내었다.As shown in FIG. 6 , the 1st week was shown as case 1, the 2nd week as case 2, the 3rd week as case 3, and the 4th week as case 4.

또한, NULL은 해당 주차에 상관계수 기준 미달으로 제외된 것을 의미한다.In addition, NULL means that the corresponding parking was excluded because the correlation coefficient criteria were not met.

즉, 도 6에서 나타낸 것처럼, 각각의 주차별 적합도가 가장 큰 값을 가지는 회귀모형을 선정한다.That is, as shown in FIG. 6 , a regression model having the largest value of fitness for each parking is selected.

다음으로, 최종 예측모델 생성부(170)는 각 주차 별로 선정된 회귀모델을 예측오차 분석방법에 적용하여 평균 절대 오차 비율을 연산하고, 평균 절대 오차 비율이 가장 작은 회귀모델을 최종 예측모델로 선정한다(S280).Next, the final prediction model generator 170 calculates the average absolute error ratio by applying the regression model selected for each week to the prediction error analysis method, and selects the regression model with the smallest average absolute error ratio as the final prediction model do (S280).

즉, 최종 예측모델 생성부(170)는 각 주차 별 선정된 회귀모형을 아래의 수학식 4에 적용하여 각 주차 별 평균 절대 오차 비율을 연산한다.That is, the final prediction model generating unit 170 calculates the average absolute error ratio for each week by applying the regression model selected for each week to Equation 4 below.

여기서, MAPE는 평균 절대 오차 비율이고,

는 해당 주차에서의 탄소 배출권 가격이고,

is the price of carbon credits in the parking lot,

is an approximation of the regression model at the corresponding parking.

이때, 도 6에서 선정된 각 주차별 회귀모델을 수학식 4를 통해 각각의 MAPE를 연산한다.At this time, each MAPE is calculated through Equation 4 for the regression model for each week selected in FIG. 6 .

그러면, 아래의 표 1과 같이 각 주차별 MAPE 값이 연산된다.Then, the MAPE value for each parking is calculated as shown in Table 1 below.

표 1에서 나타낸 것과 같이, 최종 예측모델 생성부(170)는 MAPE가 가장 적은 4주차의 회귀모델을 최종 예측모델로 선정한다.As shown in Table 1, the final predictive model generator 170 selects the regression model of the 4th week with the least MAPE as the final predictive model.

그러면, 최종 예측모델 생성부(170)는 아래의 수학식 5와 같은 회귀모델을 최종 예측모델로 선정한다.Then, the final prediction model generator 170 selects a regression model as in Equation 5 below as the final prediction model.

여기서,

는 해당 주차에서의 관심 검색어의 검색 빈도수이다.here,

이때, 관심 검색어는 "탄소배출권거래", "후성", "탄소배출권거래제", "이건산업", "탄소배출권가격", "홈데코", "productive", "비유" 및 "포크레인"을 포함한다.At this time, the search terms of interest include "carbon trading", "hoosung", "carbon trading system", "Egeon Industrial", "carbon credit price", "home decor", "productive", "figurative" and "forkrain" do.

즉, 관심 검색어는 최종 예측모델로 선정된 회귀모델에서 이용되는 검색어를 의미한다.That is, the search word of interest means a search word used in the regression model selected as the final predictive model.

이때, 관심 검색어는 최종 예측모델로 선정된 회귀모델에 따라 서로 다른 검색어일 수 있다.In this case, the search word of interest may be a different search word according to the regression model selected as the final predictive model.

다음으로, 가격 예측부(180)는 최종 예측모델에 적용된 복수의 관심 검색어에 대한 검색량을 입력받아 최종 예측 모델에 적용하여 최종 탄소 배출권 가격을 예측한다(S280).Next, the price prediction unit 180 receives the search amount for a plurality of search terms of interest applied to the final prediction model and applies it to the final prediction model to predict the final carbon credit price (S280).

즉, 가격 예측부(180)는 데이터 추출부(110)로부터 최종 예측모델에 적용된 관심 검색어에 대한 검색량을 입력받아 수학식 5에 적용하여 최종 탄소 배출권 가격을 예측한다. That is, the price prediction unit 180 receives the search amount for the keyword of interest applied to the final prediction model from the data extraction unit 110 and applies Equation 5 to predict the final carbon credit price.

이와 같이 본 발명의 실시예에 따르면, 탄소배출권 관련 검색어에 따라 탄소배출권 가격을 예측할 수 있어 변화하는 탄소배출권 시장의 영향을 반영하여 건설 현장에서의 정확한 환경부담금을 고려한 예산을 산정할 수 있다.As described above, according to an embodiment of the present invention, it is possible to predict the carbon credit price according to a search term related to carbon credits, so it is possible to reflect the influence of the changing carbon credit market and to calculate a budget in consideration of the exact environmental charge at the construction site.

본 발명은 도면에 도시된 실시 예를 참고로 설명되었으나 이는 예시적인 것이 불과하며, 본 기술 분야의 통상의 지식을 가진 자라면 이로부터 다양한 변형 및 균등한 다른 실시 예가 가능하다는 점을 이해할 것이다. 따라서, 본 발명의 진정한 기술적 보호 범위는 첨부된 특허청구범위의 기술적 사상에 의하여 정해져야 할 것이다. Although the present invention has been described with reference to the embodiment shown in the drawings, this is merely exemplary, and those skilled in the art will understand that various modifications and equivalent other embodiments are possible therefrom. Accordingly, the true technical protection scope of the present invention should be defined by the technical spirit of the appended claims.

100: 탄소 배출권 가격 예측 시스템,
110: 데이터 추출부, 120: 데이터 수집부,
130: 데이터베이스부, 140: 연산부,
150: 회귀모델 추출부, 160: 회귀모델 선정부,
170: 최종 예측모델 생성부, 180: 가격 예측부100: Carbon Credit Price Prediction System,
110: data extraction unit, 120: data collection unit,
130: database unit, 140: calculation unit,
150: regression model extraction unit, 160: regression model selection unit,
170: final prediction model generation unit, 180: price prediction unit

Claims

In the carbon credit price prediction method performed by the carbon credit price prediction system,
The carbon credit price prediction system extracts weekly search frequency data for a plurality of search terms related to carbon credits;
The carbon credit price prediction system collects the weekly closing price data of the carbon credits from the Korean Exchange;
The carbon credit price prediction system stores the weekly search frequency data and the weekly closing price data of the carbon credits for each week to build a database;
The carbon emission permit price prediction system cross-correlates each of the weekly search frequency data and weekly closing price data stored in the database to calculate a correlation index of the weekly search frequency for each week;
When the calculated correlation index is smaller than the reference value, the carbon emission price prediction system deletes the weekly search frequency data of the corresponding week from the database, and applies the weekly search frequency data and the weekly closing price data to the multiple regression analysis method, extracting a plurality of regression models for each week;
The carbon emission price prediction system calculates respective fitness for a plurality of regression models extracted for each parking, and selects a regression model having the largest value of fitness for each parking for each parking;
The carbon emission price prediction system calculates the average absolute error ratio by applying the regression model selected for each week to the prediction error analysis method, and selecting the regression model with the smallest average absolute error ratio as the final prediction model; and
The carbon credit price prediction system includes the step of receiving a search amount for a plurality of search terms of interest applied to the final prediction model and applying it to the final prediction model to predict the final carbon credit price,
Calculating the correlation index comprises:
A carbon credit price prediction method for calculating the correlation index through the following equation;

where i is the index of the corresponding search term, t is the corresponding parking,

is the search frequency of the search term in the corresponding parking lot,

is the average value of the search term frequency data,

is the average of carbon credit price data.

delete

According to claim 1,
The fitness is
A method of predicting carbon credit price calculated through the following equation;

Here, R is a correlation coefficient, n is the number of search terms having a correlation index equal to or greater than the reference value, and p is the total number of search terms related to the carbon credits.

4. The method of claim 3,
The correlation coefficient (R) is,
A method of predicting carbon credit price calculated through the following equation;

here,

is the approximate value of the regression model in the corresponding parking,

is the carbon credit price for the parking lot.

According to claim 1,
The step of predicting the final carbon credit price is,
a carbon emission permit price prediction method for calculating the average absolute error ratio for each parking by applying the regression model selected for each parking to the following equation;

where MAPE is the mean absolute error rate,

is the price of carbon credits in the parking lot,

is an approximation of the regression model at the corresponding parking.

According to claim 1,
The plurality of search terms of interest are,
Carbon credit price prediction method including carbon credit trading, hoosung, carbon credit trading system, Egan Industry, carbon credit price, home decor, productive, milk and fork crane.

7. The method of claim 6,
The step of predicting the final carbon credit price is,
A carbon credit price prediction method that predicts the final carbon credit price through the following equation:

here,

In the carbon credit price prediction system for predicting the carbon credit price,
A data extraction unit for extracting weekly search frequency data for a plurality of search terms related to carbon credits;
A data collection unit that collects the weekly closing price data of the carbon credits from the Korean Exchange,
A database unit for building a database by storing the weekly search frequency data and the weekly closing price data of the carbon credits for each week;
a calculation unit that cross-correlates each of the weekly search frequency data and weekly closing price data stored in the database to calculate a correlation index of the weekly search frequency for each week;
When the calculated correlation index is smaller than the reference value, the weekly search frequency data of the corresponding week is deleted from the database, and the weekly search frequency data and weekly closing price data are applied to the multiple regression analysis method to apply a plurality of regression models for each week. A regression model extraction unit that extracts
A regression model selector that calculates respective fitness for the plurality of regression models extracted for each parking, and selects a regression model having the greatest fitness for each parking for each parking;
A final prediction model generator that calculates the average absolute error ratio by applying the regression model selected for each week to the prediction error analysis method, and selects the regression model with the smallest average absolute error ratio as the final prediction model; And
and a price prediction unit that receives a search amount for a plurality of search terms of interest applied to the final prediction model and predicts the final carbon credit price by applying it to the final prediction model,
The calculation unit,
a carbon credit price prediction system for calculating the correlation index through the following equation;

is the search frequency of the search term in the corresponding parking lot,

is the average value of the search term frequency data,

is the average of carbon credit price data.

delete

9. The method of claim 8,
The fitness is
Carbon credit price prediction system calculated through the following equation;

11. The method of claim 10,
The correlation coefficient (R) is,
Carbon credit price prediction system calculated through the following equation;

here,

is the approximate value of the regression model in the corresponding parking,

is the carbon credit price for the parking lot.

9. The method of claim 8,
The price prediction unit,
a carbon emission price prediction system for calculating the average absolute error ratio for each parking by applying the regression model selected for each parking to the following equation;

where MAPE is the mean absolute error rate,

is the price of carbon credits in the parking lot,

is an approximation of the regression model at the corresponding parking.

9. The method of claim 8,
The plurality of search terms of interest are,
Carbon credit price prediction system including carbon credit trading, hoosung, carbon credit trading system, Egun Industry, carbon credit price, home decor, productive, milk and fork crane.

14. The method of claim 13,
The price prediction unit,
A carbon credit price prediction system that predicts the final carbon credit price through the following equation:

here,