SlideShare a Scribd company logo
複合情報学専攻 修士論文 (2007 年 2 月 14 日)




         Web サイトに存在する施設属性情報の統合に関する研究
                 複雑系工学講座 調和系工学研究室 修士 2 年 森 康真

         A Study on Integration of Facility Information on Websites
                       Research Group of Complex Systems Engineering
                        Laboratory of Harmonious Systems Engineering
                                         MC2 Yasunao Mori

Abstract: Recently, volume of tourism information on the WWW has increased, as the number of Web
pages on the WWW is rising rapidly. As for this information, it distributes and overlaps on various
websites. The expression of the attribute name and the attribute value is different on each site. Various
uses can be expected by integrating information. In this paper, the purpose is that extracting the attribute
name and the attribute value of facilities information described with HTML that exists on websites and
integrating them. There are two approaches for the purpose. The first approach is how to improve the
extraction accuracy of attribute value from the site. It pays attention to a series of HTML documents with
a large amount of facilities information. Cases-based transformation method from HTML documents to
XML ones using data type dealing with a series of HTML documents is proposed. The experimental
results show that the proposed method can transform from HTML documents to XML ones with a high
degree of precision. The second approach is construction of an integrated rule of attribute value. Those
expressions are integrated by composing the rule that converts several expressions into one kind of
expression. Facilities attribute information on four kinds of accommodations reservation sites were
actually integrated by using the proposal attribute information integration technique.




研究業績(査読付き学術論文,国際会議講演論文,国内講演論文等)
                              :
1.   森 康真, 山本 雅人, 大内 東, ” 複数事例を用いた HTML 文書から XML 文書へのラッピング”, FIT2006,
     CD-ROM, pp.85-86, (2006. 9)

2.   Yasunao Mori, Masahito Yamamoto, Azuma Ohuchi, “Cases-based Information Extraction from HTML using
     Data Type”, ENTER2007, CD-ROM, pp.163-171, (2007. 1).

More Related Content

Viewers also liked

Keyur laniya 047
Keyur laniya 047Keyur laniya 047
Keyur laniya 047
Keyur Patel
 
Taj cape town%20-%20sales%20presentation%20leisure[1]
Taj cape town%20-%20sales%20presentation%20leisure[1]Taj cape town%20-%20sales%20presentation%20leisure[1]
Taj cape town%20-%20sales%20presentation%20leisure[1]
WillieWilliams2
 
Week 5 Blog
Week 5 BlogWeek 5 Blog
Week 5 Blog
kcg10913
 
pliego-de-reclamos-iii-and-2013
pliego-de-reclamos-iii-and-2013pliego-de-reclamos-iii-and-2013
pliego-de-reclamos-iii-and-2013
Sute VI Sector
 
漫畫 摘錄
漫畫 摘錄漫畫 摘錄
漫畫 摘錄
shinmiao
 
01 laat me merken dat ik 'n vrouw ben
01 laat me merken dat ik 'n vrouw ben01 laat me merken dat ik 'n vrouw ben
01 laat me merken dat ik 'n vrouw ben
Corrie van Woezik
 

Viewers also liked (13)

Hubtown Grove
Hubtown GroveHubtown Grove
Hubtown Grove
 
Keyur laniya 047
Keyur laniya 047Keyur laniya 047
Keyur laniya 047
 
Taj cape town%20-%20sales%20presentation%20leisure[1]
Taj cape town%20-%20sales%20presentation%20leisure[1]Taj cape town%20-%20sales%20presentation%20leisure[1]
Taj cape town%20-%20sales%20presentation%20leisure[1]
 
Los x games
Los x gamesLos x games
Los x games
 
Week 5 Blog
Week 5 BlogWeek 5 Blog
Week 5 Blog
 
pliego-de-reclamos-iii-and-2013
pliego-de-reclamos-iii-and-2013pliego-de-reclamos-iii-and-2013
pliego-de-reclamos-iii-and-2013
 
Plan de gestion de uso de tic isafa potrerillo 2.012....1
Plan de gestion de uso de tic isafa potrerillo 2.012....1Plan de gestion de uso de tic isafa potrerillo 2.012....1
Plan de gestion de uso de tic isafa potrerillo 2.012....1
 
漫畫 摘錄
漫畫 摘錄漫畫 摘錄
漫畫 摘錄
 
Introducción
IntroducciónIntroducción
Introducción
 
8 Reasons Hearing Loss is More Dangerous Than You Think
8 Reasons Hearing Loss is More Dangerous Than You Think8 Reasons Hearing Loss is More Dangerous Than You Think
8 Reasons Hearing Loss is More Dangerous Than You Think
 
Plano701
Plano701Plano701
Plano701
 
01 laat me merken dat ik 'n vrouw ben
01 laat me merken dat ik 'n vrouw ben01 laat me merken dat ik 'n vrouw ben
01 laat me merken dat ik 'n vrouw ben
 
Análixe crítica diseño de productos gráficos argg0110 silvia
Análixe crítica diseño de productos gráficos argg0110 silviaAnálixe crítica diseño de productos gráficos argg0110 silvia
Análixe crítica diseño de productos gráficos argg0110 silvia
 

Similar to Abstract mori

Abstract honda
Abstract hondaAbstract honda
Abstract honda
harmonylab
 
Ijarcet vol-2-issue-4-1357-1362
Ijarcet vol-2-issue-4-1357-1362Ijarcet vol-2-issue-4-1357-1362
Ijarcet vol-2-issue-4-1357-1362
Editor IJARCET
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
ijceronline
 
Development of an E-Learning System Incorporating Semantic Web
Development of an E-Learning System Incorporating Semantic WebDevelopment of an E-Learning System Incorporating Semantic Web
Development of an E-Learning System Incorporating Semantic Web
IJORCS
 

Similar to Abstract mori (20)

Abstract honda
Abstract hondaAbstract honda
Abstract honda
 
WEB MINING: PATTERN DISCOVERY ON THE WORLD WIDE WEB - 2011
WEB MINING: PATTERN DISCOVERY ON THE WORLD WIDE WEB - 2011WEB MINING: PATTERN DISCOVERY ON THE WORLD WIDE WEB - 2011
WEB MINING: PATTERN DISCOVERY ON THE WORLD WIDE WEB - 2011
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
 
International Journal of Computational Engineering Research(IJCER)
 International Journal of Computational Engineering Research(IJCER)  International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)
 
ITEC 610 Assingement 1 Essay
ITEC 610 Assingement 1 EssayITEC 610 Assingement 1 Essay
ITEC 610 Assingement 1 Essay
 
ANALYSIS OF RESEARCH ISSUES IN WEB DATA MINING
ANALYSIS OF RESEARCH ISSUES IN WEB DATA MINING ANALYSIS OF RESEARCH ISSUES IN WEB DATA MINING
ANALYSIS OF RESEARCH ISSUES IN WEB DATA MINING
 
Literature Survey on Web Mining
Literature Survey on Web MiningLiterature Survey on Web Mining
Literature Survey on Web Mining
 
A Framework For Resource Annotation And Classification In Bioinformatics
A Framework For Resource Annotation And Classification In BioinformaticsA Framework For Resource Annotation And Classification In Bioinformatics
A Framework For Resource Annotation And Classification In Bioinformatics
 
Ijarcet vol-2-issue-4-1357-1362
Ijarcet vol-2-issue-4-1357-1362Ijarcet vol-2-issue-4-1357-1362
Ijarcet vol-2-issue-4-1357-1362
 
Semantic Annotation: The Mainstay of Semantic Web
Semantic Annotation: The Mainstay of Semantic WebSemantic Annotation: The Mainstay of Semantic Web
Semantic Annotation: The Mainstay of Semantic Web
 
105 108
105 108105 108
105 108
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
 
A NOVEL APPROACH FOR INFORMATION RETRIEVAL TECHNIQUE FOR WEB USING NLP
A NOVEL APPROACH FOR INFORMATION RETRIEVAL TECHNIQUE FOR WEB USING NLPA NOVEL APPROACH FOR INFORMATION RETRIEVAL TECHNIQUE FOR WEB USING NLP
A NOVEL APPROACH FOR INFORMATION RETRIEVAL TECHNIQUE FOR WEB USING NLP
 
WSO-LINK: Algorithm to Eliminate Web Structure Outliers in Web Pages
WSO-LINK: Algorithm to Eliminate Web Structure Outliers in Web PagesWSO-LINK: Algorithm to Eliminate Web Structure Outliers in Web Pages
WSO-LINK: Algorithm to Eliminate Web Structure Outliers in Web Pages
 
COST-SENSITIVE TOPICAL DATA ACQUISITION FROM THE WEB
COST-SENSITIVE TOPICAL DATA ACQUISITION FROM THE WEBCOST-SENSITIVE TOPICAL DATA ACQUISITION FROM THE WEB
COST-SENSITIVE TOPICAL DATA ACQUISITION FROM THE WEB
 
Review on an automatic extraction of educational digital objects and metadata...
Review on an automatic extraction of educational digital objects and metadata...Review on an automatic extraction of educational digital objects and metadata...
Review on an automatic extraction of educational digital objects and metadata...
 
Pf3426712675
Pf3426712675Pf3426712675
Pf3426712675
 
An Implementation of a New Framework for Automatic Generation of Ontology and...
An Implementation of a New Framework for Automatic Generation of Ontology and...An Implementation of a New Framework for Automatic Generation of Ontology and...
An Implementation of a New Framework for Automatic Generation of Ontology and...
 
Development of an E-Learning System Incorporating Semantic Web
Development of an E-Learning System Incorporating Semantic WebDevelopment of an E-Learning System Incorporating Semantic Web
Development of an E-Learning System Incorporating Semantic Web
 
H0314450
H0314450H0314450
H0314450
 

More from harmonylab

【修士論文】代替出勤者の選定業務における依頼順決定方法に関する研究   千坂知也
【修士論文】代替出勤者の選定業務における依頼順決定方法に関する研究   千坂知也【修士論文】代替出勤者の選定業務における依頼順決定方法に関する研究   千坂知也
【修士論文】代替出勤者の選定業務における依頼順決定方法に関する研究   千坂知也
harmonylab
 
DLゼミ:Primitive Generation and Semantic-related Alignment for Universal Zero-S...
DLゼミ:Primitive Generation and Semantic-related Alignment for Universal Zero-S...DLゼミ:Primitive Generation and Semantic-related Alignment for Universal Zero-S...
DLゼミ:Primitive Generation and Semantic-related Alignment for Universal Zero-S...
harmonylab
 
形態素解析を用いた帝国議会議事速記録の変遷に関する研究
形態素解析を用いた帝国議会議事速記録の変遷に関する研究形態素解析を用いた帝国議会議事速記録の変遷に関する研究
形態素解析を用いた帝国議会議事速記録の変遷に関する研究
harmonylab
 

More from harmonylab (20)

【DLゼミ】XFeat: Accelerated Features for Lightweight Image Matching
【DLゼミ】XFeat: Accelerated Features for Lightweight Image Matching【DLゼミ】XFeat: Accelerated Features for Lightweight Image Matching
【DLゼミ】XFeat: Accelerated Features for Lightweight Image Matching
 
【修士論文】代替出勤者の選定業務における依頼順決定方法に関する研究   千坂知也
【修士論文】代替出勤者の選定業務における依頼順決定方法に関する研究   千坂知也【修士論文】代替出勤者の選定業務における依頼順決定方法に関する研究   千坂知也
【修士論文】代替出勤者の選定業務における依頼順決定方法に関する研究   千坂知也
 
【修士論文】経路探索のための媒介中心性に基づく道路ネットワーク階層化手法に関する研究
【修士論文】経路探索のための媒介中心性に基づく道路ネットワーク階層化手法に関する研究【修士論文】経路探索のための媒介中心性に基づく道路ネットワーク階層化手法に関する研究
【修士論文】経路探索のための媒介中心性に基づく道路ネットワーク階層化手法に関する研究
 
A Study on Decision Support System for Snow Removal Dispatch using Road Surfa...
A Study on Decision Support System for Snow Removal Dispatch using Road Surfa...A Study on Decision Support System for Snow Removal Dispatch using Road Surfa...
A Study on Decision Support System for Snow Removal Dispatch using Road Surfa...
 
【卒業論文】印象タグを用いた衣服画像生成システムに関する研究
【卒業論文】印象タグを用いた衣服画像生成システムに関する研究【卒業論文】印象タグを用いた衣服画像生成システムに関する研究
【卒業論文】印象タグを用いた衣服画像生成システムに関する研究
 
【卒業論文】大規模言語モデルを用いたマニュアル文章修正手法に関する研究
【卒業論文】大規模言語モデルを用いたマニュアル文章修正手法に関する研究【卒業論文】大規模言語モデルを用いたマニュアル文章修正手法に関する研究
【卒業論文】大規模言語モデルを用いたマニュアル文章修正手法に関する研究
 
DLゼミ:Primitive Generation and Semantic-related Alignment for Universal Zero-S...
DLゼミ:Primitive Generation and Semantic-related Alignment for Universal Zero-S...DLゼミ:Primitive Generation and Semantic-related Alignment for Universal Zero-S...
DLゼミ:Primitive Generation and Semantic-related Alignment for Universal Zero-S...
 
DLゼミ: MobileOne: An Improved One millisecond Mobile Backbone
DLゼミ: MobileOne: An Improved One millisecond Mobile BackboneDLゼミ: MobileOne: An Improved One millisecond Mobile Backbone
DLゼミ: MobileOne: An Improved One millisecond Mobile Backbone
 
DLゼミ: Llama 2: Open Foundation and Fine-Tuned Chat Models
DLゼミ: Llama 2: Open Foundation and Fine-Tuned Chat ModelsDLゼミ: Llama 2: Open Foundation and Fine-Tuned Chat Models
DLゼミ: Llama 2: Open Foundation and Fine-Tuned Chat Models
 
DLゼミ: ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation
DLゼミ: ViTPose: Simple Vision Transformer Baselines for Human Pose EstimationDLゼミ: ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation
DLゼミ: ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation
 
Voyager: An Open-Ended Embodied Agent with Large Language Models
Voyager: An Open-Ended Embodied Agent with Large Language ModelsVoyager: An Open-Ended Embodied Agent with Large Language Models
Voyager: An Open-Ended Embodied Agent with Large Language Models
 
DLゼミ: Ego-Body Pose Estimation via Ego-Head Pose Estimation
DLゼミ: Ego-Body Pose Estimation via Ego-Head Pose EstimationDLゼミ: Ego-Body Pose Estimation via Ego-Head Pose Estimation
DLゼミ: Ego-Body Pose Estimation via Ego-Head Pose Estimation
 
ReAct: Synergizing Reasoning and Acting in Language Models
ReAct: Synergizing Reasoning and Acting in Language ModelsReAct: Synergizing Reasoning and Acting in Language Models
ReAct: Synergizing Reasoning and Acting in Language Models
 
形態素解析を用いた帝国議会議事速記録の変遷に関する研究
形態素解析を用いた帝国議会議事速記録の変遷に関する研究形態素解析を用いた帝国議会議事速記録の変遷に関する研究
形態素解析を用いた帝国議会議事速記録の変遷に関する研究
 
【卒業論文】深層生成モデルを用いたユーザ意図に基づく衣服画像の生成に関する研究
【卒業論文】深層生成モデルを用いたユーザ意図に基づく衣服画像の生成に関する研究【卒業論文】深層生成モデルを用いたユーザ意図に基づく衣服画像の生成に関する研究
【卒業論文】深層生成モデルを用いたユーザ意図に基づく衣服画像の生成に関する研究
 
灯油タンク内の液面高計測を用いた 灯油残量推定システムに関する研究
灯油タンク内の液面高計測を用いた灯油残量推定システムに関する研究灯油タンク内の液面高計測を用いた灯油残量推定システムに関する研究
灯油タンク内の液面高計測を用いた 灯油残量推定システムに関する研究
 
深層自己回帰モデルを用いた俳句の生成と評価に関する研究
深層自己回帰モデルを用いた俳句の生成と評価に関する研究深層自己回帰モデルを用いた俳句の生成と評価に関する研究
深層自己回帰モデルを用いた俳句の生成と評価に関する研究
 
競輪におけるレーティングシステムを用いた予想記事生成に関する研究
競輪におけるレーティングシステムを用いた予想記事生成に関する研究競輪におけるレーティングシステムを用いた予想記事生成に関する研究
競輪におけるレーティングシステムを用いた予想記事生成に関する研究
 
【卒業論文】B2Bオークションにおけるユーザ別 入札行動予測に関する研究
【卒業論文】B2Bオークションにおけるユーザ別 入札行動予測に関する研究【卒業論文】B2Bオークションにおけるユーザ別 入札行動予測に関する研究
【卒業論文】B2Bオークションにおけるユーザ別 入札行動予測に関する研究
 
A Study on Estimation of Household Kerosene Consumption for Optimization of D...
A Study on Estimation of Household Kerosene Consumption for Optimization of D...A Study on Estimation of Household Kerosene Consumption for Optimization of D...
A Study on Estimation of Household Kerosene Consumption for Optimization of D...
 

Abstract mori

  • 1. 複合情報学専攻 修士論文 (2007 年 2 月 14 日) Web サイトに存在する施設属性情報の統合に関する研究 複雑系工学講座 調和系工学研究室 修士 2 年 森 康真 A Study on Integration of Facility Information on Websites Research Group of Complex Systems Engineering Laboratory of Harmonious Systems Engineering MC2 Yasunao Mori Abstract: Recently, volume of tourism information on the WWW has increased, as the number of Web pages on the WWW is rising rapidly. As for this information, it distributes and overlaps on various websites. The expression of the attribute name and the attribute value is different on each site. Various uses can be expected by integrating information. In this paper, the purpose is that extracting the attribute name and the attribute value of facilities information described with HTML that exists on websites and integrating them. There are two approaches for the purpose. The first approach is how to improve the extraction accuracy of attribute value from the site. It pays attention to a series of HTML documents with a large amount of facilities information. Cases-based transformation method from HTML documents to XML ones using data type dealing with a series of HTML documents is proposed. The experimental results show that the proposed method can transform from HTML documents to XML ones with a high degree of precision. The second approach is construction of an integrated rule of attribute value. Those expressions are integrated by composing the rule that converts several expressions into one kind of expression. Facilities attribute information on four kinds of accommodations reservation sites were actually integrated by using the proposal attribute information integration technique. 研究業績(査読付き学術論文,国際会議講演論文,国内講演論文等) : 1. 森 康真, 山本 雅人, 大内 東, ” 複数事例を用いた HTML 文書から XML 文書へのラッピング”, FIT2006, CD-ROM, pp.85-86, (2006. 9) 2. Yasunao Mori, Masahito Yamamoto, Azuma Ohuchi, “Cases-based Information Extraction from HTML using Data Type”, ENTER2007, CD-ROM, pp.163-171, (2007. 1).