网站导航网学 原创论文 网站设计 最新系统 最新研究 原创论文 获取论文 论文降重 发表论文 论文发表 UI设计定制 论文答辩PPT格式排版 期刊发表 论文专题
返回网学首页
网学原创论文
最新论文 推荐专题 热门论文 论文专题
当前位置: 网学 > 设计下载 > 其他类别 > 正文

TheDesignofHeteromerousDataIntegrationBasedonWebSer

来源:http://myeducs.cn 联系QQ:点击这里给我发消息 作者: 用户投稿 来源: 网络 发布时间: 13/05/06

【编者按】:网学网其他类别为您提供The Design of Heteromerous Data Integration Based on Web Ser参考,解决您在The Design of Heteromerous Data Integration Based on Web Ser学习中工作中的难题,参考学习。

QQ交谈客服咨询,网学网竭诚为您服务,本站永久域名:myeducs.cn

 

2.2 Data analysis
2.2.1 Overview
The integrated source is diverse in structure, semantics or presentation. The goal of data analysis is to eliminate the differences, by figuring out the similarity and conflicts among several data sources, building meta data, and then making out transform rules or functions. Also data analysis need to cleanse the wrong data that do not satisfy the integrality regulation, or not consistent in fact [3].
There are two ways in data analysis, one is data anatomy, and the other is data mining. Data anatomy uses the method of analyzing the real data, to reach the goal of finding out the semantic information, such as data types, units, and data domains. The end point of data mining is finding the relationship among attributes, the restriction among data and so on; it should be performed in a large scaled storage.
After above information is available, the next step for analyzers is to deal with conflicts among the heteromerous data sources. It concludes two kinds of conflicts from a high level, as follows:
1. Schema Conflicts
It can be divided into two detailed kinds, similar schema conflicts and heteromerous schema conflicts.
(1) Similar schema conflicts, such as the attribute of table A is composed from several attributes of table B. The other example is data in table C is the union of table A and B, and this is at the level of tables. In samples, there is additional attributes in Dormitory MIS, which should be added into the data warehouse when integration.
(2) Heteromerous schema conflicts. Here the sample is used. For students'' basic information, there is an attribute called politic status. The difference lies in that in student MIS, the information is stored in a field, which holds the value. While in the Dormitory MIS, the information is deposited alone in a table, and the key of each record is used for referring to value of politic information.
2. Semantic Conflicts
It includes following types, schemas'' semantic conflicts, attributes'' semantic conflicts.
Schemas'' semantic conflict is another saying for schema conflicts.
Attributes'' semantic conflict can be classified into following kinds:
(1) Type conflict, which means type or length of fields are different;
(2) Naming conflict. Even a simple field, it can have several names, which depends on the vocabulary and habits of database designers. For example, as to display names of an enterprise, the word like "company", or "cooperation" is optional;
(3) Data unit conflict. Likewise, to represent the height of a person, meter and centimeter are both satisfying;
(4) Data precision conflict. Different system will not acquire for the same precision for information managed, such as numeric information, the decimal digits may be chosen in different ways, two or three, even four digits are all reasonable;
(5) Format conflict. In this scenario, the classic example is date format, which can be represented in "YYMMDD" or "YYYYMMDD", and so on;
Other specific categories are omitted, which can be found out by readers voluntarily.
2.2.2 Database design of source applications
In the design, the first step to analyze data is to examine the current design of source applications, Student MIS and Dormitory MIS. And the most important material is SRS and Data Model Design Document. Luckily if the final implementation fits into the design, as well as the design meets the requirements, not too much time will be spent in the process. All work is to understand the current design and figure out the similarity and difference between the two systems, to make clear what entities are common, what can be reused, and what need to be changed for a better support.
There are following common entities in both systems, what can be grouped into four:
1. Student politic information, native place information, nation information;
2. Student kind information, state information;
3. Academic basic information, major basic information, class basic information;
4. Student basic information, campus information, family information, the change histories of major student information.
More detailed result of compare is shown in following tables. Table 2.1 represents the difference between tables in Student MIS and Dormitory MIS, while Table 2.2 and Table 2.3 represents the differences between fields of student basic information and change histories of major student information respectively.
Table 2.1 Schema Conflicts

Entity Name
Dormitory MIS
Student MIS
Nation information,
Student state information,
Student kind information,
Academic basic information,
Major basic information
Same in both systems
Politic information
Politic_ID char(2),
Politic_Name varchar(20)
None
Native place information
Stu_Place_ID char(2),
Stu_Place_Name varchar(20)
None
Class basic information,
Student campus information,
Student family information
None
Refer to data model design document

Table 2.2 Semantic conflicts of student basic information

Dormitory MIS
Student MIS
Conflict kind
Stu_Name
varchar(30)
Stu_Name
varchar(20)
Type conflict
Stu_Nation char(2)
Stu_Nation_ID char(2)
 
Naming conflict
Stu_Politic char(2)
Stu_Politics varchar(10)
Type conflict
Stu_Biogenesis
varchar(50)
Stu_NativePlace
varchar(10)
Type conflict
Stu_Aca char(2)
Stu_Aca_ID char(2)
Naming conflict
Stu_Major char(4)
Stu_Major_ID char(4)
Naming conflict
Stu_Grade char(4)
Stu_Grade_ID char(4)
Naming conflict
Stu_Class char(2)
Stu_Class_ID char(10)
Type conflict
Stu_Remark
varchar(1000)
Stu_Remark
varchar(100)
Type conflict
None
Stu_Brithday varchar(8)
Missed attribute
Stu_Picture bit
None
Additional attribute
Stu_Project bit
None
Additional attribute

 
Table 2.3 Semantic conflicts of change histories of major student information

Dormitory MIS
Student MIS
Conflict kind
None
Stu_Name varchar(20)
Missed attribute
None
Stu_Sex char(2)
Missed attribute
None
Stu_Length char(1)
Missed attribute
Spe_Politic char(2)
None
Additional attribute
Spe_State char(2)
Stu_State char(2)
Naming conflict
Spe_Kind char(2)
Stu_Kind char(2)
Naming conflict
Spe_Aca char(2)
Stu_Aca_ID char(2)
Naming conflict
Spe_Major char(4)
Stu_Major_ID char(4)
Naming conflict
Spe_Grade char(4)
Stu_Grade_ID char(4)
Naming conflict
Spe_Class char(2)
Stu_Class_ID char(10)
Type conflict
Spe_Change_Date DateTime
Stu_Change_Date DateTime
Naming conflict

本站发布的计算机毕业设计均是完整无错的全套作品,包含开题报告+程序+论文+源代码+翻译+答辩稿PPT

本文选自计算机毕业设计http://myeducs.cn
论文文章部分只是部分简介,如需了解更多详情请咨询本站客服!QQ交谈QQ3710167

原创论文

设为首页 | 加入收藏 | 论文首页 |原创论文 |
版权所有 QQ:3710167 邮箱:3710167@qq.com 网学网 [Myeducs.cn] 您电脑的分辨率是 像素
Copyright 2008-2020 myeducs.Cn www.myeducs.Cn All Rights Reserved 湘ICP备09003080号 常年法律顾问:王律师