DATA WAREHOUSE AND DATA MINING (DWDM) MAY 2010 COMPUTER SCIENCE SEMESTER 6
Con. 3881-10. (REVISED COURSE) AN-4472(3 Hours) [Total Marks :- 100]
N.B. (1) Question No.1 is compulsory.
(2) Attempt any four questions out of remaining six questions.
1. (a) Define Data Warehouse. Explain the architecture of data warehouse with suitable block
diagram. [10 Marks]
(b) Explain data mining as a step in KDD. Give the architecture of typical DM system. [10 Marks]
2. (a) How are top-down and bottom up approaches of building data warehouse differ? Discuss the
merits and limitation of each approach. [10 Marks]
(b) What is K-means clustering? Confer the K-means algorithm with the following data for two
clusters. Data set {10, 4, 2, 12, 3, 20, 30, 11, 25, 31}[10 Marks]
3. (a) Give information package for recording information requirement for"Hotel Occupancy"
considering dimensions like time, Hotel etc. Design star schema from the information
package. [10 Marks]
(b) Explain HITS algorithm. [10 Marks]
4. (a) What is classification? What are the issues in classification? Apply statistical based algorithm
to obtain the actual probabilities of each event to classify the new tuple as a tall. Use the
following data. [10 Marks]
(b) Define Metadata. What are the different types of metadata stored in a data warehouse?
Illustrate with a simple customer sales data warehouse. [10 Marks]
5. (a) What is Clustering Techniques? Discuss the Agglomerative algorithm using following data and
plot a Dendrogram using single link approach. The following figure contains sample data
items the distance between the elements:- [10 Marks]
(b) All electronics company have sales department sales consider three dimensions
namely. [10 Marks]
(i) Time (ii) Product (iii) Store
The schema contain a central fact table sales with to measures.
(i) dollars-cost and (ii) units-sold
Using the above example describe the following OLAP operations: -
(i) Dice (ii) Slice (iii) Roll-up (iv) Drill-down
6. (a) Explain ETL of data warehousing in detail. [10 Marks]
(b) Consider the following transactions:- [10 Marks]
Apply the Apriori Algorithm with minimum support of 30% and minimum confidence of 75%
and find the large item set L. [10 Marks]
7. Write short notes on any four:- [20 Marks]
(a) Trends in data warehousing
(b) Decision Tree based classification approach.
(c) Key restructuring
(d) Crawlers
(e) Web personlization.
No comments:
Post a Comment