Tuesday, July 1, 2014

DATA WAREHOUSE AND DATA MINING (DWDM) MAY 2013 COMPUTER SCIENCE SEMESTER 6

DATA WAREHOUSE AND DATA MINING (DWDM) MAY 2013 COMPUTER SCIENCE SEMESTER 6

Con. 9998-13.                                                                                                 GS-1369
                                                        (3 Hours)                                     [Total Marks : 100]                                   
Note: 1. Question 1 is compulsory
         2. Answer any 4 out of the remaining questions.
         3. Answers to sub questions must be written together.

Q1.  (a)What are differences between Data Warehouse and Data Mart ?(05)
       (b)For Supermarket Chain consider the following dimensions, namely Product, store,(05)
time , promotion. The schema contains a central fact table, sales facts with three
measures unit_sales, dollars_sales and dollar_cost. Design star schema for this
application.
        (c)Calculate the maximum number of base fact table records for warehouse with the(05)
following values given below :
Time period: 5 years
Store: 300 stores reporting daily sales
product: 40,000 products in each store (about 4000 sell in each store daily)
        (d)Illustrate how the supermarket can use clustering methods to improve sales.(05)

Q2.

Define the following terms by giving examples

(20)
(a) Factless fact tables
(b) Snowflake Schema
(c) Web Structure Mining
(d) Concept Hierarchy

Q3.  (a)

Apply Agglomerative Hierarchical Clustering and draw single link and average

(10)
link dendrogram for the following distance matrix.
        (b)Explain the Page Rank technique with algorithm.(10)

Q4.  (a)

Consider a data warehouse for a hospital, where there are three dimensions:

(10)
(1) Doctor (2) Patient (3) Time; and two measures: (1) Count & (2) Fees;
For this example create a Olap cube and describe the following OLAP operations:
        (b)Consider the following transaction database:(10)
TID             Items
01         A, B, C, D
02         A, B, C, D, E, G
03         A, C, G, H, K
04         B, C, D, E, K
05         D, E, F, H, L 06 A, B, C, D, L
07         B, I, E, K, L
07         B, I, E, K, L
08         A, B, D, E, K
09         A, E, F, H, L
10         B, C, D, F

Apply the apriori algorithm with minimum support of 30% and minimum confidence of
70%, and find all the associtaion rules in the data set.

Q5.  (a)A simple example from the stock market involving only discrete ranges has
Profit as categorical attribute, with values {up, down}. and the training data is:


Apply the decision tree algorithm and show the generated rules.(10)
        (b)Describe the steps of the ETL (Extract - Transform - Load) cycle.(10)

Q6.  (a)

Define multidimensional and multilevel association mining.

(10)
        (b)Explain the role of Meta data in a data warehouse.(10)

Q.7

Write detailed notes on :

(20)
(a) Data Warehouse Architecture
(b) K-Means Clustering

No comments:

Post a Comment