Friday, June 27, 2014

DATA WAREHOUSE AND DATA MINING (DWDM) MAY 2010 COMPUTER SCIENCE SEMESTER 6

DATA WAREHOUSE AND DATA MINING (DWDM) MAY 2010 COMPUTER SCIENCE SEMESTER 6

Con. 3881-10.                                  (REVISED COURSE)                            AN-4472

                                                            (3 Hours)                                     [Total Marks :- 100]


N.B. (1) Question No.1 is compulsory.
        (2) Attempt any four questions out of remaining six questions.

1. (a) Define Data Warehouse. Explain the architecture of data warehouse with suitable block
         diagram. [10 Marks]
    (b) Explain data mining as a step in KDD. Give the architecture of typical DM system. [10 Marks]

2. (a) How are top-down and bottom up approaches of building data warehouse differ? Discuss the
         merits and limitation of each approach. [10 Marks]
    (b) What is K-means clustering? Confer the K-means algorithm with the following data for two
          clusters. Data set {10, 4, 2, 12, 3, 20, 30, 11, 25, 31}[10 Marks]

3. (a) Give information package for recording information requirement for"Hotel Occupancy"
          considering dimensions like time, Hotel etc. Design star schema from the information
          package.   [10 Marks]
    (b) Explain HITS algorithm. [10 Marks]

4. (a) What is classification? What are the issues in classification? Apply statistical based algorithm
          to obtain the actual probabilities of each event to classify the new tuple as a tall. Use the
          following data. [10 Marks]
       
    (b) Define Metadata. What are the different types of metadata stored in a data warehouse?
         Illustrate with a simple customer sales data warehouse. [10 Marks]

5. (a) What is Clustering Techniques? Discuss the Agglomerative algorithm using following data and
          plot a Dendrogram using single link approach. The following figure contains sample data
          items the distance between the elements:- [10 Marks]


    (b) All electronics company have sales department sales consider three dimensions
         namely. [10 Marks]
             (i) Time        (ii) Product           (iii) Store
         The schema contain a central fact table sales with to measures.
             (i) dollars-cost and  (ii) units-sold
         Using the above example describe the following OLAP operations: -
             (i) Dice       (ii) Slice        (iii) Roll-up         (iv) Drill-down

6. (a) Explain ETL of data warehousing in detail. [10 Marks]
    (b) Consider the following transactions:- [10 Marks]

         Apply the Apriori Algorithm with minimum support of 30% and minimum confidence of 75%
         and find the large item set L. [10 Marks]

7. Write short notes on any four:- [20 Marks]
    (a) Trends in data warehousing
    (b) Decision Tree based classification approach.
    (c) Key restructuring
    (d) Crawlers
    (e) Web personlization.
         

No comments:

Post a Comment