DATA WAREHOUSE AND DATA MINING (DWDM) MAY 2012 COMPUTER SCIENCE SEMESTER 6
(3 Hours) [Total Marks : 100]
N.B. : (1) Question No. 1 is compulsory.
(2) Solve any four out of the remaining.
(3) Draw suitable diagrams wherever necessary.
(4) Assume suitable data (if required)
| 1. (a) | Define a data warehouse. Explain what is the need for developing a data | 10 |
| warehouse and hence explain its architecture. | ||
| (b) | Compare OLTP and OLAP systems. Explain the steps in KDD with a suitable | 10 |
| block diagram. | ||
| 2. (a) | What is meant by ETL ? Explain the ETL process in detail. | 10 |
| (b) | State and explain the various schema used in data warehousing with examples | 10 |
| for each of them. | ||
| 3. (a) | Differentiate between top down and bottom-up approaches for building a data | 10 |
| warehouse. Explain the advantages and disadvantages of each of them. | ||
| (b) | Define what is meant by information package diagram. For recording the | 10 |
| information requirements for "hotel occupancy" having dimensions like time, | ||
| hotel etc, give the information package diagram for the same, also draw the | ||
| star schema and snow flake schema. | ||
| 4. (a) | What is meant by meta data ? Explain with an example. Explain the different | 10 |
| types of meta data stored in a data warehouse. | ||
| (b) | Explain what is meant by association rule mining. For the table given below | 10 |
| perform opriori agoritm. Also - | ||
| (i) Determine the k-item sets (frequent) obtained. | ||
| (ii) Justify the strong association rule that has been determined i.e. | ||
| specify which is the strongest rule obtained. | ||
| The table is as follows- | ||
| TID Items | ||
| 01 1, 3, 4, 6 | ||
| 02 2, 3, 5, 7 | ||
| 03 1, 2, 3, 5, 8 | ||
| 04 2, 5, 9, 10 | ||
| 05 1, 4 | ||
| Assume Minimum support of 30% and Minimum confidence of 75%. | ||
| 5. (a) | Explain dimension modelling in detail. | 10 |
| (b) | Explain what is meant by clustering. State and explain the various types with | 10 |
| suitable example for each. | ||
| 6. (a) | What is meant by classification ? Justify why clustering is said to be supervised | 10 |
| learning. How is the classifier accuracy determined and also explain its various | ||
| types. | ||
| (b) | What is meant by market-basket analysis ? Explain with an example. State and | 10 |
| explain with formula the meaning of the terms :- | ||
| (i) Support | ||
| (ii) Confidence | ||
| (iii) Iceberg queries. | ||
| Hence explain how to mine multi level association rules from transaction | ||
| databases, with example for each. | ||
| 7. | Write short notes on (any two) :- | 20 |
| (a) OLAP operations | ||
| (b) Data warehouse deployment and maintenance | ||
| (c) Attribute oriented induction | ||
| (d) Web mining. |
No comments:
Post a Comment