Intro to data mining | Information Systems homework help

Chapter 3, exercises in 3.11

5. Consider the following data set for a binary class problem.

A B Class Label

T F +

T T +

T T +

T F −

T T +

F F −

F F −

F F −

T T −

T F −

a. Calculate the information gain when splitting on A and B. Which

attribute would the decision tree induction algorithm choose?

b. Calculate the gain in the Gini index when splitting on A and B.

Which attribute would the decision tree induction algorithm

choose?

c. Figure 3.11 shows that entropy and the Gini index are both

monotonically increasing on the range [0, 0.5] and they are both

monotonically decreasing on the range [0.5, 1]. Is it possible that

information gain and the gain in the Gini index favor different

attributes? Explain.

7. Consider the following set of training examples.

X Y Z No. of Class C1 Examples No. of Class C2 Examples

0 0 0 5 40

0 0 1 0 15

0 1 0 10 5

0 1 1 45 0

1 0 0 10 5

1 0 1 25 0

1 1 0 5 20

1 1 1 0 15

a. Compute a two-level decision tree using the greedy approach

described in this chapter. Use the classification error rate as the

criterion for splitting. What is the overall error rate of the induced

tree?

b. Repeat part (a) using X as the first splitting attribute and then

choose the best remaining attribute for splitting at each of the two

successor nodes. What is the error rate of the induced tree?

c. Compare the results of parts (a) and (b). Comment on the suitability

of the greedy heuristic used for splitting attribute selection.

8. The following table summarizes a data set with three attributes A, B,

C and two class labels +, −. Build a two-level decision tree.

A B C

Number of Instances

+ −

T T T 5 0

F T T 0 20

T F T 20 0

F F T 0 5

T T F 0 0

F T F 25 0

T F F 0 0

F F F 0 25

a. According to the classification error rate, which attribute would be

chosen as the first splitting attribute? For each attribute, show the

contingency table and the gains in classification error rate.

b. Repeat for the two children of the root node.

c. How many instances are misclassified by the resulting decision

tree?

d. Repeat parts (a), (b), and (c) using C as the splitting attribute.

e. Use the results in parts (c) and (d) to conclude about the greedy

nature of the decision tree induction algorithm.

Calculate the price of your order

550 words
We'll send you the first draft for approval by September 11, 2018 at 10:52 AM
Total price:
$26
The price is based on these factors:
Academic level
Number of pages
Urgency
Basic features
  • Free title page and bibliography
  • Unlimited revisions
  • Plagiarism-free guarantee
  • Money-back guarantee
  • 24/7 support
On-demand options
  • Writer’s samples
  • Part-by-part delivery
  • Overnight delivery
  • Copies of used sources
  • Expert Proofreading
Paper format
  • 275 words per page
  • 12 pt Arial/Times New Roman
  • Double line spacing
  • Any citation style (APA, MLA, Chicago/Turabian, Harvard)

Our guarantees

Delivering a high-quality product at a reasonable price is not enough anymore.
That’s why we have developed 5 beneficial guarantees that will make your experience with our service enjoyable, easy, and safe.

Money-back guarantee

You have to be 100% sure of the quality of your product to give a money-back guarantee. This describes us perfectly. Make sure that this guarantee is totally transparent.

Read more

Zero-plagiarism guarantee

Each paper is composed from scratch, according to your instructions. It is then checked by our plagiarism-detection software. There is no gap where plagiarism could squeeze in.

Read more

Free-revision policy

Thanks to our free revisions, there is no way for you to be unsatisfied. We will work on your paper until you are completely happy with the result.

Read more

Privacy policy

Your email is safe, as we store it according to international data protection rules. Your bank details are secure, as we use only reliable payment systems.

Read more

Fair-cooperation guarantee

By sending us your money, you buy the service we provide. Check out our terms and conditions if you prefer business talks to be laid out in official language.

Read more