Introduction to Data Mining

<<< Previous    Up    Next >>>

Lesson 2.3.2.2

Instance Selection

 

Instance selection can be done by two methods, Sampling or by Search-based.

Examples for Sampling methods are:

bullet

Random Sampling - randomly select "m" instances from the "n" initial instances.

bullet

Stratified Sampling - randomly select "m" instances from the "n" initial instances, such that the distribution of classes is maintain in the selected sample.

Examples for Search-based methods are:

bullet

Search for representative instances in the data, based on some criterion and remove the remaining instances.

bullet

Form prototype instances from the actual instances, which would mimic the performance of these instances and then use only the prototype instances.

bullet

Use Statistical measures (number of instances, mean or standard deviations) to replace redundant instances with their representative pseudo-instances.

bullet

Use Support vectors to represent the entire set of instances from the data-set

 

<<< Previous    Up    Next >>>