Showing posts with label Ordinal. Show all posts
Showing posts with label Ordinal. Show all posts

Friday, 12 May 2017

#3 - Measuring Data Similarity or Dissimilarity


Continue from -
 'Measuring Data Similarity or Dissimilarity #1'
 'Measuring Data Similarity or Dissimilarity #2',


3. For Ordinal Attributes:

Ordinal attribute is an attribute with possible values that have a meaningful order or ranking among them but the magnitude between successive values is not known. Ordinal values are same as Categorical Values but with the Order.

Such as, For "Performance" columns Values are - Best, Better, Good, Average, Below Average, Bad

These values are Categorical values with order or rank so called Ordinal Values. Ordinal attributes can also be derived from discretization of numeric attributes by splitting the value range into finite number of ordered categories.

We assign rank to these categories to calculate the similarity or dissimilarity, i.e. - There is an attribute f having N possible state can have `1, 2, 3........f_N` ranking.


Measuring Data Similarity or Dissimilarity for Ordinal Attributes


How to Calculate Similarity or Dissimilarity: 

1, Assign the Rank `R_if`to each category of attribute f having N possible states.
2. Normalize the Rank between [0.0, 1.0] so that each attribute have equal weight.
Can be calculated as

`R_in = \frac{R_if - 1}{N - 1}`

3. Now Similarity or Dissimilarity can be calculated with any distance measuring techniques. ( 'Measuring Data Similarity or Dissimilarity #2)






Like the below page to get update  
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://datagenx.slack.com/messages/datascience/