My e-Notes about DataScience, Machine Learning, Python, Data Analytics, DataStage, DWH and ETL Concepts


Monday, 29 July 2019

Frequency Distribution #1 - #UnlockStats

Raw Data:

Raw data are collected data that have not been organized in any way.


An array is a list of raw numerical data in ascending or descending order of magnitude.

Frequency Distribution:

When summarizing large no of data, we categorized them into classes or categories and no of individuals belongs to each class is called the Class Frequency.
           A tabular arrangement of data by classes or categories with class frequency is called Frequency Distribution.


Below is the table for 100 students and their heights categories -

Height (in) No of Students
60-62 5
63-65 18
66-68 38
69-71 31
72-74 8
total 100

Class Intervals and Class Limits:

A symbol defining a class is called Class Intervals such as 63-65, also called Closed Class Intervals as Class has end numbers.
The end no of the class is called Class Limits such as 66 and 68 where 66 is Lower Class Limit and 68 is Upper-Class Limit.  If Class has either no upper class nor no lower class is called an Open Class Intervals such as category 65+years.

Class Boundaries:

Class Boundaries can be defined by adding upper-class limit if a category to lower class limit of the next category by 2.

Upper-Class Boundary (n) - { UCL(n) + LCL(n+1) } / 2

For 63-65 category, 65.5 { (65+66)/2 } is upper-class limit and 62.5 { (62+63)/2 } is lower class limit.

Size/Width of a Class Interval:

The difference between the lower and upper-class limit is called size or width of a Class Interval.
such as -
For 63-65 category, Width is - 65.5 - 62.5 = 3

The Class Mark:

The Class Mark is mid-point of a Class interval and can be calculated as below - 

Class Mark (n) - { UCL(n) + LCL(n) } / 2

For 63-65 category, Class Mark is - (63 + 65) / 2 = 64

Histogram and Frequency Polygon are two graphic representation of frequency distribution. We will discuss this more in the next post.

Till then, Happy Learning.........

Like the below page to get the update  
Facebook Page      Facebook Group      Twitter Feed      Google+ Feed      Telegram Group     


The postings on this site are my own and don't necessarily represent IBM's or other companies positions, strategies or opinions. All content provided on this blog is for informational purposes and knowledge sharing only.
The owner of this blog makes no representations as to the accuracy or completeness of any information on this site or found by following any link on this site. The owner will not be liable for any errors or omissions in this information nor for the availability of this information. The owner will not be liable for any losses, injuries, or damages from the display or use of his information.