# HDip Data Analytics

### Site Tools

help:statistics:descriptive-statistics

DATA ANALYTICS REFERENCE DOCUMENT

Document Title:Descriptive Statistics
Document No.:1540829111
Author(s):Rita Raher
Contributor(s):

REVISION HISTORY

Revision

Details of Modification(s)

Reason for modification

Date

By
0 Draft releaseDocument description here 2018/10/29 16:05 Rita Raher

# Descriptive Statistics

Descriptive statistics describes the data. It is distinguished from inferential statistics, in that descriptive statistics aims to summarize a sample, rather than use the data to learn about the population that the sample of data is thought to represent. This generally means that descriptive statistics, unlike inferential statistics, is not developed on the basis of probability theory, and are frequently nonparametric statistics.

Some measures that are commonly used to describe a data set are measures of central tendency and measures of variability or dispersion. Measures of central tendency include the mean, median and mode, while measures of variability include the standard deviation (or variance), the minimum and maximum values of the variables, kurtosis and skewness.

Methods for visually presenting summary statistics include tables, charts, and graphical plots.

### Descriptive statistics using Pandas

import pandas as pd

df.describe()

## Central Tendency

In statistics, a central tendency (or measure of central tendency) is a central or typical value for a probability distribution. It may also be called a center or location of the distribution.

The most common measures of central tendency are:

• mean - the average
• median - the value separating the higher half from the lower half of a data sample
• mode - The set of data values is the value that appears most often
import numpy as np

# finding the mean
np.mean(df)

# mean of a column
np.mean(df['col'])

# finding the median
np.median(df)

### Graphical Plots

Graphical plots are interesting in that they pictorially convey a large amount of information in a concise way that allows for quick interpretation and understanding of the data.

The Box Plot