Chapter 1: Understanding Statistics

Back to chapter

1.2:

How Data are Classified: Categorical Data

JoVE Core
Statistics

A subscription to JoVE is required to view this content. Sign in or start your free trial.

JoVE Core Statistics

Previous Video
1.1: Introduction to Statistics

Next Video
1.3: How Data are Classified: Numerical Data

Languages

English العربية 中文 Nederlands français Deutsch עברית italiano 日本語 한국어 português русский español Türkçe

Data, a scientific term used for the collection of observations and measurements, forms the basis for all the statistical analyses and inferences.

Data can be classified based on whether it can be measured or not. For example, consider different hair colors. One cannot measure hair color in liters or kilometers but instead can group them into categories such as black, brunette, or red.

Such data sets are called categorical data or qualitative data; they cannot be measured or counted but can be labeled or put into different categories.

Another example is human blood, which is grouped into four different types: A, B, O, or AB.

In certain cases, categorical data can be ordered in a particular fashion; such data are called ordinal categories. For example, the size of coffee cups—small, medium, large—or the height of trees in a forest—short, medium, tall—can be arranged in the order of increasing size.

1.2:

How Data are Classified: Categorical Data

A variable, usually notated by capital letters such as X and Y, is a characteristic or measurement that can be determined for each member of a population. Data are the actual values of variables. They may be numbers, or they may be words. Datum is a single value.

Data are classified based on whether they are measurable or not. Categorical data cannot be measured; instead, it can be divided into categories. For example, if Y denotes a person's party affiliation, some examples of Y include Republican, Democrat, and Independent. Y is categorical data. Categorizing a population-based on hair color, age, sex, blood group are examples of categorical data.

In some cases, categorical data can be ordered in a particular fashion, and these fall under the ordinal category. Consider the list of the top five national parks in the United States. The top five national parks can be ranked from one to five, but the differences between the data are not measurable. Another example is a cruise survey where the responses to questions about the cruise are "excellent," "good," "satisfactory," and "unsatisfactory." These responses are ordered from the most desired response to the least desired. However, the differences between the two pieces of data cannot be measured.

This text is adapted from Openstax, Introductory Statistics, Section 1.1 Definitions of Statistics, Probability, and Key Terms

Chapter 1: Understanding Statistics

Back to chapter

1.2:

How Data are Classified: Categorical Data

Previous Video
1.1: Introduction to Statistics

Next Video
1.3: How Data are Classified: Numerical Data

Languages

Share

1.2:

How Data are Classified: Categorical Data

Tags

Chapter 1: Understanding Statistics

Back to chapter

1.2:

How Data are Classified: Categorical Data

Previous Video1.1: Introduction to Statistics

Next Video1.3: How Data are Classified: Numerical Data

Languages

Share

1.2:

How Data are Classified: Categorical Data

Tags

Previous Video
1.1: Introduction to Statistics

Next Video
1.3: How Data are Classified: Numerical Data