1.2: How Data are Classified: Categorical Data
A variable, usually notated by capital letters such as X and Y, is a characteristic or measurement that can be determined for each member of a population. Data are the actual values of variables. They may be numbers, or they may be words. Datum is a single value.
Data are classified based on whether they are measurable or not. Categorical data cannot be measured; instead, it can be divided into categories. For example, if Y denotes a person's party affiliation, some examples of Y include Republican, Democrat, and Independent. Y is categorical data. Categorizing a population-based on hair color, age, sex, blood group are examples of categorical data.
In some cases, categorical data can be ordered in a particular fashion, and these fall under the ordinal category. Consider the list of the top five national parks in the United States. The top five national parks can be ranked from one to five, but the differences between the data are not measurable. Another example is a cruise survey where the responses to questions about the cruise are "excellent," "good," "satisfactory," and "unsatisfactory." These responses are ordered from the most desired response to the least desired. However, the differences between the two pieces of data cannot be measured.
This text is adapted from Openstax, Introductory Statistics, Section 1.1 Definitions of Statistics, Probability, and Key Terms