Data are individual items of information obtained from a population or sample. Data may be classified as qualitative (categorical), quantitative continuous, or quantitative discrete. Because it is not practical to measure the entire population in a study, researchers use samples to represent the population. A random sample is a representative group from the population chosen by using a method that gives each individual in the population an equal chance of being included in the sample. Random sampling methods include simple random sampling, stratified sampling, cluster sampling, and systematic sampling. Convenience sampling is a nonrandom method of choosing a sample that often produces biased data.
Once the data is collected, it can be described and presented in many different formats. For example, suppose a person is interested in buying a house in a particular area. Not having much information about the house prices, the buyer might ask the real estate agent to give a sample data set of prices. Reading through all the prices in the sample can be a little overwhelming. A better way might be to look at the median price and the variation in the prices. The median and variation are just two ways that one can use to describe data. The agent might also provide a graph of the data, which could be a more convenient way to understand the house prices.
The area of statistics that details the numerical and graphical ways to describe and display the sample data is called "Descriptive Statistics." A statistical graph is a tool that helps one learn about the shape or distribution of a sample or a population. A graph can be a more effective way of presenting data than a stack of numbers because it is easy to observe data clusters and identify positions where there are only a few data values. Newspapers and the Internet use graphs to show trends and to enable readers to compare facts and figures quickly. Some types of graphs that are used to summarize and organize data are the dot plot, the bar graph, the histogram, the stem-and-leaf plot, the frequency polygon (a type of broken line graph), the pie chart, and the box plot.
回想一下,数据大致分为定量数据和定性数据。
定量数据表示数值的测量值或计数,例如班级中学生的不同身高。
相反,定性数据(也称为分类数据)表示非数值变量,例如头发的不同颜色。
为了进行有效的统计分析,这些无组织的大型数据集以表格形式或图形形式以数字形式和可视化方式进行汇总和表示。
例如,一天中测得的温度变化可以用表格的形式进行总结。
这些数据也可以用图形表示。在这里,时间沿横轴给出,温度沿纵轴显示。
图表中的点连接在一起形成模式,从而直观地了解白天温度如何随时间变化。
该图还从其他数据值中标识出异常值,这些值指示当天观察到的极端温度。
Related Videos
Summarizing and Visualizing Data
8.9K 浏览
Summarizing and Visualizing Data
20.1K 浏览
Summarizing and Visualizing Data
7.6K 浏览
Summarizing and Visualizing Data
10.5K 浏览
Summarizing and Visualizing Data
57.9K 浏览
Summarizing and Visualizing Data
6.8K 浏览
Summarizing and Visualizing Data
5.6K 浏览
Summarizing and Visualizing Data
12.7K 浏览
Summarizing and Visualizing Data
5.4K 浏览
Summarizing and Visualizing Data
6.8K 浏览
Summarizing and Visualizing Data
4.3K 浏览
Summarizing and Visualizing Data
16.0K 浏览
Summarizing and Visualizing Data
5.1K 浏览
Summarizing and Visualizing Data
6.7K 浏览
Summarizing and Visualizing Data
13.7K 浏览