Untitled Studyset

Term

Definition

1/48

Match Practice Test Progress & Stats

Term	Definition
Term	Definition
Frequency distribution	A table that groups observations into categories or intervals and records the number of observations in each group.
Relative frequency	The proportion of observations in each category calculated as frequency divided by total observations.
Categorical variable	A variable consisting of labels or categories rather than numerical values.
Numerical variable	A variable representing measurable quantities or counts with meaningful numerical values.
Bar chart	A graph that uses rectangular bars to represent frequency or relative frequency of categorical data.
Histogram	A graph of adjacent rectangles showing frequency or relative frequency of numerical data intervals.
Interval	A range of values used to group numerical data in a frequency distribution.
Symmetric distribution	A distribution where the left and right sides are mirror images around the center.
Positively skewed distribution	A distribution with a longer tail extending to the right.
Negatively skewed distribution	A distribution with a longer tail extending to the left.
Contingency table	A table summarizing the relationship between two categorical variables using frequencies.
Stacked column chart	A chart that displays multiple categorical variables by stacking segments within bars.
Scatterplot	A graph showing the relationship between two numerical variables using plotted points.
Linear relationship	A relationship between variables that forms a straight-line pattern.
Nonlinear relationship	A relationship between variables that does not follow a straight line.
Line chart	A graph connecting data points with lines to show trends over time.
Data visualization	The graphical or tabular presentation of data to help understand patterns and relationships.
Regression analysis	A statistical method used to model the relationship between a response variable and predictor variables.
Response variable	The variable being predicted or explained in a regression model.
Predictor variable	A variable used to explain or predict the response variable.
Simple linear regression	A regression model with one predictor variable.
Multiple linear regression	A regression model with two or more predictor variables.
Regression model	A mathematical equation describing the relationship between response and predictor variables.
Intercept	The predicted value of the response variable when all predictors equal zero.
Slope coefficient	The change in predicted response associated with a one-unit increase in a predictor variable holding others constant.
Residual	The difference between the observed value and predicted value of the response variable.
Predicted value (y-hat)	The estimated value of the response variable from the regression equation.
Ordinary Least Squares (OLS)	A method that estimates regression coefficients by minimizing the sum of squared errors.
Sum of Squared Errors (SSE)	The sum of squared differences between observed and predicted values.
Standard error of the estimate	The standard deviation of residuals measuring the typical prediction error.
Coefficient of determination (R squared)	The proportion of variation in the response variable explained by the regression model.
Adjusted R squared	A version of R squared that adjusts for the number of predictors and penalizes unnecessary variables.
Dummy variable	A binary variable coded as 0 or 1 used to represent categorical data in regression.
Reference category	The omitted category used as the baseline for comparison in dummy variable regression.
Multicollinearity	A condition where predictor variables are highly linearly related causing unreliable estimates.
Test of joint significance (F test)	A test used to determine whether predictors jointly influence the response variable.
Test of individual significance (t test)	A test used to determine whether an individual predictor significantly affects the response variable.
P value	The probability of observing the sample result assuming the null hypothesis is true.
Significance level	The threshold probability used to decide whether to reject the null hypothesis.
Residual plot	A graph used to examine regression assumptions and detect patterns or outliers.
Outlier	An observation significantly different from the rest of the data.
Linearity assumption	The assumption that the relationship between predictors and response is linear in parameters.
Degrees of freedom	The number of observations minus the number of estimated parameters.
Sample regression equation	The equation using estimated coefficients to predict the response variable.
Goodness of fit	A measure of how well the regression model explains the observed data.
Model selection	The process of choosing the best regression model using measures like standard error and adjusted R squared.
Prediction error	The difference between actual and predicted values.