The project’s Key Performance Indicators or KPIs include the data analysis expectation of >=95% for the Classification Prediction Model. According to Work Group for Community Health and Development (2018), collecting and analyzing data (e.g., Qualitative and/or Quantitative) will reveal “relationships, patterns, trends” (para. 6) between the variables and help to determine if the various independent variables or intervention variables caused a significant change for the dependent variable’s result based on a reasonable level of significance or probability of having the correct result (e.g., .05 significance level or a 5% chance of having the wrong result or a 95% probability of having an accurate result).
The following are project KPIs that help to measure the success of the final data set and the predictive model:
- Total Frequency or Distribution Counts
- Sum Aggregate Amounts
- Average Aggregate Amounts
- Percentage Aggregate Amounts
- >=95% Significance Level (0.05)
- Predicted Target Value vs. Actual Target Value Accuracy
- Valid Average Squared Errors (Least)
According to Shuttleworth (2008), an alpha of 5% or 0.05 (95%) confidence level or statistical significance criteria is comfortable for most research papers (lower for more precision) according to the Author, which the null hypothesis (H0) that is less than cut-off P-values of 0 – 1 which the variable “should” be rejected from the analysis based on the strength of the variable’s relationship with the target variable.
Correlation Significance Criteria
The statistical significance criteria for determining the strength of a variable’s relationship with the target variable can be determined with the Correlation KPI, according to Wilson (2009): Variables are correlated or have a relationship with each other when a change in one quantifiable variable causes a change in another quantifiable variable and vice versa, which the correlation value helps determine the strength of the relationship and whether the relationship is a positive correlation which increases in the same direction or a negative correlation which decreases occur in an opposing direction in the quantifiable data (i.e., +1 indicates positive correlation and -1 indicates negative correlation, where r = 0). The following is a useful rule of thumb provided by the Autor for determining the strength of the correlation relationship (para. 10):
Value of r = Strength of the Relationship
- If r = -1.0 to -0.5 or 1.0 to 0.5 = Strong
- If r = -0.5 to -0.3 or 0.3 to 0.5 = Moderate
- If r = -0.3 to -0.1 or 0.1 to 0.3 = Weak
- If r = -0.1 to 0.1 = None or very weak
References
Shuttleworth, M., & Wilson, L. T. (2008, Mar 17). Research Hypothesis. Retrieved from, https://explorable.com/research-hypothesis
Wilson, L. T. (2009, May 2). Statistical Correlation. Retrieved from, https://explorable.com/statistical-correlation
Work Group for Community Health and Development. (2018). Chapter 37: Section 5. Collecting and Analyzing Data. Retrieved from, http://ctb.ku.edu/en/table-of-contents/evaluate/evaluate-community-interventions/collect-analyze-data/main