The Predictive System Resilience Assessment Tool (PSRAT) is an open source application that applies regression methods to accurately predict system resilience. The primary functions of the PSRAT include:
- Calculating three different types of regression:
- Multiple linear regression
- Interaction regression
- Polynomial regression
- Performing Forward Stepwise Regression (FSR) to determine the best covariates to use for each regression model
- Prediction of future covariate models
- Displaying model fit and confidence interval plots
- Comparison of regression models
There are two different ways to run PSRAT. The first is as a python application, and the second is as a docker image. Running through docker will work correctly on any system, but requires extra software to run. Running directly as a python app will not work on Windows, but it may be preferable on other systems because it is lighter weight.
Setup instructions for Docker are available at DockerInstructions.md and instructions for the python app are available at PythonInstructions.md.
Select a dataset using the "Select file" button. When running the python app, this will open a file picker in the datasets folder. When running PSRAT with Docker, this is the only place you can import or export to. Datasets are expected to be in xlsx, csv, or txt format as shown in the examples in the attached "datasets" folder. Once a dataset is selected, the covariates box will fill in with the covariates provided in the dataset, and the Input Data box will fill with the graph of the Y Vector.
Then select the covariates you wish to calculate a regression fit for by clicking on them in the covariates box. The order of covariates matters. If you wish to perform FSR, don't select any covariates.
- Cutoff is how far the model is fit as a percentage of the dataset. The remaining amount is used as validation. Nominal value is 0.9.
- Confidence - often notated z or labeled confidence level - is used to produce a confidence interval. The nominal value is 0.975.
- Predictions is the number of intervals forward the model should attempt to predict.
When ready to calculate, press the solve button. If you input a predictions value greater than 0, a box will appear prompting you to input the values for each selected covariate for that number of intervals. Once complete, a green bar will begin filling indicating the number of model fits that have been calculated. Once the green bar is full, you can view the Model Results and Model Comparison tabs.
This tab shows a visual comparison between the model fits and the Y Vector. The x-axis represents the interval, the y-axis represents the value at that interval, and the dotted line is the validation cutoff. By default, this tab will show the best fit out of the three models, but you can choose to display any other model, or all models at once. When viewing a specific model, the confidence intervals will be displayed. To recalculate the confidence intervals with another confidence level, type in the new confidence value, and click "Resolve".
On this tab, the raw numerical data is displayed in tabular format. At the top is text box for displaying the equations created from the chosen covariates. Below that is a list of the chosen covariates for the model fits. If you have chosen to run FSR, this will be the location of the best covariates that were chosen separatly for each model. At the bottom are three tables that show different values relating to the fit. The left table displays the discrete points of the fit line for the chosen regression model(s). The middle table shows several goodness of fit measures calculated by the program. The rightmost table displays the metrics for discerning the best fit for your needs. All values are rounded to 6 decimal places, but the full values can be attained by downloading the complete results. This is offered in an Excel, PDF, and comma-separated text file.
To run PSRAT CLI, first navigate to the PSRAT root directory in a terminal/command prompt window. The typical command structure to run the CLI is python cli.py {dataset location} {regression | predict | fsr} {process options}
To perform regression, the full command is python cli.py {dataset location} regression {regression model} {covariates} {validation cutoff} {confidence} {plot}
- dataset location: A path to the dataset file to be analyzed
- regression model:
mlr(Multiple Linear Regression)polynomial(Polynomial Regression)interaction(Multiple Linear Regression with Interaction)
- covariates: A list of the used covariates as indices in the set of all covariates. For example, if you wish to solve a system using X3, X6, and X4 in that order, covariates would be
"3, 6, 4" - validation cutoff: A value between 0 and 1 that describes what portion of the dataset the model is fit to, and which portion the model is validated against. Nominal value is
0.9 - confidence: A value between 0 and 1 that represents the confidence level (commonly notated "z") for calculating confidence intervals. Nominal value is
0.975 - plot: A boolean value (
TrueorFalse) to determine whether the data is plotted on screen once the calculations are finished executing.
To perform prediction, the full command is python cli.py {dataset location} predict {regression model} {covariates} {validation cutoff} {confidence} {future intervals} {plot}
- dataset location: A path to the dataset file to be analyzed
- regression model:
mlr(Multiple Linear Regression)polynomial(Polynomial Regression)interaction(Multiple Linear Regression with Interaction)
- covariates: A list of the used covariates as indices in the set of all covariates. For example, if you wish to solve a system using X3, X6, and X4 in that order, covariates would be
"3, 6, 4" - validation cutoff: A value between 0 and 1 that describes what portion of the dataset the model is fit to, and which portion the model is validated against. Nominal value is
0.9 - confidence: A value between 0 and 1 that represents the confidence level (commonly notated "z") for calculating confidence intervals. Nominal value is
0.975 - future intervals: A list of the future interval values for each of the covariates listed. For example, if using
"3, 6, 4"for covariates, and you wish to predict 2 intervals in the future, future intervals would be"3.1, 3.2" "6.1, 6.2" "4.1, 4.2". - plot: A boolean value (
TrueorFalse) to determine whether the data is plotted on screen once the calculations are finished executing.
To perform FSR (Forward Stepwise Regression) the full command is python cli.py {dataset location} fsr {validation cutoff} {confidence} {plot}
- dataset location: A path to the dataset file to be analyzed.
- validation cutoff: A value between 0 and 1 that describes what portion of the dataset the model is fit to, and which portion the model is validated against. Nominal value is
0.9 - confidence: A value between 0 and 1 that represents the confidence level (commonly notated "z") for calculating confidence intervals. Nominal value is
0.975 - plot: A boolean value (
TrueorFalse) to determine whether the data is plotted on screen once the calculations are finished executing.
The main folders of PSRAT are:
- datasets - contains example data sets
The covariate model that PSRAT applies was presented in:
P. Silva, Predictive Resilience Modeling. In Proc. 52nd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN 2022), Baltimore, MD, Jun 2022.
This material is based upon work supported by the National Science Foundation under Grant Numbers 2050972 and 1749635.
Code release under MIT LICENSE.