DTREG is the ideal tool for modeling business and medical data with categorical variables such as sex, race and marital status.
The process of extracting useful information from a set of data values is called “data mining”. This data can be used to create models to make predictions. Many techniques have been developed for predictive modeling, and there is an art to selecting and applying the best method for a particular situation. DTREG implements the most powerful predictive modeling methods that have been developed. You can use decision tree based methods including TreeBoost and Decision Tree Forests as well as Neural Networks, Support Vector Machine, Gene Expression Programming and Symbolic Regression, K-Means Clustering, Linear Discriminant Analysis, Linear Regression models and Logistic Regression models.
Ease of use. DTREG is a robust application that is installed easily on any Windows system. DTREG reads Comma Separated Value (CSV) data files that are easily created from almost any data source. Once you create your data file, just feed it into DTREG, and let DTREG do all of the work of creating a decision tree, Support Vector Machine, K-Means clustering, Linear Discriminant Function, Linear Regression or Logistic Regression model. Even complex analyses can be set up in minutes.
Classification and Regression Trees. DTREG can build Classification Trees where the target variable being predicted is categorical and Regression Trees where the target variable is continuous like income or sales volume.
Single-tree, TreeBoost, Decision Tree Forests, Support Vector Machine, K-Means clustering, Linear Discriminant Analysis, Linear Regression and Logistic Regression. By simply checking a button, you can direct DTREG to build a classic single-tree model, a TreeBoost model consisting of a series of trees a Decision Tree Forest, a Neural Network, a Support Vector Machine, a Gene Expression Programming, a K-Means Clustering, a Linear Discriminant Analysis function a Linear Regression model. or a Logistic Regression model.
Automatic tree pruning. DTREG uses V-fold cross-validation to determine the optimal tree size. This procedure avoids the problem of "overfitting" where the generated tree fits the training data well but does not provide accurate predictions of new data.
Surrogate variables for missing data. DTREG uses a sophisticated technique involving "surrogate variables" to handle cases with missing values. This allows cases with some available values and some missing values to be utilized to the maximum extent when building the model. It also enables DTREG to predict the values of cases that have missing values.
Visual display of the tree. DTREG can display the generated decision tree on the screen, write it to a .jpg or .png disk file or print it. When printed, DTREG uses a sophisticated technique for paginating trees that cross multiple pages.
DTREG accepts text data as well as numeric data. If you have categorical variables with data values such as “Male”, “Female”, “Married”, “Protestant”, etc., there is no need to code them as numeric values.
Data Transformation Language (DTL). DTREG includes a full Data Transformation Language (DTL) programming language for transforming variables, creating new variables and selecting which cases are to be included in the analysis.
Project files for saving analyses. DTREG saves all of the information about variables, analysis parameters as well as the generated report and tree in a project file. You can later open the project file, alter parameters or rerun it with a different dataset.
Scoring to predict values. Once a decision tree has been built, you can use DTREG to "score" a new dataset and predict values for the target variable.
Generated scoring source code. The "Translate" function in DTREG generates C, C++ and SAS® source code to compute predicted values. This source code can be included in application programs to perform high performance scoring of large volumes of data.
Heavy duty capability. The Enterprise Version of DTREG can handle an unlimited number of data rows -- hundreds of thousands or millions are no problem. DTREG can build classification trees with predictor variables that have hundreds of categories by using an efficient clustering algorithm. Many other decision tree programs limit predictor variables to 16 or less categories.
DTREG COM Library. The DTREG COM Library can be called from application programs to compute predicted target values using a decision tree generated by DTREG.