Main Article Content
This paper attempts to evaluate the predictive ability of four machine learning models: logit, decision tree, random forest and 2-class support vector machine and to identify the key predictors of default. The models are applied on a dataset of 57 companies under the Insolvency and Bankruptcy Code (IBC) in India and a matched sample of 55 solvent companies spanning over ten years from FY06-FY 2016. The solvent companies are matched on size (log of total assets) and sector and are rated ‘AAA’ and ‘AA’. 31 explanatory variables are identified by us for the study which include (i) financial ratios (ii) size and age of the company, (iii) ownership pattern and (iv) market ratios. The empirical findings reveal that random forest strongly outperforms all other models in their predictive ability, followed by SVM, DT and logit model. The findings also confirm relevance of size and age of the firm, market ratios and ownership pattern as predictors of default in addition to financial ratios. We conclude that both parametric (logit) and non-parametric models are useful in the study of default prediction as reflected in the robustness of all models with accuracy of over 75 percent. These models can help banks in strategizing their lending decisions based on credit quality of borrower firms. Our contribution is that to the best of our knowledge this is the first paper that is using the database of companies that are legally defined as insolvent and bankrupt and also taking a balanced sample to avoid biasness and inaccuracy from data imbalance. Also, this study has gone beyond traditional financial statements in identifying key default drivers.