The Evolution Of Loan-Level Data In Analyzing RMBS Investments

REQUIRED READING: There is more than enough blame to go around for the meltdown in the residential mortgage markets. Each constituent in the value chain can turn to the next in an attempt to ferret out the root causes. One thing most participants can agree on is that more thorough and accurate assessments of the risks associated with residential mortgage-backed securities (RMBS) are essential to preventing a future collapse.

Advancements in loan-level data – from collection and cleansing through to distribution – are helping issuers, investors, brokers, modelers, researchers and others achieve a higher standard risk assessment and giving them a more comprehensive look at the underlying collateral in RMBS. The data are more reliable and complete, available to more people and delivered with unprecedented speed and efficiency. Market participants who have learned their lesson about the dangers of relying too heavily on summary data are discovering a new richness of data at their fingertips.Â

Advancements in loan-level data facilitate better macroeconomic analysis (which forms the basis of mortgage prepayment and default models), enhance credit surveillance for individual deals and generate more efficient relative value analysis for traders that compare one trust to another. Chief among the uses of this data are the improved accuracy and reliability of future cashflow predictions, the central element to RMBS risk management.

During the height of the mortgage market in the mid-2000s, market participants had only one commercial loan-level data supplier from which to choose. With the exception of the largest players, the data set was beyond the reach of most because of cost constraints, the complexity of the time series and the programming expertise required to use the data.

The Security and Exchange Commission's Regulation AB and other regulations passed in recent years opened the door to a wider access to loan-level data, allowing more companies to pursue advancements in the way data are collected, distributed and used. New cleansing logic and normalization routines are making data more accurate; new methods for identifying loan attributes and integration with third-party data sets are making data richer; and new data distribution methods are allowing for quicker and easier access to more participants.

There have been advancements have happened across all three main categories of loan-level data: setup data describing loan and borrower characteristics at origination, periodic data describing a borrower's payment history and summary data for an entire trust.

Data doings

Setup data describe the loan at origination, which includes the attributes at loan closing, such as the original term, coupon rate, balance and adjustable-rate mortgage (ARM) reset periodicities. Complementary attributes are also captured that can be used by predictive models, such as the initial purpose of the loan (refinance or new purchase); the documentation type (full doc set or some flavor of a reduced doc package); the borrower's propensity to repay the loan (a FICO and/or Vantage score), occupancy type (i.e., owner- or investor-owned) and lien position.

Traditionally, the setup data were static. The user received a point-in-time view of these attributes at loan origination. In one of the major changes in loan-level data collection and delivery, loans can now be matched to other data sets. The integrated loan data can then be used to construct a time series view of attributes that can help a user track changes over time. Static data become dynamic, offering a real-time look at key borrower and property attributes.

Two major developments in third-party data integration include the integration of borrower characteristics, such as credit scores and property and title information. Current credit scores help analysts assess a borrower's likelihood to repay a loan and offer updated insights into the borrower's debt-to-income ratio, which provides insight into the borrower's ability to repay a loan. Credit bureaus have long been in the business of assisting investors in this type of analysis.

Property and title data provide an updated view of properties, which can lead to more accurate loss-severity projections by using property-specific information to update home-specific current combined loan-to-value ratios. Automated valuation models and granular home-price indices can be used to generate estimates on the current value of the home.

The second major category of loan-level data is a set of attributes that documents the borrower's loan payments. There is one payment or periodic row for each month a loan is outstanding and can be used to calculate both scheduled payments and unscheduled payments (i.e., voluntary prepayments or delinquencies and losses). These monthly updates are made available on trustee websites. Certain pieces of data, including loss and modification data, are sometimes augmented by the servicer or special servicer.

Due to payment reporting gaps and timing issues, much work is required to line up data used to accurately calculate scheduled payments, prepayments, delinquencies, losses and loan modifications. One effective method to line up the payment data, input missing values or generate value-added information is to walk the loan forward each month. This involves comparing not only data attributes for an activity period (i.e., compare the ending scheduled balance to the beginning balance to ensure accuracy of the payment information for that activity period), but also each new activity to the prior period when relevant.

This can help to correct the timing issues related to loss reporting, generate accurate loan modification identification algorithms and calculate payoff reason codes – i.e., voluntary prepayment, a forced sale, a real estate owned (REO) sale, a loan substitution or maturity. This methodology is the next step in the cleansing of loan-level data. For instance, algorithms have been used to identify 45% more loan modifications than were reported by trustees starting in 1999.

Trustees have vastly improved the reporting of loan modifications since 2008. For older vintages, users need to apply rules to identify most modifications. Similar issues sometimes exist for loan losses related to both short sales and REO sales.

This type of walk-forward methodology also can help correct a myriad of reporting issues related to current period and cumulative losses. These timing issues become apparent when reconciling loan-level data to the trustee remittance reports.

The third main category of RMBS data is summary data describing the entire trust. This includes attributes such as the total balance of all active loans, cumulative losses incurred and aggregate performance statistics.

A significant advancement in this area is the integration of the remittance reports published by the trustee and the loan-level data describing the underlying collateral. This enables the user to acquire complex summary reporting that can then be broken into its component parts using the individual assets in the trust. This allows a user to identify which loans caused the losses and which loans are close to triggering additional losses.

Flexible delivery

In addition to being more accurate and comprehensive, loan-level data are also more readily available to market participants than ever before. Anyone can now access data on only the deals they want on a variety of platforms, including several online options. This is a major departure from the height of the market, when access to data was more cost-prohibitive and cumbersome. Users now have a myriad of alternative methods for selecting the data they want and the way they want it delivered.

Delivery innovations allow equal access to premium loan-level data, helping to level the playing field for market participants of all sizes to more accurately price and assess risk on RMBS investments. Users can utilize trading front ends to conduct detailed bond surveillance on the entire RMBS market or to obtain detailed data on just the bonds in their inventory.

Improvements in loan-level data are leading to better uses of data down the analytical chain, leading to more accurate assessments of the inherent risks in trading, buying and holding securities. More comprehensive, cleaner RMBS loan-level data has led to improvements in base-case data sets and more reliable predictions of defaults and losses in changing economic scenarios.

Improved model results provide better inputs for waterfall models, which have led to more precise bond cashflow predictions and more reliable valuations. Continued improvement in data quality and availability will have a major impact on restarting the mortgage securitization market.

Furthermore, higher-quality loan data will enable analysts to more accurately predict aggregate cashflows that can then be used to dissect trust structures. Specifically, the false sense of comfort bond investors felt from over-collateralization estimates, excess spread assumptions or home-price predictions can now be more reliably assessed. This will enable bond investors or portfolio analysts to do a better job at credit surveillance for individual holdings or make better judgments on whether to acquire bonds that are part of complex structures.

Larry Barnett is CEO of BlackBox Logic, based in Denver. He can be reached at


Please enter your comment!
Please enter your name here