The Next Frontier: Expanding Credit Inclusion with New Data and Analytical Techniques

Volume 15, Issue 2 | August 19, 2021

Gaps and weaknesses in the traditional credit reporting and scoring system have been recognized for decades as creating barriers to accessing affordable credit, yet recent efforts to harness new data and analytical techniques have produced relatively limited improvements to date. In 2020, racial justice protests and the COVID-19 crisis renewed interest in such initiatives, as stakeholders asked what benefits they may have for addressing racial equity issues, what risks they pose, and how to accelerate progress for historically underserved populations.

Analysis of these questions does not yield easy answers, given the magnitude of underlying disparities in income and assets, continuing market obstacles, and other challenges. But with sustained engagement from all stakeholders—multiple industry segments, advocates, academics, and policymakers—there is reason to hope that the credit system can expand affordable access and reinforce efforts to increase economic inclusion in other sectors.

The importance of these efforts has never been clearer. Credit information and models directly affect households’ ability to both bridge temporary financial disruptions and make long-term investments in education, housing, and small business formation. Credit reports and scores are also frequently used in insurance, employment, and tenant screening, which can further magnify their effects on households’ financial health and resiliency. Thus, the credit system can substantially support or impede broader initiatives to recover from the pandemic and address longstanding societal inequities in the months and years to come.

Limitations and Gaps in Traditional Credit Scoring and Underwriting

The initial movement to increase use of data and automated underwriting in consumer credit markets began more than 50 years ago. Early changes were spurred by several factors, including the emergence of three nationwide consumer reporting agencies (NCRAs) with payment history data on millions of consumers, the development of third-party scoring models that group consumers based on their relative default risk, and lenders’ shift from subjective decision-making toward algorithmic underwriting models.1 Small business lending has not become as standardized as consumer credit, but many lenders use owners’ personal credit scores as well as small business credit reports, third-party scoring models, and proprietary algorithms to help evaluate commercial applications.2

Research suggests that these changes have tended to lower underwriting costs and default losses, improve consistency of treatment, and increase competition for borrowers.3 Yet for all of these benefits, traditional data and models are subject to significant limitations and create their own dependencies. Because credit reporting is voluntary and most information comes from particular categories of lenders, NCRA reports contain relatively little data about applicants who do not already have those types of credit products. There are incentives for companies to withhold information, and accuracy has been a substantial concern historically. More fundamentally, even for applicants with relatively robust, accurate credit files, traditional credit reports cannot provide a complete assessment of their finances because they do not provide direct information on incomes, balance sheets, or even a complete picture of recurring expenses.4

Lenders can fill these gaps by collecting information from applicants and other third-party sources. But gathering, verifying, and analyzing a detailed picture of applicants’ financial situations can take substantial time and labor, and investors and secondary market actors often prefer relying on data and models that are easy to compare across portfolios. Thus, where underwriting information is not sufficiently easy to access, lenders may reject applicants not because they in fact pose too much default risk, but rather because operational obstacles complicate their assessment.

These dynamics are particularly likely to impact communities of color and low- to moderate-income borrowers. In particular, research has identified three specific groups of applicants who have an especially difficult time accessing credit due to information barriers:

Thin- and no-file applicants: About 50 million U.S. adults (20 percent of the population) lack sufficient credit history with the NCRAs to be scored using the most widely used third-party models. African Americans, Hispanics, recent immigrants, young borrowers, and lower-income consumers are particularly likely to be “thin file” or “no file.” For example, studies indicate that nearly 30 percent of African Americans and Hispanics cannot be scored by certain models, compared with about 16 percent of whites and Asians.5
Non-prime applicants: Credit scoring models group borrowers into bands based on their relative default risk, but without additional data, lenders cannot differentiate within those bands to determine which individual applicants are higher-risk. Even if most applicants within a particular band are likely to repay their loans, lenders may choose not to lend to that cohort or may impose higher prices because default risks for the group as a whole are relatively high. 6 For instance, depending on interest rates, consumers with scores near the typical minimums for approval may pay $7,500 more over the life of a $20,000 auto loan and $86,000 more over the life of a $250,000 mortgage than peers with high scores.7 About 80 million consumers had “non-prime” scores prior to the pandemic.8 Although recent demographic data are not publicly available, nationally representative samples from the early 2000s indicate that about two-thirds of African American and almost one-half of Hispanic consumers had scores in the lowest three deciles overall, compared with about one-quarter of whites.9
Small business owners: Like young borrowers, start-up companies do not have credit histories in their own right. As a result, many owners are forced to rely on their personal scores and on consumer credit products to finance their businesses. This problem is most severe with start-ups, but various other factors have also made traditional lenders reluctant to provide business credit to companies that fall below certain sales and/or maturity thresholds. Businesses owned by minorities, recent immigrants, and women tend to have particular challenges obtaining credit.10

These examples contribute to broader concerns that disparities in conventional credit reports and scoring models both reflect and perpetuate previous inequities created by historical discrimination in such fields as employment, education, housing, and lending. These historical factors have produced substantial disparities in income—where median levels for African American and Hispanic households are about 60 percent and 74 percent that of white households, respectively—and even more dramatic gaps in wealth—where median net worth for African American and Hispanic households is about 13 percent and 19 percent that of white households, respectively.11 In light of these gaps, it is not surprising that African American and Hispanic households are more likely than white households to experience payment delinquencies and bankruptcies.12 Although Federal Reserve Board research has found that traditional credit scoring models have substantial value in predicting loan defaults across different demographic groups, studies have struggled to disentangle the relationships between scores, income, and wealth due to data limitations.13

Concerns have also been raised that African American and Hispanic borrowers’ credit histories have been disproportionately affected by lack of geographic access to banks, banks’ prioritization of larger loan sizes and wealthier customers, and targeting by lenders who offer higher prices and riskier structures.14 For instance, African American and Hispanic households and neighborhoods were far more likely than white counterparts to end up with subprime loans prior to the 2008 financial crisis.15 In the years since, declines in homeownership among African American and Hispanic households were also more severe than white households; in fact, disparities between white and African American homeownership rates reached their highest level in more than five decades.16 Recent research has also documented the severe effects of large amounts of high-cost debt among African American households.17 These examples help to illustrate the strong feedback loops between credit and broader financial and economic equity, and the need for better information and tools to increase credit access and long-term financial stability.

The State of New Initiatives

In light of these serious concerns, multiple initiatives have been launched over the past 20 years to harness digital information and other technology advances to improve credit access for underserved populations. Some efforts have refined traditional credit reports and models to distill additional insights and try to address concerns about disproportionate impacts on particular populations.18 Others have worked to incorporate additional data sources and use new modeling techniques to improve predictiveness and inclusion:

Tapping new data sources: Early efforts to tap nontraditional data sources focused on rental, utility, and telecom payment history, given that 80 million U.S. adults live in rental housing and that an estimated 97 percent of adults own cell phones of some kind.19 Newer FICO and VantageScore models will consider such data, where available, but only about 5 percent of consumers’ NCRA files were estimated to include it.20 Although efforts to convince landlords and other companies to report the information directly to NCRAs have not made substantial progress and some consumer groups have opposed routine reporting of certain utility information, access to the data through consumer-permissioned channels is slowly increasing.21 For example, some intermediary companies will report the data to NCRAs where consumers are willing to pay a fee, and some NCRAs have launched partnerships with a new group of intermediaries called data aggregators to provide lenders with telecom or utility data where consumers specifically authorize the information to be shared. A few specialty scoring products have also been introduced that draw on other sources of telecom and utility payment history and on payment history from payday loans, and other credit products that are not typically reported to NCRAs.22
Other recent initiatives are also relying on data aggregators to access information from bank and prepaid accounts, which are owned by about 96 percent of U.S. households and can provide information about income and reserves in addition to expenditures.23 Some fintechs and other lenders are building proprietary underwriting models based on such cash-flow information, and NCRAs and scoring model developers are also launching cash-flow based products and services.24 FinRegLab’s independent evaluation of cash-flow information used by several companies to underwrite consumer and small businesses prior to the pandemic suggests that it has substantial potential to increase access to credit.25

Although research in other countries suggests that patterns in mobile phone and computer usage could be predictive of credit risk,26 U.S. model builders have been reluctant to use such data for underwriting purposes, partly due to concerns that it would have a disparate impact on communities of color because of underlying differences in technology adoption and usage, education, and social networks.
Adopting machine learning techniques: Model builders have also begun to evaluate the potential for machine learning techniques to improve predictiveness and inclusion relative to traditional scoring and underwriting algorithms. Machine learning techniques often rely on less direction from programmers than traditional models and use more complex mathematical analyses to identify data patterns. Such techniques can be applied to traditional data but can be particularly useful in analyzing large and diverse data sets as they evolve over time. Although machine learning has been used for years in fraud detection, its application to credit scoring and underwriting has been limited to date due to concerns about how to manage models that are more complex and less transparent than traditional algorithms.27
For instance, some companies are using machine learning models to identify new predictive variables, but then build them into traditional models that are viewed as easier to manage for purposes of model risk governance, fair lending, and disclosure compliance. Other companies restrict the operation of machine learning models and underlying data to make them easier to understand and explain and reduce risks related to their use. Both approaches may sacrifice some predictiveness, although stakeholders are debating the relative tradeoffs.28

Data limitations make it difficult to measure the scale and effect of these various efforts to tap new data and modeling techniques and of other recent changes in credit information markets. However, there is some evidence of increasing momentum within the past few years. Following the release of FinRegLab’s empirical analysis of cash-flow data, regulators signaled increased openness to the use of such information in late 2019 and followed up with a number of statements and initiatives relating to credit underwriting data and models in 2020.29 After the onset of the pandemic, racial justice protests and increasing concerns about traditional data and models’ performance during the downturn accelerated industry interest in piloting data and modeling innovations.30 In May 2021, news broke of pilots by several large banks to use bank account data to underwrite consumers without credit scores, building out of an Office of the Comptroller of the Currency initiative that had launched the year before.31 The Biden administration has also signaled interest in the use of data to increase credit access, potentially following up on campaign plans calling for the creation of a public credit reporting and scoring division within the Consumer Financial Protection Bureau (CFPB) that would create a government option that seeks to minimize racial disparities—for example, by accepting nontraditional data sources, such as rental and utility payment history.32

Yet although stakeholder interest in using alternative data and modeling techniques to expand access to credit is increasing, so too is concern that innovations could exacerbate inequalities or lead to other unintended consequences. The next section provides an overview of potential pitfalls and barriers to adoption as private stakeholders and policymakers work to ensure that recent commitments to addressing longstanding disparities lead to concrete improvements for communities of color and other underserved populations.

Potential Pitfalls and Challenges

Simply finding that new data or models appear to be predictive of credit risk is only the beginning of a complicated process that may ultimately lead to widescale adoption of changes by lenders and other model builders over time. At the outset, model builders must grapple with a number of potential pitfalls and uncertainties in deciding whether particular data or model changes are sufficiently promising to warrant making changes to existing practices:

Data bias concerns: For both traditional and machine learning models, bias can occur due to a number of flaws in the underlying data, such as a lack of information about key subgroups, use of noisy or flawed measurements, and use of training data that were affected by historical discrimination or bias.33 These problems are particularly important to the extent that they impact racial equity and inclusion, but they can also affect other aspects of model performance. Thus, in considering adoption of new data sources, it is important to vet the information with regard to accuracy, reliability, potential gaps in coverage, and similar issues.34
Other fairness, privacy, and transparency issues with new data sources: For applicants, use of new data sources can raise other types of concerns about fairness, privacy, and transparency.35 For example, lenders who have experimented with using information about where customers shop, whether they spend money on particular activities, and other behavioral patterns have faced substantial criticism on privacy and transparency grounds, particularly if consumers are not aware that such factors could affect credit availability and pricing. Some consumer advocates oppose broad use of energy utility data for credit underwriting, arguing that it is unfair to penalize consumers who fall moderately behind on payments during peak months, given state protections against service cutoffs. Use of payment history on payday and other high-cost loans is also controversial because of concerns about the terms of such products, marketing practices, and disparities in access to various types of financial services providers.36
Concerns about management of machine learning models in particular: The fact that machine learning models can be substantially more complex and less transparent than current automated underwriting models also creates particular concerns about managing general performance and fair lending risks. For instance, concerns about performance deterioration due to data drift and “overfitting” to training data are heightened with machine learning models.37 Stakeholders have also raised concerns that machine learning models could heighten fair lending risks—for instance, by more closely mapping disparities in traditional data or by reverse-engineering race, gender, or similar characteristics based on correlations in underlying data sources, even though federal law prohibits such characteristics from being considered in credit underwriting.38 At the same time, others have argued that use of adversarial models and other machine learning techniques could help lenders identify alternative models that maintain similar levels of predictiveness while producing fewer disparities among demographic groups.39

These factors underscore the importance of rigorous data and model governance practices, diverse teams, and public research in helping both individual firms and the broader marketplace to evaluate which specific data and modeling innovations are worth substantial investments for implementation. Particularly given continuing evolution in data, modeling techniques, and economic circumstances, robust procedures are needed at each step of model development, vetting, and ongoing monitoring to increase understanding of performance over time. Teams that are diverse both in demographics and disciplines are better able to spot and manage potential problems, not only with regard to technical data science issues but also to broader legal and policy questions. Public research by academics, independent research organizations, and government agencies is also critically important to educating regulators, secondary market investors, and other stakeholders about the usefulness of particular data, models, and compliance tools. Toward that end, FinRegLab has recently announced a new project with economists at the Stanford Graduate School of Business to evaluate tools and techniques that are available for diagnosing and managing concerns about the explainability and fairness of machine learning models for credit underwriting.40

But answering baseline questions about the value of particular data or model changes on predictiveness, inclusion, and racial disparities is not sufficient in itself to ensure that beneficial changes can, in fact, reach substantial scale. Business model factors also play an important role in individual firms’ decisions about whether they are willing to purchase specialty scores or data or make other investments to change current procedures and practices.41 Investors’ and secondary market actors’ demand for consistent benchmarks that can be used across companies and portfolios can also complicate the adoption of innovations; in mortgage markets, it has slowed the adoption even of conventional scoring model updates.42 And both market and regulatory factors have substantially complicated access to particular data sources, such as rental history, telecom payments, and bank and prepaid account records.43

Customer-side considerations can also affect the utility of particular data types and the adoption of particular innovations. Although credit applicants are not always aware of the factors that influence their credit scores or the ways that creditors make underwriting decisions, some nontraditional data sources require express consumer permission to access, and some lenders are attempting to differentiate themselves by emphasizing the fact that they look beyond traditional credit reports.44 As a result, such factors as digital access and consumers’ attitudes toward privacy and machine learning could potentially affect the extent to which borrowers of color seek credit from lenders who use new data or models.45 To the extent that regulators are slow to clarify the application of existing regulatory protections to new data and modeling applications, this may further increase customer hesitation. Outreach, marketing, and relationship-building with potential customers who are relatively disconnected from and often distrust financial and data companies based on past experience are also critical to broader usage.46

Questions for Private and Public Market Participants Going Forward

These pitfalls and challenges help to illustrate why it is so complicated to assess the potential benefits and risks of data and technology innovations for addressing racial equity issues. Reducing information barriers that make it difficult to assess default risk for particular populations is possible to do if stakeholders are willing to make sufficient investments and to solve market and regulatory issues concerning model governance and data flows, but such initiatives will take significant effort from a broad array of actors. And reducing information barriers will not be sufficient by itself to generate rapid improvements in underlying disparities in income and assets, which are likely to affect other sources of financial data as well. Addressing these interwoven disparities thus requires both sustained work to address gaps and weaknesses in the traditional credit system and parallel efforts to bolster the economic resources of historically distressed populations.

With regard to the first component, a critical question is the extent to which scoring and reporting companies, lenders, and other firms will continue and expand recent work to develop, vet, and adopt more inclusive credit models. As portions of the economy start to revive from the pandemic, the business incentives that have encouraged industry interest in alternative data during the period of intense uncertainty may lessen somewhat. But simply returning to pre-pandemic practices and concentrating on lending to populations that were relatively unaffected by the pandemic risks further excluding low-income consumers, communities of color, and small business owners that have been particularly hard hit by COVID-19’s health and economic impacts. Thus, the extent to which alternative data and modeling techniques will be able to improve credit access will depend, in the first instance, on whether multiple industry segments are willing to make sustained investments in working to reach historically underserved populations and to develop more nuanced mechanisms for evaluating applicants who have experienced previous periods of financial distress.47

Regulatory actions will also be critical to facilitating the adoption of more modern and inclusive credit models and clarifying and strengthening protections for the underlying data flows. For instance, the Federal Housing Finance Agency is currently overseeing a process to approve more modern conventional credit scoring models for use in mortgage securitizations, including ones that will take rental information into account when it appears in credit reports. The Federal Trade Commission is in the process of updating and strengthening information security requirements that apply to a wide variety of nonbank financial institutions, including lenders, consumer reporting agencies, and other data intermediaries. A CFPB rulemaking to develop standards for consumer-permissioned transfers of financial data could have even broader effects, given the credit system’s increasing reliance on data aggregators to access both utility and telecom data and cash-flow information from transaction accounts. However, the agency has not yet signaled how it will prioritize the project relative to other potential initiatives. And congressional action would be needed to address more fundamental aspects of the current ecosystem, such as data protections for small business owners and proposals to create a public credit bureau.48

Given that communities of color are disproportionately affected by credit information barriers, such efforts to improve data flows and credit models could make it easier for a number of households and small business owners to access more affordable credit for activities that can boost their income and assets over time. Yet it is important to recognize that the existing disparities in income and assets and the recent hardships imposed by the pandemic are also likely to result in many individual applicants being assessed as presenting relatively high levels of default risk. These factors increase the chance that progress, particularly in early stages, may be mixed and incremental. For example, when credit scores were first adopted in small business lending, research found that more applications were approved because lenders were more confident in their ability to predict default. However, because many of the new borrowers were assessed, at least initially, as being relatively high-risk, pricing disparities increased. Due to data limitations and market developments, it is unclear how pricing changed as these new borrowers built payment history over time. One research paper evaluating potential machine learning models for mortgage lending also found that increased predictiveness could lead to some improvement in approval rates but bigger differentials in pricing.49

Such dilemmas underscore the importance of using other initiatives to address the deep racial disparities in income and assets at the same time that stakeholders in the credit system continue to explore and implement promising credit and modeling technique innovations.50 Although there is reason to believe that the credit system can play an important role in helping to magnify broader initiatives to address economic disparities, relying solely on that system to address these cumulative, structural issues would produce too little change too slowly. In much the same way, the credit system has a critical role to play in pandemic recovery, but relying solely on it would be inadequate to ensure a rapid and broad-based rebound, particularly for populations that have been hardest hit by COVID-19’s health and economic impacts.

But these challenges also underscore the urgency of making deeper improvements within the credit system. Given the risks and obstacles outlined above, it is not surprising that adoption of data and modeling innovations has been relatively small-scale to date—for instance, through creating pilots, using machine learning only in limited ways, as described above, and using alternative data or scoring models only in “second look” situations where an applicant would otherwise be turned down based on a traditional analysis of traditional data sources. Such approaches can be helpful first steps to gain experience with new innovations in circumstances that are the most likely to have beneficial outcomes for both borrowers and lenders. Yet there may also be tradeoffs to focusing too narrowly over time. For example, restricting data usage solely to second-look and conventionally unscoreable applicants may exclude other borrowers who might benefit from a particular change, as well as affecting the economics of implementation for lenders because costs are spread across a smaller population. Such approaches may also tend to create less general urgency to design safeguards because the number of affected applicants is relatively small, even though there are important equity issues to consider if some applicants are effectively facing a substantial “privacy tax” that others are not required to pay.

Taken together, these factors emphasize that there is no one silver bullet with regard to increasing access to credit through data and model innovations. Although adopting new data sources or other innovations without sufficient vetting raises substantial risks, there are also potential downsides to moving so cautiously that innovations that would have substantial net benefits cannot reach scale. The depth of underlying disparities and complexity of financial and economic interactions also increase the chance that early results may be mixed and that iterative market and policy adjustments will be needed over time. Thus, sustained effort is needed both inside and outside of the credit system to ensure broader and faster progress toward meaningful change.

The Community Development Innovation Review focuses on bridging the gap between theory and practice, from as many viewpoints as possible. The goal of this journal is to promote cross-sector dialogue around a range of emerging issues and related investments that advance economic resilience and mobility for low- and moderate-income communities and communities of color. The views expressed are those of the authors and do not necessarily represent the views of the Federal Reserve Bank of San Francisco or the Federal Reserve System.

Kelly Thompson Cochran

is Deputy Director at FinRegLab. Prior to joining FinRegLab, Kelly helped to stand up the Consumer Financial Protection Bureau, where she served most recently as the Assistant Director for Regulations. In that capacity, she oversaw rulemaking and guidance activities under the Dodd-Frank Act, Electronic Fund Transfer Act, and various other federal consumer financial laws. Kelly previously was counsel at WilmerHale, where she advised financial institutions on a wide range of legal and regulatory matters including product development, compliance, enforcement, and litigation. Kelly also conducted research on financial services innovation, community reinvestment, and other topics at the University of North Carolina at Chapel Hill.

End Notes

1. FinRegLab, “The Use of Cash-Flow Data in Underwriting Credit: Market Context & Policy Analysis” § 2.1 (2020). The three NCRAs are Equifax, Experian, and TransUnion. The Fair Isaac Corporation (FICO) and VantageScore, which is a joint venture by the three NCRAs, are the largest providers of third-party scoring models. Id.

2. FinRegLab, “The Use of Cash-Flow Data in Underwriting Credit: Small Business Spotlight” § 2.1 (2019).

3. Board of Governors of the Federal Reserve System, “Report to Congress on Credit Scoring and Its Effects on the Availability and Affordability of Credit” pp. S-2 to S-4, O-2 to O-4, 32-49 (2007) (hereinafter FRB, Credit Scoring Report); Susan Wharton Gates et al., “Automated Underwriting in Mortgage Lending: Good News for the Underserved?” 13 Housing Policy Debate 369 (2002); FinRegLab, “Market Context & Policy Analysis,” p. 11, n. 16.

4. FinRegLab, “Market Context & Policy Analysis” §§ 2.1‒2.2. The most extensive federal study of accuracy issues predates the beginning of the Consumer Financial Protection Bureau’s program to examine critical actors and several other market developments. The Bureau announced in 2020 that it was planning a new study on accuracy issues. Consumer Financial Protection Bureau, “Director Kraninger’s Remarks During the November 2020 Academic Research Council Meeting” (Nov. 23, 2020).

5. Consumer Financial Protection Bureau, “Data Point, Credit Invisibles,” pp. 4‒6, 17 (2015); FinRegLab, “Market Context & Policy Analysis” § 2.2.

6. FinRegLab, “Market Context & Policy Analysis” §§ 2.1, 2.2. Borrower advocates particularly criticize so-called risk-based pricing, arguing that higher prices may themselves increase the risk of default and that some lenders charge more than necessary to cover losses. Id. p. 9, n. 12.

7. Lyle Daly, “Here’s How Much Money Bad Credit Will Really Cost You,” The Ascent (Apr. 8, 2019).

8. FinRegLab, “Market Context & Policy Analysis” § 2.2. In the past year, average scores have risen several points in response to household efforts to shore up their finances, spending constraints created by business lockdowns, governmental relief efforts, and temporary accommodations by lenders and others. The number of consumers in the nonprime category shrunk by about 3 percent overall in 2020, but impacts among different populations have been uneven and there are concerns that delinquencies will rise rapidly as assistance programs end. FinRegLab, Research Brief, “Covid-19 Credit Reporting and Scoring Update” p. 2, nn. 7, 10 (2020); Stefan Lembo Stolba, Blog, “Experian 2020 Consumer Credit Review,” www.experian.com (Jan. 4, 2021); Elisabeth Buchwald, “A Pandemic Paradox: Americans’ Credit Scores Continue to Rise as Economy Struggles—Here’s Why,” MarketWatch (updated Feb. 20, 2021).

9. FRB, “Credit Scoring Report,” pp. 150‒53. For more recent work examining credit score gaps in the mortgage context and among consumers living in zip codes with majority Hispanic and African American zip codes, see Jaya Dey and Lariece M. Brown, ”The Role of Credit Attributes in Explaining the Homeownership Gap Between Whites and Minorities Since the Financial Crisis, 2012–2018,” Housing Policy Debate (2020); Urban Institute, “Credit Health During the COVID-19 Pandemic” (Feb. 25, 2021).

10. FinRegLab, “Small Business Spotlight” §§ 2.1, 2.2.

11. Jessica Semega et al., “Income and Poverty in the United States: 2019,” U.S. Census Bureau (2020); Neil Bhutta et al., “Disparities in Wealth by Race and Ethnicity in the 2019 Survey of Consumer Finances,” FEDS Notes (Sept. 28, 2020); Kriston McIntosh et al., “Examining the Black-White Wealth Gap,” Urban Institute (Feb. 27, 2020).

12. See, for example, Brown and Dey; Jonathan D. Fisher, “Who Files for Personal Bankruptcy in the United States?” 53 Journal of Consumer Affairs 2003 (2019); Okechukwu D. Anyamele, “Racial Ethnic Differences in Household Loan Delinquency Rate in Recent Financial Crisis: Evidence from 2007 and 2010 Survey of Consumer Finances,” 8 Journal of Applied Finance & Banking 49 (2018).

13. FRB, “Credit Scoring Report”; FinRegLab, “Market Context & Policy Analysis” § 2.3.

14. See, for example, National Consumer Law Center, “Past Imperfect: How Credit Scores and Other Analytics ‘Bake In’ and Perpetuate Past Discrimination” (2016); Lisa Rice and Deidre Swesnik, “Discriminatory Effects of Credit Scoring on Communities of Color,” 46 Suffolk Law Review 935 (2013).

15. See, for example, Patrick Bayer et al., “What Drives Racial and Ethnic Differences in High-Cost Mortgages? The Role of High-Risk Lenders,” 31 Review of Financial Studies 175 (2018); Jacob S. Rugh et al., “Race, Space, and Cumulative Disadvantage: A Case Study of the Subprime Lending Collapse,” 62 Social Problems 186 (2015); Derek S. Hyra, “Metropolitan Segregation and the Subprime Lending Crisis,” 23 Housing Policy Debate 177 (2013).

16. U.S. Census Bureau, “Homeownership Rate for the United States,” retrieved from FRED, Federal Reserve Bank of St. Louis (Feb. 16, 2021); Jung Hyun Choi, Blog, “Breaking Down the Black-White Homeownership Gap,” Urban Institute (Feb. 21, 2020).

17. Prosperity Now, “Addressing Debt in Black Communities: A Comprehensive Report Exploring the Potential and Limitations of Services in the Realm of Financial Coaching” (2020); Prosperity Now, “Forced to Walk a Dangerous Line: The Causes and Consequences of Debt in Black Communities” (2018).

18. For instance, VantageScore estimates that its 4.0 model can score about 40 million additional consumers relative to other third-party models, including applicants who are not scored by other models because they have not had credit activity within the past six months. Barrett Burns, Blog, “You Are Not Invisible to Us,” VantageScore (visited Jan. 25, 2021); VangageScore, “The Credit Card Industry & Vantage Score” (visited Feb. 16, 2021). Adjustments have also been made in the treatment of medical debt and public records, given particular concerns about those data sources, and several initiatives have been launched to identify more insights during economic downturns. FinRegLab, “Market Context & Policy Analysis” § 2.3; FinRegLab, Research Brief, “Data Diversification in Credit Underwriting,” pp. 8‒9 (2020).

19. Pew Research Center, “Mobile Fact Sheet” (Apr. 7, 2021); FinRegLab, “Market Context & Policy Analysis” § 2.3. See also Amy Hou, Blog, “The Growing Interest in Alternative Data Sharing,” Urjanet (Sept. 13, 2019) (reporting the results of a survey of U.S. adults with household incomes of at least $25,000 in which 91 percent of respondents reported having at least one utility or telecom account in their name)..

20. FICO, “Expanding Credit Access with Alternative Data” p. 6 (2021). Much of the information that is available concerns severe delinquencies, but not routine payments history. Id.

21. For discussions of the market barriers and policy debates, see, for example, FinRegLab, “Market Context & Policy Analysis” § 2.3; FinRegLab, Research Brief, “Utility, Telecom, and Rental Data in Underwriting Credit” (forthcoming 2021); FinRegLab, Research Brief, “Covid-19 Credit Reporting and Scoring Update,” pp. 9‒11.

22. FinRegLab, “Data Diversification in Credit Underwriting,” p. 5.

23. Federal Deposit Insurance Corporation, 2019 FDIC Survey, How America Banks: Household Use of Banking and Financial Services 1, 6 (2020); FinRegLab, “Market Context & Policy Analysis” § 4. Data aggregators are also being used to access information from telecom and utility companies. FinRegLab, “Data Diversification in Credit Underwriting,” p. 6.

24. FinRegLab, “Market Context & Policy Analysis” § 4; FinRegLab, “Small Business Spotlight” § 4; FinRegLab, “Data Diversification in Credit Underwriting,” pp. 5‒8.

25. FinRegLab, “The Use of Cash-Flow Data in Underwriting Credit: Empirical Research Findings” (2019). The study analyzed data from six companies to evaluate the potential effects of cash-flow information on predictiveness, inclusion, and fair lending. The results suggested that cash-flow information could not only be used to predict default risk in situations in which traditional credit report information is not available, but that it also added somewhat different insights with regard to borrowers who did have traditional credit reports and scores. The analysis also found evidence that the participating companies were extending credit to applicants who may have faced constraints in accessing credit historically, and that the degree to which the information was predictive of credit risk appeared to be relatively consistent across borrowers who likely belong to different demographic groups.

26. See, for example, Henri Ots et al., “Mobile Phone Usage Data for Credit Scoring” (Feb. 2020); Tobias Berg et al., “On the Rise of the FinTechs: Credit Scoring Using Digital Footprints,” Review of Financial Studies (2019); Alain Shema, “Effective Credit Scoring Using Limited Mobile Phone Data” (Jan. 2019).

27. FinRegLab, Frequently Asked Questions, “AI in Financial Services: Key Concepts” (2020); FinRegLab, Frequently Asked Questions, “AI in Financial Services: Explainability in Credit Underwriting” (2020).

28. See, for example, Cynthia Rudin and Joanna Radin, “Why Are We Using Black Box Models in AI When We Don’t Need To? A Lesson from An Explainable AI Competition,” Harvard Data Science Review (Fall 2019).

29. Board of Governors of the Federal Reserve System, Consumer Financial Protection Bureau, Federal Deposit Insurance Corporation, National Credit Union Administration, and Office of the Comptroller of the Currency, “Interagency Statement on the Use of Alternative Data in Credit Underwriting” (Dec. 3, 2019); FinRegLab, “Data Diversification in Credit Underwriting,” pp. 9‒10.

30. FinRegLab, “Data Diversification in Credit Underwriting”; FinRegLab, “Covid-19 Credit Reporting and Scoring Update”; FinRegLab, Research Brief, “Disaster-Related Credit Reporting Options” (2020).

31. The OCC started Project REACh (Roundtable for Economic Access and Change) in 2020 to convene national banks, civil rights organizations, fintechs, and other stakeholders in the aftermath of racial justice protests. Peter Rudegeair and AnnaMaria Andriotis, “JPMorgan, Others Plan to Issue Credit Cards to People with No Credit Scores,” Wall Street Journal (May 13, 2021).

32. “The Biden Plan to Build Back Better by Advancing Racial Equity Across the American Economy,” www.joebiden.com (undated); “The Biden Plan for Investing in Our Communities Through Housing,” www.joebiden.com (undated).

33. FinRegLab, Frequently Asked Questions, “AI in Financial Services: Explainability in Credit Underwriting.”

34. For analyses of such issues with regard to cash-flow information from transaction accounts and other sources, see FinRegLab, “Market Context & Policy Analysis” §§ 5.1, 6.1.

35. Federal law does not specifically require disclosure of scoring or underwriting criteria to applicants, but lenders must provide certain disclosures in connection with “adverse actions” and have sometimes been subject to enforcement activity for unfair and deceptive practices where they have failed to disclose particular information. FinRegLab, “Market Context & Policy Analysis” § 6.1.1.4. More broadly, there is a desire to educate and empower consumers and small business owners so that they can manage their finances in ways that improve their ability to access affordable credit over time. Id. § 6.2.

36. FinRegLab, “Market Context & Policy Analysis” §§ 2.3, 6.1.1.1; National Consumer Law Center, “Credit Invisibility and Alternative Data: Promises and Perils” (2019).

37. FinRegLab, Frequently Asked Questions, “AI in Financial Services: Explainability in Credit Underwriting.”

38. See, for example, Talia B. Gillis, “The Input Fallacy,” Minnesota Law Review (forthcoming 2022); Laura Blattner and Scott Nelson, “How Costly Is Noise? Data and Disparities in the U.S. Mortgage Market” (January 2021); Andreas Fuster et al., “Predictably Unequal? The Effects of Machine Learning on Credit Markets” (October 2020); Anya E.R. Prince and Daniel Schwarcz, “Proxy Discrimination in the Age of Artificial Intelligence and Big Data,” 105 Iowa Law Review 1257 (2020); Mark MacCarthy, “Fairness in Algorithmic Decisionmaking,” Brookings Institution (Dec. 6, 2019).

39. See, for example, Sian Townson, “AI Can Make Bank Loans More Fair,” Harvard Business Review (Nov. 6, 2020). For a study of adversarial models to address racial bias concerns in other contexts, see Christina Wadsworth et al., “Achieving Fairness through Adversarial Learning: An Application to Recidivism Prediction” (2018).

40. FinRegLab, “FinRegLab to Evaluate the Explainability and Fairness of Machine Learning in Credit Underwriting” (April 14, 2021). As discussed above, concerns about managing the complexity and fairness risks of machine learning models are one of the primary reasons that lenders have been slow to adopt such models in credit underwriting. The project will evaluate the ability of both proprietary and open-source model diagnostic and management tools that use different explainability approaches to support lenders in three critical areas: model risk management, fair lending, and adverse action reporting.

41. For instance, large banks have been inconsistent in their willingness to serve applicants who seek smaller loans and/or are considered higher-risk due to a range of business and regulatory considerations, while smaller banks often face challenges in adopting technology changes due to resource constraints and other factors. Fintech lenders are often first adopters of data and technology innovations but face business model constraints on their access to capital that affect their pricing and ability to withstand economic downturns. FinRegLab, “Market Context & Policy Analysis” §§ 5.2.1.2, 5.2.1.3.

42. Credit scoring models that are widely used for mortgage securitization purposes are so old that they do not consider rental payment history even when it is included in consumer reports. As discussed below, the Federal Housing Finance Agency is overseeing a process to approve more recent models for use, though implementation is expected to take multiple years. Id. §§ 2.3, 5.2.1.4.

43. Id. § 5.2.2; FinRegLab, “Utility, Telecom, and Rental Data in Underwriting Credit”; FinRegLab, “Data Diversification in Credit Underwriting,” p. 4, n. 23; FinRegLab, “Covid-19 Credit Reporting and Scoring Update,” pp. 9‒10, n. 70.

44. FinRegLab, “Market Context & Policy Analysis” § 6.2.

45. Id. §§ 5.1, 6.2.2.

46. The importance of building stronger connections was underscored by recent experiences with the Paycheck Protection Program. FinRegLab, Research Brief, “Technology Solutions for PPP and Beyond” (2020). See also Kedra Newsom Reeves et al., “Racial Equity in Banking Starts with Busting the Myths,” BCG (Feb. 2, 2021); Robert Hackett, “Banking While Black: How a New Generation of Leaders Is Overcoming a Legacy of Discrimination and Mistrust,” Fortune (June 19, 2020); Aria Florant et al., “The Case for Accelerating Financial Inclusion in Black Communities,” McKinsey & Co. (2020).

47. The latter issue is critical when using both traditional and nontraditional sources of financial data, since both are likely to reflect evidence of pandemic-related hardships. Indeed, distress may actually be more evident in nontraditional data sources than in the information that is typically reported to the NCRAs. For example, Congress provided more short-term relief to homeowners with federally related mortgages than it did to renters, which may cause their consumer reports and scores to deteriorate more rapidly. National Consumer Law Center, “The Credit Score Pandemic Paradox and Credit Invisibility” (2021); FinRegLab, “Covid-19 Credit Reporting & Scoring Update,” pp. 9‒11. Relief programs for utility payments and small businesses were also complicated by program variations and structure issues at the local, state, and federal levels. See, for example, FinRegLab, “Utility, Telecom, and Rental Data in Underwriting Credit”; Joseph Parilla, “Washington Has Supplied the Dollars to Save Small Businesses, But Local Leaders Need to Supply the Strategy,” Brookings Institution (April 5, 2021); National Governors Association, “State Initiatives for Small Business Recovery During The COVID-19 Pandemic Economic Crisis” (Dec. 16, 2020); FinRegLab, “Technology Solutions for PPP and Beyond.” Thus, model builders will have to think carefully with regard to the predictive value of evidence of financial distress during the pandemic era.

48. For further discussions of specific market and policy initiatives to leverage particular types of data, see FinRegLab, “Utility, Telecom, and Rental Data in Underwriting Credit”; FinRegLab, “Market Context & Policy Analysis” § 6.

49. Fuster; Allen N. Berger et al., ”Credit Scoring and the Availability, Price, and Risk of Small Business Credit,” 37 Journal of Money, Credit & Banking 191 (2005); W. Scott Frame et al., “Credit Scoring and the Availability of Small Business Credit in Low- and Moderate-Income Areas,” 39 Financial Review 35‒54 (2004).

50. And even within the credit system, data and modeling innovations are not the only improvements that could be helpful. Special-purpose credit programs, down-payment assistance for first-time homebuyers, enhanced credit guarantees and insurance, and better tools to help borrowers during short-term income and expense shocks are just some of the strategies that have been suggested to reduce racial disparities.