In the context of analyzing a Title VII Civil Rights and Massachusetts law “disparate impact” claim, a federal court has cast considerable doubt on the efficacy of statistical tools employed in “disparate impact” analysis. In Pedro Lopez, et al. v. City of Lawrence, et al., the district court ruled in favor of multiple police departments after considering whether police promotional exams resulting in a statistically significant “disparate impact” for Title VII purposes. No. 07-11693-GAO (D. Mass. Sept. 5, 2014). While the ruling does not change federal “disparate impact” analysis or the rules employers must follow, the decision helpfully illustrates the problems of small samples sizes, aggregation across data samples, and aggregation across multiple time periods and lends support to employers in “disparate impact” analysis. Because OFCCP looks to Title VII for guidance in assessing discrimination, this decision could affect how the OFCCP reviews and evaluates compensation data.
Lopez concerned whether police departments’ promotional exams, although facially neutral, resulted in disparate adverse impact against minority candidates. There is no “single test” to establish disparate impact; instead, “courts appear generally to have judged the ‘significance’ or ‘substantiality’ of numerical disparities on a case-by-case basis.” Further, courts will engage in a burden shifting analysis. Once an adverse impact is shown, the practice is nonetheless permitted if the employer can show that it is “job-related and consistent with business necessity.” An employee can still prevail by demonstrating that there is “another selection device without a similar discriminatory effect that would also serve the employer’s legitimate interest.”
The Court’s decision discussed issues related to disparate impact claims. First, the Court pointed out that “[o]ne problem is the case of small statistical samples . . . [w]ith respect to small data sets, adverse impact ratios, standing alone, can be misleading.” The Court also emphasized that “proposed conclusions from small data sets can lack statistical significance,” “small datasets can be statistically unstable,” and “even where it is assumed that there are no performance differences between different groups, statistical analysis may yet indicate that there are by showing ‘false positive’ adverse impact.”
Second, the Court described the problem of data aggregation. A municipality (or another subject group) may try to correct for the problem of small data sets by aggregating data across other municipalities (or subject groups). Such aggregation, however, may simply be inappropriate. In Lopez, the Court ruled such aggregation inappropriate because under Massachusetts law, a municipality may promote officers to sergeant only if they are already employed in its own police department. As a result, promotion schemes should only be evaluated relative to the pool of candidates capable of promotion, and disparate impact analyses should not entail aggregation across multiple municipalities.
Third, the Court addressed multi-year aggregation. While admitting that the method has superficial appeal, the Court stated that “it has not been shown to be a reliable analysis technique” and pointed out that there was likely overlap between multiple iterations of test-taking. This overlap, the Court noted, calls the method’s statistical validity into question. The Court dismissed all claims against employers after finding that statistical analyses did not support employees’ claims.
The decision shows that disparate impact analysis has several potential flaws based on the particular sample and data aggregation techniques. As previously reported, the OFCCP is making a renewed effort to analyze the compensation practices of federal contractors. Cases like Lopez should provide both the OFCCP and employers guidance with respect to disparate impact analysis. Government contractors should carefully evaluate their compensation policies, structures and equity with these principles in mind.