Meetup Recap: Provider Data Quality Must Be Fixed

A few weeks ago we hosted a meetup at our office about Provider Data and its quality. Around 50 healthcare data experts, developers, and data scientists joined to listen to the following three amazing speakers: Abe Gong from Superconductive HealthMorgan Templar from Get Governed, and our very own Andrew Kobylinski.

Currently, the industry standard for provider data accuracy is between 70% and 80% for provider directories published on the health plan websites. This means that one out of four doctor names plus addresses plus phone numbers you find online are wrong. In the latest CMS provider directory audit results, there were 12 health plans with worse error rates between 60.9% and 97.2%. As an industry, we can do better.

Here’s a couple of perspectives on how healthcare (provider) data can be improved that were discussed at our meetup. 


Andrew Kobylinski, Head of Platform at BetterDoctor, shared mistakes that typically occur in the complicated world of provider data management. One of the biggest issues with provider data comes from its dynamic nature – about 20% of the basic demographic data attributes such as name, address and phone number combinations changes every year. As such, the information must be verified several times a year, preferably at the primary source, such as from a doctor’s office.

To be efficient and get a high conversion from the providers, the outreach to the providers needs to be performed on behalf of multiple health plans. When BetterDoctor outreaches to the 517,000 providers every quarter, we verify provider demographic information for multiple organizations at once, using a prefilled online form based on aggregated data. To guarantee successful data collection, we use a provider’s preferred outreach method, which typically is an email directing them to our online form. We don’t spam providers, instead we want to provide an easy way to make sure patients find the doctors in the online provider directories.


Morgan Templar, who has worked with healthcare provider data for the last 20 years, believes fixing provider data must be made a priority. At the moment, the healthcare industry is not incentivized to fix this issue due to the fact that the costs to repair this issue are incorrectly perceived to be higher than what the actual benefits would be. Health plans, for example, will greatly benefit from improving data quality and giving members a top notch shopping experience.

Provider data management is becoming more important as consumers self serve more during the health plan shopping process. However, as Templar pointed out, the lack of data governance has led to inaccurate directories becoming a standard, and health plans wasting millions of dollars to keep a track of their provider networks. Regulators have started to penalize health plans for not being compliant with their provider directories or not having reasonable provider accessibility.

Templar believes the data governance is key in making sense of the provider data issue: we need to motivate health plans to prioritize this problem, in a way that it drives value for the organization. The value needs to be aligned for all the stakeholders in the process. 


Abe Gong brought a data scientist perspective into the discussion. According to him, data management in general is lacking best practices for testing the data quality. A lot of this testing could be automated even. For certain types of data attributes, you should be able to expect certain data and form rules for what to expect in the data fields. For example, this could mean flagging unexpected changes in attributes that shouldn’t change often: such as the NPI number, degree or first name of a provider that typically stays constant. For certain attribute, you should be expecting certain kind of data: the NPI number should always be 10 digits, names should begin with a capital letter etc.

As Abe explained, “your data pipelines wants to become a hairball”, which means that the data flows will likely become a complex interconnected entity. Managing this can become easily tricky. To address the issue, his company Superconductive Health has created an testing automation tool that helps to do the work data engineers do not typically have time to focus on.

Link to Abe’s presentation can be found here


In the era of the Internet, consumers should be able to find the correct information they are looking for. It all starts with clean data and efficient data management. We need to prioritize the provider data issue if we want to give patients the right tools to access the care they need.