Data integration is the process of retrieving data from multiple systems or sources and processing it to conform to a better data model representing the industry. The structured or unstructured data is transformed using technical and viable processes to present it as a comprehensive and valuable information.
In huge data sets the variation in the names are extreme enough that defining the complete set of rules is not possible which leads to creation of multiple entities for the same organisation. As part of the solution, we have performed entity curation process on US national claims data, merging multiple organisation under a single organisation using different set of conditions.
We have solved the problem of associating correct facilities to their corresponding entity key for the data pulled from different industry sources by detailed process.
Collection and sorting of physician license information for further use in reporting and analysis.
Some of the platforms we use for data integration process are Salesforce, tableau and Hive Query Language.