anonymizer/data/DCR_process_overview.md

14 KiB

1. Overall DCR Process Overview

  • Purpose of DCR Process: The Data Change Request (DCR) process is designed to improve and validate the quality of existing data in source systems. It provides a formal mechanism for proposing, validating, and applying data changes.

  • General DCR Process Flow:

    1. A source system creates a proposal for a data change, known as a DCR or Validation Request (VR).
    2. The MDM HUB routes this request to the appropriate validation channel.
    3. Validation is performed either by internal Data Stewards (DS) within Reltio or by external, third-party validator services like OneKey or Veeva OpenData.
    4. A response is sent back, which includes metadata about the DCR's status (e.g., accepted, rejected) and the actual data profile update (payload) resulting from the processed DCR.
  • High-Level Solution Architecture: The architecture involves source systems initiating DCRs through the MDM HUB. The HUB acts as a central router, directing requests to either Reltio for internal data steward review or to Third-Party Validators (OneKey, Veeva). The HUB is also responsible for receiving responses and facilitating the update of data in Reltio, often through ETL processes that handle payload delivery (e.g., via S3).


2. System-Specific DCR Implementations

Here are the details for each system involved in the DCR process.


OneKey (OK)

  • Role in DCR Process: Functions as a third-party validator for DCRs. It receives validation requests, processes them, and returns the results.
  • Key Components: DCR Service 2, OK DCR Service, OneKey Adapter, Publisher, Hub Store (Mongo DB), Manager (Reltio Adapter).
  • Actors Involved: PforceRX, Data Stewards (in Reltio), Reltio, HUB, OneKey.
  • Core Process Details:
    • PforceRX-initiated: DCRs are created via the HUB's API. The HUB integrates with OneKey's API to submit requests (/vr/submit) and periodically checks for status updates (/vr/trace).
    • Reltio-initiated: Data Stewards can suggest changes in Reltio and use the "Send to Third Party Validation" feature, which triggers a flow to submit a validation request to OneKey. Singleton entities created in Reltio can also trigger an automatic validation request to OneKey.
  • Integration Methods:
    • API: Real-time integration with OneKey via REST APIs (/vr/submit, /vr/trace).
    • File Transfer: Data profile updates (payload) are delivered back to Reltio via CSV files on an S3 bucket, which are then processed by an ETL job.
  • DCR Types Handled: Create, update, delete operations for HCP/HCO profiles; validation of newly created singleton entities; changes suggested by Data Stewards.

Veeva OpenData (VOD)

  • Role in DCR Process: Functions as a third-party validator, primarily handling DCRs initiated by Data Stewards from within Reltio.
  • Key Components: DCR Service 2, Veeva DCR Service, Veeva Adapter, GMTF (Global Master Template & Foundation) jobs.
  • Actors Involved: Data Stewards (in Reltio), HUB, Veeva OpenData.
  • Core Process Details:
    1. Data Stewards in Reltio create DCRs using the "Suggest / Send to 3rd Party Validation" functionality.
    2. The HUB stores these requests in a Mongo collection (DCRRegistryVeeva).
    3. A scheduled job gathers these DCRs, packages them into ZIP files containing multiple CSVs, and places them on an S3 bucket.
    4. Files are synchronized from S3 to Veeva's SFTP server in batches (typically every 24 hours).
    5. Veeva processes the files and returns response files to an inbound S3 directory, which the HUB traces to update DCR statuses.
  • Integration Methods:
    • File Transfer: Asynchronous, batch-based communication via ZIP/CSV files exchanged through S3 and SFTP.
  • DCR Types Handled: Primarily handles changes suggested by Data Stewards for existing profiles that need external validation from Veeva.

IQVIA Highlander (HL) - Decommissioned April 2025

  • Role in DCR Process: Acted as a wrapper to translate DCRs from a Veeva format into a format that could be loaded into Reltio for Data Steward review.
  • Key Components: DCR Service (first version), IQVIA DCR Wrapper.
  • Actors Involved: Veeva (on behalf of PforceRX), Reltio, HUB, IQVIA wrapper, Data Stewards.
  • Core Process Details:
    1. Veeva uploaded DCR requests as CSV files to an FTP location.
    2. The HUB translated the Veeva CSV format into the IQVIA wrapper's CSV format.
    3. The IQVIA wrapper processed this file and created DCRs directly in Reltio.
    4. Data Stewards would then review, approve, or reject these DCRs within Reltio.
  • Integration Methods:
    • File Transfer: Communication was entirely file-based via S3 and SFTP.
  • DCR Types Handled: Aggregated 21 specific use cases into six generic types: NEW_HCP_GENERIC, UPDATE_HCP_GENERIC, DELETE_HCP_GENERIC, NEW_HCO_GENERIC, UPDATE_HCO_GENERIC, DELETE_HCO_GENERIC.

3. Key DCR Operations and Workflows


Create DCR

  • Description: This is the main entry point for clients like PforceRx to create DCRs. The process validates the request, routes it to the correct target system (Reltio, OneKey, or Veeva), and creates the DCR.
  • Triggers: An API call to POST /dcr.
  • Detailed Steps:
    1. The DCR service receives and validates the request (e.g., checks for duplicate IDs, existence of referenced objects for updates).
    2. It uses a decision table to determine the target system based on attributes like country, source, and operation type.
    3. It calls the appropriate internal method to create the DCR in the target system (Reltio, OneKey, or Veeva).
    4. A corresponding DCR tracking entity is created in Reltio, and the state is saved in the Mongo DCR Registry.
    5. For Reltio-targeted DCRs, a workflow is initiated for Data Steward review.
    6. Pre-close logic may be applied to auto-accept or auto-reject the DCR based on the country.
  • Decision Logic/Rules: A configurable decision table routes DCRs based on userName, sourceName, country, operationType, affectedAttributes, and affectedObjects.

Submit Validation Request

  • Description: This process submits validation requests for newly created "singleton" entities in Reltio to the OneKey service.
  • Triggers: Reltio events (e.g., HCP_CREATED, HCO_CREATED) are aggregated in a time window (e.g., 4 hours).
  • Detailed Steps:
    1. After an event aggregation window closes, the process performs several checks (e.g., entity is active, no existing OneKey crosswalk, no potential matches found via Reltio's getMatches API).
    2. If all checks pass, the entity data is mapped to a OneKey submitVR request.
    3. The request is sent to OneKey via POST /vr/submit.
    4. A DCR entity is created in Reltio to track the status, and the request is logged in Mongo.

Trace Validation Request

  • Description: This scheduled process checks the status of pending validation requests that have been sent to an external validator like OneKey or Veeva.
  • Triggers: A timed scheduler (cron job) that runs every N hours.
  • Detailed Steps (OneKey Example):
    1. The process queries the Mongo DCR cache for requests with a SENT status.
    2. For each request, it calls the OneKey POST /vr/trace API.
    3. It evaluates the processStatus and responseStatus from the OneKey response.
    4. If the request is resolved (VAS_FOUND, VAS_NOT_FOUND, etc.), the DCR status in Reltio and Mongo is updated to ACCEPTED or REJECTED.
    5. If the response indicates a match was found but an OK crosswalk doesn't yet exist in Reltio, a new workflow is triggered for Data Steward manual review (DS_ACTION_REQUIRED).

Data Steward Response

  • Description: This process handles the final outcome of a DCR that was reviewed internally by a Data Steward in Reltio.
  • Triggers: Reltio change request events (CHANGE_REQUEST_CHANGED, CHANGE_REQUEST_REMOVED) that do not have the ThirdPartyValidation flag.
  • Detailed Steps:
    1. The process consumes the event from Reltio.
    2. It checks the state of the change request.
    3. If the state is APPLIED or REJECTED, the corresponding DCR entity in Reltio and the record in Mongo are updated to a final status of ACCEPTED or REJECTED.

Data Steward OK Validation Request

  • Description: This process handles DCRs created by a Data Steward in Reltio using the "Suggest" and "Send to Third Party Validation" features, routing them to an external validator like OneKey.
  • Triggers: Reltio change request events that do have the ThirdPartyValidation flag set to true.
  • Detailed Steps:
    1. The HUB retrieves the "preview" state of the entity from Reltio to see the suggested changes.
    2. It compares the current entity with the preview to calculate the delta.
    3. It maps these changes to a OneKey submitVR request. Attribute removals are sent as a comment due to API limitations.
    4. The request is sent to OneKey.
    5. Upon successful submission, the original change request in Reltio is programmatically rejected (since the validation is now happening externally), and a new DCR entity is created for tracking the OneKey validation.

4. Data Comparison and Mapping Details

  • OneKey Comparator: When a Data Steward suggests changes, the HUB compares the current Reltio entity with the "preview" state to send to OneKey.

    • Simple Attributes (e.g., FirstName): Values are compared for equality. The suggested value is taken if different.
    • Complex Attributes (e.g., Addresses, Specialties): Nested attributes are matched using their Reltio URI. New nested objects are added, and changes to existing ones are applied.
    • Mandatory Fields: For HCP, LastName and Country are mandatory. For HCO, Country and Addresses are mandatory.
    • Attribute Removal: Due to API limitations, removing an attribute is not done directly but by generating a comment in the request, e.g., "Please remove attributes: [Address: ...]".
  • Veeva Mapping: The process of mapping Reltio canonical codes to Veeva's source-specific codes is multi-layered.

    1. Veeva Defaults: The system first checks custom CSV mapping files stored in configuration (mdm-veeva-dcr-service/defaults). These files define direct mappings for a specific country and canonical code (e.g., IN;SP.PD;PD).
    2. RDM Lookups: If no default is found, it queries RDM (via a Mongo LookupValues collection) for the canonical code and looks for a sourceMapping where the source is "VOD".
    3. Veeva Fallback: If no mapping is found, it consults fallback CSV files (mdm-veeva-dcr-service/fallback) for certain attributes (e.g., hco-specialty.csv). A regular expression is often used to extract the correct code. If all else fails, a question mark (?) is used as the default fallback.

5. Status Management and Error Handling

  • DCR Statuses: The system uses a combination of statuses to track the DCR lifecycle.
RequestStatus DCRStatus Internal Cache Status Description
REQUEST_ACCEPTED CREATED SENT_TO_OK DCR sent to OneKey, pending DS review.
REQUEST_ACCEPTED CREATED SENT_TO_VEEVA DCR sent to Veeva, pending DS review.
REQUEST_ACCEPTED CREATED DS_ACTION_REQUIRED DCR pending internal DS validation in Reltio.
REQUEST_ACCEPTED ACCEPTED ACCEPTED DS accepted the DCR; changes were applied.
REQUEST_ACCEPTED ACCEPTED PRE_ACCEPTED Pre-close logic automatically accepted the DCR.
REQUEST_REJECTED REJECTED REJECTED DS rejected the DCR.
REQUEST_REJECTED REJECTED PRE_REJECTED Pre-close logic automatically rejected the DCR.
REQUEST_FAILED - FAILED DCR failed due to a validation or system error.
  • Error Codes:
Error Code Description HTTP Code
DUPLICATE_REQUEST The extDCRRequestId has already been registered. 403
NO_CHANGES_DETECTED The request contained no changes compared to the existing entity. 400
VALIDATION_ERROR A referenced object (HCP/HCO) or attribute could not be found. 404 / 400
  • DCR State Changes: A DCR begins in an OPEN state. It is then sent to a target system, moving to states like SENT_TO_OK, SENT_TO_VEEVA, or DS_ACTION_REQUIRED. Pre-close logic can immediately move it to PRE_ACCEPTED or PRE_REJECTED. Following data steward review (either internal or external), the DCR reaches a terminal state of ACCEPTED or REJECTED. If an error occurs, it moves to FAILED.

6. Technical Artifacts and Infrastructure

  • Mongo Collections:

    • DCRRegistryONEKEY / DCRRegistryVeeva: System-specific collections to store and track the state of DCRs sent to OneKey and Veeva, respectively. They hold the mapped request data and trace the response.
    • DCRRequest / DCRRegistry: General-purpose collections for tracking DCRs, their metadata, and overall status within the HUB.
    • DCRRegistryVeeva: Specifically stores Veeva-bound DCRs, including the raw CSV line content, before they are submitted in a batch.
  • File Specifications:

    • Veeva Integration: Uses ZIP files containing multiple CSV files (change_request.csv, change_request_hcp.csv, change_request_address.csv, etc.). Response files follow a similar pattern and are named with the convention <country>_DCR_Response_<Date>.zip.
    • Highlander Integration: Used CSV files for requests.
  • Event Models: The system uses internal Kafka events to communicate DCR status changes between components.

    • OneKeyDCREvent: Published by the trace process after receiving a response from OneKey. It contains the DCR ID and the OneKeyChangeRequest details (vrStatus, vrStatusDetail, comments, validated IDs).
    • VeevaDCREvent: Published by the Veeva trace process. It contains the DCR ID and VeevaChangeRequestDetails (vrStatus, vrStatusDetail, comments, new Veeva IDs).