Skip to Main Content
Feature Request FR-4247
Product Area Developer Experience
Status CLOSED

8 Voters

Enhanced Data Load Validations in Data Load Definitions

martin l Public
· Jan 30 2025

Idea Summary:
Enhance APEX Data Load Definitions by introducing declarative validation rules. This would allow developers to enforce data integrity by configuring validations such as:

  • Required Fields: Ensure critical columns are non-empty.
  • Value Ranges: Define minimum/maximum values for numeric/date columns.
  • Lookup Constraints: Restrict values to existing entries in lookup tables.
  • Regex Patterns: Validate text formats (e.g., emails, phone numbers).
  • Custom Error Messages: Provide clear feedback for validation failures.

Use Case:
A hospital imports patient records via CSV. Validations ensure:

  1. Required Fields: “Patient ID” and “Admission Date” are mandatory.
  2. Range Checks: “Age” must be 0-120.
  3. Lookups: “Department Code” must exist in the system’s departments table.
  4. Formatting: “Email” must match a regex pattern.

During upload, rows violating these rules are flagged with actionable errors (e.g., “Row 5: Department Code ‘XYZ’ not found”), allowing users to correct the file without compromising data quality.

Preferred Solution:

  1. UI Enhancements:
    • Add a Validations section under “Advanced Column Settings” in Data Load Definitions.
    • Include checkboxes for Required, fields for Min/Max values, Regex Pattern, and a Lookup Validation toggle (fail if no match).
  2. Validation Engine:
    • Run checks during the staging phase, pre-commit.
    • Generate an error report highlighting invalid rows/columns with custom messages.
  3. Lookup Handling:
    • Replace silent empty values with hard failures or configurable defaults (e.g., “Unknown” or NULL).
  4. Error Reporting:
    • Provide a downloadable summary with row numbers, columns, and failure reasons.
    • Optionally, allow partial uploads for valid rows (configurable by developer).

Example Configuration:

  • Column: “Discount Percentage”
    • Validation: Range (0–100)
    • Error Message: “Discount must be between 0% and 100%.”
  • Column: “Country Code”
    • Validation: Lookup (must exist in COUNTRIES table)
    • Error Action: Reject row and log “Invalid Country Code: XYZ”.

Benefits:

  • Data Integrity: Prevent invalid or incomplete data at the source.
  • User Productivity: Reduce back-and-forth corrections with instant feedback.
  • Low-Code: No PL/SQL required—configure validations declaratively.

Technical Considerations:

  • Ensure validation logic respects column data types (e.g., date ranges vs. numeric ranges).

This is something that you ALWAYS have to do anyway, and right now we have to code this all by hand.

We reviewed this idea carefully, and while it was interesting, we concluded that it is unlikely to make its way into APEX in the foreseeable future.

Comments

Comments

  • carsten.czarski APEX Team OP 13 days ago

    I have a few questions if you don't mind please:

    • Data Loading typically targets a table. Why not adding check and foreign key constraints to these tables, which would automatically lead to the rows being rejected, and also improve the quality of your data model?
    • If a row fails validation, what do you expect to happen?

    regards

    -Carsten

  • martin l OP 12 days ago

    Hi Carsten, 

    I agree with your first comment.

    For the second comment, I created an errors region on the data load screen to tell the user which line fails and which column it was. You can see an example below. 
    The column errors is doing basic check like value is required and sometimes I'm calling custom procedure to make sure that the data can be loaded like if a date is in a closed period, you cannot load this entry.

    I hope this help clarifying my idea. Let me know if you need more details. Thanks.

    Regards,
    Martin

  • carsten.czarski APEX Team OP 12 days ago

    I think, all in all such things can be partly solved with having proper constraints on the underlying tables. Other cases can be resolved by first loading into a Collection (Data Loading supports that); from there rows can be validated, notifications can be sent and application UI can be built to bring data in order - you can even allow your end users to fix wrong rows in-place. 

    That all is much more flexible than a declarative validation functionality in data loading, which probably would not meet requirements from all sorts of customers. Given the implementation impact which that idea would have on how Data Loading works, I'm afraid I don't see that being implemented in the near future.

  • martin l OP 11 days ago

    ok, thank you for looking into it.

    Regards,
    Martin