dwh_document
Description¶
This table contains unstructured clinical documents and reports associated with a patient’s healthcare encounter. Each record corresponds to a single document instance, such as a clinical report, observation note, or questionnaire form. The table captures the document content as free text together with key metadata including the author, the date of creation, the type of document, and the healthcare context in which it was generated (e.g., visit, service, or care unit).
Columns¶
| Field | User Guide | ETL Conventions | Datatype | Required | Primary Key | Foreign Key | FK Table |
|---|---|---|---|---|---|---|---|
| document_num | Unique identifier | bigint(64) | Yes | Yes | No | ||
| patient_num | bigint(64) | Yes | No | Yes | dwh_patient | ||
| title | The title of the document. | varchar(400) | No | No | No | ||
| document_date | The date the document was recorded. | timestamptz | Yes | No | No | ||
| document_type | The type of the document (e.g. CR, observation, formulaire, ...) | varchar(100) | No | No | No | ||
| author | The person who created or authored the document | varchar(200) | No | No | No | ||
| displayed_text | The content of the document. | Convert to html | text | Yes | No | No | |
| stay_num | The visit during which the document was created. | bigint(64) | No | No | Yes | dwh_patient_stay | |
| department_num | The service associated with the document. | bigint(64) | No | No | Yes | dwh_thesaurus_department | |
| unit_num | The care unit associated with the document. | bigint(64) | No | No | Yes | dwh_thesaurus_unit | |
| instance_document_id | Code of the healthcare center, see hospital_instance for more informations | varchar(40) | No | No | No | ||
| document_origin_code | Indicate source software for this document | varchar(60) | No | No | No | ||
| id_doc_source | Unique identifier in source software | varchar(300) | No | No | No | ||
| document_pid | Optional pseudo-identifier for a document. Mainly included for structural consistency; not required for standard analytical use. | Generated as a hash of id_doc_source combined with document_salt. | varchar(300) | No | No | No | |
| document_salt | Optional random salt used in the hash algorithm to generate document_pid. | varchar(300) | No | No | No | ||
| upload_id | Identifier of the pipeline integration run, used to differentiate each batch of integrated data. | Defined at the start of the pipeline as datetime.now().strftime("%Y%m%d%H%M%S"). For example a batch integrated on 15/09/2025 at 00:00:00 has upload_id = 20250915000000. |
bigint(64) | No | No | No | |
| update_date | Date and time of the record’s last update. | timestamptz | No | No | No |