dwh_thesaurus_data

Description¶

This table stores the standardized set of concepts used to represent medical data within the data warehouse. It acts as a controlled vocabulary for coding clinical information, laboratory tests, procedures, diagnoses, and medications, covering multiple classification systems (e.g., CIM10, CCAM, ATC, local lab codes). Each record represents a unique concept with associated metadata to describe its meaning, structure, permissible values, and usage in the dataset.

Columns¶

Field	User Guide	ETL Conventions	Datatype	Required	Primary Key	Foreign Key	FK Table
thesaurus_data_num	Unique identifier		bigint(64)	Yes	Yes	No
thesaurus_code	Identifier of the vocabulary or coding system (e.g. CIM10, CCAM, ATC).		varchar(30)	Yes	No	No
concept_code	The concept code represents the identifier of the concept in the source vocabulary. Note that concept codes are not unique across vocabularies.		varchar(100)	Yes	No	No
concept_str	Human-readable name or label of the concept.		varchar(2000)	Yes	No	No
description	Extended description or definition of the concept.		varchar(4000)	No	No	No
measuring_unit	Measurement unit for quantitative concepts (e.g., mg, mmHg).		varchar(50)	No	No	No
value_type	Type of value expected	numeric : quantitative value text: free-text value present : presence/absence indicator liste : enumerated values	varchar(50)	Yes	No	No
list_values	List of values when type is liste else null	Values should be split by ; (e.g. principal;associé;relié)	varchar(4000)	No	No	No
thesaurus_parent_num	Parent concept in the hierarchy.		bigint(64)	No	No	Yes	dwh_thesaurus_data
count_data_used	Count of dwh_data associated with this concept		integer(32)	No	No	No
update_date	Date and time of the record’s last update.		timestamptz	No	No	No
upload_id	Identifier of the pipeline integration run, used to differentiate each batch of integrated data.	Defined at the start of the pipeline as `datetime.now().strftime("%Y%m%d%H%M%S")`. For example a batch integrated on 15/09/2025 at 00:00:00 has `upload_id = 20250915000000`.	bigint(64)	No	No	No