Meta data description table


Main Attributes

Attribute Description
barcode A unique identifier for each sample used in the study, often a combination of alphanumeric characters.
patient Identifier for the individual from whom the sample was obtained.
sample Identifier for the specific biological sample from a patient.
shortLetterCode An abbreviated code representing a specific attribute or classification relevant to the sample or study.
definition Descriptive information or criteria about how data in the respective column should be interpreted.
sample_submitter_id Unique ID assigned by the submitter to the sample.
sample_type_id A numerical ID that corresponds to the type of sample collected, such as tumor or normal tissue.
tumor_descriptor Descriptive term that specifies attributes of the tumor tissue in the sample.
sample_id Unique identifier assigned to the sample for tracking and reference.
sample_type Classification of the sample based on its origin (e.g., primary tumor, normal tissue).
composition Description of the physical composition of the sample (e.g., solid, blood).
days_to_collection Number of days from the patient’s initial diagnosis to the date the sample was collected.
state The processing or preservation state of the sample.
initial_weight The weight of the sample when initially collected.
preservation_method Method used to preserve the sample, such as freezing or embedding in paraffin.
pathology_report_uuid Unique identifier for the pathology report associated with the sample.
submitter_id Identifier for the entity or individual who submitted the sample or data.
oct_embedded Indicates whether the sample was embedded in OCT (optimal cutting temperature compound) for preservation.
specimen_type Broad classification of the specimen type, like tumor or normal.
is_ffpe Indicates if the sample is fixed in formalin and embedded in paraffin (FFPE).
tissue_type Type of tissue from which the sample was derived (e.g., breast, lung).
synchronous_malignancy Indicates presence of a malignancy in the patient synchronous with the primary cancer under study.
ajcc_pathologic_stage Stage of cancer as defined by the American Joint Committee on Cancer’s pathologic classification system.
days_to_diagnosis Number of days from the patient’s first symptoms or visit until their cancer diagnosis.
treatments Descriptions of treatments the patient has undergone, potentially affecting the sample.
last_known_disease_status Most recent known status of the patient’s disease at the time of data collection.
tissue_or_organ_of_origin The original tissue or organ from which the primary tumor developed.
days_to_last_follow_up Number of days from initial cancer diagnosis to the last follow-up with the patient.
age_at_diagnosis Age of the patient at the time of cancer diagnosis.
primary_diagnosis The initial diagnosis of cancer type made by examining the tissue sample.
prior_malignancy Indicates if the patient had a malignancy prior to the current diagnosis.
year_of_diagnosis The year in which the cancer was diagnosed.
prior_treatment Indicates whether the patient received any treatment for cancer prior to the current diagnosis.
ajcc_staging_system_edition Edition of the AJCC staging system used to classify the cancer.
ajcc_pathologic_t Tumor size and extent of invasion in the AJCC pathologic classification.
morphology Description of the tumor’s physical characteristics at a microscopic level.
ajcc_pathologic_n Involvement of regional lymph nodes in the AJCC pathologic classification.
ajcc_pathologic_m Presence of distant metastasis as per the AJCC pathologic classification.
classification_of_tumor Notes whether the tumor is primary, recurrent, or metastatic.
diagnosis_id A unique identifier for the diagnosis linked to this sample.
icd_10_code The International Classification of Diseases (ICD-10) code related to the primary diagnosis.
site_of_resection_or_biopsy The anatomical site where the resection or biopsy for the sample was performed.
tumor_grade Grade of the tumor indicating its aggressiveness or likelihood of growth and spread.
progression_or_recurrence Indicates whether the cancer has progressed or recurred after initial treatment.
alcohol_history Information regarding the patient’s history of alcohol use.
exposure_id Identifier for any exposure information linked to the patient’s medical or environmental history.
race Race of the patient as reported.
gender Gender of the patient.
ethnicity Ethnic background of the patient.
vital_status Current vital status of the patient (e.g., alive, deceased).
age_at_index Age of the patient at the time of inclusion in the study or data collection.
days_to_birth Number of days from the patient’s birth to the date of data entry. Used for calculating current age.
year_of_birth Year the patient was born.
demographic_id A unique identifier for the demographic information of the patient.
days_to_death Number of days from the patient’s birth until their death.
year_of_death Year the patient died.
bcr_patient_barcode Barcode assigned to the patient’s data for tracking and reference in the BCR (Biospecimen Core Resource).
primary_site The primary location of the original tumor.
project_id Identifier for the research project or study this data is part of.
disease_type Type of disease diagnosed in the patient (e.g., BRCA for breast cancer).
name General identifier or name label for the sample or data set.
releasable Indicates whether the data is approved for release.
released Indicates whether the data has been released publicly.
days_to_sample_procurement Number of days from the patient’s diagnosis to when the sample was procured.
paper_* attributes (e.g., paper_patient, etc.) Prefix indicating data points used or referenced in specific published papers related to the study.

Paper Attributes


Attribute Description
paper_patient Refers to patient data as used or cited in specific research papers.
paper_Tumor.Type Type of tumor as classified in the paper; may differ from clinical classifications.
paper_Included_in_previous_marker_papers Indicates whether data from this patient was included in previous studies focusing on specific biomarkers.
paper_vital_status The vital status (alive or deceased) of the patient as reported in the paper.
paper_days_to_birth Age of the patient in days at birth as used in the paper for age-related analyses.
paper_days_to_death Number of days from birth to the date of death of the patient as used in the paper.
paper_days_to_last_followup Number of days from initial diagnosis to the last follow-up as reported in the paper.
paper_age_at_initial_pathologic_diagnosis Patient’s age at the time of initial pathological diagnosis as cited in the paper.
paper_pathologic_stage Pathologic stage of the cancer as used in the paper, based on tumor size, node involvement, and metastasis presence.
paper_Tumor_Grade The grade of the tumor indicating its aggressiveness or likelihood of growth and spread as used in the paper.
paper_BRCA_Pathology Specific pathological findings related to breast cancer (BRCA) as discussed in the paper.
paper_BRCA_Subtype_PAM50 Classification of BRCA subtypes according to the PAM50 gene expression profiling as used in the paper.
paper_MSI_status Microsatellite instability (MSI) status of the cancer as used in the paper.
paper_HPV_Status Human Papillomavirus (HPV) status of the patient as discussed in the paper, relevant to certain cancer types.
paper_tobacco_smoking_history Tobacco smoking history of the patient as reported in the paper, which may influence cancer risk assessment.
paper_CNV Clusters Clusters based on copy number variations (CNVs) as identified and discussed in the paper.
paper_Mutation Clusters Mutation clusters as identified in the genomic data and used for analysis in the paper.
paper_DNA.Methylation Clusters Clusters based on DNA methylation patterns as discussed in the paper, relevant for understanding epigenetic modifications.
paper_mRNA Clusters mRNA expression clusters as analyzed and reported in the paper, used for gene expression profiling.
paper_miRNA Clusters MicroRNA (miRNA) expression clusters as used in the paper for studying post-transcriptional regulation.
paper_lncRNA Clusters Long non-coding RNA (lncRNA) clusters as discussed in the paper, relevant for understanding their roles in cancer.
paper_Protein Clusters Protein expression clusters as analyzed in the paper, important for proteomics studies.
paper_PARADIGM Clusters Integrated clusters based on the PARADIGM pathway analysis as used in the paper.
paper_Pan-Gyn Clusters Clusters that encompass various gynecological cancer types as used in the paper for broader comparative studies.