QIAGEN powered by

Latest improvements for QIAGEN OmicSoft Lands

  Current line          Archive

QIAGEN OmicSoft Lands

Release date: 2024-02-01

OmicSoft Lands Release 2024R1

Highlights

  • New breast cancer proteomics studies added to ClinicalProteomicTumor
  • CCLE Land updated with new variant information and metadata
  • Over 8000 new samples added to OncoLand
  • Over 5000 samples added to DiseaseLand

OncoLand updates

ClinicalProteomicTumor

ClinicalProteomicTumor integrates studies focused on cancer proteomics from CPTAC and other repositories, including additional data such as transcriptomics and somatic variation.

This release adds 155 samples and 96 comparisons from PDC000120, focusing on multiple subtypes of breast cancer. These new studies include MS proteomics, RNA-seq, miRNA-seq, and somatic mutation data profiling.

A horizontal bar graph showing the number of samples per category.

Figure 1. New Samples in ClinicalProteomicTumor from PDC000120, grouped by GeneticSubtype and colored by OncoSampleType.

With this new dataset, as with other datasets in ClinicalProteomicTumor, you can mine the collection of pre-computed comparisons to reveal differentially regulated genes and proteins that can be evaluated as candidate targets or biomarkers, then confirm at the sample level.

Scatter plots showing differential expression of genes.

Figure 2. Differential expression of genes between triple-receptor negative breast cancer (TNBC) vs. non-TNBC in PDC000120 at the protein and gene levels. (A) Comparison of differential expression at the RNA-seq and protein levels reveals multiple candidate markers of TNBC. (B) Sample-level expression of PPP1R14C at the RNA and protein levels confirms increased levels in TNBC samples.

OncoHuman

OncoHuman is the unified repository of oncology transcriptomics projects from thousands of studies requested by OmicSoft users.

A horizontal bar graph showing the number of samples per category.

Figure 3. New samples in OncoHuman, grouped by DiseaseState and colored by OncoSampleType.

This release adds 7066 samples and 1187 comparisons from 81 datasets on the following topics:

  • Colorectal cancer: GSE100179, GSE113513, GSE131353, GSE133057, GSE14095, GSE140973, GSE161158, GSE164191, GSE193814, GSE200129, GSE216455, GSE37175, GSE37178, GSE64857, GSE71187, GSE73255, GSE75315, GSE81653, and GSE97689
  • Stomach cancer: GSE115637, GSE116167, GSE118916, GSE125177, GSE128459, GSE130823, GSE160116, GSE96667, GSE96668, and GSE98708
  • Pancreas cancer and esophagus cancer: GSE157096, GSE161533, and GSE221250
  • Cutaneous T-cell lymphoma (CTCL): GSE180574, GSE181117, and GSE181118
  • Other datasets: GSE107170, GSE117970, GSE117970, GSE123285, GSE126464, GSE131592, GSE132707, GSE132966, GSE132966, GSE140186, GSE142720, GSE145148, GSE147745, GSE151423, GSE151825, GSE162669, GSE16757, GSE172153, GSE173771, GSE178998, GSE179443, GSE185824, GSE19977, GSE200146, GSE20017, GSE204862, GSE210274, GSE212248, GSE214846, GSE222334, GSE223655, GSE226448, GSE230453, GSE39791, GSE43362, GSE45267, GSE45434, GSE45435, GSE46581, GSE51697, GSE62743, GSE64041, GSE78806, GSE80774, GSE80774, and GSE89377

Removed/reprocessed datasets or comparisons

SRP017465, ERP003613 GPL11154, E-MTAB-2836 GPL16791, and GSE5057 GPL96 were removed due to redundancy with other lands (HumanDisease and HPA).

As part of our standard review process, comparisons for the following already landed projects were revised and can be found with an updated “OSModifiedDate”: E-MTAB-3610, E-MTAB-62, E-MTAB-783, E-MTAB-8412, GSE100025, GSE10021, GSE100705, GSE101833, GSE103340, GSE104922, GSE105402, GSE108088, GSE108286, GSE108345, GSE10843, GSE112282, GSE112369, GSE1133, GSE113970, GSE114012, GSE114564, GSE115544, GSE116305, GSE116437, GSE116438, GSE116439, GSE116440, GSE116441, GSE116442, GSE116443, GSE116444, GSE116445, GSE116446, GSE116447, GSE116448, GSE116449, GSE116450, GSE116451, GSE118171, GSE126109, GSE129696, GSE1323, GSE134147, GSE146361, GSE146687, GSE1474, GSE147971, GSE155343, GSE165914, GSE166716, GSE170999, GSE175787, GSE17714, GSE180440, GSE18088, GSE183202, GSE183777, GSE184398, GSE19114, GSE19188, GSE195984, GSE19860, GSE20124, GSE202434, GSE20462, GSE209746, GSE22821, GSE22984, GSE27157, GSE28567, GSE28645, GSE28709, GSE29288, GSE30543, GSE32036, GSE32323, GSE32474, GSE32989, GSE35159, GSE35896, GSE36552, GSE41035, GSE41445, GSE42937, GSE4342, GSE45052, GSE47992, GSE48213, GSE48276, GSE48433, GSE51447, GSE52219, GSE52329, GSE55624, GSE57083, GSE58326, GSE62080, GSE66514, GSE69795, GSE70691, GSE73318, GSE73360, GSE73526, GSE76402, GSE80606, GSE81089, GSE81980, GSE83129, GSE85465, GSE8596, GSE87419, GSE89127, GSE9031, GSE90592, GSE90681, GSE94304, GSE94669, GSE95499, GSE9677, GSE97023, GSE98383, PRJEB25780, and PRJNA816986.

OncoMouse

A horizontal bar graph showing the number of samples per category.

Figure 4. New samples in OncoMouse, grouped by DiseaseState and colored by TissueCategory.

This release adds 1135 samples and 470 comparisons from 20 datasets, including GSE112585, GSE122774, GSE143253, GSE145573, GSE149175, GSE149178, GSE168846, GSE173107, GSE184599, GSE202940, GSE203260, GSE205644, GSE218161, GSE235599, GSE237098, GSE242835, GSE25671, GSE85385, and GSE85507.

Removed/reprocessed datasets or comparisons

No datasets were removed for this release.

As part of our standard review process, metadata (and comparisons if the case) for the following already landed projects were revised and can be found with an updated “OSModifiedDate”: GSE102416, GSE103712, GSE106683, GSE112174, GSE112973, GSE126080, GSE135691, GSE135785, GSE26410, GSE30865, GSE42708, GSE43803, GSE56252, GSE65503, GSE67497, GSE68162, GSE69290, GSE69544, GSE69688, GSE71908, GSE83915, GSE89077, GSE89823, GSE94133, GSE97133, and GSE97452.

CCLE/DepMap

The Cancer Cell Line Encyclopedia (CCLE) project is an effort to conduct a detailed genetic characterization of a large panel of human cancer cell lines. OmicSoft's CCLE Land provides analysis and visualization of DNA copy number, mRNA expression, mutation data, and more, for 1879 cancer cell lines.

A horizontal bar graph showing the number of samples per category.

Figure 5. CCLE Land cell line distribution, grouped by DiseaseCategory and colored by TissueCategory.

With this release, new samples were added (based on DepMap 2023Q4 release) and new pharmacological drug response profiling data were added to metadata.

In addition, cell line descriptions were aligned to the OmicSoft curation standard for DiseaseState, TissueCategory, and OncoSampleType, to align with cell lines in OncoHuman and other Lands.

New data

  • A total of 48 new samples were added.
  • CRISPR gene dependency experiments were updated to CHRONOS data.
  • A total of 49 new DNA-seq somatic mutation samples were added.
  • A total of 61 new CNV samples were added, and all CNV data were updated with the current data as inferred from WGS, WES, or SNP array data.

Key metadata changes

  • Histology and DiseaseLocation[PrimarySite] were recurated entirely from literature.
  • DiseaseState, Tissue, and OncoSampleType for each cell line were updated according to the current OmicSoft standards.
  • Fields were renamed to be consistent with other OmicSoft Lands.
Old Field Name New Field Name
New Field CatalogNumber
New Field TumorType[DepMap]
New Field TreatmentHistory
Lineage[DepMap] OncoTreeLineage
DiseaseState[Cellosaurus] OncoTreeDisease
DiseaseState[Cellosaurus][NCItCode] OncoTreeCode
LineageSubtype[DepMap];DiseaseSubtype OncoTreeDiseaseSubtype
LineageMolecularSubtype[DepMap] GeneticSubtype[DepMap][Legacy]
LineageSubSubtype[DepMap] LineageSubSubtype[DepMap][Legacy]
Age[years] AgeAtSampling[years]
AgeCategory AgeCategoryAtSampling
MicrosatelliteInstability[MSI][CCLE] MicrosatelliteInstability[MSI][Status][CCLE]
MicrosatelliteInstability[MSI][GDSC] MicrosatelliteInstability[MSI][Status][GDSC]
GeneDependency[XPR1][PMID:35437317] GeneDependency[XPR1][PMID35437317]
CCLEName CellLineName[CCLE]
CellLineSource BiomaterialProvider

Known issues

  • Cancerous and normal/non-tumor cell lines originating from the same individual have the same SubjectID, but different DiseaseState values

DiseaseLand updates

HumanDisease

HumanDisease is the unified repository of non-oncology disease omics projects from thousands of studies requested by OmicSoft users.

A horizontal bar graph showing the number of samples per category.

Figure 6. New samples in HumanDisease (excluding control samples), grouped by DiseaseState and colored by TissueCategory.

This release adds 4982 samples and 1108 comparisons from 72 datasets, including studies on:

  • Schizophrenia: GSE202537, GSE235055, GSE226233, GSE206720, GSE184102, GSE182370, GSE155067, GSE132689, and GSE118941
  • Depressive disorder: GSE178071, GSE178071, GSE193417, GSE99725, GSE135524, GSE128387, GSE85333, and GSE17440
  • Eye disease, retinal degeneration, and retina profiling: GSE102485, GSE131877, GSE132828, GSE142333, GSE144785, GSE151610, GSE154684, GSE164884, GSE176513, GSE180705, GSE186751, GSE201219, GSE201219, GSE227975, GSE75990, GSE94437, and GSE98370
  • CRISPR KO: GSE132704, GSE141171, GSE143371, GSE221916, GSE221916, GSE232818, GSE239367, and GSE246263
  • Atopic dermatitis: GSE137430, GSE141570, GSE141571, GSE185764, GSE208405, GSE224783, and GSE237920
  • Other topics: GSE99454, GSE198449, GSE155700, GSE162955, GSE24265, GSE209552, GSE137856, GSE19205, GSE206213, GSE48761, GSE52285, E-MTAB-12067, GSE124197, GSE141910, PXD038846, and GSE219278

Removed/reprocessed datasets or comparisons

The following datasets were removed from DiseaseLand, as they are duplicated in OncoHuman: GSE48953 GPL9115, GSE63816 GPL11154, GSE65185 GPL11154, GSE67501 GPL14951, GSE76340 GPL10558, GSE76340 GPL6947, and GSE79338 GPL11154.

As part of our standard review process, comparisons for the following already landed projects were revised and can be found with an updated “OSModifiedDate”: E-MTAB-1895, GSE100261, GSE101126, GSE102293, GSE102498, GSE103060, GSE109140, GSE11227, GSE117469, GSE118882, GSE120396, GSE12161, GSE12261, GSE124173, GSE124392, GSE12815, GSE129247, GSE130737, GSE13139, GSE137338, GSE13736, GSE143453, GSE144108, GSE144274, GSE144715, GSE145303, GSE145898, GSE147404, GSE150540, GSE151924, GSE154613, GSE155326, GSE159676, GSE164457, GSE16706, GSE17482, GSE177029, GSE17814, GSE194086, GSE205976, GSE206088, GSE206529, GSE20739, GSE216997, GSE21980, GSE22956, GSE23289, GSE24345, GSE26295, GSE27507, GSE28786, GSE29903, GSE30780, GSE32443, GSE34074, GSE37147, GSE37693, GSE39180, GSE40281, GSE41861, GSE43692, GSE44037, GSE45133, GSE45357, GSE4635, GSE50892, GSE51392, GSE53201, GSE54937, GSE57148, GSE57893, GSE60217, GSE6092, GSE60937, GSE62253, GSE6280, GSE62974, GSE64605, GSE65561, GSE65790, GSE66597, GSE66785, GSE67596, GSE71216, GSE71831, GSE71862, GSE72633, GSE73650, GSE75362, GSE75363, GSE75886, GSE75940, GSE83476, GSE85799, GSE86884, GSE87534, GSE87554, GSE90028, GSE92354, GSE92724, GSE93902, GSE95038, GSE95431, GSE96962, GSE97469, GSE994, and GSE99999.

MouseDisease

MouseDisease is the unified repository comprising thousands of studies exploring mouse models of human disease, requested by OmicSoft users.

A horizontal bar graph showing the number of samples per category.

Figure 7. New samples in MouseDisease, grouped by DiseaseCategory and colored by TissueCategory.

This release adds 642 samples and 374 comparisons from 31 datasets, including studies on:

  • Sleep disorder: GSE166831, GSE211088, and GSE211301
  • Schizophrenia: GSE218742, GSE207669, GSE209673, GSE197888, GSE181522, and GSE181285
  • Depressive disorder: GSE218742, GSE207669, GSE209673, GSE197888, GSE181522, and GSE181285
  • Hemophilia: GSE106436
  • Other topics: GSE173926, GSE182698, GSE211982, GSE137595, GSE196266, GSE124197, GSE5296, GSE95653, GSE96055, ERP112950, GSE185476, GSE179802, GSE158777, GSE166412, GSE171852, GSE104036, GSE112348, GSE200575, GSE205958, GSE214701, and GSE221379

Removed/reprocessed datasets or comparisons

No datasets were removed for this release.

As part of our standard review process, comparisons for the following already landed projects were revised and can be found with an updated “OSModifiedDate”: E-MTAB-5326, GSE100635, GSE106463, GSE107655, GSE109055, GSE109329, GSE112116, GSE114838, GSE118628, GSE126454, GSE132040, GSE134226, GSE134659, GSE135442, GSE146074, GSE147034, GSE148084, GSE160020, GSE1623, GSE180493, GSE19286, GSE25765, GSE25766, GSE25767, GSE25890, GSE25926, GSE27382, GSE31928, GSE32078, GSE32936, GSE34889, GSE37746, GSE41044, GSE42813, GSE48200, GSE48217, GSE51969, GSE60413, GSE63062, GSE65094, GSE71379, GSE72069, GSE75000, GSE76811, GSE76812, GSE85409, GSE87212, GSE87317, GSE89412, GSE95401, GSE96694, GSE97353, GSE97806, GSE98423, PRJNA556537, and SRP100399.

ATCC Land updates

ATCC Human

This release adds 215 samples, bringing the total to 1568 samples from 341 unique cell lines.

A horizontal bar graph showing the number of samples per category.

Figure 8. Distribution of samples in ATCC_Human_B38_GC33, grouped by DiseaseCategory and colored by TissueCategory.

ATCC Mouse

This release adds 21 samples, bringing the total to 198 samples from 49 unique cell lines.

A horizontal bar graph showing the number of samples per category.

Figure 9. Distribution of samples in ATCC_Mouse_B38, grouped by DiseaseState and colored by TissueCategory.

ATCC update highlights

With this latest release, you can quickly mine statistical comparisons to reveal differentially expressed genes between pairs of cell lines from the same tissue.

Figure 10. Comparison bubble plot displaying fold change (x-axis) and significance (size of bubble) for the expression of DNMT3A.

These new comparison data can be combined with RNA-seq expression data and mutation data to quickly identify the best cell line for your research.

Figure 11. RNA-Seq Mutation Genome Browser View for Flt3 in a subset of cell lines from hematologic samples from ATCC_Mouse_B38. Click on the interactive plot to highlight mutations of interest and explore the underlying sample metadata.

General updates

Updates to OmicSoft Lands flat file schemas

With  this latest release, several improvements have been made to the flat file exports of the Lands and the data queries via OmicSoft Lands API. Improvements include unification of the project_id field name across tables, consistent use of snake_case across all clinical_triplets attributes, availability of a persistent comparison_index, and unification of field types across databases.

Land Database Version Cleanup

Customers with dedicated installations are recommended to review the list of available databases and remove any legacy versions that are not being used.

In most cases, the recommended version is Human Genome version 38 and gene model GenCode.V33 (“B38_GC33 Lands”). This will reduce confusion for users who are unsure which database to search for relevant information.

Attend live and on-demand webinars

The expert Field Application Scientists of QIAGEN® routinely hold online trainings for new and advanced users of OmicSoft Lands data, showcasing the use of these resources to answer scientific questions. See upcoming webinars, as well as recordings of previous webinars here: https://digitalinsights.qiagen.com/webinars-and-events/

Update to the latest OmicSoft Suite version to access the latest features

OmicSoft Suite updates significantly reduce the loading time and memory footprint of Single Cell Lands. Updates include new visualizations and features that cannot be accessed in earlier versions. Contact ts-bioinformatics@qiagen.com to learn more.