Annotate an h5ad file based on CELLxGENE schemaΒΆ
This guide shows how to validate and curate an AnnData object using the metadata registries of laminlabs/cellxgene
, based on the CELLxGENE schema version 5.1.0.
The validated object can be subsequently registered as an artifact in your LaminDB instance.
Note
The Annotate class is primarily designed to validate all metadata with respect to adhere to the ontologies. It not reimplement all rules of the cellxgene schema and we therefore recommend running the cellxgene-schema if full adherence beyond metadata is a necessity.
Set upΒΆ
Load your instance to register the annotated AnnData:
!lamin init --storage ./test-cellxgene-annotate --schema bionty
Show code cell output
π‘ connected lamindb: testuser1/test-cellxgene-annotate
import lamindb as ln
import lnschema_bionty as lb
from cellxgene_lamin import Annotate, datasets, CellxGeneFields
ln.settings.verbosity = "hint"
lb.settings.organism = "human"
π‘ connected lamindb: testuser1/test-cellxgene-annotate
β Full backed capabilities are not available for this version of anndata, please install anndata>=0.9.1.
An h5ad fileΒΆ
Letβs start with an AnnData object that weβd like to inspect and curate:
adata = datasets.anndata_human_immune_cells(populate_registries=True)
adata
AnnData object with n_obs Γ n_vars = 1626 Γ 36503
obs: 'donor', 'tissue', 'cell_type', 'assay', 'sex_ontology_term_id'
var: 'feature_is_filtered'
uns: 'default_embedding'
obsm: 'X_umap'
adata.write_h5ad("anndata_human_immune_cells.h5ad")
!cellxgene-schema validate anndata_human_immune_cells.h5ad
Loading dependencies
Loading validator modules
Starting validation...
WARNING: Validation of raw layer was not performed due to current errors, try again after fixing current errors.
ERROR: Add labels error: Column 'cell_type' is a reserved column name of 'obs'. Remove it from h5ad and try again.
ERROR: Add labels error: Column 'assay' is a reserved column name of 'obs'. Remove it from h5ad and try again.
ERROR: Add labels error: Column 'tissue' is a reserved column name of 'obs'. Remove it from h5ad and try again.
ERROR: 'title' in 'uns' is not present.
ERROR: 'ENSG00000269933' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000261737' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000259834' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000256374' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000263464' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000203812' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272196' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272880' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000270188' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000287116' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000237133' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000224739' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000227902' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000239467' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272551' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000280374' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000236886' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000229352' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000286601' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000227021' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000259855' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273301' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000271870' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000237838' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000286996' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000269028' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000286699' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273370' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000261490' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272567' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000270394' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272370' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272354' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000251044' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272040' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000182230' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000204092' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000261068' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000236740' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000236996' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000232295' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000271734' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000236673' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000227220' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000236166' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000112096' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000285162' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000286228' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000237513' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000285106' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000226380' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000270672' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000225932' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000244693' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000268955' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272267' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000253878' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000259820' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000226403' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000233776' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000269900' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000261534' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000237548' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000239665' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000256892' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000249860' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000271409' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000224745' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000261438' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000231575' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000260461' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000255823' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000254740' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000254561' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000282080' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000256427' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000287388' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000276814' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000280710' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000215271' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000258414' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000258808' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000277050' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273888' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000258861' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000259444' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000244952' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273923' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000262668' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000232196' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000256618' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000221995' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000226377' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273576' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000267637' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000282965' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273837' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000286949' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000256222' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000280095' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000278927' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000278955' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000277352' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000239446' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000256045' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000228906' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000228139' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000261773' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000278198' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273496' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000277666' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000278782' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000277761' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000269933' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000261737' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000259834' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000256374' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000263464' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000203812' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272196' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272880' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000270188' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000287116' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000237133' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000224739' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000227902' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000239467' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272551' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000280374' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000236886' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000229352' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000286601' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000227021' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000259855' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273301' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000271870' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000237838' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000286996' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000269028' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000286699' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273370' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000261490' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272567' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000270394' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272370' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272354' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000251044' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272040' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000182230' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000204092' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000261068' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000236740' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000236996' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000232295' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000271734' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000236673' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000227220' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000236166' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000112096' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000285162' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000286228' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000237513' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000285106' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000226380' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000270672' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000225932' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000244693' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000268955' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272267' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000253878' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000259820' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000226403' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000233776' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000269900' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000261534' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000237548' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000239665' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000256892' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000249860' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000271409' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000224745' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000261438' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000231575' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000260461' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000255823' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000254740' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000254561' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000282080' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000256427' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000287388' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000276814' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000280710' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000215271' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000258414' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000258808' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000277050' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273888' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000258861' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000259444' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000244952' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273923' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000262668' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000232196' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000256618' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000221995' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000226377' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273576' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000267637' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000282965' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273837' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000286949' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000256222' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000280095' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000278927' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000278955' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000277352' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000239446' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000256045' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000228906' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000228139' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000261773' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000278198' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273496' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000277666' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000278782' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000277761' is not a valid feature ID in 'raw.var'.
ERROR: Dataframe 'obs' is missing column 'cell_type_ontology_term_id'.
ERROR: Dataframe 'obs' is missing column 'assay_ontology_term_id'.
ERROR: Dataframe 'obs' is missing column 'disease_ontology_term_id'.
ERROR: Dataframe 'obs' is missing column 'organism_ontology_term_id'.
ERROR: Dataframe 'obs' is missing column 'tissue_ontology_term_id'.
ERROR: Dataframe 'obs' is missing column 'self_reported_ethnicity_ontology_term_id'.
ERROR: Dataframe 'obs' is missing column 'development_stage_ontology_term_id'.
ERROR: Dataframe 'obs' is missing column 'is_primary_data'.
ERROR: Dataframe 'obs' is missing column 'donor_id'.
ERROR: Dataframe 'obs' is missing column 'suspension_type'.
ERROR: Dataframe 'obs' is missing column 'tissue_type'.
Validation complete in 0:00:00.317133 with status is_valid=False
Validate and curate metadataΒΆ
Validate the AnnData object:
try:
annotate = Annotate(adata)
except Exception as e:
print(e)
Columns {'tissue_type', 'development_stage', 'self_reported_ethnicity', 'suspension_type', 'donor_id', 'organism', 'disease'} are not found in the data object!
Letβs fix the βdonor_idβ column name:
adata.obs.rename(columns={"donor": "donor_id"}, inplace=True)
For the missing columns, we can pass default values suggested from CELLxGENE:
CellxGeneFields.OBS_FIELD_DEFAULTS
{'disease': 'normal',
'development_stage': 'unknown',
'self_reported_ethnicity': 'unknown',
'suspension_type': 'cell',
'donor_id': 'na',
'tissue_type': 'tissue',
'cell_type': 'native_cell',
'sex': 'unknown'}
annotate = Annotate(adata, organism="human", **CellxGeneFields.OBS_FIELD_DEFAULTS)
π‘ added defaults to the AnnData object: {'organism': 'human', 'disease': 'normal', 'development_stage': 'unknown', 'self_reported_ethnicity': 'unknown', 'suspension_type': 'cell', 'tissue_type': 'tissue'}
β
added 1 record with Feature.name for columns: ['sex_ontology_term_id']
β
added 10 records from laminlabs/cellxgene with Feature.name for columns: ['assay', 'cell_type', 'development_stage', 'disease', 'donor_id', 'self_reported_ethnicity', 'tissue', 'organism', 'tissue_type', 'suspension_type']
β
added 123 records from laminlabs/cellxgene with Gene.ensembl_gene_id for var_index: ['ENSG00000112096', 'ENSG00000182230', 'ENSG00000203812', 'ENSG00000204092', 'ENSG00000215271', 'ENSG00000221995', 'ENSG00000224739', 'ENSG00000224745', 'ENSG00000225932', 'ENSG00000226377', 'ENSG00000226380', 'ENSG00000226403', 'ENSG00000227021', 'ENSG00000227220', 'ENSG00000227902', 'ENSG00000228139', 'ENSG00000228906', 'ENSG00000229352', 'ENSG00000231575', 'ENSG00000232196', 'ENSG00000232295', 'ENSG00000233776', 'ENSG00000236166', 'ENSG00000236673', 'ENSG00000236740', 'ENSG00000236886', 'ENSG00000236996', 'ENSG00000237133', 'ENSG00000237513', 'ENSG00000237548', 'ENSG00000237838', 'ENSG00000239446', 'ENSG00000239467', 'ENSG00000239665', 'ENSG00000244693', 'ENSG00000244952', 'ENSG00000249860', 'ENSG00000251044', 'ENSG00000253878', 'ENSG00000254561', 'ENSG00000254740', 'ENSG00000255823', 'ENSG00000256045', 'ENSG00000256222', 'ENSG00000256374', 'ENSG00000256427', 'ENSG00000256618', 'ENSG00000256892', 'ENSG00000258414', 'ENSG00000258808', 'ENSG00000258861', 'ENSG00000259444', 'ENSG00000259820', 'ENSG00000259834', 'ENSG00000259855', 'ENSG00000260461', 'ENSG00000261068', 'ENSG00000261438', 'ENSG00000261490', 'ENSG00000261534', 'ENSG00000261737', 'ENSG00000261773', 'ENSG00000262668', 'ENSG00000263464', 'ENSG00000267637', 'ENSG00000268955', 'ENSG00000269028', 'ENSG00000269900', 'ENSG00000269933', 'ENSG00000270188', 'ENSG00000270394', 'ENSG00000270672', 'ENSG00000271409', 'ENSG00000271734', 'ENSG00000271870', 'ENSG00000272040', 'ENSG00000272196', 'ENSG00000272267', 'ENSG00000272354', 'ENSG00000272370', 'ENSG00000272551', 'ENSG00000272567', 'ENSG00000272880', 'ENSG00000273301', 'ENSG00000273370', 'ENSG00000273496', 'ENSG00000273554', 'ENSG00000273576', 'ENSG00000273837', 'ENSG00000273888', 'ENSG00000273923', 'ENSG00000274175', 'ENSG00000274792', 'ENSG00000275249', 'ENSG00000275869', 'ENSG00000276017', 'ENSG00000276814', 'ENSG00000277050', 'ENSG00000277196', 'ENSG00000277352', 'ENSG00000277666', 'ENSG00000277761', 'ENSG00000277836', 'ENSG00000278198', 'ENSG00000278633', 'ENSG00000278782', 'ENSG00000278817', 'ENSG00000278927', 'ENSG00000278955', 'ENSG00000280095', 'ENSG00000280374', 'ENSG00000280710', 'ENSG00000282080', 'ENSG00000282965', 'ENSG00000285106', 'ENSG00000285162', 'ENSG00000286228', 'ENSG00000286601', 'ENSG00000286699', 'ENSG00000286949', 'ENSG00000286996', 'ENSG00000287116', 'ENSG00000287388']
annotate.categoricals
{'assay': FieldAttr(ExperimentalFactor.name),
'cell_type': FieldAttr(CellType.name),
'development_stage': FieldAttr(DevelopmentalStage.name),
'disease': FieldAttr(Disease.name),
'donor_id': FieldAttr(ULabel.name),
'self_reported_ethnicity': FieldAttr(Ethnicity.name),
'sex_ontology_term_id': FieldAttr(Phenotype.ontology_id),
'suspension_type': FieldAttr(ULabel.name),
'tissue': FieldAttr(Tissue.name),
'tissue_type': FieldAttr(ULabel.name),
'organism': FieldAttr(Organism.name)}
validated = annotate.validate()
π‘ validating metadata using registries of instance laminlabs/cellxgene
β
var_index is validated against Gene.ensembl_gene_id
π‘ mapping assay on ExperimentalFactor.name
β found 3 terms validated terms: ["10x 3' v3", "10x 5' v2", "10x 5' v1"]
β save terms via .add_validated_from('assay')
β
assay is validated against ExperimentalFactor.name
β
cell_type is validated against CellType.name
π‘ mapping development_stage on DevelopmentalStage.name
β found 1 terms validated terms: ['unknown']
β save terms via .add_validated_from('development_stage')
β
development_stage is validated against DevelopmentalStage.name
π‘ mapping disease on Disease.name
β found 1 terms validated terms: ['normal']
β save terms via .add_validated_from('disease')
β
disease is validated against Disease.name
π‘ mapping donor_id on ULabel.name
β 12 terms are not validated: 'D496-1', '621B-1', 'A29-1', 'A36-1', 'A35-1', '637C-1', 'A52-1', 'A37-1', 'D503-1', '640C-1', 'A31-1', '582C-1'
β save terms via .add_new_from('donor_id')
π‘ mapping self_reported_ethnicity on Ethnicity.name
β found 1 terms validated terms: ['unknown']
β save terms via .add_validated_from('self_reported_ethnicity')
β
self_reported_ethnicity is validated against Ethnicity.name
π‘ mapping sex_ontology_term_id on Phenotype.ontology_id
β found 1 terms validated terms: ['PATO:0000384']
β save terms via .add_validated_from('sex_ontology_term_id')
β
sex_ontology_term_id is validated against Phenotype.ontology_id
π‘ mapping suspension_type on ULabel.name
β found 1 terms validated terms: ['cell']
β save terms via .add_validated_from('suspension_type')
β
suspension_type is validated against ULabel.name
π‘ mapping tissue on Tissue.name
β found 16 terms validated terms: ['blood', 'thoracic lymph node', 'spleen', 'mesenteric lymph node', 'lamina propria', 'liver', 'jejunal epithelium', 'omentum', 'bone marrow', 'ileum', 'caecum', 'thymus', 'skeletal muscle tissue', 'duodenum', 'sigmoid colon', 'transverse colon']
β save terms via .add_validated_from('tissue')
β 1 terms is not validated: 'lungg'
β save terms via .add_new_from('tissue')
π‘ mapping tissue_type on ULabel.name
β found 1 terms validated terms: ['tissue']
β save terms via .add_validated_from('tissue_type')
β
tissue_type is validated against ULabel.name
β
organism is validated against Organism.name
validated
False
Register new metadata labelsΒΆ
Following the suggestions above to register genes and labels that arenβt present in the current instance:
(Note that our instance is rather empty. Once you filled up the registries, registering new labels wonβt be frequently needed)
annotate.add_validated_from("all")
π‘ saving labels for 'assay'
β
added 3 records from laminlabs/cellxgene with ExperimentalFactor.name for assay: ["10x 5' v1", "10x 5' v2", "10x 3' v3"]
π‘ saving labels for 'cell_type'
π‘ saving labels for 'development_stage'
β
added 1 record from laminlabs/cellxgene with DevelopmentalStage.name for development_stage: ['unknown']
π‘ saving labels for 'disease'
β
added 1 record from laminlabs/cellxgene with Disease.name for disease: ['normal']
π‘ saving labels for 'donor_id'
β 12 non-validated categories are not saved in ULabel.name: ['D496-1', '621B-1', 'A29-1', 'A36-1', 'A35-1', '637C-1', 'A52-1', 'A37-1', 'D503-1', '640C-1', 'A31-1', '582C-1']!
β to lookup categories, use lookup().donor_id
β to save, run .add_new_from('donor_id')
π‘ saving labels for 'self_reported_ethnicity'
β
added 1 record from laminlabs/cellxgene with Ethnicity.name for self_reported_ethnicity: ['unknown']
π‘ saving labels for 'sex_ontology_term_id'
β
added 1 record from laminlabs/cellxgene with Phenotype.ontology_id for sex_ontology_term_id: ['PATO:0000384']
π‘ saving labels for 'suspension_type'
β
added 1 record from laminlabs/cellxgene with ULabel.name for suspension_type: ['cell']
π‘ saving labels for 'tissue'
β 1 non-validated categories are not saved in Tissue.name: ['lungg']!
β to lookup categories, use lookup().tissue
β to save, run .add_new_from('tissue')
β
added 16 records from laminlabs/cellxgene with Tissue.name for tissue: ['spleen', 'sigmoid colon', 'jejunal epithelium', 'bone marrow', 'skeletal muscle tissue', 'transverse colon', 'thymus', 'liver', 'duodenum', 'lamina propria', 'mesenteric lymph node', 'caecum', 'omentum', 'blood', 'ileum', 'thoracic lymph node']
π‘ saving labels for 'tissue_type'
β
added 1 record from laminlabs/cellxgene with ULabel.name for tissue_type: ['tissue']
π‘ saving labels for 'organism'
For donors, we register the new labels:
annotate.add_new_from("donor_id")
β
added 12 records with ULabel.name for donor_id: ['D496-1', '621B-1', 'A29-1', 'A36-1', 'A35-1', '637C-1', 'A52-1', 'A37-1', 'D503-1', '640C-1', 'A31-1', '582C-1']
An error is shown for the tissue label βlunggβ, which is a typo, should be βlungβ. Letβs fix it:
tissues = annotate.lookup().tissue
# using a lookup object to find the correct term
tissues.lung
Tissue(updated_at=2024-01-08 15:22:49 UTC, uid='7Tt4iEKc', name='lung', ontology_id='UBERON:0002048', synonyms='pulmo', description='Respiration Organ That Develops As An Outpocketing Of The Esophagus.', created_by_id=1, public_source_id=47)
adata.obs["tissue"] = adata.obs["tissue"].cat.rename_categories(
{"lungg": tissues.lung.name}
)
annotate.add_validated_from("tissue")
β
added 1 record from laminlabs/cellxgene with Tissue.name for tissue: ['lung']
Letβs validate the object again:
validated = annotate.validate()
π‘ validating metadata using registries of instance laminlabs/cellxgene
β
var_index is validated against Gene.ensembl_gene_id
β
assay is validated against ExperimentalFactor.name
β
cell_type is validated against CellType.name
β
development_stage is validated against DevelopmentalStage.name
β
disease is validated against Disease.name
β
donor_id is validated against ULabel.name
β
self_reported_ethnicity is validated against Ethnicity.name
β
sex_ontology_term_id is validated against Phenotype.ontology_id
β
suspension_type is validated against ULabel.name
β
tissue is validated against Tissue.name
β
tissue_type is validated against ULabel.name
β
organism is validated against Organism.name
validated
True
adata.obs.head()
donor_id | tissue | cell_type | assay | sex_ontology_term_id | organism | disease | development_stage | self_reported_ethnicity | suspension_type | tissue_type | |
---|---|---|---|---|---|---|---|---|---|---|---|
CZINY-0109_CTGGTCTAGTCTGTAC | D496-1 | blood | classical monocyte | 10x 3' v3 | PATO:0000384 | human | normal | unknown | unknown | cell | tissue |
CZI-IA10244332+CZI-IA10244434_CCTTCGACATACTCTT | 621B-1 | thoracic lymph node | T follicular helper cell | 10x 5' v2 | PATO:0000384 | human | normal | unknown | unknown | cell | tissue |
Pan_T7935491_CTGGTCTGTACATGTC | A29-1 | spleen | memory B cell | 10x 5' v1 | PATO:0000384 | human | normal | unknown | unknown | cell | tissue |
Pan_T7980367_GGGCATCCAGGTGGAT | A36-1 | lung | alveolar macrophage | 10x 5' v1 | PATO:0000384 | human | normal | unknown | unknown | cell | tissue |
Pan_T7935494_ATCATGGTCTACCTGC | A29-1 | mesenteric lymph node | naive thymus-derived CD4-positive, alpha-beta ... | 10x 5' v1 | PATO:0000384 | human | normal | unknown | unknown | cell | tissue |
Register fileΒΆ
Now we are ready to register the artifact to the working instance:
# track the current notebook
ln.transform.stem_uid = "WOK3vP0bNGLx"
ln.transform.version = "0"
ln.track()
π‘ notebook imports: cellxgene_lamin==0.2.1 lamindb==0.72a1 lnschema_bionty==0.42.0
π‘ saved: Transform(version='0', uid='WOK3vP0bNGLx6K79', name='Annotate an h5ad file based on CELLxGENE schema', key='cellxgene-annotate', type='notebook', updated_at=2024-05-19 21:59:50 UTC, created_by_id=1)
π‘ saved: Run(uid='xuwb7Y45ZYLKrK41NLTt', transform_id=1, created_by_id=1)
π‘ tracked pip freeze > /home/runner/.cache/lamindb/run_env_pip_xuwb7Y45ZYLKrK41NLTt.txt
# this will modify the AnnData object by adding required columns and categories
artifact = annotate.save_artifact(description="test h5ad file")
π‘ path content will be copied to default storage upon `save()` with key `None` ('.lamindb/o4MFKKv0yJwnXpSn16cY.h5ad')
β
storing artifact 'o4MFKKv0yJwnXpSn16cY' at '/home/runner/work/cellxgene-lamin/cellxgene-lamin/docs/test-cellxgene-annotate/.lamindb/o4MFKKv0yJwnXpSn16cY.h5ad'
π‘ parsing feature names of X stored in slot 'var'
β
36503 terms (100.00%) are validated for ensembl_gene_id
β
linked: FeatureSet(uid='BPRiNXBxXgf29wPegB3r', n=36503, dtype='float', registry='bionty.Gene', hash='xtVNbbhs3ty63qs-rwKZ', created_by_id=1)
π‘ parsing feature names of slot 'obs'
β
11 terms (100.00%) are validated for name
β
linked: FeatureSet(uid='1Dr3JoKqg2BBlOP5ajQl', n=11, registry='Feature', hash='5EjZAKLhWtufR2roYcq1', created_by_id=1)
β
saved 2 feature sets for slots: 'var','obs'
β
linked feature 'sex_ontology_term_id' to registry 'bionty.Phenotype'
View the registered artifact with metadata:
artifact.describe()
Artifact(updated_at=2024-05-19 21:59:54 UTC, uid='o4MFKKv0yJwnXpSn16cY', suffix='.h5ad', accessor='AnnData', description='test h5ad file', size=54727155, hash='5esmrdu-DFv9nKyK4ZFA0G', hash_type='sha1-fl', n_observations=1626, visibility=1, key_is_virtual=True)
Provenance:
π created_by: User(uid='DzTjkKse', handle='testuser1', name='Test User1')
π storage: uid='M5FiaJrAnyJj', root='/home/runner/work/cellxgene-lamin/cellxgene-lamin/docs/test-cellxgene-annotate', type='local', instance_uid='1Dd1nk1DP8Uy')
π transform: Transform(version='0', uid='WOK3vP0bNGLx6K79', name='Annotate an h5ad file based on CELLxGENE schema', key='cellxgene-annotate', type='notebook')
π run: Run(uid='xuwb7Y45ZYLKrK41NLTt', started_at=2024-05-19 21:59:50 UTC, is_consecutive=True)
Features:
var: FeatureSet(uid='BPRiNXBxXgf29wPegB3r', n=36503, dtype='float', registry='bionty.Gene')
'MIR1302-2HG', 'FAM138A', 'OR4F5', 'None', 'OR4F29', 'OR4F16', 'LINC01409', 'FAM87B', 'LINC01128', 'LINC00115', 'FAM41C', 'LINC02593', 'SAMD11', 'NOC2L', 'KLHL17', 'PLEKHN1', 'PERM1', 'HES4'
obs: FeatureSet(uid='1Dr3JoKqg2BBlOP5ajQl', n=11, registry='Feature')
π assay (11, cat[bionty.ExperimentalFactor]): '10x 5' v1', '10x 5' v2', '10x 3' v3'
π cell_type (11, cat[bionty.CellType]): 'classical monocyte', 'T follicular helper cell', 'memory B cell', 'alveolar macrophage', 'naive thymus-derived CD4-positive, alpha-beta T cell', 'effector memory CD8-positive, alpha-beta T cell, terminally differentiated', 'alpha-beta T cell', 'CD4-positive helper T cell', 'naive thymus-derived CD8-positive, alpha-beta T cell', 'macrophage'
π development_stage (11, cat[bionty.DevelopmentalStage]): 'unknown'
π disease (11, cat[bionty.Disease]): 'normal'
π donor_id (11, cat[ULabel]): 'D496-1', '621B-1', 'A29-1', 'A36-1', 'A35-1', '637C-1', 'A52-1', 'A37-1', 'D503-1', '640C-1'
π self_reported_ethnicity (11, cat[bionty.Ethnicity]): 'unknown'
π tissue (11, cat[bionty.Tissue]): 'spleen', 'sigmoid colon', 'jejunal epithelium', 'bone marrow', 'skeletal muscle tissue', 'transverse colon', 'thymus', 'liver', 'duodenum', 'lamina propria'
π organism (11, cat[bionty.Organism]): 'human'
π tissue_type (11, cat[ULabel]): 'tissue'
π suspension_type (11, cat[ULabel]): 'cell'
π sex_ontology_term_id (11, cat[bionty.Phenotype]): 'male'
Labels:
π organisms (1, bionty.Organism): 'human'
π tissues (17, bionty.Tissue): 'spleen', 'sigmoid colon', 'jejunal epithelium', 'bone marrow', 'skeletal muscle tissue', 'transverse colon', 'thymus', 'liver', 'duodenum', 'lamina propria'
π cell_types (31, bionty.CellType): 'classical monocyte', 'T follicular helper cell', 'memory B cell', 'alveolar macrophage', 'naive thymus-derived CD4-positive, alpha-beta T cell', 'effector memory CD8-positive, alpha-beta T cell, terminally differentiated', 'alpha-beta T cell', 'CD4-positive helper T cell', 'naive thymus-derived CD8-positive, alpha-beta T cell', 'macrophage'
π diseases (1, bionty.Disease): 'normal'
π phenotypes (1, bionty.Phenotype): 'male'
π experimental_factors (3, bionty.ExperimentalFactor): '10x 5' v1', '10x 5' v2', '10x 3' v3'
π developmental_stages (1, bionty.DevelopmentalStage): 'unknown'
π ethnicities (1, bionty.Ethnicity): 'unknown'
π ulabels (14, ULabel): 'cell', 'tissue', 'D496-1', '621B-1', 'A29-1', 'A36-1', 'A35-1', '637C-1', 'A52-1', 'A37-1'
Register collectionΒΆ
Register a new collection for the registered artifact:
# register a new collection
collection = annotate.save_collection(
artifact, # registered artifact above, can also pass a list of artifacts
name=( # title of the publication
"Cross-tissue immune cell analysis reveals tissue-specific features in humans"
" (for test demo only)"
),
description="10.1126/science.abl5197", # DOI of the publication
reference="E-MTAB-11536", # accession number (e.g. GSE#, E-MTAB#, etc.)
reference_type="ArrayExpress",
) # source type (e.g. GEO, ArrayExpress, SRA, etc.)
β
loaded: FeatureSet(uid='BPRiNXBxXgf29wPegB3r', n=36503, dtype='float', registry='bionty.Gene', hash='xtVNbbhs3ty63qs-rwKZ', created_by_id=1)
β
loaded: FeatureSet(uid='1Dr3JoKqg2BBlOP5ajQl', n=11, registry='Feature', hash='5EjZAKLhWtufR2roYcq1', created_by_id=1)
collection.artifact
Return an input h5ad file for cellxgene-schemaΒΆ
adata_cxg = annotate.to_cellxgene(is_primary_data=True)
adata_cxg
AnnData object with n_obs Γ n_vars = 1626 Γ 36503
obs: 'donor_id', 'sex_ontology_term_id', 'suspension_type', 'tissue_type', 'tissue_ontology_term_id', 'cell_type_ontology_term_id', 'assay_ontology_term_id', 'organism_ontology_term_id', 'disease_ontology_term_id', 'development_stage_ontology_term_id', 'self_reported_ethnicity_ontology_term_id', 'is_primary_data'
var: 'feature_is_filtered'
uns: 'default_embedding', 'title', 'cxg_lamin_schema_reference', 'cxg_lamin_schema_version'
obsm: 'X_umap'
adata_cxg.write_h5ad("anndata_human_immune_cells_cxg.h5ad")
!cellxgene-schema validate anndata_human_immune_cells_cxg.h5ad
Loading dependencies
Loading validator modules
Starting validation...
WARNING: Validation of raw layer was not performed due to current errors, try again after fixing current errors.
ERROR: 'ENSG00000269933' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000261737' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000259834' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000256374' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000263464' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000203812' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272196' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272880' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000270188' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000287116' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000237133' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000224739' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000227902' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000239467' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272551' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000280374' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000236886' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000229352' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000286601' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000227021' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000259855' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273301' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000271870' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000237838' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000286996' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000269028' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000286699' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273370' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000261490' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272567' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000270394' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272370' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272354' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000251044' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272040' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000182230' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000204092' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000261068' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000236740' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000236996' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000232295' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000271734' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000236673' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000227220' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000236166' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000112096' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000285162' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000286228' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000237513' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000285106' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000226380' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000270672' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000225932' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000244693' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000268955' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272267' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000253878' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000259820' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000226403' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000233776' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000269900' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000261534' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000237548' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000239665' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000256892' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000249860' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000271409' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000224745' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000261438' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000231575' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000260461' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000255823' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000254740' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000254561' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000282080' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000256427' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000287388' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000276814' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000280710' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000215271' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000258414' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000258808' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000277050' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273888' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000258861' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000259444' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000244952' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273923' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000262668' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000232196' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000256618' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000221995' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000226377' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273576' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000267637' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000282965' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273837' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000286949' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000256222' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000280095' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000278927' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000278955' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000277352' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000239446' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000256045' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000228906' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000228139' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000261773' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000278198' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273496' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000277666' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000278782' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000277761' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000269933' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000261737' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000259834' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000256374' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000263464' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000203812' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272196' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272880' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000270188' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000287116' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000237133' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000224739' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000227902' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000239467' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272551' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000280374' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000236886' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000229352' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000286601' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000227021' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000259855' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273301' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000271870' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000237838' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000286996' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000269028' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000286699' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273370' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000261490' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272567' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000270394' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272370' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272354' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000251044' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272040' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000182230' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000204092' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000261068' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000236740' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000236996' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000232295' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000271734' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000236673' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000227220' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000236166' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000112096' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000285162' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000286228' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000237513' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000285106' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000226380' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000270672' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000225932' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000244693' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000268955' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272267' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000253878' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000259820' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000226403' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000233776' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000269900' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000261534' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000237548' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000239665' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000256892' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000249860' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000271409' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000224745' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000261438' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000231575' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000260461' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000255823' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000254740' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000254561' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000282080' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000256427' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000287388' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000276814' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000280710' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000215271' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000258414' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000258808' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000277050' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273888' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000258861' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000259444' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000244952' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273923' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000262668' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000232196' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000256618' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000221995' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000226377' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273576' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000267637' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000282965' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273837' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000286949' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000256222' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000280095' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000278927' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000278955' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000277352' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000239446' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000256045' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000228906' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000228139' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000261773' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000278198' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273496' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000277666' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000278782' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000277761' is not a valid feature ID in 'raw.var'.
Validation complete in 0:00:03.106616 with status is_valid=False