Data Archiving & Permissions

Policies, best practices, and resources for responsibly sharing cancer genomics and biomarker data with the Journal of Cancer Genetics and Biomarkers (JCGB).

Championing Open, Ethical, and Reusable Data

JCGB is committed to accelerating precision oncology through transparent, FAIR-compliant data sharing. This page outlines how to prepare, deposit, and license the datasets, code, and metadata that underpin your manuscript. Following these guidelines maximises reproducibility, fosters collaboration, and honours the trust patients and communities place in biomedical research.

82%JCGB articles provide direct repository links

68%Submissions include open-source code packages

5Days average from data submission to repository approval when prepared correctly

0Tolerance for data misuse or privacy breaches

1. JCGB Data Sharing Principles

Transparency: Readers must be able to trace results back to raw or processed data and analytical code.
Reusability: Data should be deposited with clear licenses, metadata, and formats that enable integration and re-analysis.
Respect: Sensitive datasets require governance aligned with participant consent, community expectations, and legal frameworks.
Longevity: Use stable repositories that offer persistent identifiers (DOIs, accession numbers) and long-term preservation guarantees.

2. Selecting the Right Repository

Choose repositories according to data type, access requirements, and compliance mandates.

Genomic & Transcriptomic Data

European Genome-phenome Archive (EGA)
NCBI Sequence Read Archive (SRA)
Gene Expression Omnibus (GEO)
dbGaP for controlled-access human genetics

Proteomics & Metabolomics

PRIDE (Proteomics Identifications Database)
MassIVE
Metabolomics Workbench

Imaging & Radiomics

The Cancer Imaging Archive (TCIA)
BioImage Archive
PhysioNet for physiological signals

Clinical & Real-World Data

Vivli or Project Data Sphere for clinical trials
HealthData.gov for U.S. public datasets
Institutional data enclaves with controlled access

Multi-Omics & Integrative Studies

Zenodo or Figshare for curated multi-modal packages
Synapse for collaborative consortia projects
Open Science Framework (OSF) for protocol registries

Code & Computational Pipelines

GitHub/GitLab with version tagging
Zenodo integration to mint DOIs
Dockstore for containerised workflows

Need guidance? Contact [email protected] with your dataset description. JCGB’s data editors can recommend repositories, licensing options, and metadata standards tailored to your project.

3. Preparing Data for Deposits

3.1 Documentation Checklist

Study overview (objectives, design, cohort characteristics)
Data dictionary describing variables, units, and codes
File structure map indicating directories, naming conventions, and dependencies
Detailed methods or laboratory protocols (include reagents, instruments, software)
Quality control procedures and filtering rules
Version history documenting updates or corrections

3.2 Metadata Standards

Use community schemas (MIAME for microarrays, MINSEQE for RNA-seq, HUPO-PSI for proteomics).
Include controlled vocabulary terms (MeSH, HPO, ICD-10) for disease classification.
Provide sample-level metadata (age, sex, ethnicity, tumour type, staging) with anonymisation.
Document data processing pipelines, software versions, and parameter settings.

4. Licensing & Permissions

JCGB encourages open licenses while recognising the need for controlled access in sensitive contexts.

License	Use Case	Notes
CC0 / Public Domain	Non-identifiable datasets, benchmark resources	Maximises reuse; cite original creators to maintain credit
CC BY 4.0	General-purpose sharing with attribution	Recommended for most JCGB datasets
CC BY-NC	Restrict commercial use of sensitive datasets	Ensure “non-commercial” aligns with funder policies
DUO (Data Use Ontology)	Controlled-access human genomic data	Specify consent-based restrictions (e.g., disease-specific research)

5. Handling Sensitive & Restricted Data

Controlled Access: Deposit in repositories offering access committees (EGA, dbGaP, controlled Synapse projects). Provide Data Use Agreements (DUAs).
De-identification: Remove direct identifiers, convert dates to offsets relative to diagnosis, and aggregate geolocation data.
Genomic Sovereignty: For indigenous or marginalised communities, align with community-specific governance, benefit-sharing agreements, and indigenous data frameworks (CARE principles).
Third-Party Data: Obtain permissions from original custodians; document agreement terms in the manuscript.

Patient privacy: JCGB will decline submissions lacking evidence of ethical approvals, consent for data sharing, or appropriate de-identification protocols.

6. Data Availability Statements

Include a dedicated section in the manuscript after acknowledgments. Examples:

“Whole-genome sequencing data are available in the European Genome-phenome Archive (EGA) under accession EGAS00001008721. Access requests can be submitted through the EGA Data Access Committee.”
“Metabolomic peak tables associated with this study are deposited in Metabolomics Workbench (ST002345) under CC BY 4.0 license.”
“Due to participant confidentiality agreements, de-identified clinical data are available upon request to the corresponding author and subject to institutional review.”

7. Code & Workflow Sharing

Host code in public repositories with version control.
Tag releases corresponding to the manuscript and archive them via Zenodo for DOI generation.
Provide README files with setup instructions, dependencies, and example data.
Use containerisation (Docker, Singularity) or workflow description languages (CWL, Nextflow, WDL) for complex pipelines.
Indicate licensing (MIT, Apache 2.0, GPL) to clarify reuse rights.

8. Managing Large Files & Complex Data

Compress large datasets (tar.gz) and split into manageable segments (≤5 GB) where repository limits apply.
Provide checksum files (MD5, SHA256) to verify integrity.
For extremely large data (petabyte-scale imaging), coordinate with JCGB to arrange cloud-based sharing or institutional hosting.

9. Embargoes & Prepublication Sharing

JCGB permits repository embargoes aligned with journal publication. Specify embargo end date in the metadata.
If presenting at conferences, ensure repository links honour embargo deadlines and note them in the cover letter.
JCGB supports preprints (bioRxiv, medRxiv). Link datasets upon preprint release for transparency.

10. Compliance & Verification

During peer review, JCGB may request reviewer access to datasets or code. Provide temporary credentials where necessary.
Post-acceptance, JCGB verifies repository accessibility, metadata completeness, and licensing.
Failure to provide verifiable data may delay publication or result in rejection.

11. Updating Data Post-Publication

Use repository versioning to issue updates or corrections.
Notify JCGB of significant changes so errata or addenda can be published.
Document version history in README files and repository metadata.

12. Resources & Toolkits

JCGB Submission Guide

Review manuscript submission steps and requirements.

Go to Submit Page

FAIR Principles

Understand frameworks for Findable, Accessible, Interoperable, Reusable data.

Visit GO FAIR

CARE Principles

Respect collective benefit, authority, responsibility, and ethics for indigenous data.

Explore CARE

Data Management Plan Templates

Download DMP templates aligned with NIH, Horizon Europe, and Wellcome requirements.

Access DMPTool

13. Frequently Asked Questions

What if my dataset exceeds repository limits?

Contact the repository support team for expansion options or use institutional cloud services. Inform JCGB so we can document the approach in your data availability statement.

Can I restrict commercial reuse?

Yes. Use licenses such as CC BY-NC when consistent with funder mandates. Clarify restrictions in metadata and the manuscript.

How do I handle multi-institutional permissions?

Coordinate Data Transfer Agreements (DTAs) among partner institutions early. Provide JCGB with copies or summaries in supplementary materials.

What about legacy data without consent for sharing?

Describe the limitations and provide summary statistics. Seek IRB guidance on re-consenting or de-identifying. JCGB will consider justified exceptions with clear explanations.

14. Data Governance Checklist

Complete this checklist before final submission to ensure your archiving plan meets JCGB’s standards and legal obligations:

Consent Alignment: Confirm that participant information sheets and consent forms explicitly cover the level of data sharing you intend (open, controlled, summary-only). Document any deviations and secure re-consent or waivers when necessary.
Institutional Approvals: Obtain a letter or email confirmation from your institutional data governance office or privacy board approving the external release of data. Store the approval in project records and reference it in the manuscript if required.
Data Transfer Agreements: Execute DTAs among collaborating institutions, specifying permitted uses, security obligations, breach response procedures, and data retention timelines.
Security Architecture: Encrypt datasets prior to upload, enforce multi-factor authentication, and maintain access logs for controlled repositories. For cloud-hosted data, comply with ISO 27001 or equivalent standards.
Retention & Destruction Policy: Define how long raw and processed data will be retained post-publication and outline procedures for secure destruction when retention periods end.
Community Engagement: For indigenous or community-led research, document engagement sessions, co-authorship agreements, and benefit-sharing commitments in alignment with the CARE principles.
Machine-Readable Metadata: Validate JSON, XML, or CSV metadata files against repository schemas. Ensure ontology tags, units, and identifiers are accurate and consistent across files.
Emergency Contacts: Provide repositories with a generic institutional email (e.g., [email protected]) to ensure continuity if personnel change.

Need Help with Data Archiving?

Email [email protected] for repository recommendations, metadata templates, or compliance checklists. JCGB’s data editors are ready to support you.

Contact Data Editors Review Author Instructions Read Editorial Policies

Questions or clarifications? Reach us at [email protected]. Provide a brief description of your dataset, repository plans, and any consent considerations so we can deliver tailored advice.

Last updated: September 2025. JCGB reviews data policies annually to align with evolving legal, ethical, and technological standards.

Journal of Cancer Genetics And Biomarkers

Journal of Cancer Genetics And Biomarkers – Data Archiving Permissions

Data Archiving & Permissions