Vous êtes sur la page 1sur 60

@METASPACE2020

METASPACE training guide


Contributors
Adam Pruska, Andreas Roempp, Annabelle Fülöp, Anne Mette Handler, Benedikt Geier,
2017
Berin Boughton, Bernhard Spengler, Buck Achim, Carina Ramallo-Guevara, Charles
Pineau, Chris Anderton, Christian Janfelt, Christina Burr, Claire Carter, Corinna Henkel, OurCon’V imzML workshop
Cristina Gonzalez Lopez, Cristine Quiason, David Muddiman, Denis Sammour, Dhaka
Bhandari, Dinaiz Thinagaran, Dirk Hoelscher, Don Nguyen, Dušan Velickovic, Eike Ulrich
Brockmann, Emilia Sogin , Emrys Jones, Eric Weaver, Erin Gemperline, Guanshi Zhang,
Gus Grey, Heath Patterson, Hidenobu Miyazawa, József Pánczél, James Langridge,
James McKenzie, Jan-Hinrich Rabe, Janfelt Christian, Jens Soltwisch, Jialing Zhang,
Josephine Bunch, Julian Griffin, Julien Delecolle, Kaija Schaepe, Klaus Dreisewerd, Theodore Alexandrov (EMBL, UCSD)
Konstantin Nagornov, Ksenija Radic, Kumar Sharma, Kyana Garza, Lavigne Regis,
Lennart Huizing, Liebeke Manuel, Lingjun Li, Livia Eberlin, Logan Mackay, Luca Rappez, Andy Palmer (EMBL)
Marina Reuter, Mario Kompauer, Mark Bokhart, Marta Sans, Marty Paine, Mathieu
Gaudin, Maureen Kane, Max Müller, Michael Becker, Michael Linscheid, Mikhail Belov, Vitaly Kovalev (EMBL)
Na Sun, Neha Garg, Nicolas Desbenoit, Nicole Strittmatter, Oliver Lechtenfeld, Pegah
Khamehgir-Silz,, Renata Soares, Richard Caprioli, Richard Goodwin, Rima Ait-Belkacem,
Ron Heeren, Samantha Walker, Sandra Schulz, Sarah Aboulmagd, Sergio Triana, Shane
Artem Tarasov (EMBL)
Ellis, Sheerin Latham, Sophie Jacobsen, Spencer Thomas, Stefanie Gerbig, Steve
Castellino, Veronika Saharuka, Vitaly Kovalev, Yury Tsybin, Zoe Hall, Zoltan Takats
Vitaly Theodore
Welcome everyone! Andy
Alexandrov
Palmer Kovalev
“developer” “leader”
“scientist”

Artem
Tarasov
“hacker”
Training Overview
Part 1: Introduction Part 2: Tutorial
Learning outcomes Learning outcomes
METASPACE project Data requirements
Bioinformatics for metabolite annotation Data submission
Engine Annotation browsing
Knowledgebase Interpretation

Part 3: Export to imzML


FTICR (Bruker)
Orbitrap (Thermo)
Other vendors
Introduction
What we hope you will learn today
● Ins and outs of metabolite annotation in HR imaging MS
● Bioinformatics we developed
○ Metabolite Signal Match (MSM) score
○ False Discovery Rate estimation
○ FDR-controlled annotation
● Our METASPACE platform
○ How to prepare data for submission to our service
○ How to submit your data
○ How to view molecular annotations in our webapp
Project overview: slides on slideshare
Bioinformatics: slides on slideshare
Overview of the annotation engine
Outline
● Input (data and metadata)
● Online Software
● Data Submission
● Annotation Browsing
● Use Cases
a. mouse brain, MALDI-FTICR (U Rennes 1)
b. human colorectal tumor, DESI-Orbitrap (ICL)
The METASPACE platform
imzML,
metadata
upload

task scheduler
10 minutes

http://annotate.metaspace2020.eu annotations
engine
database

explore Amazon Cloud


annotations
Upload: Data
centroided imzML

http://ms-imaging.org/
Upload: Metadata
sample information
acquisition details
Upload: Interface annotate.metaspace2020.eu/#/upload

data upload

metadata collection
Example 1
Mouse brain
(MALDI-FTICR)

Select an annotation
See the molecular
distribution

Data provided by
Regis Lavigne, Charles Pineau,
University of Rennes 1
Example 2
Human colorectal tumor
(DESI-Orbitrap)

Filter different datasets


See data details
View metadata

Data provided by
James McKenzie, Zoltan Takats,
Imperial College London
Tutorial on how to use
METASPACE engine and molecular knowledgebase
Learning Outcomes

1. Preparing data for submission


2. Submitting data
3. Browsing results
Data Requirements
Imaging mass spectrometry data

- Any ionisation source


- Any spatial resolution
- Any tissue
- One section per dataset
Data Requirements
High resolving power

RFWHM(@400) > 90K

Well-calibrated

ideally < 3 ppm


Data Requirements
Data Format http://imzml.org/wp/introduction/

- imzML

Centroided

- vendor preferred
- http://metaspace2020.eu/imzml
Customised Processing
R200=70K R200=280K
Processing is tailored to your data!
[C41H78NO7P+K]+
- Technical metadata
- Resolving power
- isotope prediction
- Polarity
- adducts
Data Requirements
Your responsibility:

- Data is processed ‘as is’


- Check metadata is correct
- Report resolving power accurately (check within data-set)

- Low numbers of annotations often correspond to poor quality mass spectra


- Calibration inaccuracy
- Lock-mass errors
Data Submission
Data upload

1. Follow conversion instructions for your instrument

2. Select the centroided files, .imzML and .ibd

3. The dataset will be copied to the cloud storage


(accessible only to our team)
Metadata form
● Start typing to see
suggestions
● Please fill truthfully
○ Don’t want to disclose?
Just put ‘N/A’
● Click (top right)
○ Enabled once the files
finished uploading
Browsing Results
Annotation
knowledgebase
web app

http://annotate.metaspace2020.eu

Results and metadata are public

Datasets are not


Dataset list
the list can be
filtered and
exported to CSV

processing is
in progress

queued

finished

Clicking on fields limits the list to datasets with the same value
Annotation table

Currently selected
molecule
(click to select)

principal peak m/z

MSM score
Sorting/filtering annotations Click on column headers to sort

Add as many filters as you need

Quickly add a filter by hovering over a cell and clicking the icon
Quick search
● search across all fields

● works in both ‘Datasets’ and


‘Annotations’ tabs

● supports* prefix match, OR


operator, negation, e.g.
Rennes -(rat | mouse | human)

(*) ElasticSearch Simple Query


Quick search
● search across all fields

● works in both ‘Datasets’ and


‘Annotations’ tabs

● supports* prefix match, OR operator,


negation, e.g.
Rennes -(rat | mouse | human)

(*) ElasticSearch Simple Query


Dynamic Summaries

● Automatically generated
figures summarising filtered
datasets
Molecule search Search by name (partial name search)

Click to edit

Or by molecular formula (exact match only)


Details for highlighted annotation

molecule distribution
(principal isotope)

Putative metabolite entries


from the database
Visual insight into MSM score assignment

Exact m/z of each


ion image

Ion images for each


isotope peak

Isotopic patterns
Blue: theoretical abundance Click and drag
(at instrument resolving power) to zoom
Red: measured image intensity
Step-by-step search
Choose molecular formula database
Step-by-step search
Add dataset filter, then choose dataset(s) (up to 10)
Step-by-step search
Type molecule name or molecular formula

The annotations table will dynamically update


Step-by-step search

Export to CSV will save the current annotations table.


Changing the filters will change which annotations are
exported
Always export annotations for comparison together
(so they are at the same FDR)
Results Browsing Summary
1. Choose database
2. Choose data-set
3. Add molecule filter and type ‘PC’
a. molecular class filter
4. Type ‘PC(16:0/18:0)
a. single metabolite filter
5. Select row of table
a. single ion filter
6. Simple comparison of spatial distributions
between adducts
7. Export of annotations to csv

Also possible
● Filter by m/z
● Formula search
● Comparison across datasets
Interpretation
MSM
FDR Controlled Annotation score
True annotation
False discovery

False Discovery Rate - the fraction of incorrect annotations


FDR = 0.1
Control - request a set of annotations at a fixed estimated
FDR

Setting the level:


- Adjust the number of molecules for follow-up analysis
- When only limited numbers of molecules can be reviewed,
adjust the FDR so that fewer/great numbers of molecules are FDR = 0.2
annotated
- Compare annotations between datasets
- A principled way of selecting molecules to compare between FDR = nTrue
datasets nFalse + nTrue
Choice of metabolite database

synthesized/recorded
88M CAS registry

biologically occurring/active
50M PubChem compounds

single biological system


40K HMDB

sample specific
1K LC-MS
Choice of metabolite database
Impacts search and False-Discovery-Rate estimation

● Use one that’s relevant


● Larger database
○ more false-hits --> fewer annotations at a fixed FDR
● Different databases give different annotations
○ even for molecules in both databases due to FDR control
○ for data-set comparison, use the same database
Annotating at level of molecular formula

● Possibility of multiple metabolites per sum formula


○ webapp shows all hits from the database search (learn the ambiguity!)
○ other databases can be searched (e.g. PubChem)
○ use enrichment analysis to get biological leads

● Use an orthogonal technique for reporting individual metabolites


○ not directly integrated (yet)
○ use web-app results help to target MS/MS studies (e.g. purchase of standards)
Reporting Results
● The METASPACE platform putatively annotates* molecular formula along with several
candidate metabolites
● A set of annotations should be reported along with the FDR threshold selected.
○ e.g. “Molecular annotation was performed using the METASPACE annotation
engine (Palmer et al, Nature Methods 2017). 150 molecules were annotated
against the LipidMAPS database at 10% FDR. Results are publically available at
annotate.metaspace2020.eu”
● The export function of the website delivered a spreadsheet that can be included as
supporting for any publication.

Metabolomics Standards Initiative identification levels Sumner et al, 2007, Metabolomics


Learning Summary How to get help?
● Preparing data for submission ● METASPACE team:
○ imzML export ○ web: metaspace2020.eu
○ metadata ○ email: contact@metaspace2020.eu
○ twitter: @metaspace2020
● Submitting data ○ source code:
○ web-app, upload https://github.com/METASPACE2020/
● FTICR data conversion
● Browsing knowledgebase ○ SCiLS: support@scils.de
○ web-app, annotations ● Orbitrap data conversion
○ Thermo Fisher Scientific:
kerstin.strupat@thermofisher.com
imzML Export
Export into imzML: FT-ICR data
Using SCiLS Lab’s METASPACE export
Export to METASPACE
● Export your centroided high-resolution spectra in the imzML format
● Available for “FT-ICR type” SCiLS Lab files from SCiLS Lab 2016b
● Performance in version 2018b significantly increased (4x speedup, batch export)
● Best results in METASPACE if peak list is required for centroiding
● Two different Bruker data formats
○ SQLite peak list data: Peak list provided during import
○ FT-ICR profile data: Generate a peak list after import
Create imzML file for METASPACE
● In the objects tab, click the export symbol of
the region to be exported and select
“Export to METASPACE”
● The Export Spectra dialog opens
● Set your normalization of choice
● Select your peak list of choice
for example “Imported Peaks” in case of
SQLite
● Provide your scan polarity
● Click OK to save imzML file
SQLite peak list data
● Data must have been acquired with
on-the-fly centroid detection
i.e. there is a file called ‘peaks.sqlite’ within the .d
folder of the data-set

● In SCiLS Lab a peak list “Imported peaks” is


available, selecting most frequent peaks
By default all peaks appearing more frequently
than 1% of spectra
FT-ICR profile data
● Older Solarix Files do not directly contain a peak
list to perform centroiding
● Create peak list with Data Analysis
SCiLS Lab Help Section 7.4
● Use METASPACE tool for peak finding
https://spatialmetabolomics.github.io/centroidize/
● Use other external tools (mMass, …)
● Import the external peak list into SCiLS Lab
File > Import > m/z intervals from CSV or Clipboard
Use METASPACE tool for peak finding
● Select the overview spectrum CSV exported from SCiLS
● Upload CSV file to METASPACE tool
● Copy values to clipboard
● Use File > Import > m/z intervals from CSV
Upload imzML files to METASPACE
● Go to http://annotate.metaspace2020.eu/#/upload
Export into imzML: Orbitrap data (.raw)
Instructions: metaspace2020.eu/imzML
Software tools:

imageQuest / raw-converter
- Recommended for: MALDI images (Thermo MALDI- / TransMIT AP-S-MALDI-)

imzmlConverter
- Recommended for: DESI/flowProbe with separate files per row

Recommended for bioinformaticians: pyimzML (Python parser)


imageQuest
.raw -> imzML

● Commercial
○ Thermo
Scientific
Raw-imzml converter
.raw -> imzML
● Free
● http://ms-imaging.org/wp/raw-to-imzm
l-converter/
Export into imzML: Generic

.raw -> mzML -> imzML


● MSConvert
○ free (link)

● imzMLConverter
○ free
○ requires registration
○ http://www.cs.bham.ac.uk/~ibs/imzMLConverter
Acknowledgments
METASPACE R&D team at EMBL METASPACE data contributors
Theodore Alexandrov Achim Buck, Adam Pruska, Andreas Roempp, Andrew Palmer, Annabelle Fülöp, Anne Mette Handler,
Benedikt Geier, Berin Boughton, Bernhard Spengler, Carina Ramallo-Guevara, Charles Pineau, Chris
Vitaly Kovalev Anderton, Christian Janfelt, Christina Burr, Claire Carter, Corinna Henkel, Cristina Gonzalez Lopez,
Cristine Quiason, David Muddiman, Denis Sammour, Dhaka Bhandari, Dinaiz Thinagaran, Dirk Hoelscher,
Artem Tarasov Don Nguyen, Dušan Velickovic, Eike Ulrich Brockmann, Emilia Sogin , Emrys Jones, Eric Weaver, Erin
Andrew Palmer Gemperline, Guanshi Zhang, Gus Grey, Heath Patterson, Hidenobu Miyazawa, József Pánczél, James
Langridge, James McKenzie, Jan-Hinrich Rabe, Janfelt Christian, Jens Soltwisch, Jialing Zhang,
Dominik Fay Josephine Bunch, Julian Griffin, Julien Delecolle, Kaija Schaepe, Klaus Dreisewerd, Konstantin Nagornov,
Ksenija Radic, Kumar Sharma, Kyana Garza, Lavigne Regis, Lennart Huizing, Lingjun Li, Livia Eberlin,
Logan Mackay, Luca Rappez, Manuel Liebeke, Marina Reuter, Mario Kompauer, Mark Bokhart, Marta
Sans, Marty Paine, Mathieu Gaudin, Maureen Kane, Max Müller, Michael Becker, Michael Linscheid,
SCiLS Mikhail Belov, Na Sun, Neha Garg, Nicolas Desbenoit, Nicole Strittmatter, Oliver Lechtenfeld, Pegah
Khamehgir-Silz, Rappez Luca, Regis Lavigne, Renata Soares, Richard Caprioli, Richard Goodwin, Rima
Dennis Trede Ait-Belkacem, Ron Heeren, Samantha Walker, Sandra Schulz, Sarah Aboulmagd, Sergio Triana, Shane
Ellis, Sheerin Latham, Sophie Jacobsen, Spencer Thomas, Stefanie Gerbig, Steve Castellino, Theodore
Jan Hendrik Kobarg Alexandrov, Veronika Saharuka, Vitaly Kovalev, Yury Tsybin, Zoe Hall, Zoltan Takats

This project has received funding from the European Union’s Horizon 2020 research and
innovation programme under grant agreement № 634402.

Vous aimerez peut-être aussi