Job Description
Job Summary:
- We are looking for a data scientist/curator who will help annotate proteomics data at the MLCS.
- The contractor will apply proteomics related knowledge with bioinformatics skills to help curate and register high dimensional proteomics data into internal data registry.
- The deliverable will facilitate data processing, analysis, machine learning with high reproducibility and scalability, as well as data management and visualization.
Duties and Responsibilities:
- Annotate internal mass spec proteomics datasets to be compliant.
- Register the datasets into internal database.
- Transform and engineer datasets so they are ready for downstream analysis.
- Extract, transform and load high value external datasets.
- Document workflows and ensure data injection, metadata capture, versioning control and curation.
- The deliverable will facilitate data processing, analysis, machine learning with high reproducibility and scalability, as well as data management and visualization.
Organizational Relationship:
- Mostly with hiring manager.
- Communication with wet lab scientists who are the data generators.
Education and Experience:
- B.S or advanced degree in Bioinformatics, Biology, Molecular Cell Biology, Biochemistry, Analytical Chemistry or related fields.
Technical Requirements:
- Familiarity with mass spectrometry based proteomics data type is strongly prefered.
- Wet-lab experience in Biology, Biochemistry, Molecular Biology, Analytical Chemistry or related field is strongly preferred.
- Knowledge of relational or graph database is prefered.
- Experience in scripting languages like R/Python is desirable.
- Familiarity with workflow languages (Nextflow, CWL, etc) is desirable.
- Familiarity with Client scripting, cluster or cloud computing infrastructure (AWS, GCP) is desirable.