OverviewWe are seeking a highly skilled Data Scientist specializing in massspectrometry-based proteomics to join our data science team. In this role, you will leverage advanced analytical techniques to extract meaningful insights from complex proteomics datasets generated by state-of-the-art mass spectrometry techniques. You'll play a crucial role in accelerating our molecular glue discovery platform by developing robust data science workflows that bridge high-throughput screening data with biological understanding.ResponsibilitiesProteomics Data Analysis: Analyze large-scale DIA and DDA shotgun-proteomics datasets to identify differential expression patterns and elucidate molecular mechanismsAlgorithm Development: Design and implement algorithms and statistical models to process, quality control, and interpret complex proteomics dataHigh-Throughput Screening Support: Develop automated pipelines for analyzing LC-MS data from high-throughput screening campaigns to identify novel molecular glue targets and mechanismsData Integration: Integrate proteomics data with other omics datasets, chemical structure data, and biological pathway information to generate actionable insightsVisualization & Reporting: Create data visualizations and comprehensive reports for cross-functional teams including medicinal chemistry, biology, and clinical developmentMethod Development: Collaborate with analytical chemistry teams to optimize data acquisition and develop computational approaches for proteomics data analysisPlatform Enhancement: Contribute to the continuous improvement of our molecular glue discovery platform through innovative data science methodologiesQualificationsPhD or MS in Data Science, Computational Biology, Bioinformatics, Physics, Chemistry, or a related quantitative field, with publications in computational proteomics, chemoproteomics, or chemical biology.A minimum of 2+ years industrial hands-on experience in proteomics data analysisKnowledge of SQL and R, or other data analysis toolsProficiency in Python programming with experience in data science libraries (pandas/polars, numpy, scipy, scikit-learn, matplotlib/seaborn/plotnine/ggplot)Experience with cloud computing platforms (AWS, GCP) and containerization technologiesDemonstrated experience working with LC-MS or other omics data, particularly in high-throughput screening (HTS) environmentsStrong foundation in core data science concepts including statistical analysis, machine learning, data visualization, and experimental designStrong foundation in mass spectrometry data processing software algorithms (identification, quantification, missing value imputation, differential expression)
#J-18808-Ljbffr