Based on process problems I've seen at hackathons to date, my "hackathon goal" was to map Pete Kane's objectives1 to a plan containing activities that we could execute and improve at TRI-con. The subsequent "2020 Kidney Cancer Hackathon: Results" post describes what actually happened. Of course, I expect the plan to be iterated upon and improved as time goes on.
After speaking to RTTP's Pete Kane, here are what I believed to be the union of our objectives:
- Do deep biological profiling for patients before the hackathon.
- Reveal new treatment paths during the hackathon
- Help set up new drug n=1 trial cases after the hackathon
- Create a journal to disseminate results
I maintained that the first three Objectives could be supported by the following activities.
- If the patient support team (may be, but doesn't have to be, the patient or the patient's physician),
- can provide complete original data (e.g. WES/WGS, RNA-seq, EMR),
- which has been verified (e.g. Appendix: Mike D'Amour fastq Validation Process produces this report)
- and has undergone “standard” pre-processing, (e.g. alignment)
- which has been verified (e.g. multiple aligners)
- and if the researchers have done much of their investigation before the hackathon (e.g. I reached out to Clemson before they showed up),
- then the hackathon can be used to review, discuss and tweak researchers' results,
- identify potential targets
- and discuss potential therapeutics
- which can be proposed to the FDA for approval under their n=1 protocol.
- Documentation of results after step 5 can be put in the journal, listing sources from steps 1-5 as generously donated & properly credited contributions.
My thesis was that the further down the plan the patients are, the more "FDA ready" they are likely to be. For example, if the researchers have done the research before the hackathon, then the hackathon becomes a conference where attendees discuss n=1 research results. If they only have unprocessed patient data, then much of the hackathon is spent doing processing and very little research. Finally, if the data is not verified, there is a chance that the data quality is poor. If so, then the results are likely poor. From a patient perspective, lists like this give potential attendees a realistic expectation of what can be done based on where they are. It also shows them how to get the most "bang for the buck" by doing a fair amount of work ahead of time.
Appendix: Mike D'Amour fastq Validation Process
Read data (fastq format) was aligned with the BWA-MEM2. aligner used in the Broad Institute GATK suite.
IGV - Integrated Genome Viewer from the Broad Institute3. - accessing the dbSNP database4. is used for images.
- This mapping loosely follows a Business Methodology I've used in the past. Pete runs "Research To the People" (RTTP), which does these hackathons
- Li H. (2013) Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv:1303.3997v2 [q-
bio.GN]. (if you use the BWA-MEM algorithm or the fastmap command, or want to cite the whole BWA package)
- James T. Robinson, Helga Thorvaldsdóttir, Wendy Winckler, Mitchell Guttman, Eric S. Lander, Gad Getz, Jill P. Mesirov. Integrative Genomics Viewer. Nature Biotechnology 29, 24–26 (2011)
- Kitts A, Sherry S. The Single Nucleotide Polymorphism Database (dbSNP) of Nucleotide Sequence Variation. 2002 Oct 9 [Updated 2011 Feb 2]. In: McEntyre J, Ostell J, editors. The NCBI Handbook [Internet]. Bethesda (MD): National Center for Biotechnology Information (US); 2002-. Chapter 5. Available from: https://www.ncbi.nlm.