BNFO 620 – Bioinformatics Practicum

Spring 2023

This course provides students with a bioinformatics research project experience, usually with a decent amount of coding (e.g. in Python). The goal is to produce some useful data or analysis, and ideally a publishable manuscript by the end of the semester (actually, a published paper eventually). Also, we will also write or update some Wikipedia pages that are related to these projects. So, in any case students will produce something for the general public:)

Two sessions per week: one general session (topics listed below), one project team meeting (usually 2 students per team).

Syllabus (2023); updated version will be available in January 2024

Projects in Spring 2024 include, but are not limited to the following (please inquire if you are interested in any particular topic):

Main projects:

  • Text-mining of reptile species descriptions (more coding)
  • Collection, extraction and processing of images and their meta-data

Alternative projects

  • Metabolic networks and their regulation by protein-protein interactions
  • Snake genomes and limb development
  • Snake color pattern collection and analysis / development of a snake ID tool
  • and possibly others (negotiable)

The schedule below is optional and negotiable, but we should cover some of that stuff:

Week 1. Brief introduction to science and research. What is known and what is not? Applied vs basic research. Good science vs. bad science (and pseudoscience). Projects intro.

Reading list (not including project-specific literature):

Week 2. Wikipedia, Wikimedia, Wiki other things. Articles, user pages, talk pages. Source vs. visual editor, citations, linking. What is a good article?

    • Logan et al. 2010 Ten Simple Rules for Editing Wikipedia. PLoS Comput Biol 6(9): e1000941
    • Copyright, finding images, original research, Wikimedia Commons
    • Exercise: edit your first Wikipedia page

Week 3. The scientific literature: past, present and future. Literature vs. databases vs. aggregators and meta-databases.

Week 4. Project presentations. Students present the project they have chosen: the problem, the approach to solve it, expected outcomes.

Week 5. Biomedical databases and data sources. (NAR) data(bases) and others. Where to find the database (or dataset) you need:

    • Nucleic Acids Research Database issue 2022
    • Supplementary files, Dryad, Figshare, Compass & Co.
    • Exercise: pick a database paper from the NAR issue and update (or create) the corresponding Wikipedia entry on that database

Week 6. Finding scientific literature. PubMed, Google Scholar, Web of Science (may be merged with Week 3).

Week 7. Project presentations.

Week 8. Science and social media: mailing lists, Twitter, Facebook, Researchgate, Academia, etc.

Week 9. Computational tools. Automation vs programming, specialized tools.

    • Programming, scripting, and automating
    • How to automate your Mac, PC, or phone
    • Learn new tricks using newsletters, podcasts, meetups, and blogs
    • Exercises: set up a bunch of text expansions on your device; automate email processing and calendering, etc.

Week 10. Writing papers.

Week 11. Peer review.

Week 12. Final Project presentations.

    • Presentation
    • Excercise: finalize draft of paper; circulate for peer review

Week 13. Wrap up and reflections: What’s this all good for? How does it fit into the big picture. The meaning of life and all that …

Further reading (beyond this class): 10 simple rules ad nauseam