“My son is in computers”, my father used to say. “In computers” was a catch all term of the time that described everything from sales to repairs to programming to systems design. Similarly bioinformatics is growing into a field whose practitioners include clinicians, software engineers, life science researchers and more. Given this diversity of roles, what does a bioinformatics curriculum look like? This is a question that the International Society for Computational Biology (ISCB)’s Education Committee (EC) has been puzzling over for some years.
In 2018 members of the ISCB EC published a resource that described core competencies and model personas of bioinformatics practitioners. The model personas were further subdivided into three types: bioinformatics researchers, users and engineers. In May of this year ISCB, H3ABioNet, GOBLET and ELIXIR organised a four day Bioinformatics Education Summit in Cape Town, South Africa to refine this competencies and personas framework and model its application to bioinformatics curricula and short courses. The revised framework is due for publication in July 2019, in time for the ISMB meeting in Basel, Switzerland.
Existing or new curricula and courses can be mapped to the competencies framework, noting for each course module which competencies the module addresses (for example general biology, or statistical research methods), and whether this is a core focus of the module. Bloom’s taxonomy, a hierarchical model that describes the level of complexity of learning objectives, can be used to further describe the level at which certain competencies are addressed. For example, a bioinformatics engineer might need to be able to evaluate the construction of software systems, whereas a life science researcher might only need to be able to apply such software systems.
In addition to helping educators design and refine their courses, the competencies framework will be part of a planned future ISCB programme to endorse curricula and courses. Such endorsements could contribute to the international development of bioinformatics education by making qualifications valid across institutions and countries.
Moving on from the work on the bioinformatics core competencies framework, the Bioinformatics Education Summit considered the development of an online Train-the-Trainer course and the GOBLET bioinformatics trainer resources portal. This portal will contain content on skills required for training, how to develop training materials, how to organise and deliver training, how to assess trainees, evaluate a course and finally how to move towards endorsing and accrediting a course. In addition to the higher education oriented training resources there is a collection of resources on introducing high school learners to bioinformatics.
Finally the Bioschemas markup specifications were introduced. Bioschemas is a development from the schemas.org project that allows websites and online resources to be annotated in a way that computers can understand. In the training context there are draft schemas that describe training materials, courses and course instances. Once created and exported as JSON-LD scripts these can be embedded in web pages. The ELIXIR TeSS web crawler can then read these descriptions and automatically add information on materials and courses to its search portal. While the schemas are still in draft, and the tools for creating Bioschemas JSON-LD are still in their infancy, the project has great promise in alerting students to training opportunities and letting trainers find and share training materials.
The US NIH’s BD2K (Big Data To Knowledge) programme also has a training resources discovery portal, called ERuDite. In its current incarnation, ERuDite is rather similar to TeSS, but the ambition of the project is to become a tool for exploring a path through the data science knowledge landscape. Learners might start with a certain set of knowledge and skills and use the platform to identify training resources (such as MOOCs) and opportunities that will lead them towards their final knowledge goal. An earlier paper describes how knowledge representation and machine learning is used in this design.