Chapter 1 Summary Statement

In 2020, 89% of all hospitals have implemented an electronic health record system creating 2,314 exabytes of new medical data since the Health Information Technology for Economic and Clinical Health (HITECH) Act’s Electronic Health Records - Meaningful Use (EHR-MU) clause of 2009 (Health and Human Services, 2017; Moriarty, 2020; Office for Civil Rights (OCR), 2009; Office of the National Coordinator for Health Information Technology (ONC), 2020; Stewart, 2020). This sheer volume of health data necessitates the understanding, accessing, managing, and interpreting of data across researchers, clinicians, and patients (Institute of Medicine (US) Roundtable on Value & Science-Driven Health Care, 2010). While EHR systems have their own data challenges, the influx of electronic data has called for changes in how clinicians undergo training to meet the challenges of evidenced-based medicine by using these data (American Medical Association, 2021; Bresnick, 2015). By contextualizing and democratizing data science skills for clinicians, we can provide them more capacity to explore and make better use of the data (Kross et al., 2020). Additionally, by empowering those in or interested in a biomedical profession with better data literacy and data science skills, we can expand the workforce needed to better use and collect the data we need to innovate and progress health care. We can accomplish this by teaching learners the programming tools used for data analytics (Farrell and Carey, 2018).

This proposal seeks to address the following knowledge gaps in the literature and needs in the field of training biomedical professionals: (1) There are no formal learner personas for the biomedical community and the assessment tools to identify and create learner personas do not exist. (2) Data science learning materials for the biomedical sciences lack community oriented, open, and maintained lessons targeting learner persona needs grounded in pedagogical practices and theory. (3) While we know a lot about the teaching and pedagogy of computer science education, less is known about data literacy education, and almost nothing is known about data science education in an applied domain (e.g., biomedical sciences).

We hypothesize that learning materials with an eye towards tidy data principles is an effective way to teach the data science and data literacy skills that will help learners incorporate programming and data science skills from their spreadsheet workflows. We will be using a series of longitudinal surveys along with a data science curriculum that we will create to test this hypothesis. This work will bridge the skills gap between medical practitioners and domain experts in the biomedical sciences with the analysts, researchers, and data scientists to make better use of data (storage, FAIR, stewardship) in data science teams by creating a computational community of practice that can enhance workforce development, modernize the data ecosystem, work with data science tools for sustainable and open science.

Specific Aims

  • Aim 1: Identify learner personas in the biomedical sciences by creating and validating learner self-assessment surveys.
    • 1.1: Learner self-assessment survey asking questions about prior programming, statistics, and data knowledge will be used to create learner personas.
    • 1.2: Validate learner self-assessment survey.
    • 1.3: Personas will encompass a student’s prior knowledge using survey data. General background, perception of needs, and special considerations will be added to make each learner persona a complete character.
  • Aim 2: Create an effective data science for biomedical science curriculum based on best education and pedagogy practices.
    • 2.1: Learning objectives focused on core data literacy principles in the data science pipeline will be used for each lesson module.
    • 2.2 Lesson content follow best educational and pedagogical practices.
    • 2.3 Assess the effectiveness of learning materials.
  • Aim 3: Assess the effectiveness of formative assessments in learning objectives.
    • 3.1: Implement an experiment for conducting formative and summative assessment question types.
    • 3.2: Assess the effectiveness of targeted feedback in auto-grading systems used in formative and summative feedback.

References

American Medical Association. (2021). Accelerating Change in Medical Education. American Medical Association. https://www.ama-assn.org/education/accelerating-change-medical-education

Bresnick, J. (2015, November 2). Healthcare Big Data Analytics in Med School Marks Turning Point. HealthITAnalytics. https://healthitanalytics.com/news/healthcare-big-data-analytics-in-med-school-marks-turning-point

Farrell, K. J., and Carey, C. C. (2018). Power, pitfalls, and potential for integrating computational literacy into undergraduate ecology courses. Ecology and Evolution, 8(16), 7744–7751. https://doi.org/10.1002/ece3.4363

Health and Human Services, U. S. D. of. (2017). HITECH Act Summary. https://www.hipaasurvivalguide.com/hitech-act-summary.php

Institute of Medicine (US) Roundtable on Value & Science-Driven Health Care. (2010). Clinical Data as the Basic Staple of the Learning Health System. In Clinical Data as the Basic Staple of Health Learning: Creating and Protecting a Public Good: Workshop Summary. National Academies Press (US). https://www.ncbi.nlm.nih.gov/books/NBK54306/

Kross, S., Peng, R. D., Caffo, B. S., Gooding, I., and Leek, J. T. (2020). The Democratization of Data Science Education. The American Statistician, 74(1), 1–7. https://doi.org/10.1080/00031305.2019.1668849

Moriarty, A. (2020, May 26). Does Hospital EHR Adoption Actually Improve Data Sharing? Definitive Healthcare. https://blog.definitivehc.com/hospital-ehr-adoption

Office for Civil Rights (OCR). (2009, October 28). HITECH Act Enforcement Interim Final Rule [Text]. HHS.gov. https://www.hhs.gov/hipaa/for-professionals/special-topics/HITECH-act-enforcement-interim-final-rule/index.html

Office of the National Coordinator for Health Information Technology (ONC). (2020, May 19). Health IT Legislation. https://www.healthit.gov/topic/laws-regulation-and-policy/health-it-legislation

Stewart, C. (2020, September 24). Healthcare data volume globally 2020 forecast. Statista. https://www.statista.com/statistics/1037970/global-healthcare-data-volume/