Technical notes : Research paper on learning/teaching data science

Title: Navigating Diverse Data Science Learning: Critical Reflections Towards Future Practice

Author: Yehia Elkhatib

Download link

This are my notes on the above paper, which mainly deals with detailing the methods explored and implemented to impart a high quality of education in data science. The paper also provides an interesting breakup of the different roles in data science workflows.

  • The importance of being able to work in a team is highlighted. Working in isolation for a data scientist almost renders the results meaningless.

  • Considering the typically diverse backgrounds of DS practitioners, it is difficult to devise a curriculum that caters to everybody. This factor is certainly critical to consider before taking up any formal university courses. I would not want to spend a great deal of time and money in learning obsolete techniques or technologies.

  • There are differences in learning rates based on the background, and past academic environments. In particular, most students do not seem to realize that the best learning takes place in a ‘social’ manner. Besides addressing the above, several aspects of effective learning and aligning the curriculum and teaching methodology to the typical industrial workflows are explored in this paper.

  • The literature references of past studies and research would certainly make interesting reads. However, they are more relevant to those in the teaching line. An interesting approach would be to read between the lines to extract the best practices for students to learn rapidly and effectively. However, there are many direct resources and techniques to approach the latter.

  • DS Roles :- Core.

    • Janitor
      • data cleaning, pre-processing
    • Scout
      • EDA, early insights
    • Analyst
      • identifying patterns, initial hypothesis, evidence of unforeseen narratives)
    • Decision Builder
      • automate decision making, ML, DL
    • Curator
      • storage formats across interfaces, data governance
    • Engineer
      • Manage the interface between development and production products, efficiency and reliability of data interaction.
  • Auxiliary roles : these roles come into the picture as the DS team grows.

    • Domain specialist
      • data significance, sources of bias
    • Infrastructure manager
      • support to build and operate, beyond the data engineer
    • Communicator
      • Communicating explanatory and confirmatory analyses, setting up systems to interact with the audiences outside the DS team
    • Facilitator
      • A/B experiments, additional support to the communicator.