Specialty Area

Data Science

Content Progression

Data science is the practice of collecting, analyzing, and interpreting data sets to solve complex, real-life problems using statistical and computational methods (DS4E). The foundational content for all students includes some learning outcomes related to data science. For continued learning beyond the foundation, we have defined the following content progression that includes two additional levels (fundamentals and specialty) that progressively build on this content. This progression may lead to a data science major and a career as, for example, a data scientist, data engineer, data modeler, statistician, or data ethicist.

Foundation

Prioritized Foundational Content Specific to Data Science:

  • Programming basics
  • Cleaning and using data
  • Social and ethical implications
  • Data bias
  • Testing and debugging
  • Inclusive collaboration on data projects

Fundamentals

  • Data science tools
  • Transform and prepare data
  • Data validity (clean and accurate)
  • Statistics (e.g., normal distribution, descriptive statistics, regression analysis)
  • Data visualization 
  • Extract meaning from tabular data using a function
  • Query formation (prompt engineering; Structured Query Language (SQL); elastic search)
  • Make predictions and determine generalizability
  • Data forms and bias (ethics)
  • Data fairness and bias (mitigating bias)
  • Data privacy, security, bias, missing data, ethics
  • Legal and ethical implications
  • Structured problem-solving (case studies; case analysis)
  • IDEs for data science (e.g., PyCharm, RStudio, Azure, Jupyter Labs)
  • Intersection of data science and other fields
  • Careers in data science

Specialty

  • Distributed cloud based systems 
  • Data pipelines and transfer
  • Data modeling 
  • Machine learning basics
  • Data validity, credibility, and reliability (data consciousness)
  • Advanced data visualizations
  • Data from wearables and its implications
  • Evaluating statistical conclusions (e.g., effect size)
  • Data privacy and security 
  • Interface development for data analysis (e.g., business intelligence (BI) tools, such as PowerBI, Tableau)
  • Common algorithms for data science (e.g., linear regression, KNN)
  • Designing, imagining, and critiquing new ways to get, use, and restrict data

Example Course Pathway

The data science content progression can be packaged in a variety of ways to meet the local context and needs of individual schools and districts. This data science course pathway serves as an example of how content in this specialty can be implemented in high schools. Each box represents a course and can be expanded to view a corresponding description.

Foundation

see below

Computer Science Foundations supports all high school students, regardless of postsecondary goals, in developing the knowledge, skills, and dispositions necessary to navigate and understand the technology-driven world in which they live. Course content, organized into five Topic Areas (Algorithms, Programming, Data and Analysis, Computing Systems and Security, and Preparing for the Future), rests upon four Key Pillars (Computational Thinking, Inclusive Collaboration, Human-Centered Design, and Impacts and Ethics). Topic Areas and Pillars are essential components of this course and the student experience (see Section 2 of this report for more details).

Fundamentals

see below

Programming the Future provides students who have a foundational understanding of computer science with an opportunity to explore various topics such as cybersecurity, artificial intelligence, and data science. While developing their programming skills, students will apply fundamental ideas in these areas to solve meaningful and interesting problems. Content covered in this course aligns with fundamentals content from the Programming, Cybersecurity, Artificial Intelligence, and Data Science content progressions as defined in Sections 3.1, 3.2, 3.3, and 3.5.

Specialty

see below

In a world that is increasingly informed and driven by data, it is necessary to understand data, where it comes from, how it is leveraged, and how it can impact life and work. Data Science and Analytics is a first in-depth course for students to investigate the various ways that data can be stored, accessed, modified, and visualized. Students will consider impacts and ethical considerations related to ownership and bias in data as well as how data visualizations can be misleading. While this course focuses on the computer science context, data science is increasingly interdisciplinary, and students will be afforded opportunities to apply analysis and visualization techniques in fields/topics of personal interest. Content covered in this course aligns with specialty content from the Data Science content progression as defined in Section 3.5.

Advanced Application

see below

The Pathway Capstone Course is an opportunity for students to apply advanced computer science knowledge and problem-solving, communication, and collaboration skills to tackle a personally meaningful computing project. Students will design innovative solutions and present them to authentic audiences, preparing them for future academic and professional pursuits. This course is designed to inspire creativity, foster collaboration, and demonstrate proficiency in real-world application of the knowledge, skills, and dispositions developed during prior coursework and experiences.

View the Implementation and Integrating CS pages to learn more about how to teach foundational and specialty content to students.

Possible Careers:

Data Scientist, Data Security Analyst, Data Privacy Specialist, Data Ethicist, Data Modeler, Statistician
Reimagining CS Pathways: High School and Beyond