Repository: ossu/data-science
Branch: master
Commit: ccb52fe046fe
Files: 10
Total size: 29.7 KB
Directory structure:
gitextract_cch89h3k/
├── .github/
│ ├── ISSUE_TEMPLATE/
│ │ └── request-for-comment-template.md
│ └── workflows/
│ └── delete-empty-issues.yml
├── CNAME
├── LICENSE.md
├── README.md
├── coursepages/
│ ├── intro-cs/
│ │ └── README.md
│ └── intro-programming/
│ └── README.md
└── extras/
├── books.md
├── courses.md
└── specializations.md
================================================
FILE CONTENTS
================================================
================================================
FILE: .github/ISSUE_TEMPLATE/request-for-comment-template.md
================================================
---
name: Request for Comment Template
about: Template for creating an RFC to modify the curriculum
title: 'RFC: '
labels: ''
assignees: ''
---
**Problem:**
Give a 1 sentence description of a problem with the current OSSU Curriculum. Successful critiques of the curriculum will point out ways that OSSU is failing to uphold [our curricular guidelines](https://github.com/ossu/data-science#curricular-guideline). Examples are:
* OSSU lists course X as required when the course's topics are elective in our curricular guidelines.
* OSSU does not having a course to cover required topic X from our curricular guidelines.
* OSSU lists courses X, Y and Z that cover the same topics when fewer courses could suffice.
* OSSU recommends course X to teach a topic, but there exists a higher quality course that covers the same material.
**Duration:**
This should most often be 1 month from the date of posting.
**Background:**
Give an in depth description of the problem. Describe a solution to the problem. Describe the advantages and disadvantages of this solution. This section should be a few paragraphs.
**Proposal:**
Give a bullet point list of changes that are being proposed. These can link to a Pull Request.
**Alternatives:**
Give a bullet point list of alternative ways to address the problem.
================================================
FILE: .github/workflows/delete-empty-issues.yml
================================================
name: Delete empty issues
on:
issues:
types:
- opened
jobs:
label_issues:
runs-on: ubuntu-latest
permissions:
issues: write
if: github.event.issue.body == '' || contains(github.event.issue.body, 'Give a 1 sentence description of a problem with the current OSSU Curriculum. Successful critiques of the curriculum will point out ways that OSSU is failing to uphold')
steps:
- name: Create comment
uses: actions-cool/issues-helper@v3
with:
actions: 'create-comment'
token: ${{ secrets.GITHUB_TOKEN }}
issue-number: ${{ github.event.issue.number }}
body: |
Hello @${{ github.event.issue.user.login }}.
It looks like you've opened an empty issue or one without a unique problem description.
Please understand that this is a popular project, useful to many learners, and empty issues distract maintainers that are trying to help others.
If you would like practice with issues, you can follow github documentation to create your own repo:
https://docs.github.com/en/repositories/creating-and-managing-repositories/creating-a-new-repository
And then in that repo practice creating and editing issues:
https://docs.github.com/en/issues/tracking-your-work-with-issues/configuring-issues/quickstart
We look forward to your future contributions to OSSU, when you are contributing to improve computer science education for learners all over the world!
- name: Close issue
uses: actions-cool/issues-helper@v3
with:
actions: 'close-issue'
token: ${{ secrets.GITHUB_TOKEN }}
issue-number: ${{ github.event.issue.number }}
================================================
FILE: CNAME
================================================
ds.ossu.dev
================================================
FILE: LICENSE.md
================================================
<a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-sa/4.0/88x31.png" /></a><br />This work is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/">Creative Commons Attribution-ShareAlike 4.0 International License</a>.
================================================
FILE: README.md
================================================
<div align="center" style="text-align: center">
<img src="http://i.imgur.com/kYYCXtC.png" alt="Open Source Society logo"/>
<h3>Open Source Society University</h3>
<p>
:bar_chart: Path to a free self-taught education in <strong>Data Science</strong>!
</p>
<p>
<a href="https://github.com/open-source-society/data-science">
<img alt="Open Source Society University - Data Science" src="https://img.shields.io/badge/OSSU-data--science-blue.svg">
</a>
</p>
</div>
## Contents
- [About](#about)
- [Curricular Guideline](#curricular-guideline)
- [How to use this guide](#how-to-use-this-guide)
- [Community](#community)
- [Prerequisites](#prerequisites)
- [Curriculum](#curriculum)
- [How to contribute](#how-to-contribute)
- [Code of conduct](#code-of-conduct)
- [Team](#team)
## About
This is a path for those of you who want to complete the Data Science undergraduate curriculum on your own time, **for free**, with courses from the **best universities** in the World.
In our curriculum, we give preference to MOOC (Massive Open Online Course) style courses because these courses were created with our style of learning in mind.
## Curricular Guideline
OSSU Data Science uses the report [Curriculum Guidelines for Undergraduate Programs in Data Science](https://www.amstat.org/asa/files/pdfs/EDU-DataScienceGuidelines.pdf) as our guide for course recommendation.
## How to use this guide
### Duration
It is possible to finish within about 2 years if you plan carefully and devote roughly 20 hours/week to your studies. Learners can use [this spreadsheet](https://docs.google.com/spreadsheets/d/1TEGSUQDFuWL3TYNjiM8G3esly-tKOcgHSDABt92mzdA/copy) to estimate their end date. Make a copy and input your start date and expected hours per week in the `Timeline` sheet. As you work through courses you can enter your actual course completion dates in the Curriculum Data sheet and get updated completion estimates.
> **Warning:** While the spreadsheet is a useful tool to estimate the time you need to complete this curriculum, it may not be up-to-date with the curriculum. Use the spreadsheet just to estimate the time you need. Use the [the GitHub repo](https://github.com/ossu/data-science) to see what courses to do.
### Order of the classes
Some courses can be taken in parallel, while others must be taken sequentially. All of the courses within a topic should be taken in the order listed in the curriculum. The graph below demonstrates how topics should be ordered.
<img src="./topic_progression_graph.jpg" width="300" alt="Topic Progression Graph" />
### Track your progress
[Fork](https://www.freecodecamp.org/news/how-to-fork-a-github-repository/) the [GitHub repo](https://github.com/ossu/data-science) into your own GitHub account and put ✅ next to the stuff you've completed as you complete it. This can serve as your [kanban board](https://en.wikipedia.org/wiki/Kanban_board) and will be faster to implement than any other solution (giving you time to spend on the courses).
### Which programming languages should I use?
Python and R are heavily used in Data Science community and our courses teach you both. Remember, the important thing for each course is to internalize the core concepts and to be able to use them with whatever tool (programming language) that you wish.
### Content Policy
You must share only files that you are allowed. **Do NOT disrespect the code of conduct** that you sign in the beginning of your courses.
## Community
We have a Discord server! This should be your first stop to talk with other OSSU students. [Why don't you introduce yourself right now?](https://discord.gg/wuytwK5s9h)
You can also interact through [GitHub issues](https://github.com/open-source-society/data-science/issues).
Add **Open Source Society University** to your [Linkedin](https://www.linkedin.com/school/11272443/) profile!
> **Warning:** There are a few third-party/deprecated/outdated material that you might find when searching for OSSU. We recommend you to ignore them, and only use the [OSSU Data Science Github Repo](https://github.com/ossu/data-science). Some known outdated materials are:
> - An unmaintained and deprecated trello board
> - Third-party notion templates
## Prerequisites
The Data Science curriculum assumes the student has taken [high school math](https://ossu.dev/precollege-math) and [statistics](https://www.khanacademy.org/math/probability).
## Curriculum
- [Introduction to Data Science](#introduction-to-data-science)
- [Introduction to Computer Science](#introduction-to-computer-science)
- [Data Structures and Algorithms](#data-structures-and-algorithms)
- [Databases](#databases)
- [Single Variable Calculus](#single-variable-calculus)
- [Linear Algebra](#linear-algebra)
- [Multivariable Calculus](#multivariable-calculus)
- [Statistics & Probability](#statistics--probability)
- [Data Science Tools & Methods](#data-science-tools--methods)
- [Machine Learning/Data Mining](#machine-learningdata-mining)
- [Final project](#final-project)
### Introduction to Data Science
[What is Data Science](https://www.coursera.org/learn/what-is-datascience)
### Introduction to Computer Science
_Students who already know basic programming in any language can skip this first course_
[Introduction to programming](coursepages/intro-programming/README.md)
[Introduction to Computer Science and Programming Using Python](coursepages/intro-cs/README.md)
[Introduction to Computational Thinking and Data Science](https://ocw.mit.edu/courses/6-0002-introduction-to-computational-thinking-and-data-science-fall-2016/)
### Data Structures and Algorithms
_The Algorithms courses are taught in Java. If students need to learn Java, they should take this course first_
[Java Programming](https://java-programming.mooc.fi/)
[Algorithms I: ArrayLists, LinkedLists, Stacks and Queues](https://www.edx.org/learn/data-structures/the-georgia-institute-of-technology-data-structures-algorithms-i-arraylists-linkedlists-stacks-and-queues)
[Algorithms II: Binary Trees, Heaps, SkipLists and HashMaps](https://www.edx.org/learn/data-structures/the-georgia-institute-of-technology-data-structures-algorithms-ii-binary-trees-heaps-skiplists-and-hashmaps)
[Algorithms III: AVL and 2-4 Trees, Divide and Conquer Algorithms](https://www.edx.org/learn/data-structures/the-georgia-institute-of-technology-data-structures-algorithms-iii-avl-and-2-4-trees-divide-and-conquer-algorithms)
[Algorithms IV: Pattern Matching, Dijkstra’s, MST, and Dynamic Programming Algorithms](https://www.edx.org/learn/data-structures/the-georgia-institute-of-technology-data-structures-algorithms-iv-pattern-matching-dijkstras-mst-and-dynamic-programming-algorithms)
### Databases
[Database Management Essentials](https://www.coursera.org/learn/database-management)
[Data Warehouse Concepts, Design, and Data Integration](https://www.coursera.org/learn/dwdesign)
[Relational Database Support for Data Warehouses](https://www.coursera.org/learn/dwrelational)
[Business Intelligence Concepts, Tools, and Applications](https://www.coursera.org/learn/business-intelligence-tools)
[Design and Build a Data Warehouse for Business Intelligence Implementation](https://www.coursera.org/learn/data-warehouse-bi-building)
[MongoDB for Developers Learning Path](https://learn.mongodb.com/pages/mongodb-developer-learning-paths)
### Single Variable Calculus
[Calculus 1A: Differentiation](https://mitxonline.mit.edu/courses/course-v1:MITxT+18.01.1x/)
[Calculus 1B: Integration](https://mitxonline.mit.edu/courses/course-v1:MITxT+18.01.2x/)
[Calculus 1C: Coordinate Systems & Infinite Series](https://mitxonline.mit.edu/courses/course-v1:MITxT+18.01.3x/)
### Linear Algebra
[Essence of Linear Algebra](https://www.youtube.com/playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE_ab)
[Linear Algebra](https://ocw.mit.edu/courses/mathematics/18-06sc-linear-algebra-fall-2011/)
### Multivariable Calculus
[Multivariable Calculus](http://ocw.mit.edu/courses/mathematics/18-02sc-multivariable-calculus-fall-2010/index.htm)
### Statistics & Probability
[Introduction to Probability](https://projects.iq.harvard.edu/stat110/home)
[Intro to Descriptive Statistics](https://www.udacity.com/course/intro-to-descriptive-statistics--ud827)
[Intro to Inferential Statistics](https://www.udacity.com/course/intro-to-inferential-statistics--ud201)
[Statistical Learning with Python by Stanford University on EdX](https://www.edx.org/learn/python/stanford-university-statistical-learning-with-python) ([Textbook](https://hastie.su.domains/ISLP/ISLP_website.pdf.download.html), [Textbook resources](https://www.statlearning.com/resources-python)) or [Statistical Learning With R by Stanford University on EdX](https://www.edx.org/learn/statistics/stanford-university-statistical-learning) ([Textbook](https://hastie.su.domains/ISLR2/ISLRv2_corrected_June_2023.pdf.download.html), [Textbook resources](https://www.statlearning.com/resources-second-edition))
### Data Science Tools & Methods
[Tools for Data Science](https://www.coursera.org/learn/open-source-tools-for-data-science)
[Data Science Methodology](https://www.coursera.org/learn/data-science-methodology)
[Data Science: Wrangling](https://www.edx.org/course/data-science-wrangling)
### Machine Learning/Data Mining
[Supervised Machine Learning: Regression and Classification](https://www.coursera.org/learn/machine-learning)
[Advanced Learning Algorithms](https://www.coursera.org/learn/advanced-learning-algorithms)
[Unsupervised Learning, Recommenders, Reinforcement Learning](https://www.coursera.org/learn/unsupervised-learning-recommenders-reinforcement-learning)
[Intro to Machine Learning](https://www.udacity.com/course/intro-to-machine-learning--ud120)
[Mining Massive Datasets](https://www.edx.org/course/mining-massive-datasets)
[Process Mining](https://www.coursera.org/learn/process-mining)
### Final project
Part of learning is doing.
The assignments and exams for each course are to prepare you to use your knowledge to solve real-world problems.
After you've completed the curriculum,
you should identify a problem that you can solve using the knowledge you've acquired.
You can create something entirely new, or you can improve some tool/program that you use and wish were better.
Students who would like more guidance in creating a project may choose to use a series of project oriented courses.
A sample of options
(many more are available, at this point you should be capable of identifying a series that is interesting and relevant to you)
are available on [this page](extras/specializations.md).
### Congratulations
After completing the requirements of the curriculum above,
you will have completed the equivalent of a full bachelor's degree in Data Science.
Congratulations!
What is next for you? The possibilities are boundless and overlapping:
- Look for a job as a data scientist!
- Check out the [readings](extras/books.md) for classic books you can read that will sharpen your skills and expand your knowledge.
- Join a local data science meetup (e.g. via [meetup.com](https://www.meetup.com/)).
- Pay attention to emerging technologies in the world of data science.

## How to contribute
You can [open an issue](https://help.github.com/articles/creating-an-issue/) and give us your suggestions as to how we can improve this guide, or what we can do to improve the learning experience.
You can also [fork this project](https://help.github.com/articles/fork-a-repo/) and send a [pull request](https://help.github.com/articles/using-pull-requests/) to fix any mistakes that you have found.
If you want to suggest a new resource, send a pull request adding such resource to the [extras](https://github.com/open-source-society/data-science/tree/master/extras) section. The **extras** section is a place where all of us will be able to submit interesting additional articles, books, courses and specializations.
## Code of Conduct
[OSSU's code of conduct](https://github.com/ossu/code-of-conduct).
## Team
* **Curriculum Maintainer**: [Waciuma Wanjohi](https://github.com/waciumawanjohi)
* **Contributors**: [contributors](https://github.com/open-source-society/data-science/graphs/contributors)
================================================
FILE: coursepages/intro-cs/README.md
================================================
# Introduction to Computer Science
This course will introduce you to the world of computer science. Students who have been introduced to programming, either from the courses above or through study elsewhere, should take this course for a flavor of the material to come. If you finish the course wanting more, Computer Science is likely for you!
This course has been developed by MIT and is available from three different places. We recommend you to do it from the archived version on Edx.
> 6.0001 Introduction to Computer Science and Programming in Python is intended for students with little or no programming experience. It aims to provide students with an understanding of the role computation can play in solving problems and to help students, regardless of their major, feel justifiably confident of their ability to write small programs that allow them to accomplish useful goals. The class uses the Python 3.5 programming language.
**Course Link:** <https://learning.edx.org/course/course-v1:MITx+6.00.1x+2T2018/home>
Alternative Links:
- <https://ocw.mit.edu/courses/6-0001-introduction-to-computer-science-and-programming-in-python-fall-2016/>
- <https://www.edx.org/course/introduction-to-computer-science-and-programming-7> (instructor-paced version, runs three times a year)
## Instructions
**Note:** These instructions are for the archived version of the course on Edx, which we recommend. They don't apply to other versions of the course.
- The course does not have a homepage on Edx, but don't worry about it. Open the [link](https://learning.edx.org/course/course-v1:MITx+6.00.1x+2T2018/home) given above, log in (if you are not logged in) and then enroll in the course.
- Work through the course as given in the course overview. Watch the videos, do the finger exercises, and then solve the problem sets.
- You won't be able to submit your responses for the finger exercises, but you can see their answers by clicking on "Show Answer". Check your answers honestly.
- You won't be able to submit the problem sets on their own page. To submit them, go to the "Sandbox" section (It is the last section. You can find it on the course overview). There, you will be able to submit your work and get it graded.
- You don't need to install the full Anaconda distribution to do this course. See the notes section below for more information.
- If you are stuck somewhere, feel free to ask questions. You can join the OSSU chat for this course here: <https://discord.gg/jvchSm9>.
## Notes
- You don't need to install the full anaconda package to do this course. You can just download the Spyder IDE from here: <https://github.com/spyder-ide/spyder/releases/latest>. It comes bundles with python as well as some popular scientific python libraries (all the libraries which this course uses are included), but it is not as large or complex as the full anaconda distribution. You don't need to set up python separately or anything.
- The community has found this resource useful: <https://www.youtube.com/playlist?list=PL4e66Kzl1JCFPVBa7gBzWJF_FDF3KBf-2>
- You won't get any certificate for doing this course. If you really want a certificate, you need to do the [instructor-paced version of this course](https://www.edx.org/course/introduction-to-computer-science-and-programming-7) on Edx. Certificate of an introductory course like this is not very valuable, so unless you are absolutely sure, we recommend you to do the archived version of this course instead.
- If for some reason you want to do the OCW version of the course, you will find many useful notes and fixes of various problems in our [discord server](https://discord.gg/jvchSm9).
================================================
FILE: coursepages/intro-programming/README.md
================================================
# Introduction to Programming
If you've never written a for-loop, or don't know what a string is in programming, start here. These courses are self-paced, allowing you to adjust the number of hours you spend per week to meet your needs.
We are currently looking for volunteers to try out both of the following two courses and analyze them in different ways to determine which one is better suited to be included in our curriculum. We suggest that you flip a coin to decide which one to take first, so that you avoid an ordering bias. Once you have completed both courses, please provide your analysis of [this RFC](https://github.com/ossu/computer-science/issues/1164) (the RFC is from Computer Science curriculum, but also applicable for this curriculum).
If you don't have time or do not want to volunteer, you are required to do **only ONE** of the following courses.
## CS50P: Introduction to Programming with Python
This course has been developed by the CS50 team at Harvard University.
> An introduction to programming using a language called Python. Learn how to read and write code as well as how to test and "debug" it. Designed for students with or without prior programming experience who'd like to learn Python specifically. Learn about functions, arguments, and return values (oh my!); variables and types; conditionals and Boolean expressions; and loops. Learn how to handle exceptions, find and fix bugs, and write unit tests; use third-party libraries; validate and extract data with regular expressions; model real-world entities with classes, objects, methods, and properties; and read and write files. Hands-on opportunities for lots of practice. Exercises inspired by real-world programming problems. No software required except for a web browser, or you can write code on your own PC or Mac.
**Link**: <https://cs50.harvard.edu/python/>
**Note**: This course is *different* from CS50 or CS50x. CS50 is not part of the OSSU curriculum. That being said, if you have completed CS50, you can skip this course and move on to the next one.
### Instructions
- If you want to follow along with the instructor, log in to the [CS50 "codespace"](https://cs50.dev) and watch [this video](https://cs50.harvard.edu/python/2022/shorts/visual_studio_code_for_cs50/) to get started.
- Watch each lecture and complete the respective problem set. Read the lecture notes to revise things.
- If you are stuck somewhere, feel free to ask questions. You can join the OSSU chat for this course here: <https://discord.gg/cBkssaJy5g>.
- You can also join the CS50 discord server and ask questions there: <https://discord.gg/cs50>, but note that it is not affiliated with or maintained by OSSU.
### Course Materials
0. [Functions, Variables](https://cs50.harvard.edu/python/2022/weeks/0/) — [Notes](https://cs50.harvard.edu/python/2022/notes/0/) — [Problem Set](https://cs50.harvard.edu/python/2022/psets/0/)
1. [Conditionals](https://cs50.harvard.edu/python/2022/weeks/1/) — [Notes](https://cs50.harvard.edu/python/2022/notes/1/) — [Problem Set](https://cs50.harvard.edu/python/2022/psets/1/)
2. [Loops](https://cs50.harvard.edu/python/2022/weeks/2/) — [Notes](https://cs50.harvard.edu/python/2022/notes/2/) — [Problem Set](https://cs50.harvard.edu/python/2022/psets/2/)
3. [Exceptions](https://cs50.harvard.edu/python/2022/weeks/3/) — [Notes](https://cs50.harvard.edu/python/2022/notes/3/) — [Problem Set](https://cs50.harvard.edu/python/2022/psets/3/)
4. [Libraries](https://cs50.harvard.edu/python/2022/weeks/4/) — [Notes](https://cs50.harvard.edu/python/2022/notes/4/) — [Problem Set](https://cs50.harvard.edu/python/2022/psets/4/)
5. [Unit Tests](https://cs50.harvard.edu/python/2022/weeks/5/) — [Notes](https://cs50.harvard.edu/python/2022/notes/5/) — [Problem Set](https://cs50.harvard.edu/python/2022/psets/5/)
6. [File I/O](https://cs50.harvard.edu/python/2022/weeks/6/) — [Notes](https://cs50.harvard.edu/python/2022/notes/6/) — [Problem Set](https://cs50.harvard.edu/python/2022/psets/6/)
7. [Regular Expressions](https://cs50.harvard.edu/python/2022/weeks/7/) — [Notes](https://cs50.harvard.edu/python/2022/notes/7/) — [Problem Set](https://cs50.harvard.edu/python/2022/psets/7/)
8. [Object-Oriented Programming](https://cs50.harvard.edu/python/2022/weeks/8/) — [Notes](https://cs50.harvard.edu/python/2022/notes/8/) — [Problem Set](https://cs50.harvard.edu/python/2022/psets/8/)
9. [Et Cetera](https://cs50.harvard.edu/python/2022/weeks/9/) — [Notes](https://cs50.harvard.edu/python/2022/notes/9/) — [Final Project](https://cs50.harvard.edu/python/2022/project/)
## Python for Everybody
This course has been created by Professor Charles Severance from the University of Michigan.
> Learn to Program and Analyze Data with Python. Develop programs to gather, clean, analyze, and visualize data.
**Link**: <https://www.py4e.com/lessons>
**Textbook**: [PDF](http://do1.dr-chuck.com/pythonlearn/EN_us/pythonlearn.pdf) / [EPUB](http://do1.dr-chuck.com/pythonlearn/EN_us/pythonlearn.epub) / [HTML](https://www.py4e.com/html3) / [Buy hardcopy](https://www.py4e.com/book)
**Note**: This course is also offered on Coursera, Edx. Those versions require you to pay to get the full version of the course. We suggest doing the course on its website, which is completely free.
### Instructions
- You need to [sign in](https://www.py4e.com/) to the course website using your Google account to access the assignments.
- Watch all the videos of a lesson and then do its assignments.
- If you prefer reading books, you can read the HTML version of the chapter related to the lesson linked on the lesson's page, or you can download the whole book in different formats from [this page](https://www.py4e.com/book).
- If you face any problems, feel free to ask questions. You can join the OSSU chat for this course here: <https://discord.gg/syA242Z>.
- You only need to complete the course up to the Regular Expressions lesson. The rest of the course is optional.
### Course Materials
1. [Installing Python](https://www.py4e.com/lessons/install)
2. [Why Program?](https://www.py4e.com/lessons/intro)
3. [Variables, expressions and statements](https://www.py4e.com/lessons/memory)
4. [Conditional Execution](https://www.py4e.com/lessons/logic)
5. [Functions](https://www.py4e.com/lessons/functions)
6. [Loops and Iterations](https://www.py4e.com/lessons/loops)
7. [Strings](https://www.py4e.com/lessons/strings)
8. [Files](https://www.py4e.com/lessons/files)
9. [Lists](https://www.py4e.com/lessons/lists)
10. [Dictionaries](https://www.py4e.com/lessons/dictionary)
11. [Tuples](https://www.py4e.com/lessons/tuples)
12. [Regular Expressions](https://www.py4e.com/lessons/regex)
13. [Network Programming](https://www.py4e.com/lessons/network) (Optional)
14. [Using Web Services](https://www.py4e.com/lessons/servces) (Optional)
15. [Object-Oriented Programming](https://www.py4e.com/lessons/Objects) (Optional)
16. [Databases](https://www.py4e.com/lessons/database) (Optional)
17. [Data Visualization](https://www.py4e.com/lessons/dataviz) (Optional)
### Fixes
1. If you're doing the BeautifulSoup4 lesson, there is an issue with Python 3.10+ that will give you an error referencing the Collections library. We have a fix for you. We don't expect you to understand it, just put this in front of your code in the imports block:
```python
import collections
collections.Callable = collections.abc.Callable
from bs4 import BeautifulSoup
```
Doing this should fix the compatibility issue and allow your code to run.
================================================
FILE: extras/books.md
================================================
# Data Science - Extra Resources
## Books
- [Python](#python)
- [Data Analysis](#data-analysis)
- [Data Visualization](#data-visualization)
- [Web Scraping](#web-scraping)
- [Databases and SQL](#databases-and-sql)
- [Statistics](#statistics)
- [Linear Algebra](#linear-algebra)
- [Machine Learning](#machine-learning)
- [Data Science](#data-science)
- [Big Data](#big-data)
---
### Python
Name | Author | ISBN
:-- | :--: | :--:
### Data Analysis
Name | Author | ISBN
:-- | :--: | :--:
### Data Visualization
Name | Author | ISBN
:-- | :--: | :--:
### Web Scraping
Name | Author | ISBN
:-- | :--: | :--:
### Databases and SQL
Name | Author | ISBN
:-- | :--: | :--:
### Statistics
Name | Author | ISBN
:-- | :--: | :--:
### Linear Algebra
Name | Author | ISBN
:-- | :--: | :--:
================================================
FILE: extras/courses.md
================================================
# Data Science - Extra Resources
## Courses
- [Statistics](#statistics)
---
### Statistics
Courses | Duration | Effort
:-- | :--: | :--:
[Intro to Statistics](https://www.udacity.com/course/intro-to-statistics--st101)| 8 weeks | 6 hours/week
[Basic Statistics](https://www.coursera.org/learn/basic-statistics)| 8 weeks | 3 hours/week
[Bayesian Statistics](https://www.coursera.org/learn/bayesian)| 5 weeks | 5-7 hours/week
================================================
FILE: extras/specializations.md
================================================
# Data Science - Specializations
## Specializations
* [Udacity](#udacity)
* [Machine Learning Nanodegree by Google](#machine-learning-nanodegree-by-google)
* [Data Scientist Nanodegree](#data-scientist-nanodegree)
* [edX](#edx)
* [Data Science and Engineering with Apache Spark](#data-science-and-engineering-with-apache-spark)
* [Coursera](#coursera)
* [Data Mining Specialization](#data-mining-specialization)
* [Machine Learning Specialization](#machine-learning-specialization)
* [Data Science Specialization](#data-science-specialization)
* [FutureLearn](#futurelearn)
---
### Udacity
#### Machine Learning Nanodegree by Google
Course | Duration | Effort
:-- | :--: | :--:
[Machine Learning Engineer Nanodegree](https://www.udacity.com/course/machine-learning-engineer-nanodegree--nd009)| - weeks | 10 hours/week
#### Data Scientist Nanodegree
Course | Duration | Effort
:-- | :--: | :--:
[Data Analyst Nanodegree](https://www.udacity.com/course/data-analyst-nanodegree--nd002)| - weeks | 10 hours/week
### edX
#### Data Science and Engineering with Apache Spark
Course | Duration | Effort
:-- | :--: | :--:
[Data Science and Engineering with Apache Spark XSeries](https://www.edx.org/xseries/data-science-engineering-apache-spark)| - weeks | 10 hours/week
### Coursera
#### Data Mining Specialization
Course | Duration | Effort
:-- | :--: | :--:
[Data Mining](https://www.coursera.org/specializations/data-mining)| - weeks | 8-12 hours/week
#### Machine Learning Specialization
Course | Duration | Effort
:-- | :--: | :--:
[Machine Learning](https://www.coursera.org/specializations/machine-learning)| - weeks | 8-12 hours/week
#### Data Science Specialization
Course | Duration | Effort
:-- | :--: | :--:
[Statistics with R](https://www.coursera.org/specializations/statistics)| - weeks | - hours/week
[Data Science at Scale](https://www.coursera.org/specializations/data-science)| 17 weeks | 6-8 hours/week
[Data Science](https://www.coursera.org/specializations/jhu-data-science) | - weeks | 4-9 hours/week
### FutureLearn
#### Big Data
Course | Duration | Effort
:-- | :--: | :--:
[Big Data Analytics](https://www.futurelearn.com/programs/big-data-analytics)| 8 weeks | - hours/week
gitextract_cch89h3k/
├── .github/
│ ├── ISSUE_TEMPLATE/
│ │ └── request-for-comment-template.md
│ └── workflows/
│ └── delete-empty-issues.yml
├── CNAME
├── LICENSE.md
├── README.md
├── coursepages/
│ ├── intro-cs/
│ │ └── README.md
│ └── intro-programming/
│ └── README.md
└── extras/
├── books.md
├── courses.md
└── specializations.md
Condensed preview — 10 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (32K chars).
[
{
"path": ".github/ISSUE_TEMPLATE/request-for-comment-template.md",
"chars": 1303,
"preview": "---\nname: Request for Comment Template\nabout: Template for creating an RFC to modify the curriculum\ntitle: 'RFC: '\nlabel"
},
{
"path": ".github/workflows/delete-empty-issues.yml",
"chars": 1754,
"preview": "name: Delete empty issues\non:\n issues:\n types:\n - opened\njobs:\n label_issues:\n runs-on: ubuntu-latest\n p"
},
{
"path": "CNAME",
"chars": 11,
"preview": "ds.ossu.dev"
},
{
"path": "LICENSE.md",
"chars": 372,
"preview": "<a rel=\"license\" href=\"http://creativecommons.org/licenses/by-sa/4.0/\"><img alt=\"Creative Commons License\" style=\"border"
},
{
"path": "README.md",
"chars": 12283,
"preview": "<div align=\"center\" style=\"text-align: center\">\n<img src=\"http://i.imgur.com/kYYCXtC.png\" alt=\"Open Source Society logo\""
},
{
"path": "coursepages/intro-cs/README.md",
"chars": 3660,
"preview": "# Introduction to Computer Science\n\nThis course will introduce you to the world of computer science. Students who have b"
},
{
"path": "coursepages/intro-programming/README.md",
"chars": 7526,
"preview": "# Introduction to Programming\n\nIf you've never written a for-loop, or don't know what a string is in programming, start "
},
{
"path": "extras/books.md",
"chars": 869,
"preview": "# Data Science - Extra Resources\r\n\r\n## Books\r\n\r\n- [Python](#python)\r\n- [Data Analysis](#data-analysis)\r\n- [Data Visualiz"
},
{
"path": "extras/courses.md",
"chars": 443,
"preview": "# Data Science - Extra Resources\r\n\r\n## Courses\r\n\r\n- [Statistics](#statistics)\r\n\r\n---\r\n\r\n### Statistics\r\n\r\nCourses | Dura"
},
{
"path": "extras/specializations.md",
"chars": 2224,
"preview": "# Data Science - Specializations\n\n## Specializations\n\n* [Udacity](#udacity)\n * [Machine Learning Nanodegree by Google]("
}
]
About this extraction
This page contains the full source code of the ossu/data-science GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 10 files (29.7 KB), approximately 7.9k tokens. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.
Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.