Data Science with Python

Last updated on Oct 1, 2021

⚠️ WORK IN PROGRESS! ⚠️

Welcome to my Data Science with Python course!

You can find all the Jupyter notebook on my Github page here.

Please, if you find any typos or mistakes, open a new issue. Or even better, fork the repo and submit a pull request. I am happy to share my work and I am even happier if it can be useful.

Content

  1. Data Structures
    • Lists
    • Tuples
    • Sets
    • Dictionaries
    • Numpy arrays
    • Pandas DataFrames
    • Pyspark DataFrames
  2. Data Exploration
    • Import, export data
    • Descriprives and summary statistics
    • Pivot tables and aggregation
  3. Data Types
    • Numerical data
    • String data
    • Time data
    • Missing data
  4. Data Wrangling
    • Rows: sorting, indexing, ….
    • Columns: renaming, ordering, ….
    • Collapse and aggregate
    • Reshape
    • Concatenate and merge
  5. Plotting
    • Distributions
    • Time Series
    • Correlations
    • Regression
    • Geographical data
  6. Machine Learning Pipeline
    • Data exploration
    • Encoding and normalization
    • Missing values
    • Weighting
    • Prediction
    • Cross-validation
  7. Web Scraping
    • Pandas
    • APIs
    • Static Webscraping
    • Dynamic Webscraping
  8. TBD
    • What is missing? Let me know!

Contacts

All feedback is greatly appreciated!