Python Libraries for Data Science

Objectives

Be able to choose the appropriate Python libraries for data projects

Master NumPy, Pandas, Matplotlib, Seaborn and Plotly

Become autonomous in data analysis, data cleaning and visualization

📄 Program (PDF) 📅 View dates

Target Audience

Data analysts
Python developers

Prerequisites

Knowledge of Python fundamentals (variables, types, loops, conditions, functions, files)

Teaching Methods

Theoretical input: interactive presentations with slides
Hands-on practice: individual labs and progressive exercises using real financial datasets
Active learning: collaborative problem-solving
Balanced theory/practice approach: 30% theory / 70% practice
Course materials provided to participants
Mid-course quiz (20 questions) and final quiz (30 questions) to validate acquired skills

Target Certification: RS6701 — Manipulating, analyzing and visualizing data using Python Data Science modules — CPF eligible

Detailed Program

Day 1 — The Scientific Python Ecosystem & NumPy

Overview of Python Data Science packages
Installing scientific libraries: pip, venv, miniconda, mamba, miniforge, WinPython
Development environments: IPython, Jupyter Notebook, JupyterLab, Spyder, VS Code
Introduction to the NumPy library
Advantages of arrays (performance, data representation)
Creating arrays with array(), zeros(), ones(), full(), arange(), linspace(), logspace()
Matrix multiplication with np.dot and the @ operator
Identity matrix with identity() and eye(), diagonal matrix with diag()
Random initialization using NumPy's random module
Data types and attributes ndim, shape, size, dtype, itemsize, nbytes
Indexing, slicing, advanced indexing and broadcasting
Transposing and reshaping arrays (transpose(), reshape(), newaxis())
Concatenating and splitting arrays (concatenate(), vstack(), hstack(), split())
Functions: sum(), min(), max(), median(), percentile(), cumsum(), var(), argmin(), argmax()
Boolean masks for extracting information
Loading and saving arrays: loadtxt(), save(), load()

Day 2 — Data Manipulation with Pandas

Introduction to the Pandas library
Creating a Series and a DataFrame
Extracting row and column indices (index and columns attributes)
Importing and exporting data (CSV, Excel…)
Data exploration: head(), tail(), info(), describe(), dtypes
Implicit and explicit indexing with loc and iloc
Advanced selection: boolean expressions, query() method
Concatenating data with concat(), merging and joining with merge() and join()
Missing values: isna(), dropna(), fillna(), interpolate()
Sorting data: sort_index(), sort_values()
Removing data and duplicates: drop(), drop_duplicates()
Aggregation functions: sum(), cumsum(), min(), max(), mean(), median(), var(), std(), quantile()
Grouping and analysis: groupby(), aggregate(), apply(), filter(), transform()
Pivot tables: pivot_table()
Moving averages: rolling(), expanding(), ewm()
Multi-indexing: MultiIndex.from_product(), from_tuple(), from_arrays()
String processing and regular expressions with Pandas
Time series data: to_datetime(), date_range(), asfreq(), resample()

Day 3 — Visualization with Matplotlib & Seaborn

Introduction to Matplotlib: MATLAB-style vs object-oriented approach
Figure and Axes objects
Plotting curves with plot(): color, style, width, title, legend
Scatter plots with scatter()
Error bars with errorbar()
Area filling with fill_between()
Histograms with hist()
Multiple charts with subplots() and 3D plots with mplot3d
Pandas plotting: plot(), bar(), barh(), hist(), box(), scatter(), pie()
Introduction to Seaborn: Figure-level API and Axes-level API
Relational plots: relplot(), lineplot(), scatterplot()
Distributions: displot(), histplot(), jointplot(), pairplot()
Categorical data: catplot(), barplot(), countplot(), boxplot(), violinplot()
Heatmaps: heatmap()
Linear regression models: lmplot()
Customization: set_theme(), set_style(), set_context(), despine()

Day 4 — Interactive Visualization with Plotly

Introduction to the Plotly library and Kaleido: exploring Plotly Express
Plotting curves with line(): customizing figures with title, width, height, marker, labels
Adding information: hover_data, hover_name, text
Multiple charts: facet_row, facet_col
Style customization: template option and default themes
Area charts with area(): adding patterns with pattern_shape
Scatter plots with scatter(): using size, size_max, opacity, symbol
Color bars: color_continuous_scale, update_layout(), update_coloraxes()
Formatting bar charts with bar() and histograms with histogram()
3D charts with scatter_3d() and line_3d()
Mapping data with line_map(), scatter_map(), line_geo(), scatter_geo(), and choropleth()

Python Libraries for Data Science

Objectives

Be able to choose the appropriate Python libraries for data projects

Master NumPy, Pandas, Matplotlib, Seaborn and Plotly

Become autonomous in data analysis, data cleaning and visualization

📄 Program (PDF) 📅 View dates

Objectives

Target Audience

Prerequisites

Teaching Methods

Target Certification: RS6701 — Manipulating, analyzing and visualizing data using Python Data Science modules — CPF eligible

Detailed Program

Day 1 — The Scientific Python Ecosystem & NumPy

Day 2 — Data Manipulation with Pandas

Day 3 — Visualization with Matplotlib & Seaborn

Day 4 — Interactive Visualization with Plotly

Demander un devis

Request a Quote

📞 Planifier un échange

📞 Schedule a Call

Python Libraries for Data Science

Objectives Be able to choose the appropriate Python libraries for data projects Master NumPy, Pandas, Matplotlib, Seaborn and Plotly Become autonomous in data analysis, data cleaning and visualization 📄 Program (PDF) 📅 View dates ✉️ Request information

Objectives

Target Audience

Prerequisites

Teaching Methods

Target Certification: RS6701 — Manipulating, analyzing and visualizing data using Python Data Science modules — CPF eligible

Detailed Program

Day 1 — The Scientific Python Ecosystem & NumPy

Day 2 — Data Manipulation with Pandas

Day 3 — Visualization with Matplotlib & Seaborn

Day 4 — Interactive Visualization with Plotly

Prochaines sessions

Upcoming sessions

Demande de renseignements

Information Request

Demander un devis

Request a Quote

📞 Planifier un échange

📞 Schedule a Call

Objectives

Be able to choose the appropriate Python libraries for data projects

Master NumPy, Pandas, Matplotlib, Seaborn and Plotly

Become autonomous in data analysis, data cleaning and visualization

📄 Program (PDF) 📅 View dates