Data Science

Python for Data Science

This course is designed to provide participants with the fundamental skills and knowledge required to start a career in data science using Python. Participants will learn how to manipulate and analyze data using Python libraries such as NumPy, Pandas, and Matplotlib. By the end of the course, participants will be able to perform basic data analysis tasks and visualize data effectively.

Python for Data Science

About this Course

Course Overview:

This course is designed to provide participants with the fundamental skills and knowledge required to start a career in data science using Python. Participants will learn to manipulate and analyze data using Python libraries such as NumPy, Pandas, and Matplotlib. By the end of the course, participants will be able to perform basic data analysis tasks and visualize data effectively.

 

Prerequisites:

  • Basic programming knowledge
  • Familiarity with mathematics and statistics concepts (desirable but not mandatory)

 

Evaluation:

  • Weekly assignments and quizzes
  • Mid-term project (data analysis and visualization)
  • Final capstone project (data analysis, visualization, and presentation)

 

Resources:

  • Textbook: "Python for Data Analysis" by Wes McKinney
  • Online resources: Documentation of NumPy, Pandas, Matplotlib, and scikit-learn
  • Additional readings and tutorials provided by the instructor

Course Outline

  • • Introduction to Data Science and Python
  • • Setting up Python environment (Anaconda, Jupyter Notebook)
  • • Python basics: variables, data types, operators, control flow
  • • Introduction to Lists and Tuples
  • • Using sets
  • • Using Dictionaries
  • • Introduction to NumPy arrays
  • • Array creation and manipulation
  • • Array indexing and slicing
  • • Array operations and functions
  • • Introduction to Pandas DataFrame and Series
  • • Reading and writing data using Pandas
  • • Data manipulation with Pandas
  • • Data cleaning and preprocessing
  • • Introduction to Matplotlib
  • • Basic plots: line plot, scatter plot, bar plot
  • • Customizing plots: labels, titles, colors
  • • Multiple plots and subplots
  • • Understanding the dataset
  • • Descriptive statistics
  • • Data summarization and aggregation
  • • Data visualization for EDA
  • • Handling missing values
  • • Data transformation: normalization, scaling
  • • Data merging and joining
  • • Data reshaping
  • • Basics of machine learning
  • • Supervised vs. unsupervised learning
  • • Introduction to scikit-learn library
  • • Apply learned concepts to a real-world dataset
  • • Data analysis and visualization
  • • Present findings and insights
  • • Project to solve real life problems will be presented and evaluated