Skip to content

stefmolin/Hands-On-Data-Analysis-with-Pandas

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Hands-On Data Analysis with Pandas

Hands-On Data Analysis with Pandas

This is the code repository for my book Hands-On Data Analysis with Pandas, published by Packt.

Book Description

Data analysis has become a necessary skill in a variety of positions where knowing how to work with data and extract insights can generate great value.

This book will show you how to analyze your data and get started with machine learning in Python, using the powerful pandas library. We will extend pandas offerings with other Python libraries, such as matplotlib, NumPy, and scikit-learn to perform each phase and operation of data analysis tasks. You will learn data wrangling, how to manipulate your data, clean it, visualize it, find patterns, and make predictions based on past data, using real-world examples. You will learn how to conduct data analysis, and then take your analyses a step further as we explore some applications of anomaly detection, regression, clustering, and classification.

Towards the end of the book, you will be able to use pandas to ensure the veracity of your data, visualize it for effective decision-making, and reliably reproduce analyses across multiple datasets.

What You Will Learn

  • Understand how data analysts and scientists think about gathering and understanding data
  • Perform data analysis and data wrangling in Python
  • Combine, group, and aggregate data from multiple sources
  • Create data visualizations with pandas, matplotlib, and seaborn
  • Learn how to apply machine learning algorithms with sklearn to make predictions and look for patterns.
  • Use Python data science libraries to analyze real-world datasets.
  • Use pandas to solve several common data representation and analysis problems
  • Learn how to collect data from APIs
  • Utilize computer science concepts and algorithms to write more efficient code for data analysis
  • Learn how to build your own Python scripts, modules, and packages
  • Build and run simulations

Table of Contents

TBD

About the Author

Stefanie Molin (@stefmolin) is a Data Scientist and Software Engineer at Bloomberg LP in NYC (and hacker in training) tackling tough problems in Information Security particularly revolving around anomaly detection, building tools for gathering data, and knowledge sharing. She has extensive experience in data science, designing anomaly detection solutions, and machine learning in both R and Python in the AdTech and FinTech industries. She holds a B.S. in Operations Research from Columbia University’s engineering school with minors in Economics and Entrepreneurship and Innovation.

In her free time, she enjoys traveling the world, inventing new recipes, and learning new languages spoken both among people and computers.