2016-01-29 (Last Updated: 2024-04-11)

NYC Taxi#

There are a variety of approaches for plotting large datasets, but most of them are very unsatisfactory. This example shows some of the issues, then demonstrates how Datashader helps make large datasets truly practical. We will use part of the well-studied NYC Taxi trip database, with the locations of all NYC taxi pickups and dropoffs from the month of January 2015. Although we know what the data is, let’s approach it as if we are doing data mining, and see what it takes to understand the dataset from scratch.

The example project consists of three notebooks, one of them being also a deployable Panel dashboard:

Geographic analysis: This notebook demonstrates the issues with plotting a large number of points. It focuses on plotting points on a map, and shows how Datashader can help analyzing a large amount of data.
Non-geographic analysis: This notebook explores different non-geographical dimensions of the dataset, again showing the capabilities offered by Datashader.
Panel Dashboard: This notebook shows how to build a simple dashboard for exploring 10 million taxi trips in a Jupyter notebook using Datashader, then deploying it as a standalone dashboard using Panel.