Webinar 2019 Introduction to scalable computing with Dask in Python

From SHARCNETHelp
Revision as of 17:02, 9 October 2019 by imported>Syam (Created page with "When programming in Python, some common libraries, such as Numpy, Pandas, Scikit-Learn, etc. usually work well if the dataset fits into the existing RAM on a single machine. H...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

When programming in Python, some common libraries, such as Numpy, Pandas, Scikit-Learn, etc. usually work well if the dataset fits into the existing RAM on a single machine. However, when dealing with large datasets, it could be a challenge to work around the memory constraints. This is where Dask can help. Dask is a Python task-graph based framework to parallelize operations. Dask provides Python APIs or libraries that can handle large datasets on a single multi-core machine or on a cluster. This webinar provides an introduction to Dask.