+1

Pandas in Python

Getting started


What is it?

Pandas is a Python package that provides fast, flexible, and expressive data structures designed to make working with "relational" and "labeled" data both easy and intutive. It aims to be the fundamental high-level building block for doing practical, real world data analyst in Python. Additionally, it has the broader goal of becoming the most powerful and flexible open source data analyst / manipulation tool available in any language. It is already well on its way towards this goal.

Main Features

Here are just a few of the things that pandas does well:

  • Easy handling of missing data (represented as Nan, NA, or NaT ) in floating point as well as non-floating point data.
  • Size mutability: columns can be inserted and deleted from DataFrame and higher dimensional objects.
  • Automatic and explicit data alignment: objects can be explicitly aligned to a set of lables, or the user can simply ignore the labels and ler Series, DataFrame, etc. Automatically align the data for you in computations.
  • Powerful, flexible group by functionality to perform split-apply-combine operations on data sets, for both aggregating and transforming data.
  • Make it easy to convert ragged, differently-indexed data in other Python and Numpy data structures into DataFrame objects.
  • Intelligent label-based slicing, fancy indexing, and subsetting or large data sets.
  • Intuitive merging and joining data sets.
  • Flexible reshaping and pivoting of data sets.
  • Hierachical labeling of axes (possible to have multiple labels per tick).
  • Robust IO tools for loading data from flat files (CSV and delimited), Excel files, database, and saving/loading data from the ultrafast HDF5 format.
  • Time series-specific functionality: date range generation and frequency conversion, moving window statistics, date shifting and lagging.

Installation

The source code is currently hosted on Github at: https://github.com/pandas-dev/pandas

Working with Conda?
Conda is part of the Anaconda distribution and can beinstalled with Anaconda or Miniconda:

conda install pandas

Prefer Pip?
Pandas can be installed via pip from PyPi:

pip install pandas

Installation from source

To install pandas from source you need Cython in addition to the normal dependentcies above. Cython can be installed from PyPi:

pip install cython

In the pandas directory (same one where you found this file after cloning the git repo), execute:

python setup.py install

or for installing in development mode

python -m pip install -e . --no-build-isolation --no-use-pep517

or alternatively

python setup.py delvelop

see full instructions for installing from source.

Documentation

The official documentation is hosted on PyData.org: https://pandas.pydata.org/pandas-docs/stable/


All rights reserved

Viblo
Hãy đăng ký một tài khoản Viblo để nhận được nhiều bài viết thú vị hơn.
Đăng kí