Pandas is a fast, powerful, flexible and easy to use open source data analysis and manipulation tool, built on top of the Python programming language.


pip3 install pandas


import pandas as pd


Load csv

data = pd.read_csv("filename.csv")

If you want to parse the dates of the start column give read_csv the argument parse_dates=['start'].

Do operation on column data and save it in other column

# make a simple dataframe
df = pd.DataFrame({'a':[1,2], 'b':[3,4]})
#    a  b
# 0  1  3
# 1  2  4

# create an unattached column with an index
df.apply(lambda row: row.a + row.b, axis=1)
# 0    4
# 1    6

# do same but attach it to the dataframe
df['c'] = df.apply(lambda row: row.a + row.b, axis=1)
#    a  b  c
# 0  1  3  4
# 1  2  4  6

Get unique values of column

If we want to get the unique values of the name column:

Extract columns of dataframe

df1 = df[['a','b']]

Remove dumplicate rows

df = df.drop_duplicates()

Remove column from dataframe

del df['name']

Count unique combinations of values in selected columns


     A    B  count
0   no   no      1
1   no  yes      2
2  yes   no      4
3  yes  yes      3

Get row that contains the maximum value of a column