import pandas as pdx = pd.Series([1,2,3])xindex values--------------011223x[0]1
Define new index
x = pd.Series([1,2,3], index=['a','b','c'])xindex values--------------a 1b 2c 3x['a']1
Dictionary type series
data ={'abc':1,'def':2,'xyz':3}x = pd.Series(data)xindex values--------------abc 1def2xyz 3x['abc']1
Scalar value series
The value gets repeated for each of the indexed defined.
x = pd.Series(1, index=['a','b','c'])xindex values--------------a 1b 1c 1
Dataframe
A Dataframe is a two dimensinal object that can have columns
Dictionaries, lists, series can be included
Most commonly used pandas object
Dataframe with Numpy
import numpy as npimport pandas as pddates = pd.date_range('20200226', periods=3)# DatetimeIndex(['2020-02-26', '2020-02-27', '2020-02-28'], dtype='datetime64[ns]', freq='D')columns =list('ABC')['A','B','C']data = np.random.randn(3,3)# array([[-0.10914734, -0.75659384, -0.06899813],# [ 1.37714538, 2.09279708, -0.05586049],# [ 0.2282605 , -1.54231927, -0.34941844]])df = pd.DataFrame(data, index=dates, columns=columns)dfindex A B C--------------------------------------------2020-02-260.109953-0.1348010.0238902020-02-271.4175910.8008340.1459552020-02-28-1.4287340.4382760.422585
Get Summary of our data
df.describe() A B C-----------------------------------------count 3.0000003.0000003.000000 mean -0.297529-0.6959120.751660 std 0.9662461.0451120.839380min-1.155043-1.624613-0.16710425%-0.820992-1.2617760.38830750%-0.486941-0.8989400.94371775%0.131229-0.2315611.211043max0.7493980.4358181.478368
Get cumulative sum
df.apply(np.cumsum) A B C ---------------------------------------------2020-02-26-0.076596-0.4707570.1095222020-02-271.390033-0.875267-1.1203442020-02-281.504389-0.522906-3.327057