Extend Pandas DataFrame with custom functions and attributes

May 5, 2015 07:34 · 175 words · 1 minute read

At Quantego.com we love working with Pandas Dataframes. We use them to store and analyze results from simulation runs. On top of our data matrix and a multi-level index we also need to accommodate custom plotting functions and attributes from the previous simulation run.

Subclassing pandas.DataFrame for this task was a no-brainer. The new version 0.16.1 (to be released in the next days) includes some fixes to make working with subclasses of complex data-frames (DF) easier. Here an example of what can be done. First define two new classes for pandas.Series (single col DF) and pandas.DataFrame . You can define new functions or attributes, as needed.

class CustomSeries(pandas.Series):
    def _constructor(self):
        return CustomSeries

    def custom_series_function(self):
        return 'OK'

class CustomDataFrame(pandas.DataFrame):
    "My custom dataframe"
    def __init__(self, *args, **kw):
        super(CustomDataFrame, self).__init__(*args, **kw)

    def _constructor(self):
        return CustomDataFrame

    _constructor_sliced = CustomSeries

    def custom_frame_function(self):
        return 'OK'

Notice _constructor  and _constructor_sliced . They make sure you get the correct class back, when slicing the DF.

Via self  you have convenient access to all Pandas functions and can even roll your own.