EzPyZ

DataFrame

class EzPyZ.DataFrame(data, columns=None, subset=None)

Bases: object

A DataFrame object will be used to utilize all other functionality in this package.

If you would prefer to pass a pandas dataframe directly to the class:

>>> import EzPyZ as ez
>>> import pandas as pd
>>> raw_data = {
...     'height_cm': [134, 168, 149, 201, 177, 168],
...     'weight_kg': [32.2, 64.3, 59.9, 95.4, 104.2, 63.1]
... }
>>> pandas_df = pd.DataFrame(raw_data)
>>> df = ez.DataFrame(data=pandas_df)

Or if you’d like to provide the data in a more raw format (similar to what would be passed to a pandas dataframe):

>>> import EzPyZ as ez
>>> raw_data = {
...     'height_cm': [134, 168, 149, 201, 177, 168],
...     'weight_kg': [32.2, 64.3, 59.9, 95.4, 104.2, 63.1]
... }
>>> df = ez.DataFrame(data=raw_data)

Or if you’d like to provide the data directly from an Excel of CSV file:

>>> import EzPyZ as ez
>>> from EzPyZ.tools import read_file
>>> df = ez.DataFrame(data=read_file("bmi_data.csv")) # A bmi_data.xlsx would also work here.
__init__(data, columns=None, subset=None)

Constructs a DataFrame object.

Parameters
  • data (Union[pd.DataFrame, Dict[str, List[Any]]]) – Either a pandas DataFrame object, or a dictionary where the keys are column titles and the values are lists of associated values (in order).

  • columns (List[str]) – (optional) A list of strings containing the titles of columns to be included in the dataframe. All others will be excluded. If this option is left blank or set to NoneType, then all columns will be included.

  • subset (str) – String containing rules to exclude certain rows from the DataFrame. This string must be composed with standard comparison operators (‘==’, ‘!=’, ‘<’, ‘>’, ‘<=’, ‘>=’). “And” statements must be separated by the word ‘and’ character, and “or” statements must be separated by the word ‘or’. Parenthesis are allowed as well. Defaults to None.

Returns

A new EzPyZ.DataFrame object.

Return type

``EzPyZ.DataFrame`

__repr__()

Returns basic DataFrame information.

Returns

Basic DataFrame information for debugging.

Return type

str

Usage:

>>> import EzPyZ as ez
>>> data = ez.tools.read_file("bmi_data.csv") A bmi_data.xlsx would also work here.
>>> df = ez.DataFrame(data=data)
>>> print(repr(df))
DataFrame(df=Column(title=height_cm, values=[134, 168, 149, 201, 177, ...]),
          Column(title=weight_kg, values=[32.2, 64.3, 59.9, 95.4, 104.2, ...]))
__str__()

Returns the DataFrame as a string.

Returns

A print-friendly string representing the DataFrame object.

Return type

str

Usage:

>>> import EzPyZ as ez
>>> data = ez.tools.read_file("bmi_data.csv") A bmi_data.xlsx would also work here.
>>> df = ez.DataFrame(data=data)
>>> print(df)
height_cm      weight_kg
1   134            32.2
2   168            64.3
3   149            59.9
4   201            95.4
5   177            104.2
6   168            63.1
get_columns()

Returns columns as a list.

Returns

Columns as a list.

Return type

List[EzPyZ.Column]

Usage:

>>> import EzPyZ as ez
>>> data = ez.tools.read_file("bmi_data.csv") A bmi_data.xlsx would also work here.
>>> df = ez.DataFrame(data=data)
>>> print(df.get_columns())
[Column(title=height_cm, values=[134, 168, 149, 201, 177, ...]),
 Column(title=weight_kg, values=[32.2, 64.3, 59.9, 95.4, 104.2, ...])]
get_titles()

Returns a list of all column titles.

Returns

A list of all column titles.

Return type

List[str]

Usage:

>>> import EzPyZ as ez
>>> data = ez.tools.read_file("bmi_data.csv") A bmi_data.xlsx would also work here.
>>> df = ez.DataFrame(data=data)
>>> print(df.get_titles())
['height_cm', 'weight_kg']
head(count=5)

Returns the first count rows of the dataframe.

Parameters

count (int) – (optional) The number of rows to return. Defaults to 5.

Returns

The first count rows of the dataframe.

Return type

str

Usage:

>>> import EzPyZ as ez
>>> data = ez.tools.read_file("bmi_data.csv") A bmi_data.xlsx would also work here.
>>> df = ez.DataFrame(data=data)
>>> print(df.head())
height_cm      weight_kg
0   134            32.2
1   168            64.3
2   149            59.9
3   201            95.4
4   177            104.2
5   168            63.1
length_columns()

Returns the number of columns in the DataFrame.

Returns

Number of columns.

Return type

int

Usage:

>>> import EzPyZ as ez
>>> data = ez.tools.read_file("bmi_data.csv") A bmi_data.xlsx would also work here.
>>> df = ez.DataFrame(data=data)
>>> df.length_columns()
2
length_rows()

Returns the number of rows in the DataFrame.

Returns

Number of rows.

Return type

int

Usage:

>>> import EzPyZ as ez
>>> data = ez.tools.read_file("bmi_data.csv") A bmi_data.xlsx would also work here.
>>> df = ez.DataFrame(data=data)
>>> df.length_rows()
6
subset(criterion)

Returns a new DataFrame object that meets the filter criterion provided.

Returns

A new, filtered DataFrame object.

Return type

EzPyZ.DataFrame

write_csv(filename='out.csv', header=True)

Writes the dataframe to a CSV file.

Parameters
  • filename (str) – (optional) The qualified name of the file to write to. Defaults to out.csv.

  • header (bool) – (optional) Boolean. Specifies whether or not the column titles should be written to the CSV. Defaults to True.

Returns

Nothing.

Return type

NoneType

Usage:

>>> import EzPyZ as ez
>>> raw_data = {
>>>     'height (cm)': [134, 168, 149, 201, 177, 168],
>>>     'weight (kg)': [32.2, 64.3, 59.9, 95.4, 104.2, 63.1]
>>> }
>>> df = ez.DataFrame(data=raw_data)
>>> df.write_csv("bmi_data.csv")

Column

class EzPyZ.Column(title, values)

Bases: object

A Column object. Column objects will make up EzPyZ.DataFrame objects in this module. This class is NOT intended for exernal use!

__init__(title, values)

Constructs a Column object.

Parameters
  • title (str) – A string containing the title of the column.

  • values (List[Any]) – A list containing the values in the column, in order.

Returns

Nothing.

Return type

NoneType

__repr__()

Returns basic Column information.

Returns

Basic Column information for debugging.

Return type

str

Usage:

>>> import EzPyZ as ez
>>> col = ez.column.Column("height_cm", [134, 168, 149, 201, 177, 168])
>>> print(repr(col))
Column(title=height_cm, values=[134, 168, 149, 201, 177, ...])
__str__()

Returns the Column as a string.

Returns

The Column as a string.

Return type

str

Usage:

>>> import EzPyZ as ez
>>> col = ez.column.Column("height_cm", [134, 168, 149, 201, 177, 168])
>>> print(col)
height_cm
134
168
149
201
177
168
get_values()

Returns self.values.

Returns

The values in the column.

Return type

List[Any]

Usage:

>>> import EzPyZ as ez
>>> col = ez.column.Column("height_cm", [134, 168, 149, 201, 177, 168])
>>> print(col.get_values())
[134, 168, 149, 201, 177, 168]
length()

Returns the length of self.values.

Returns

The number of values in the column.

Return type

int

Usage:

>>> import EzPyZ as ez
>>> col = ez.column.Column("height_cm", [134, 168, 149, 201, 177, 168])
>>> print(col.length())
6
mean()

Returns the mean of self.values.

Returns

The mean of the values in the column.

Return type

float

Usage:

>>> import EzPyZ as ez
>>> col = ez.column.Column("height_cm", [134, 168, 149, 201, 177, 168])
>>> print(col.mean())
166.16666666666666
median()

Returns the median of self.values.

Returns

The median of the values in the column.

Return type

float

Usage:

>>> import EzPyZ as ez
>>> col = ez.column.Column("height_cm", [134, 168, 149, 201, 177, 168])
>>> print(col.median())
168.0
mode()

Returns the mode of self.values.

Returns

The mode of the values in Column.

Return type

float

Usage:

>>> import EzPyZ as ez
>>> col = ez.column.Column("height_cm", [134, 168, 149, 201, 177, 168])
>>> print(col.mode())
168
set_values(values)

Sets self.values.

Parameters

values (List[Any]) – A list containing the values in the column, in order.

Returns

Nothing.

Return type

NoneType

stdev()

Returns the standard deviation of self.values.

Returns

The standard deviation of the values in the column.

Return type

float

Usage:

>>> import EzPyZ as ez
>>> col = ez.column.Column("height_cm", [134, 168, 149, 201, 177, 168])
>>> print(col.stdev())
23.094732444145496
title()

Returns self.col_title

Returns

The title of the column.

Return type

str

Usage:

>>> import EzPyZ as ez
>>> col = ez.column.Column("height_cm", [134, 168, 149, 201, 177, 168])
>>> print(col.title())
height_cm
variance()

Returns the variance of self.values.

Returns

The variance of the values in the column.

Return type

float

Usage:

>>> import EzPyZ as ez
>>> col = ez.column.Column("height_cm", [134, 168, 149, 201, 177, 168])
>>> print(col.variance())
533.3666666666667