EzPyZ¶

DataFrame¶

class EzPyZ.DataFrame(data, columns=None, subset=None)¶

Bases: object

A DataFrame object will be used to utilize all other functionality in this package.

If you would prefer to pass a pandas dataframe directly to the class:

>>> import EzPyZ as ez
>>> import pandas as pd
>>> raw_data = {
...     'height_cm': [134, 168, 149, 201, 177, 168],
...     'weight_kg': [32.2, 64.3, 59.9, 95.4, 104.2, 63.1]
... }
>>> pandas_df = pd.DataFrame(raw_data)
>>> df = ez.DataFrame(data=pandas_df)

Or if you’d like to provide the data in a more raw format (similar to what would be passed to a pandas dataframe):

>>> import EzPyZ as ez
>>> raw_data = {
...     'height_cm': [134, 168, 149, 201, 177, 168],
...     'weight_kg': [32.2, 64.3, 59.9, 95.4, 104.2, 63.1]
... }
>>> df = ez.DataFrame(data=raw_data)

Or if you’d like to provide the data directly from an Excel of CSV file:

>>> import EzPyZ as ez
>>> from EzPyZ.tools import read_file
>>> df = ez.DataFrame(data=read_file("bmi_data.csv")) # A bmi_data.xlsx would also work here.

__init__(data, columns=None, subset=None)¶

Constructs a DataFrame object.

Parameters

data (Union[pd.DataFrame, Dict[str, List[Any]]]) – Either a pandas DataFrame object, or a dictionary where the keys are column titles and the values are lists of associated values (in order).
columns (List[str]) – (optional) A list of strings containing the titles of columns to be included in the dataframe. All others will be excluded. If this option is left blank or set to NoneType, then all columns will be included.
subset (str) – String containing rules to exclude certain rows from the DataFrame. This string must be composed with standard comparison operators (‘==’, ‘!=’, ‘<’, ‘>’, ‘<=’, ‘>=’). “And” statements must be separated by the word ‘and’ character, and “or” statements must be separated by the word ‘or’. Parenthesis are allowed as well. Defaults to None.

Returns

A new EzPyZ.DataFrame object.

Return type

``EzPyZ.DataFrame`

__repr__()¶

Returns basic DataFrame information.

Returns: Basic DataFrame information for debugging.
Return type: str

Usage:

>>> import EzPyZ as ez
>>> data = ez.tools.read_file("bmi_data.csv") A bmi_data.xlsx would also work here.
>>> df = ez.DataFrame(data=data)
>>> print(repr(df))
DataFrame(df=Column(title=height_cm, values=[134, 168, 149, 201, 177, ...]),
          Column(title=weight_kg, values=[32.2, 64.3, 59.9, 95.4, 104.2, ...]))

__str__()¶

Returns the DataFrame as a string.

Returns: A print-friendly string representing the DataFrame object.
Return type: str

Usage:

>>> import EzPyZ as ez
>>> data = ez.tools.read_file("bmi_data.csv") A bmi_data.xlsx would also work here.
>>> df = ez.DataFrame(data=data)
>>> print(df)
height_cm      weight_kg
1   134            32.2
2   168            64.3
3   149            59.9
4   201            95.4
5   177            104.2
6   168            63.1

get_columns()¶

Returns columns as a list.

Returns: Columns as a list.
Return type: List[EzPyZ.Column]

Usage:

>>> import EzPyZ as ez
>>> data = ez.tools.read_file("bmi_data.csv") A bmi_data.xlsx would also work here.
>>> df = ez.DataFrame(data=data)
>>> print(df.get_columns())
[Column(title=height_cm, values=[134, 168, 149, 201, 177, ...]),
 Column(title=weight_kg, values=[32.2, 64.3, 59.9, 95.4, 104.2, ...])]

get_titles()¶

Returns a list of all column titles.

Returns: A list of all column titles.
Return type: List[str]

Usage:

>>> import EzPyZ as ez
>>> data = ez.tools.read_file("bmi_data.csv") A bmi_data.xlsx would also work here.
>>> df = ez.DataFrame(data=data)
>>> print(df.get_titles())
['height_cm', 'weight_kg']

head(count=5)¶

Returns the first count rows of the dataframe.

Parameters: count (int) – (optional) The number of rows to return. Defaults to 5.
Returns: The first count rows of the dataframe.
Return type: str

Usage:

>>> import EzPyZ as ez
>>> data = ez.tools.read_file("bmi_data.csv") A bmi_data.xlsx would also work here.
>>> df = ez.DataFrame(data=data)
>>> print(df.head())
height_cm      weight_kg
0   134            32.2
1   168            64.3
2   149            59.9
3   201            95.4
4   177            104.2
5   168            63.1

length_columns()¶

Returns the number of columns in the DataFrame.

Returns: Number of columns.
Return type: int

Usage:

>>> import EzPyZ as ez
>>> data = ez.tools.read_file("bmi_data.csv") A bmi_data.xlsx would also work here.
>>> df = ez.DataFrame(data=data)
>>> df.length_columns()
2

length_rows()¶

Returns the number of rows in the DataFrame.

Returns: Number of rows.
Return type: int

Usage:

>>> import EzPyZ as ez
>>> data = ez.tools.read_file("bmi_data.csv") A bmi_data.xlsx would also work here.
>>> df = ez.DataFrame(data=data)
>>> df.length_rows()
6

subset(criterion)¶

Returns a new DataFrame object that meets the filter criterion provided.

Returns: A new, filtered DataFrame object.
Return type: EzPyZ.DataFrame

write_csv(filename='out.csv', header=True)¶

Writes the dataframe to a CSV file.

Parameters

filename (str) – (optional) The qualified name of the file to write to. Defaults to out.csv.
header (bool) – (optional) Boolean. Specifies whether or not the column titles should be written to the CSV. Defaults to True.

Returns

Nothing.

Return type

NoneType

Usage:

>>> import EzPyZ as ez
>>> raw_data = {
>>>     'height (cm)': [134, 168, 149, 201, 177, 168],
>>>     'weight (kg)': [32.2, 64.3, 59.9, 95.4, 104.2, 63.1]
>>> }
>>> df = ez.DataFrame(data=raw_data)
>>> df.write_csv("bmi_data.csv")

Column¶

class EzPyZ.Column(title, values)¶

Bases: object

A Column object. Column objects will make up EzPyZ.DataFrame objects in this module. This class is NOT intended for exernal use!

__init__(title, values)¶

Constructs a Column object.

Parameters

title (str) – A string containing the title of the column.
values (List[Any]) – A list containing the values in the column, in order.

Returns

Nothing.

Return type

NoneType

__repr__()¶

Returns basic Column information.

Returns: Basic Column information for debugging.
Return type: str

Usage:

>>> import EzPyZ as ez
>>> col = ez.column.Column("height_cm", [134, 168, 149, 201, 177, 168])
>>> print(repr(col))
Column(title=height_cm, values=[134, 168, 149, 201, 177, ...])

__str__()¶

Returns the Column as a string.

Returns: The Column as a string.
Return type: str

Usage:

>>> import EzPyZ as ez
>>> col = ez.column.Column("height_cm", [134, 168, 149, 201, 177, 168])
>>> print(col)
height_cm
134
168
149
201
177
168

get_values()¶

Returns self.values.

Returns: The values in the column.
Return type: List[Any]

Usage:

>>> import EzPyZ as ez
>>> col = ez.column.Column("height_cm", [134, 168, 149, 201, 177, 168])
>>> print(col.get_values())
[134, 168, 149, 201, 177, 168]

length()¶

Returns the length of self.values.

Returns: The number of values in the column.
Return type: int

Usage:

>>> import EzPyZ as ez
>>> col = ez.column.Column("height_cm", [134, 168, 149, 201, 177, 168])
>>> print(col.length())
6

mean()¶

Returns the mean of self.values.

Returns: The mean of the values in the column.
Return type: float

Usage:

>>> import EzPyZ as ez
>>> col = ez.column.Column("height_cm", [134, 168, 149, 201, 177, 168])
>>> print(col.mean())
166.16666666666666

median()¶

Returns the median of self.values.

Returns: The median of the values in the column.
Return type: float

Usage:

>>> import EzPyZ as ez
>>> col = ez.column.Column("height_cm", [134, 168, 149, 201, 177, 168])
>>> print(col.median())
168.0

mode()¶

Returns the mode of self.values.

Returns: The mode of the values in Column.
Return type: float

Usage:

>>> import EzPyZ as ez
>>> col = ez.column.Column("height_cm", [134, 168, 149, 201, 177, 168])
>>> print(col.mode())
168

set_values(values)¶

Sets self.values.

Parameters: values (List[Any]) – A list containing the values in the column, in order.
Returns: Nothing.
Return type: NoneType

stdev()¶

Returns the standard deviation of self.values.

Returns: The standard deviation of the values in the column.
Return type: float

Usage:

>>> import EzPyZ as ez
>>> col = ez.column.Column("height_cm", [134, 168, 149, 201, 177, 168])
>>> print(col.stdev())
23.094732444145496

title()¶

Returns self.col_title

Returns: The title of the column.
Return type: str

Usage:

>>> import EzPyZ as ez
>>> col = ez.column.Column("height_cm", [134, 168, 149, 201, 177, 168])
>>> print(col.title())
height_cm

variance()¶

Returns the variance of self.values.

Returns: The variance of the values in the column.
Return type: float

Usage:

>>> import EzPyZ as ez
>>> col = ez.column.Column("height_cm", [134, 168, 149, 201, 177, 168])
>>> print(col.variance())
533.3666666666667