If we wanted to extract a header, grab all the columns, and do a count of a dataset, we would have to have written all these functions in a module and imported them for every new dataset we used. This can get tedious over time because we use datasets all the time and we would want a consistent set of behaviors and attributes to use with them.
With classes, we bundle all of that data and behavior together in one location. An instance of the Dataset
class is all we need to count unique terms in a dataset or get a file's header. Once we add behavior to a class, every instance of the class will be able to perform that behavior. As we develop our application, we can add more properties to classes to extend their functionality. Using classes and instances helps organize our code, and allows us to represent real-world concepts in well-defined code constructs.
Create a class called Dataset
.
- Inside the class, create a
type
attribute. Assign the value"csv"
to it.
Create an instance of the Dataset
class, and assign it to the variable dataset
.
Print the type
attribute of the dataset
instance.
Add a data
parameter to the __init__()
method, and set the value to the self.data
attribute.
Read the data from nfl.csv
and set it to the variable nfl_data
.
Make an instance of the class, passing in nfl_data
to the __init__()
method (when you call Dataset(...)).
- Assign the result to the variable
nfl_dataset
.
Use the data
attribute to access the underlying data for nfl_dataset
and assign the result to the variable dataset_data
.
Add an instance method print_data()
that takes in a num_rows
argument.
- This method should print out data up to the given amount of rows.
Create an instance of the Dataset class and initialize with the nfl_data
. nfl_data
is already loaded for you .
- Assign it to the cariable
nfl_dataset
.
Call the print_data method , setting the num_rows
parameter to 5
.
Add the extract_header()
code to the initializer and set the header data to self.header
.
Create a variable called nfl_header
and set it to the header attribute.
Add a method named column
that takes in a label
argument, finds the index of the header, and returns a list of the column data.
- If the
label
is not in the header, you should returnNone
.
Create a variable called year_column
and set it to the return value of column('year')
.
Create a variable called player_column
and set it to the return value of column('player')
.
Add a method to the Dataset
class called count_unique()
that takes in a label
arguments.
Get the unique set of items from the column()
method and return the total count.
Use the instance method to assign the number of unique term values of year
to total_years
.
Add a method to the Dataset class called __str__()
- Convert the first 10 rows of
self.data
to a string and set it as the return value.
Create an instance of the class called nfl_dataset
and call print on it.