1.Shared Indexes
import pandas as pd
fandango = pd.read_csv('fandango_score_comparison.csv')
print(fandango.index)
RangeIndex(start=0, stop=146, step=1)
2. Using Integer Indexes to Select Rows
fandango = pd.read_csv('fandango_score_comparison.csv')
last_row = fandango.shape[0] - 1
first_last = fandango.iloc[[0, last_row]]
print(first_last)
3. Using Custom Indexes
The dataframe object has a set_index() method that allows us to pass in the name of the column we want pandas to use as the Dataframe index.
- inplace: If set to True, this parameter will set the index for the current, "live" dataframe, instead of returning a new dataframe.
- drop: If set to False, this parameter will keep the column we specified as the index, instead of dropping it.
fandango = pd.read_csv('fandango_score_comparison.csv')
fandango_films = fandango.set_index('FILM', drop=False)
print(fandango_films.index[0:5])
Index(['Avengers: Age of Ultron (2015)', 'Cinderella (2015)', 'Ant-Man (2015)', 'Do You Believe? (2015)', 'Hot Tub Time Machine 2 (2015)'], dtype='object', name='FILM')
4. Using a Custom Index for Selection
movies = ["The Lazarus Effect (2015)", "Gett: The Trial of Viviane Amsalem (2015)", "Mr. Holmes (2015)"]
best_movies_ever = fandango_films.loc[movies]
5. Apply() Logic Over Columns: Practice
double_df = float_df.apply(lambda x: x*2)
print(double_df.head(1))
print('------------------------')
halved_df = float_df.apply(lambda x: x/2)
print(halved_df.head(1))
6. Apply() Over Dataframe Rows
rt_mt_user = float_df[['RT_user_norm', 'Metacritic_user_nom']]
rt_mt_deviations = rt_mt_user.apply(lambda x: np.std(x), axis=1)
print(rt_mt_deviations[0:5])
FILM
Avengers: Age of Ultron (2015) 0.375
Cinderella (2015) 0.125
Ant-Man (2015) 0.225
Do You Believe? (2015) 0.925
Hot Tub Time Machine 2 (2015) 0.150
dtype: float64