What is DataFrame from Pandas Library
In Python, a DataFrame is a two-dimensional tabular data structure provided by the pandas library that consists of rows and columns. It can be thought of as a spreadsheet or a SQL table.
Each column in a DataFrame represents a variable, and each row represents an observation. You can perform various operations on a DataFrame, such as filtering, sorting, grouping, and aggregating data.
Here’s an example of creating a DataFrame from pandas library:
import pandas as pd
data = {'Name': ['John', 'Alice', 'Bob'],
'Age': [25, 30, 35],
'Country': ['USA', 'Canada', 'UK']}
df = pd.DataFrame(data)
print(df)
Output:
Name Age Country
0 John 25 USA
1 Alice 30 Canada
2 Bob 35 UK
In this example, we created a DataFrame ‘df
‘ from a dictionary ‘data
‘ with three columns ‘Name’, ‘Age’, and ‘Country’ and three rows of data. The DataFrame is printed using the ‘print()
‘ function.
You can access specific columns or rows of a DataFrame using indexing, slicing, or filtering operations. You can also perform various statistical and mathematical operations on the data in a DataFrame using pandas functions. Overall, DataFrame is a useful data structure for working with tabular data in Python.
Change the order of DataFrame columns from pandas library
You can change the order of DataFrame columns using the ‘reindex()
‘ function from pandas library. Here’s an example:
Suppose you have a DataFrame ‘df
‘ with columns ‘A’, ‘B’, ‘C’ and ‘D’ in the order:
import pandas as pd
df = pd.DataFrame({
'A': [1, 2, 3],
'B': [4, 5, 6],
'C': [7, 8, 9],
'D': [10, 11, 12]
})
print(df)
Output:
A B C D
0 1 4 7 10
1 2 5 8 11
2 3 6 9 12
Now, let’s say you want to change the order of the columns to ‘B’, ‘A’, ‘D’, ‘C’. You can achieve this using the ‘reindex()
‘ function like this:
df = df.reindex(columns=['B', 'A', 'D', 'C'])
print(df)
Output:
B A D C
0 4 1 10 7
1 5 2 11 8
2 6 3 12 9
As you can see, the order of the columns has been changed as per the provided list of columns. The ‘reindex()
‘ function also takes other parameters like ‘fill_value
‘ and ‘method
‘, which can be used to fill missing values, forward-fill or backward-fill values, etc.
Change the order of DataFrame Row
To change the order of rows in a DataFrame in Python, you can use the ‘reindex
‘ method with the new order of the row index.
Here’s an example:
import pandas as pd
# create a sample DataFrame
df = pd.DataFrame({'Name': ['Alice', 'Bob', 'Charlie', 'David'],
'Age': [25, 30, 35, 40],
'City': ['New York', 'Los Angeles', 'Chicago', 'Houston']})
# print the original DataFrame
print('Original DataFrame:')
print(df)
# define the new row order
new_order = [2, 0, 3, 1] # the new row order
# change the order of rows
df = df.reindex(new_order)
# print the updated DataFrame
print('Updated DataFrame:')
print(df)
Output:
Original DataFrame:
Name Age City
0 Alice 25 New York
1 Bob 30 Los Angeles
2 Charlie 35 Chicago
3 David 40 Houston
Updated DataFrame:
Name Age City
2 Charlie 35 Chicago
0 Alice 25 New York
3 David 40 Houston
1 Bob 30 Los Angeles
In this example, we created a sample DataFrame from pandas library and then defined a new order for the rows using the ‘new_order
‘ list. We then used the ‘reindex
‘ method to change the order of the rows in the DataFrame according to the new order of the row index. The resulting DataFrame has the same columns as the original, but the rows are in the new order specified by the ‘new_order
‘ list.