English | 简体中文 | 繁體中文 | Русский язык | Français | Español | Português | Deutsch | 日本語 | 한국어 | Italiano | بالعربية
Pandas SQL操作的具体实例
由于许多潜在的Pandas用户都对SQL有所了解,因此本页面旨在提供一些示例说明如何使用Pandas执行各种SQL操作。
esempio url = 'https://raw.github.com/pandasdev/ pandas/master/pandas/tests/data/tips.csv' url = 'https://raw.github.com/pandas-dev/pandas/master/pandas/tests/data/tips.csv' print tips.head()
i risultati dell'esecuzione sono i seguenti:
total_bill tip sex smoker day time size 0 16.99 1.01 Female No Sun Dinner 2 1 10.34 1.66 Male No Sun Dinner 3 2 21.01 3.50 Male No Sun Dinner 3 3 23.68 3.31 Male No Sun Dinner 2 4 24.59 3.61 Female No Sun Dinner 4
在SQL中,选择是使用您选择的列的逗号分隔列表(或使用*来选择所有列)来完成的:
SELECT total_bill, tip, smoker, time from tips LIMIT 5;
使用Pandas,通过将列名称列表传递到DataFrame来完成列选择:
tips[['total_bill', 'tip', 'smoker', 'time']].head(5)
让我们看一个完整的实例:
esempio url = 'https://raw.github.com/pandasdev/ pandas/master/pandas/tests/data/tips.csv' url = 'https://raw.github.com/pandas-dev/pandas/master/pandas/tests/data/tips.csv' print tips[['total_bill', 'tip', 'smoker', 'time']].head(5)
i risultati dell'esecuzione sono i seguenti:
total_bill tip smoker time 0 16.99 1.01 No Dinner 1 10.34 1.66 No Dinner 2 21.01 3.50 No Dinner 3 23.68 3.31 No Dinner 4 24.59 3.61 No Dinner
Calling a DataFrame without a column name list will display all columns (similar to SQL's *).
Filtering is performed using the WHERE clause in SQL.
SELECT * FROM tips WHERE time = 'Dinner' LIMIT 5;
The DataFrame can be filtered in many ways. The most intuitive method is to use boolean indexing.
tips[tips['time'] == 'Dinner'].head(5)
tips.head(5)
esempio url = 'https://raw.github.com/pandasdev/ pandas/master/pandas/tests/data/tips.csv' url = 'https://raw.github.com/pandas-dev/pandas/master/pandas/tests/data/tips.csv' print(tips[tips['time'] == 'Dinner'].head(5))
i risultati dell'esecuzione sono i seguenti:
total_bill tip sex smoker day time size 0 16.99 1.01 Female No Sun Dinner 2 1 10.34 1.66 Male No Sun Dinner 3 2 21.01 3.50 Male No Sun Dinner 3 3 23.68 3.31 Male No Sun Dinner 2 4 24.59 3.61 Female No Sun Dinner 4
The above statement passes a series of True / False objects to the DataFrame and returns all rows with True.
This operation retrieves the number of records in each group in the entire dataset. For example, query the gender grouping and count:
SELECT sex, count(*) from tips GROUP BY sex;
In Pandas, the operation is as follows:
tips.groupby('sex').size()
tips.head(5)
esempio url = 'https://raw.github.com/pandasdev/ pandas/master/pandas/tests/data/tips.csv' url = 'https://raw.github.com/pandas-dev/pandas/master/pandas/tests/data/tips.csv' print tips.groupby('sex').size()
i risultati dell'esecuzione sono i seguenti:
print(tips.groupby('sex').size()) sex Femmina 87 Maschio 157
ricerca del numero di righe N
SQL utilizza LIMIT per restituire N righe: SELECT * from tips
LIMIT 5 ;
operazione Pandas come segue:
tips.head(5)
esempio import pandas as pd url = 'https://raw.github.com/pandas-dev/pandas/master/pandas/tests/data/tips.csv' tips = tips[['smoker', 'day', 'time']].head(5) consigli di print
i risultati dell'esecuzione sono i seguenti:
giorno fumatore ora 0 No Sun Dinner 1 No Sun Dinner 2 No Sun Dinner 3 No Sun Dinner 4 No Sun Dinner