English | 简体中文 | 繁體中文 | Русский язык | Français | Español | Português | Deutsch | 日本語 | 한국어 | Italiano | بالعربية

operazioni SQLPandas

Pandas SQL操作的具体实例

由于许多潜在的Pandas用户都对SQL有所了解,因此本页面旨在提供一些示例说明如何使用Pandas执行各种SQL操作。

 esempio
 url = 'https://raw.github.com/pandasdev/
 pandas/master/pandas/tests/data/tips.csv'
 url = 'https://raw.github.com/pandas-dev/pandas/master/pandas/tests/data/tips.csv'
 print tips.head()

i risultati dell'esecuzione sono i seguenti:

      total_bill   tip      sex  smoker  day     time  size
0        16.99  1.01   Female      No  Sun  Dinner      2
1        10.34  1.66     Male      No  Sun  Dinner      3
2        21.01  3.50     Male      No  Sun  Dinner      3
3        23.68  3.31     Male      No  Sun  Dinner      2
4        24.59  3.61   Female      No  Sun  Dinner      4

查询

在SQL中,选择是使用您选择的列的逗号分隔列表(或使用*来选择所有列)来完成的:

 SELECT total_bill, tip, smoker, time
 from tips
 LIMIT 5;

使用Pandas,通过将列名称列表传递到DataFrame来完成列选择:

 tips[['total_bill', 'tip', 'smoker', 'time']].head(5)

让我们看一个完整的实例:

 esempio
 url = 'https://raw.github.com/pandasdev/
 pandas/master/pandas/tests/data/tips.csv'
  
 url = 'https://raw.github.com/pandas-dev/pandas/master/pandas/tests/data/tips.csv'
 print tips[['total_bill', 'tip', 'smoker', 'time']].head(5)

i risultati dell'esecuzione sono i seguenti:

   total_bill   tip  smoker     time
0       16.99  1.01      No   Dinner
1       10.34  1.66      No   Dinner
2        21.01     3.50     No        Dinner
3        23.68     3.31     No        Dinner
4        24.59     3.61     No        Dinner

Calling a DataFrame without a column name list will display all columns (similar to SQL's *).

WHERE clause query

Filtering is performed using the WHERE clause in SQL.

 SELECT * FROM tips WHERE time = 'Dinner' LIMIT 5;

The DataFrame can be filtered in many ways. The most intuitive method is to use boolean indexing.

 tips[tips['time'] == 'Dinner'].head(5)

tips.head(5)

 esempio
 url = 'https://raw.github.com/pandasdev/
 pandas/master/pandas/tests/data/tips.csv'
 url = 'https://raw.github.com/pandas-dev/pandas/master/pandas/tests/data/tips.csv'
 print(tips[tips['time'] == 'Dinner'].head(5))

i risultati dell'esecuzione sono i seguenti:

     total_bill     tip       sex       smoker   day       time       size
0        16.99     1.01     Female     No     Sun     Dinner     2
1        10.34     1.66     Male       No     Sun     Dinner     3
2        21.01     3.50     Male       No     Sun     Dinner     3
3        23.68     3.31     Male       No     Sun     Dinner     2
4        24.59     3.61     Female     No     Sun     Dinner     4

The above statement passes a series of True / False objects to the DataFrame and returns all rows with True.

GroupBy grouping

This operation retrieves the number of records in each group in the entire dataset. For example, query the gender grouping and count:

 SELECT sex, count(*)
 from tips
 GROUP BY sex;

In Pandas, the operation is as follows:

 tips.groupby('sex').size()

tips.head(5)

 esempio
 url = 'https://raw.github.com/pandasdev/
 pandas/master/pandas/tests/data/tips.csv'
 url = 'https://raw.github.com/pandas-dev/pandas/master/pandas/tests/data/tips.csv'
 print tips.groupby('sex').size()

i risultati dell'esecuzione sono i seguenti:

 print(tips.groupby('sex').size())
 sex
 Femmina 87
 Maschio 157

dtype: int64

ricerca del numero di righe N

 SQL utilizza LIMIT per restituire N righe:
 SELECT * from tips

LIMIT 5 ;

 operazione Pandas come segue:

tips.head(5)

 esempio
 import pandas as pd
 url = 'https://raw.github.com/pandas-dev/pandas/master/pandas/tests/data/tips.csv'
 tips = tips[['smoker', 'day', 'time']].head(5)
 consigli di print

i risultati dell'esecuzione sono i seguenti:

     giorno fumatore ora
0      No   Sun   Dinner
1      No   Sun   Dinner
2      No   Sun   Dinner
3      No   Sun   Dinner
4      No   Sun   Dinner