admin管理员组文章数量:1568418
python处理数据可视化
由Viraj Parekh | 2017年4月6日 (by Viraj Parekh | April 6, 2017)
This is a basic tutorial using pandas and a few other packages to build a simple datapipe for getting NBA data. Even though this tutorial is done using NBA data, you don’t need to be an NBA fan to follow along. The same concepts and techniques can be applied to any project of your choosing.
这是使用熊猫和其他一些软件包来构建用于获取NBA数据的简单数据管道的基础教程。 即使本教程是使用NBA数据完成的,您也不必成为NBA粉丝。 相同的概念和技术可以应用于您选择的任何项目。
This is meant to be used as a general tutorial for beginners with some experience in Python or R.
旨在将其用作具有Python或R经验的初学者的通用教程。
第一步:我们需要什么数据? (Step One: What data do we need?)
The first step to any data project is getting an idea of what you want. We’re going to focus on getting NBA data at a team level on a game by game basis. From my experience, these team level stats usually exist in different places, making them harder to compare across games.
任何数据项目的第一步都是要了解您想要的东西。 我们将专注于逐场比赛在团队层面获取NBA数据。 根据我的经验,这些团队级别的统计数据通常存在于不同的地方,这使得它们在整个游戏中很难进行比较。
Our goal is to build box scores across a team level to easily compare them against each other. Hopefully this will give some insight as to how a team’s play has changed over the course of the season or make it easier to do any other type of analysis.
我们的目标是在整个团队水平上建立盒子分数,以轻松地相互比较。 希望这能对团队的表现在整个赛季中发生的变化提供一些见解,或者使进行任何其他类型的分析变得更加容易。
On a high level, this might look something like:
从高层次看,这可能看起来像:
Game | Days Rest | Total Passes | Total Assists | Passes/Assist | EFG | Outcome
游戏 天休息| 总通行证| 助攻总数| 通过/协助| EFG | 结果
下一步:数据来自哪里? (Next step: Where is the data coming from?)
stats.nba has all the NBA data that’s out there, but the harder part is finding a quick way to fetch and manipulate it into the form that’s needed (and what most of this tutorial will be about).
stats.nba拥有所有的NBA数据,但更难的部分是找到一种快速方法来将其提取并操纵为所需的形式(以及本教程大部分内容)。
Analytics is fun, but everything around it can be tough.
分析很有趣,但是周围的一切都很艰难。
We’re going to use the nba_py package
我们将使用nba_py包
Huge shoutout to https://github/seemethere for putting this together.
要大声地对https://github/seemethere进行大喊大叫,以将其整合在一起。
This is going to focus on team stats, so lets play around a little bit to get a sense of what we’re working with.
这将集中在团队统计数据上,因此让我们稍作练习以了解我们正在使用的工具。
Start by importing the packages we’ll need:
首先导入我们需要的软件包:
import pandas as pd from nba_py import team
import pandas as pd from nba_py import team
If you’re using jupyter notebooks notebooks you can pip-install any packages you don’t have straight from the notebook using:
如果您使用的是jupyter笔记本电脑笔记本,则可以使用以下方法从笔记本电脑中直接安装您没有的任何软件包:
If you’re using Yhat’s Python IDE, Rodeo you can install nba_py
in the packages tab.
如果您使用的是Yhat的Python IDE Rodeo ,则可以在“软件包”标签中安装nba_py
。
Install packages in the Packages tab. No surprises here.
在“软件包”选项卡中安装软件包。 这里没有惊喜。
So referring to the docs, it looks like we’ll need some sort of roster id to get data for each team. This api hits an endpoint on the NBA”s website, so the IDs are most likely in the URL:
因此,参考文档,看来我们需要某种名册ID才能获取每个团队的数据。 该api会在NBA网站上命中一个端点,因此ID最有可能出现在URL中:
(Unapologetic Knicks bias) Looking at the team page for the on stats.nba, here’s the url: http://stats.nba/team/#!/1610612752/
(无奈的尼克斯偏见)在stats.nba上查看团队页面,以下是URL:http://stats.nba/team/#!/1610612752/
That number at the end looks like a team ID. Let’s see how the passing data works:
最后的数字看起来像一个团队ID。 让我们看看传递的数据如何工作:
class nba_py.team.TeamPassTracking(team_id, measure_type=’Base’, per_mode=’PerGame’, plus_minus=’N’, pace_adjust=’N’, rank=’N’, league_id=’00’, season=’2016-17′, season_type=’Regular Season’, po_round=’0′, outcome=”, location=”, month=’0′, season_segment=”, date_from=”, date_to=”, opponent_team_id=’0′, vs_conference=”, vs_division=”, game_segment=”, period=’0′, shot_clock_range=”, last_n_games=’0′)
class nba_py.team.TeamPassTracking(team_id,measure_type ='Base',per_mode ='PerGame',plus_minus ='N',progress_adjust ='N',rank ='N',League_id = '00',season ='2016- 17',season_type =“常规季节”,po_round ='0',results =”,location =”,month ='0',season_segment =”,date_from =”,date_to =”,对手_team_id ='0',vs_conference = ”,vs_division =”,game_segment =”,期间='0',shot_clock_range =”,last_n_games ='0')
passes_made() passes_recieved()
pass_made()pass_recieved()
knicks = team.TeamPassTracking(1610612752)
knicks = team.TeamPassTracking(1610612752)
All the info is stored in the knicks object:
所有信息都存储在尼克斯对象中:
TEAM_ID | TEAM_ID | TEAM_NAME | 队名 | PASS_TYPE | PASS_TYPE | G | G | PASS_FROM | 通行证 | PASS_TEAMMATE_PLAYER_ID | PASS_TEAMMATE_PLAYER_ID | FREQUENCY | 频率 | PASS | 通过 | AST | AST | FGM | 女性外阴残割 | FGA | FGA | FG_PCT | FG_PCT | FG2M | FG2M | FG2A | FG2A | FG2_PCT | FG2_PCT | FG3M | FG3M | FG3A | FG3A | FG3_PCT | FG3_PCT | ||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 0 | 1610612752 | 1610612752 | New York Knicks | 纽约尼克斯 | made | 制作 | 64 | 64 | Rose, Derrick | 罗斯,德里克 | 201565 | 201565 | 0.144 | 0.144 | 56.73 | 56.73 | 4.42 | 4.42 | 6.30 | 6.30 | 14.02 | 14.02 | 0.449 | 0.449 | 4.34 | 4.34 | 8.64 | 8.64 | 0.503 | 0.503 | 1.95 | 1.95 | 5.38 | 5.38 | 0.363 | 0.363 |
1 | 1个 | 1610612752 | 1610612752 | New York Knicks | 纽约尼克斯 | made | 制作 | 58 | 58 | Jennings, Brandon | 詹宁斯,布兰登 | 201943 | 201943 | 0.111 | 0.111 | 48.22 | 48.22 | 4.93 | 4.93 | 7.09 | 7.09 | 15.47 | 15.47 | 0.458 | 0.458 | 5.31 | 5.31 | 10.50 | 10.50 | 0.506 | 0.506 | 1.78 | 1.78 | 4.97 | 4.97 | 0.358 | 0.358 |
2 | 2 | 1610612752 | 1610612752 | New York Knicks | 纽约尼克斯 | made | 制作 | 66 | 66 | Porzingis, Kristaps | 克里斯蒂安(Kristaps)波尔津吉斯(Porzingis) | 204001 | 204001 | 0.106 | 0.106 | 40.61 | 40.61 | 1.47 | 1.47 | 3.29 | 3.29 | 7.65 | 7.65 | 0.430 | 0.430 | 2.56 | 2.56 | 5.50 | 5.50 | 0.466 | 0.466 | 0.73 | 0.73 | 2.15 | 2.15 | 0.338 | 0.338 |
3 | 3 | 1610612752 | 1610612752 | New York Knicks | 纽约尼克斯 | made | 制作 | 46 | 46 | Noah, Joakim | 诺亚(Joahm) | 201149 | 201149 | 0.073 | 0.073 | 40.20 | 40.20 | 2.24 | 2.24 | 4.17 | 4.17 | 8.85 | 8.85 | 0.472 | 0.472 | 3.43 | 3.43 | 6.93 | 6.93 | 0.495 | 0.495 | 0.74 | 0.74 | 1.91 | 1.91 | 0.386 | 0.386 |
4 | 4 | 1610612752 | 1610612752 | New York Knicks | 纽约尼克斯 | made | 制作 | 72 | 72 | Anthony, Carmelo | 安东尼(Carmelo) | 2546 | 2546 | 0.102 | 0.102 | 35.83 | 35.83 | 2.88 | 2.88 | 4.18 | 4.18 | 9.65 | 9.65 | 0.433 | 0.433 | 3.13 | 3.13 | 6.99 | 6.99 | 0.447 | 0.447 | 1.06 | 1.06 | 2.67 | 2.67 | 0.396 | 0.396 |
5 | 5 | 1610612752 | 1610612752 | New York Knicks | 纽约尼克斯 | made | 制作 | 73 | 73 | Lee, Courtney | 李·考特尼 | 201584 | 201584 | 0.090 | 0.090 | 30.92 | 30.92 | 2.33 | 2.33 | 3.92 | 3.92 | 8.42 | 8.42 | 0.465 | 0.465 | 3.01 | 3.01 | 5.97 | 5.97 | 0.505 | 0.505 | 0.90 | 0.90 | 2.45 | 2.45 | 0.369 | 0.369 |
6 | 6 | 1610612752 | 1610612752 | New York Knicks | 纽约尼克斯 | made | 制作 | 68 | 68 | Hernangomez, Willy | 威利·埃尔南戈梅斯 | 1626195 | 1626195 | 0.076 | 0.076 | 28.26 | 28.26 | 1.25 | 1.25 | 2.32 | 2.32 | 5.50 | 5.50 | 0.422 | 0.422 | 1.74 | 1.74 | 3.93 | 3.93 | 0.442 | 0.442 | 0.59 | 0.59 | 1.57 | 1.57 | 0.374 | 0.374 |
7 | 7 | 1610612752 | 1610612752 | New York Knicks | 纽约尼克斯 | made | 制作 | 46 | 46 | Baker, Ron | 贝克,罗恩 | 1627758 | 1627758 | 0.045 | 0.045 | 24.93 | 24.93 | 1.87 | 1.87 | 2.61 | 2.61 | 5.72 | 5.72 | 0.456 | 0.456 | 1.93 | 1.93 | 3.80 | 3.80 | 0.509 | 0.509 | 0.67 | 0.67 | 1.91 | 1.91 | 0.352 | 0.352 |
8 | 8 | 1610612752 | 1610612752 | New York Knicks | 纽约尼克斯 | made | 制作 | 46 | 46 | Thomas, Lance | 托马斯·兰斯 | 202498 | 202498 | 0.042 | 0.042 | 23.24 | 23.24 | 0.76 | 0.76 | 1.93 | 1.93 | 4.67 | 4.67 | 0.414 | 0.414 | 1.70 | 1.70 | 3.78 | 3.78 | 0.448 | 0.448 | 0.24 | 0.24 | 0.89 | 0.89 | 0.268 | 0.268 |
9 | 9 | 1610612752 | 1610612752 | New York Knicks | 纽约尼克斯 | made | 制作 | 75 | 75 | O’Quinn, Kyle | 奥奎恩,凯尔 | 203124 | 203124 | 0.068 | 0.068 | 22.93 | 22.93 | 1.49 | 1.49 | 2.35 | 2.35 | 4.87 | 4.87 | 0.482 | 0.482 | 1.93 | 1.93 | 3.63 | 3.63 | 0.533 | 0.533 | 0.41 | 0.41 | 1.24 | 1.24 | 0.333 | 0.333 |
Did you know you can inspect, copy and save data frames in the History tab in Rodeo?
您是否知道可以在Rodeo的“历史记录”选项卡中检查,复制和保存数据框?
Referring back to the docs, this looks like per game averages for passes. Definitely a lot that can be done with this, but let’s try to get it for a specific game. Referring to the docs:
回到文档,这看起来像每场比赛的传球平均值。 绝对可以做到这一点,但让我们尝试针对特定游戏获得它。 参考文档:
knicks_last_game = team.TeamPassTracking(1610612752, last_n_games = 1) knicks_last_game.passes_made().head(10)
knicks_last_game = team.TeamPassTracking(1610612752, last_n_games = 1) knicks_last_game.passes_made().head(10)
TEAM_ID | TEAM_ID | TEAM_NAME | 队名 | PASS_TYPE | PASS_TYPE | G | G | PASS_FROM | 通行证 | PASS_TEAMMATE_PLAYER_ID | PASS_TEAMMATE_PLAYER_ID | FREQUENCY | 频率 | PASS | 通过 | AST | AST | FGM | 女性外阴残割 | FGA | FGA | FG_PCT | FG_PCT | FG2M | FG2M | FG2A | FG2A | FG2_PCT | FG2_PCT | FG3M | FG3M | FG3A | FG3A | FG3_PCT | FG3_PCT | ||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 0 | 1610612752 | 1610612752 | New York Knicks | 纽约尼克斯 | made | 制作 | 1 | 1个 | Baker, Ron | 贝克,罗恩 | 1627758 | 1627758 | 0.212 | 0.212 | 72.0 | 72.0 | 6.0 | 6.0 | 7.0 | 7.0 | 15.0 | 15.0 | 0.467 | 0.467 | 7.0 | 7.0 | 11.0 | 11.0 | 0.636 | 0.636 | 0.0 | 0.0 | 4.0 | 4.0 | 0.000 | 0.000 |
1 | 1个 | 1610612752 | 1610612752 | New York Knicks | 纽约尼克斯 | made | 制作 | 1 | 1个 | Ndour, Maurice | 恩杜尔,莫里斯 | 1626254 | 1626254 | 0.135 | 0.135 | 46.0 | 46.0 | 1.0 | 1.0 | 3.0 | 3.0 | 9.0 | 9.0 | 0.333 | 0.333 | 3.0 | 3.0 | 4.0 | 4.0 | 0.750 | 0.750 | 0.0 | 0.0 | 5.0 | 5.0 | 0.000 | 0.000 |
2 | 2 | 1610612752 | 1610612752 | New York Knicks | 纽约尼克斯 | made | 制作 | 1 | 1个 | Anthony, Carmelo | 安东尼(Carmelo) | 2546 | 2546 | 0.126 | 0.126 | 43.0 | 43.0 | 2.0 | 2.0 | 5.0 | 5.0 | 16.0 | 16.0 | 0.313 | 0.313 | 4.0 | 4.0 | 13.0 | 13.0 | 0.308 | 0.308 | 1.0 | 1.0 | 3.0 | 3.0 | 0.333 | 0.333 |
3 | 3 | 1610612752 | 1610612752 | New York Knicks | 纽约尼克斯 | made | 制作 | 1 | 1个 | O’Quinn, Kyle | 奥奎恩,凯尔 | 203124 | 203124 | 0.118 | 0.118 | 40.0 | 40.0 | 5.0 | 5.0 | 5.0 | 5.0 | 6.0 | 6.0 | 0.833 | 0.833 | 4.0 | 4.0 | 4.0 | 4.0 | 1.000 | 1.000 | 1.0 | 1.0 | 2.0 | 2.0 | 0.500 | 0.500 |
4 | 4 | 1610612752 | 1610612752 | New York Knicks | 纽约尼克斯 | made | 制作 | 1 | 1个 | Lee, Courtney | 李·考特尼 | 201584 | 201584 | 0.118 | 0.118 | 40.0 | 40.0 | 3.0 | 3.0 | 6.0 | 6.0 | 8.0 | 8.0 | 0.750 | 0.750 | 2.0 | 2.0 | 4.0 | 4.0 | 0.500 | 0.500 | 4.0 | 4.0 | 4.0 | 4.0 | 1.000 | 1.000 |
5 | 5 | 1610612752 | 1610612752 | New York Knicks | 纽约尼克斯 | made | 制作 | 1 | 1个 | Hernangomez, Willy | 威利·埃尔南戈梅斯 | 1626195 | 1626195 | 0.082 | 0.082 | 28.0 | 28.0 | 3.0 | 3.0 | 4.0 | 4.0 | 8.0 | 8.0 | 0.500 | 0.500 | 4.0 | 4.0 | 6.0 | 6.0 | 0.667 | 0.667 | 0.0 | 0.0 | 2.0 | 2.0 | 0.000 | 0.000 |
6 | 6 | 1610612752 | 1610612752 | New York Knicks | 纽约尼克斯 | made | 制作 | 1 | 1个 | Holiday, Justin | 假日,贾斯汀 | 203200 | 203200 | 0.071 | 0.071 | 24.0 | 24.0 | 3.0 | 3.0 | 4.0 | 4.0 | 7.0 | 7.0 | 0.571 | 0.571 | 4.0 | 4.0 | 6.0 | 6.0 | 0.667 | 0.667 | 0.0 | 0.0 | 1.0 | 1.0 | 0.000 | 0.000 |
7 | 7 | 1610612752 | 1610612752 | New York Knicks | 纽约尼克斯 | made | 制作 | 1 | 1个 | Kuzminskas, Mindaugas | 明道加斯Kuzminskas | 1627851 | 1627851 | 0.059 | 0.059 | 20.0 | 20.0 | 2.0 | 2.0 | 2.0 | 2.0 | 6.0 | 6.0 | 0.333 | 0.333 | 2.0 | 2.0 | 5.0 | 5.0 | 0.400 | 0.400 | 0.0 | 0.0 | 1.0 | 1.0 | 0.000 | 0.000 |
8 | 8 | 1610612752 | 1610612752 | New York Knicks | 纽约尼克斯 | made | 制作 | 1 | 1个 | Randle, Chasson | 查森·兰德尔 | 1626184 | 1626184 | 0.044 | 0.044 | 15.0 | 15.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 1.0 | 0.000 | 0.000 | 0.0 | 0.0 | 1.0 | 1.0 | 0.000 | 0.000 | 0.0 | 0.0 | 0.0 | 0.0 | NaN | N |
9 | 9 | 1610612752 | 1610612752 | New York Knicks | 纽约尼克斯 | made | 制作 | 1 | 1个 | Vujacic, Sasha | 萨沙武贾西奇 | 2756 | 2756 | 0.035 | 0.035 | 12.0 | 12.0 | 1.0 | 1.0 | 2.0 | 2.0 | 3.0 | 3.0 | 0.667 | 0.667 | 2.0 | 2.0 | 2.0 | 2.0 | 1.000 | 1.000 | 0.0 | 0.0 | 1.0 | 1.0 | 0.000 | 0.000 |
This looks clean enough to be wrangled into a form that can be worked with.
这看起来很干净,可以整理成可以使用的形式。
If we’re trying to create a team level box score, we’re more than likely going to need to join tables together down the line, just something to keep in mind.
如果我们要创建团队级别的盒子分数,那么很可能需要将表连接在一起,这是需要牢记的。
Hitting the ShotTracking endpoint looks interesting:
击中ShotTracking端点看起来很有趣:
TEAM_ID | TEAM_ID | TEAM_NAME | 队名 | SORT_ORDER | 排序 | G | G | CLOSE_DEF_DIST_RANGE | CLOSE_DEF_DIST_RANGE | FGA_FREQUENCY | FGA_FREQUENCY | FGM | 女性生殖器 | FGA | FGA | FG_PCT | FG_PCT | EFG_PCT | EFG_PCT | FG2A_FREQUENCY | FG2A_FREQUENCY | FG2M | FG2M | FG2A | FG2A | FG2_PCT | FG2_PCT | FG3A_FREQUENCY | FG3A_FREQUENCY | FG3M | FG3M | FG3A | FG3A | FG3_PCT | FG3_PCT | ||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 0 | 1610612752 | 1610612752 | New York Knicks | 纽约尼克斯 | 1 | 1个 | 1 | 1个 | 0-2 Feet – Very Tight | 0-2英尺–非常紧 | 0.091 | 0.091 | 4.0 | 4.0 | 8.0 | 8.0 | 0.500 | 0.500 | 0.500 | 0.500 | 0.091 | 0.091 | 4.0 | 4.0 | 8.0 | 8.0 | 0.500 | 0.500 | 0.000 | 0.000 | 0.0 | 0.0 | 0.0 | 0.0 | NaN | N |
1 | 1个 | 1610612752 | 1610612752 | New York Knicks | 纽约尼克斯 | 2 | 2 | 1 | 1个 | 2-4 Feet – Tight | 2-4英尺–紧 | 0.318 | 0.318 | 15.0 | 15.0 | 28.0 | 28.0 | 0.536 | 0.536 | 0.536 | 0.536 | 0.295 | 0.295 | 15.0 | 15.0 | 26.0 | 26.0 | 0.577 | 0.577 | 0.023 | 0.023 | 0.0 | 0.0 | 2.0 | 2.0 | 0.000 | 0.000 |
2 | 2 | 1610612752 | 1610612752 | New York Knicks | 纽约尼克斯 | 3 | 3 | 1 | 1个 | 4-6 Feet – Open | 4-6英尺–开放 | 0.409 | 0.409 | 16.0 | 16.0 | 36.0 | 36.0 | 0.444 | 0.444 | 0.500 | 0.500 | 0.250 | 0.250 | 12.0 | 12.0 | 22.0 | 22.0 | 0.545 | 0.545 | 0.159 | 0.159 | 4.0 | 4.0 | 14.0 | 14.0 | 0.286 | 0.286 |
3 | 3 | 1610612752 | 1610612752 | New York Knicks | 纽约尼克斯 | 4 | 4 | 1 | 1个 | 6+ Feet – Wide Open | 6英尺以上-张开 | 0.182 | 0.182 | 7.0 | 7.0 | 16.0 | 16.0 | 0.438 | 0.438 | 0.500 | 0.500 | 0.102 | 0.102 | 5.0 | 5.0 | 9.0 | 9.0 | 0.556 | 0.556 | 0.080 | 0.080 | 2.0 | 2.0 | 7.0 | 7.0 | 0.286 | 0.286 |
Following along in Rodeo? Your view should look something like this.
跟随牛仔竞技表演吗? 您的视图应如下所示。
This looks interesting! We wanted EFG% (effective field goal percentage) in our original table, but it looks like we can get EFG% for open and covered shots. Let’s group ‘Open’ and ‘Wide Open’ together, along with ‘Tight’ and ‘Very Tight.’
这看起来很有趣! 我们希望在原始表格中使用EFG%(有效投篮命中率),但看起来我们可以为公开和掩护投篮获得EFG%。 让我们将“ Open”和“ Wide Open”以及“ Tight”和“ Very Tight”分组在一起。
Effective field goal percentage is a statistic that adjusts field goal percentage to account for the fact that three-point field goals count for three points while field goals only count for two points:
有效投篮命中率是一种统计数据,它会调整投篮命中率,以说明三分投篮命中占3分而投篮命中仅占2分这一事实:
This might help answer questions like “Do teams hit more open shots when they win?”
这可能有助于回答“团队获胜时会打更多空位吗?”之类的问题。
df_grouped = knicks_shots.closest_defender_shooting() df_grouped['OPEN'] = df_grouped['CLOSE_DEF_DIST_RANGE'].map(lambda x : True if 'Open' in x else False) ##This creates a new column OPEN, mapped from the 'CLOSE_DEF_DIST_RANGE' column. ##http://pandas.pydata/pandas-docs/stable/generated/pandas.Series.map.html df_grouped
df_grouped = knicks_shots.closest_defender_shooting() df_grouped['OPEN'] = df_grouped['CLOSE_DEF_DIST_RANGE'].map(lambda x : True if 'Open' in x else False) ##This creates a new column OPEN, mapped from the 'CLOSE_DEF_DIST_RANGE' column. ##http://pandas.pydata/pandas-docs/stable/generated/pandas.Series.map.html df_grouped
TEAM_ID | TEAM_ID | TEAM_NAME | 队名 | SORT_ORDER | 排序 | G | G | CLOSE_DEF_DIST_RANGE | CLOSE_DEF_DIST_RANGE | FGA_FREQUENCY | FGA_FREQUENCY | FGM | 女性外阴残割 | FGA | FGA | FG_PCT | FG_PCT | EFG_PCT | EFG_PCT | FG2A_FREQUENCY | FG2A_FREQUENCY | FG2M | FG2M | FG2A | FG2A | FG2_PCT | FG2_PCT | FG3A_FREQUENCY | FG3A_FREQUENCY | FG3M | FG3M | FG3A | FG3A | FG3_PCT | FG3_PCT | OPEN | 打开 | ||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 0 | 1610612752 | 1610612752 | New York Knicks | 纽约尼克斯 | 1 | 1个 | 1 | 1个 | 0-2 Feet – Very Tight | 0-2英尺–非常紧 | 0.091 | 0.091 | 4.0 | 4.0 | 8.0 | 8.0 | 0.500 | 0.500 | 0.500 | 0.500 | 0.091 | 0.091 | 4.0 | 4.0 | 8.0 | 8.0 | 0.500 | 0.500 | 0.000 | 0.000 | 0.0 | 0.0 | 0.0 | 0.0 | NaN | N | False | 假 |
1 | 1个 | 1610612752 | 1610612752 | New York Knicks | 纽约尼克斯 | 2 | 2 | 1 | 1个 | 2-4 Feet – Tight | 2-4英尺–紧 | 0.318 | 0.318 | 15.0 | 15.0 | 28.0 | 28.0 | 0.536 | 0.536 | 0.536 | 0.536 | 0.295 | 0.295 | 15.0 | 15.0 | 26.0 | 26.0 | 0.577 | 0.577 | 0.023 | 0.023 | 0.0 | 0.0 | 2.0 | 2.0 | 0.000 | 0.000 | False | 假 |
2 | 2 | 1610612752 | 1610612752 | New York Knicks | 纽约尼克斯 | 3 | 3 | 1 | 1个 | 4-6 Feet – Open | 4-6英尺–开放 | 0.409 | 0.409 | 16.0 | 16.0 | 36.0 | 36.0 | 0.444 | 0.444 | 0.500 | 0.500 | 0.250 | 0.250 | 12.0 | 12.0 | 22.0 | 22.0 | 0.545 | 0.545 | 0.159 | 0.159 | 4.0 | 4.0 | 14.0 | 14.0 | 0.286 | 0.286 | True | 真正 |
3 | 3 | 1610612752 | 1610612752 | New York Knicks | 纽约尼克斯 | 4 | 4 | 1 | 1个 | 6+ Feet – Wide Open | 6英尺以上-张开 | 0.182 | 0.182 | 7.0 | 7.0 | 16.0 | 16.0 | 0.438 | 0.438 | 0.500 | 0.500 | 0.102 | 0.102 | 5.0 | 5.0 | 9.0 | 9.0 | 0.556 | 0.556 | 0.080 | 0.080 | 2.0 | 2.0 | 7.0 | 7.0 | 0.286 | 0.286 | True | 真正 |
The last column ‘OPEN’ gives us the information we need. Now we can aggregate based off of it. Let’s get the total number of open shots.
最后一列“ OPEN”为我们提供了我们所需的信息。 现在我们可以基于它进行聚合。 让我们获取打开镜头的总数。
That looks like it worked. Similarly, we can get the total number of “covered” shots taken (looks like it’s a lot higher…nothing surprising there.)
看起来很有效。 同样,我们可以获得已拍摄的“被覆盖”镜头的总数(看起来要高很多……不足为奇)。
Keep in mind, this is a bit misleading, as layups and other shots near the basket are more likely to have a nearby defender.
请记住,这有点误导,因为篮筐附近的上篮得分和其他投篮更有可能在附近有后卫。
Referring to the definition for EFG%:
参考EFG%的定义:
$$EFG = frac{(FGM + .5 * 3PM)}{FGA}$$
$$ EFG = frac {(FGM + .5 * 3PM)} {FGA} $$
We definitely have all the information we need to compute this for open and covered shots:
我们肯定拥有计算公开和掩饰照片所需的所有信息:
#Mapping the formula above into a column: open_efg = (df_grouped.loc[df_grouped['OPEN']== True, 'FGM'].sum() + (.5 * df_grouped.loc[df_grouped['OPEN']== True, 'FG3M'].sum()))/(df_grouped.loc[df_grouped['OPEN']== True, 'FGA'].sum()) covered_efg = (df_grouped.loc[df_grouped['OPEN']== False, 'FGM'].sum() + (.5 * df_grouped.loc[df_grouped['OPEN']== False, 'FG3M'].sum()))/(df_grouped.loc[df_grouped['OPEN']== False, 'FGA'].sum()) print open_efg print covered_efg 0.5 0.527777777778
#Mapping the formula above into a column: open_efg = (df_grouped.loc[df_grouped['OPEN']== True, 'FGM'].sum() + (.5 * df_grouped.loc[df_grouped['OPEN']== True, 'FG3M'].sum()))/(df_grouped.loc[df_grouped['OPEN']== True, 'FGA'].sum()) covered_efg = (df_grouped.loc[df_grouped['OPEN']== False, 'FGM'].sum() + (.5 * df_grouped.loc[df_grouped['OPEN']== False, 'FG3M'].sum()))/(df_grouped.loc[df_grouped['OPEN']== False, 'FGA'].sum()) print open_efg print covered_efg 0.5 0.527777777778
Interesting… shooting better when there’s a defender nearby makes it look like there’s more to the story. Then again, nothing about the Knicks ever s
版权声明:本文标题:python处理数据可视化_数据整理101:使用Python提取,处理和可视化NBA数据 内容由热心网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:https://www.elefans.com/dongtai/1725887942a1047238.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论