pandas
------
Pandas is a module for Python to manipulate and analyze tabular data. It
allows easy reading and writing to different file formats (including MS
Excel).
.. code:: python
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
working with Excel data
~~~~~~~~~~~~~~~~~~~~~~~
.. code:: python
myexcel = pd.read_excel("pandas_test.xlsx", sheetname="mydata", skiprows=2)
.. code:: python
myexcel
.. raw:: html
|
sample |
weight |
volume |
magnetic moment |
| 0 |
A |
20.1 |
10.3 |
2.000000e-08 |
| 1 |
B |
20.4 |
10.2 |
2.000000e-08 |
| 2 |
C |
19.8 |
9.8 |
2.000000e-08 |
| 3 |
D |
17.2 |
11.0 |
2.000000e-08 |
| 4 |
E |
24.3 |
10.3 |
2.000000e-08 |
| 5 |
F |
18.5 |
10.3 |
2.000000e-08 |
.. code:: python
myexcel[['sample','weight']]
.. raw:: html
|
sample |
weight |
| 0 |
A |
20.1 |
| 1 |
B |
20.4 |
| 2 |
C |
19.8 |
| 3 |
D |
17.2 |
| 4 |
E |
24.3 |
| 5 |
F |
18.5 |
calculations with rows:
.. code:: python
myexcel['density'] = myexcel['weight'] / myexcel['volume']
save pandas dataframe back to an excel file
.. code:: python
myexcel.to_excel('new_excel_file.xlsx')
and a text file
.. code:: python
myexcel.to_csv('new_excel_file.csv', sep='\t')
intermagnet data
~~~~~~~~~~~~~~~~
Data from magnetic observatories all over the world can be downloaded
here:
http://www.intermagnet.org/data-donnee/download-eng.php
we start by reading in the file downloaded from intermagnet in IAGA2002
format
.. code:: python
clf = pd.read_csv("clf20160202vsec.sec", skiprows=22, sep='\s+')
.. code:: python
clf
.. raw:: html
|
DATE |
TIME |
DOY |
CLFX |
CLFY |
CLFZ |
CLFF |
| |
| 0 |
2016-02-02 |
00:00:00.000 |
33 |
21201.36 |
120.82 |
42892.47 |
47846.22 |
NaN |
| 1 |
2016-02-02 |
00:00:01.000 |
33 |
21201.35 |
120.80 |
42892.49 |
47846.20 |
NaN |
| 2 |
2016-02-02 |
00:00:02.000 |
33 |
21201.35 |
120.80 |
42892.47 |
47846.20 |
NaN |
| 3 |
2016-02-02 |
00:00:03.000 |
33 |
21201.38 |
120.79 |
42892.49 |
47846.21 |
NaN |
| 4 |
2016-02-02 |
00:00:04.000 |
33 |
21201.37 |
120.80 |
42892.49 |
47846.20 |
NaN |
| 5 |
2016-02-02 |
00:00:05.000 |
33 |
21201.36 |
120.83 |
42892.46 |
47846.18 |
NaN |
| 6 |
2016-02-02 |
00:00:06.000 |
33 |
21201.34 |
120.86 |
42892.45 |
47846.19 |
NaN |
| 7 |
2016-02-02 |
00:00:07.000 |
33 |
21201.36 |
120.83 |
42892.45 |
47846.19 |
NaN |
| 8 |
2016-02-02 |
00:00:08.000 |
33 |
21201.35 |
120.83 |
42892.45 |
47846.17 |
NaN |
| 9 |
2016-02-02 |
00:00:09.000 |
33 |
21201.35 |
120.81 |
42892.47 |
47846.17 |
NaN |
| 10 |
2016-02-02 |
00:00:10.000 |
33 |
21201.33 |
120.80 |
42892.44 |
47846.15 |
NaN |
| 11 |
2016-02-02 |
00:00:11.000 |
33 |
21201.33 |
120.82 |
42892.43 |
47846.17 |
NaN |
| 12 |
2016-02-02 |
00:00:12.000 |
33 |
21201.33 |
120.85 |
42892.41 |
47846.14 |
NaN |
| 13 |
2016-02-02 |
00:00:13.000 |
33 |
21201.35 |
120.84 |
42892.43 |
47846.15 |
NaN |
| 14 |
2016-02-02 |
00:00:14.000 |
33 |
21201.34 |
120.83 |
42892.45 |
47846.14 |
NaN |
| 15 |
2016-02-02 |
00:00:15.000 |
33 |
21201.31 |
120.84 |
42892.43 |
47846.15 |
NaN |
| 16 |
2016-02-02 |
00:00:16.000 |
33 |
21201.34 |
120.84 |
42892.41 |
47846.16 |
NaN |
| 17 |
2016-02-02 |
00:00:17.000 |
33 |
21201.35 |
120.85 |
42892.40 |
47846.13 |
NaN |
| 18 |
2016-02-02 |
00:00:18.000 |
33 |
21201.36 |
120.82 |
42892.44 |
47846.14 |
NaN |
| 19 |
2016-02-02 |
00:00:19.000 |
33 |
21201.36 |
120.81 |
42892.45 |
47846.16 |
NaN |
| 20 |
2016-02-02 |
00:00:20.000 |
33 |
21201.36 |
120.82 |
42892.44 |
47846.15 |
NaN |
| 21 |
2016-02-02 |
00:00:21.000 |
33 |
21201.36 |
120.81 |
42892.44 |
47846.16 |
NaN |
| 22 |
2016-02-02 |
00:00:22.000 |
33 |
21201.35 |
120.82 |
42892.46 |
47846.16 |
NaN |
| 23 |
2016-02-02 |
00:00:23.000 |
33 |
21201.35 |
120.81 |
42892.48 |
47846.15 |
NaN |
| 24 |
2016-02-02 |
00:00:24.000 |
33 |
21201.33 |
120.82 |
42892.47 |
47846.16 |
NaN |
| 25 |
2016-02-02 |
00:00:25.000 |
33 |
21201.37 |
120.77 |
42892.46 |
47846.18 |
NaN |
| 26 |
2016-02-02 |
00:00:26.000 |
33 |
21201.41 |
120.76 |
42892.48 |
47846.18 |
NaN |
| 27 |
2016-02-02 |
00:00:27.000 |
33 |
21201.42 |
120.74 |
42892.47 |
47846.20 |
NaN |
| 28 |
2016-02-02 |
00:00:28.000 |
33 |
21201.41 |
120.75 |
42892.47 |
47846.21 |
NaN |
| 29 |
2016-02-02 |
00:00:29.000 |
33 |
21201.41 |
120.76 |
42892.50 |
47846.20 |
NaN |
| ... |
... |
... |
... |
... |
... |
... |
... |
... |
| 86370 |
2016-02-02 |
23:59:30.000 |
33 |
21191.18 |
158.23 |
42897.90 |
47846.59 |
NaN |
| 86371 |
2016-02-02 |
23:59:31.000 |
33 |
21191.18 |
158.24 |
42897.90 |
47846.60 |
NaN |
| 86372 |
2016-02-02 |
23:59:32.000 |
33 |
21191.18 |
158.23 |
42897.91 |
47846.60 |
NaN |
| 86373 |
2016-02-02 |
23:59:33.000 |
33 |
21191.17 |
158.27 |
42897.91 |
47846.58 |
NaN |
| 86374 |
2016-02-02 |
23:59:34.000 |
33 |
21191.19 |
158.28 |
42897.90 |
47846.59 |
NaN |
| 86375 |
2016-02-02 |
23:59:35.000 |
33 |
21191.21 |
158.31 |
42897.91 |
47846.60 |
NaN |
| 86376 |
2016-02-02 |
23:59:36.000 |
33 |
21191.25 |
158.30 |
42897.90 |
47846.60 |
NaN |
| 86377 |
2016-02-02 |
23:59:37.000 |
33 |
21191.23 |
158.30 |
42897.93 |
47846.61 |
NaN |
| 86378 |
2016-02-02 |
23:59:38.000 |
33 |
21191.21 |
158.34 |
42897.93 |
47846.60 |
NaN |
| 86379 |
2016-02-02 |
23:59:39.000 |
33 |
21191.20 |
158.41 |
42897.91 |
47846.59 |
NaN |
| 86380 |
2016-02-02 |
23:59:40.000 |
33 |
21191.19 |
158.46 |
42897.90 |
47846.56 |
NaN |
| 86381 |
2016-02-02 |
23:59:41.000 |
33 |
21191.18 |
158.48 |
42897.86 |
47846.53 |
NaN |
| 86382 |
2016-02-02 |
23:59:42.000 |
33 |
21191.18 |
158.48 |
42897.86 |
47846.50 |
NaN |
| 86383 |
2016-02-02 |
23:59:43.000 |
33 |
21191.21 |
158.45 |
42897.85 |
47846.51 |
NaN |
| 86384 |
2016-02-02 |
23:59:44.000 |
33 |
21191.21 |
158.45 |
42897.86 |
47846.53 |
NaN |
| 86385 |
2016-02-02 |
23:59:45.000 |
33 |
21191.23 |
158.46 |
42897.84 |
47846.53 |
NaN |
| 86386 |
2016-02-02 |
23:59:46.000 |
33 |
21191.25 |
158.45 |
42897.83 |
47846.55 |
NaN |
| 86387 |
2016-02-02 |
23:59:47.000 |
33 |
21191.26 |
158.44 |
42897.84 |
47846.53 |
NaN |
| 86388 |
2016-02-02 |
23:59:48.000 |
33 |
21191.26 |
158.46 |
42897.82 |
47846.56 |
NaN |
| 86389 |
2016-02-02 |
23:59:49.000 |
33 |
21191.25 |
158.49 |
42897.83 |
47846.54 |
NaN |
| 86390 |
2016-02-02 |
23:59:50.000 |
33 |
21191.21 |
158.49 |
42897.84 |
47846.54 |
NaN |
| 86391 |
2016-02-02 |
23:59:51.000 |
33 |
21191.22 |
158.49 |
42897.82 |
47846.52 |
NaN |
| 86392 |
2016-02-02 |
23:59:52.000 |
33 |
21191.22 |
158.49 |
42897.85 |
47846.53 |
NaN |
| 86393 |
2016-02-02 |
23:59:53.000 |
33 |
21191.21 |
158.51 |
42897.85 |
47846.51 |
NaN |
| 86394 |
2016-02-02 |
23:59:54.000 |
33 |
21191.24 |
158.48 |
42897.84 |
47846.52 |
NaN |
| 86395 |
2016-02-02 |
23:59:55.000 |
33 |
21191.23 |
158.46 |
42897.83 |
47846.54 |
NaN |
| 86396 |
2016-02-02 |
23:59:56.000 |
33 |
21191.23 |
158.47 |
42897.83 |
47846.53 |
NaN |
| 86397 |
2016-02-02 |
23:59:57.000 |
33 |
21191.23 |
158.48 |
42897.86 |
47846.55 |
NaN |
| 86398 |
2016-02-02 |
23:59:58.000 |
33 |
21191.23 |
158.48 |
42897.86 |
47846.52 |
NaN |
| 86399 |
2016-02-02 |
23:59:59.000 |
33 |
21191.24 |
158.50 |
42897.88 |
47846.53 |
NaN |
86400 rows × 8 columns
.. code:: python
clf.dtypes
.. parsed-literal::
DATE object
TIME object
DOY int64
CLFX float64
CLFY float64
CLFZ float64
CLFF float64
| float64
dtype: object
plot single column
.. code:: python
clf['CLFF'].plot()
.. parsed-literal::
.. image:: 10_pandas_files/10_pandas_21_1.png
plot specific columns
.. code:: python
clf.plot(x='TIME', y='CLFF', figsize=(12,8))
.. parsed-literal::
.. image:: 10_pandas_files/10_pandas_23_1.png
add a rolling mean
.. code:: python
clf['CLFF_rolling'] = pd.rolling_mean( clf['CLFF'], window=1000, center=True)
.. code:: python
clf.loc[490:530]
.. raw:: html
|
DATE |
TIME |
DOY |
CLFX |
CLFY |
CLFZ |
CLFF |
| |
CLFF_rolling |
| 490 |
2016-02-02 |
00:08:10.000 |
33 |
21202.41 |
121.06 |
42892.77 |
47846.95 |
NaN |
NaN |
| 491 |
2016-02-02 |
00:08:11.000 |
33 |
21202.40 |
121.07 |
42892.73 |
47846.95 |
NaN |
NaN |
| 492 |
2016-02-02 |
00:08:12.000 |
33 |
21202.41 |
121.10 |
42892.73 |
47846.93 |
NaN |
NaN |
| 493 |
2016-02-02 |
00:08:13.000 |
33 |
21202.39 |
121.11 |
42892.72 |
47846.91 |
NaN |
NaN |
| 494 |
2016-02-02 |
00:08:14.000 |
33 |
21202.40 |
121.09 |
42892.70 |
47846.90 |
NaN |
NaN |
| 495 |
2016-02-02 |
00:08:15.000 |
33 |
21202.40 |
121.07 |
42892.70 |
47846.91 |
NaN |
NaN |
| 496 |
2016-02-02 |
00:08:16.000 |
33 |
21202.38 |
121.08 |
42892.72 |
47846.86 |
NaN |
NaN |
| 497 |
2016-02-02 |
00:08:17.000 |
33 |
21202.39 |
121.06 |
42892.70 |
47846.89 |
NaN |
NaN |
| 498 |
2016-02-02 |
00:08:18.000 |
33 |
21202.40 |
121.06 |
42892.67 |
47846.88 |
NaN |
NaN |
| 499 |
2016-02-02 |
00:08:19.000 |
33 |
21202.37 |
121.06 |
42892.68 |
47846.86 |
NaN |
NaN |
| 500 |
2016-02-02 |
00:08:20.000 |
33 |
21202.37 |
121.03 |
42892.70 |
47846.87 |
NaN |
47846.59858 |
| 501 |
2016-02-02 |
00:08:21.000 |
33 |
21202.37 |
121.03 |
42892.71 |
47846.86 |
NaN |
47846.59846 |
| 502 |
2016-02-02 |
00:08:22.000 |
33 |
21202.39 |
121.06 |
42892.68 |
47846.88 |
NaN |
47846.59836 |
| 503 |
2016-02-02 |
00:08:23.000 |
33 |
21202.39 |
121.02 |
42892.67 |
47846.88 |
NaN |
47846.59832 |
| 504 |
2016-02-02 |
00:08:24.000 |
33 |
21202.39 |
121.05 |
42892.70 |
47846.86 |
NaN |
47846.59829 |
| 505 |
2016-02-02 |
00:08:25.000 |
33 |
21202.39 |
121.04 |
42892.73 |
47846.89 |
NaN |
47846.59832 |
| 506 |
2016-02-02 |
00:08:26.000 |
33 |
21202.42 |
121.04 |
42892.72 |
47846.89 |
NaN |
47846.59837 |
| 507 |
2016-02-02 |
00:08:27.000 |
33 |
21202.41 |
121.03 |
42892.73 |
47846.89 |
NaN |
47846.59846 |
| 508 |
2016-02-02 |
00:08:28.000 |
33 |
21202.38 |
121.06 |
42892.75 |
47846.87 |
NaN |
47846.59851 |
| 509 |
2016-02-02 |
00:08:29.000 |
33 |
21202.37 |
121.08 |
42892.70 |
47846.85 |
NaN |
47846.59863 |
| 510 |
2016-02-02 |
00:08:30.000 |
33 |
21202.36 |
121.11 |
42892.69 |
47846.85 |
NaN |
47846.59873 |
| 511 |
2016-02-02 |
00:08:31.000 |
33 |
21202.36 |
121.12 |
42892.65 |
47846.83 |
NaN |
47846.59888 |
| 512 |
2016-02-02 |
00:08:32.000 |
33 |
21202.35 |
121.11 |
42892.65 |
47846.82 |
NaN |
47846.59899 |
| 513 |
2016-02-02 |
00:08:33.000 |
33 |
21202.33 |
121.12 |
42892.65 |
47846.83 |
NaN |
47846.59914 |
| 514 |
2016-02-02 |
00:08:34.000 |
33 |
21202.36 |
121.16 |
42892.62 |
47846.81 |
NaN |
47846.59927 |
| 515 |
2016-02-02 |
00:08:35.000 |
33 |
21202.37 |
121.15 |
42892.59 |
47846.79 |
NaN |
47846.59941 |
| 516 |
2016-02-02 |
00:08:36.000 |
33 |
21202.38 |
121.16 |
42892.61 |
47846.79 |
NaN |
47846.59953 |
| 517 |
2016-02-02 |
00:08:37.000 |
33 |
21202.36 |
121.17 |
42892.61 |
47846.78 |
NaN |
47846.59964 |
| 518 |
2016-02-02 |
00:08:38.000 |
33 |
21202.37 |
121.16 |
42892.59 |
47846.79 |
NaN |
47846.59977 |
| 519 |
2016-02-02 |
00:08:39.000 |
33 |
21202.36 |
121.19 |
42892.57 |
47846.79 |
NaN |
47846.59990 |
| 520 |
2016-02-02 |
00:08:40.000 |
33 |
21202.35 |
121.22 |
42892.55 |
47846.78 |
NaN |
47846.59999 |
| 521 |
2016-02-02 |
00:08:41.000 |
33 |
21202.34 |
121.22 |
42892.57 |
47846.77 |
NaN |
47846.60009 |
| 522 |
2016-02-02 |
00:08:42.000 |
33 |
21202.39 |
121.23 |
42892.59 |
47846.80 |
NaN |
47846.60017 |
| 523 |
2016-02-02 |
00:08:43.000 |
33 |
21202.38 |
121.25 |
42892.55 |
47846.75 |
NaN |
47846.60024 |
| 524 |
2016-02-02 |
00:08:44.000 |
33 |
21202.38 |
121.29 |
42892.54 |
47846.75 |
NaN |
47846.60031 |
| 525 |
2016-02-02 |
00:08:45.000 |
33 |
21202.36 |
121.29 |
42892.56 |
47846.71 |
NaN |
47846.60036 |
| 526 |
2016-02-02 |
00:08:46.000 |
33 |
21202.33 |
121.30 |
42892.58 |
47846.72 |
NaN |
47846.60039 |
| 527 |
2016-02-02 |
00:08:47.000 |
33 |
21202.34 |
121.30 |
42892.57 |
47846.70 |
NaN |
47846.60041 |
| 528 |
2016-02-02 |
00:08:48.000 |
33 |
21202.33 |
121.29 |
42892.57 |
47846.70 |
NaN |
47846.60039 |
| 529 |
2016-02-02 |
00:08:49.000 |
33 |
21202.32 |
121.28 |
42892.53 |
47846.71 |
NaN |
47846.60036 |
| 530 |
2016-02-02 |
00:08:50.000 |
33 |
21202.30 |
121.28 |
42892.51 |
47846.71 |
NaN |
47846.60035 |
.. code:: python
ax = clf.plot(x='TIME', y='CLFF', figsize=(12,8))
clf.plot(ax=ax, x='TIME', y='CLFF_rolling', color='r', linewidth=3)
plt.show()
.. image:: 10_pandas_files/10_pandas_27_0.png