pandas
------
Pandas is a module for Python to manipulate and analyze tabular data. It
allows easy reading and writing to different file formats (including MS
Excel).
.. code:: python
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
working with Excel data
~~~~~~~~~~~~~~~~~~~~~~~
.. code:: python
myexcel = pd.read_excel("pandas_test.xlsx", sheetname="mydata", skiprows=2)
.. code:: python
myexcel
.. raw:: html
|
sample |
weight |
volume |
magnetic moment |
0 |
A |
20.1 |
10.3 |
2.000000e-08 |
1 |
B |
20.4 |
10.2 |
2.000000e-08 |
2 |
C |
19.8 |
9.8 |
2.000000e-08 |
3 |
D |
17.2 |
11.0 |
2.000000e-08 |
4 |
E |
24.3 |
10.3 |
2.000000e-08 |
5 |
F |
18.5 |
10.3 |
2.000000e-08 |
.. code:: python
myexcel[['sample','weight']]
.. raw:: html
|
sample |
weight |
0 |
A |
20.1 |
1 |
B |
20.4 |
2 |
C |
19.8 |
3 |
D |
17.2 |
4 |
E |
24.3 |
5 |
F |
18.5 |
calculations with rows:
.. code:: python
myexcel['density'] = myexcel['weight'] / myexcel['volume']
save pandas dataframe back to an excel file
.. code:: python
myexcel.to_excel('new_excel_file.xlsx')
and a text file
.. code:: python
myexcel.to_csv('new_excel_file.csv', sep='\t')
intermagnet data
~~~~~~~~~~~~~~~~
Data from magnetic observatories all over the world can be downloaded
here:
http://www.intermagnet.org/data-donnee/download-eng.php
we start by reading in the file downloaded from intermagnet in IAGA2002
format
.. code:: python
clf = pd.read_csv("clf20160202vsec.sec", skiprows=22, sep='\s+')
.. code:: python
clf
.. raw:: html
|
DATE |
TIME |
DOY |
CLFX |
CLFY |
CLFZ |
CLFF |
| |
0 |
2016-02-02 |
00:00:00.000 |
33 |
21201.36 |
120.82 |
42892.47 |
47846.22 |
NaN |
1 |
2016-02-02 |
00:00:01.000 |
33 |
21201.35 |
120.80 |
42892.49 |
47846.20 |
NaN |
2 |
2016-02-02 |
00:00:02.000 |
33 |
21201.35 |
120.80 |
42892.47 |
47846.20 |
NaN |
3 |
2016-02-02 |
00:00:03.000 |
33 |
21201.38 |
120.79 |
42892.49 |
47846.21 |
NaN |
4 |
2016-02-02 |
00:00:04.000 |
33 |
21201.37 |
120.80 |
42892.49 |
47846.20 |
NaN |
5 |
2016-02-02 |
00:00:05.000 |
33 |
21201.36 |
120.83 |
42892.46 |
47846.18 |
NaN |
6 |
2016-02-02 |
00:00:06.000 |
33 |
21201.34 |
120.86 |
42892.45 |
47846.19 |
NaN |
7 |
2016-02-02 |
00:00:07.000 |
33 |
21201.36 |
120.83 |
42892.45 |
47846.19 |
NaN |
8 |
2016-02-02 |
00:00:08.000 |
33 |
21201.35 |
120.83 |
42892.45 |
47846.17 |
NaN |
9 |
2016-02-02 |
00:00:09.000 |
33 |
21201.35 |
120.81 |
42892.47 |
47846.17 |
NaN |
10 |
2016-02-02 |
00:00:10.000 |
33 |
21201.33 |
120.80 |
42892.44 |
47846.15 |
NaN |
11 |
2016-02-02 |
00:00:11.000 |
33 |
21201.33 |
120.82 |
42892.43 |
47846.17 |
NaN |
12 |
2016-02-02 |
00:00:12.000 |
33 |
21201.33 |
120.85 |
42892.41 |
47846.14 |
NaN |
13 |
2016-02-02 |
00:00:13.000 |
33 |
21201.35 |
120.84 |
42892.43 |
47846.15 |
NaN |
14 |
2016-02-02 |
00:00:14.000 |
33 |
21201.34 |
120.83 |
42892.45 |
47846.14 |
NaN |
15 |
2016-02-02 |
00:00:15.000 |
33 |
21201.31 |
120.84 |
42892.43 |
47846.15 |
NaN |
16 |
2016-02-02 |
00:00:16.000 |
33 |
21201.34 |
120.84 |
42892.41 |
47846.16 |
NaN |
17 |
2016-02-02 |
00:00:17.000 |
33 |
21201.35 |
120.85 |
42892.40 |
47846.13 |
NaN |
18 |
2016-02-02 |
00:00:18.000 |
33 |
21201.36 |
120.82 |
42892.44 |
47846.14 |
NaN |
19 |
2016-02-02 |
00:00:19.000 |
33 |
21201.36 |
120.81 |
42892.45 |
47846.16 |
NaN |
20 |
2016-02-02 |
00:00:20.000 |
33 |
21201.36 |
120.82 |
42892.44 |
47846.15 |
NaN |
21 |
2016-02-02 |
00:00:21.000 |
33 |
21201.36 |
120.81 |
42892.44 |
47846.16 |
NaN |
22 |
2016-02-02 |
00:00:22.000 |
33 |
21201.35 |
120.82 |
42892.46 |
47846.16 |
NaN |
23 |
2016-02-02 |
00:00:23.000 |
33 |
21201.35 |
120.81 |
42892.48 |
47846.15 |
NaN |
24 |
2016-02-02 |
00:00:24.000 |
33 |
21201.33 |
120.82 |
42892.47 |
47846.16 |
NaN |
25 |
2016-02-02 |
00:00:25.000 |
33 |
21201.37 |
120.77 |
42892.46 |
47846.18 |
NaN |
26 |
2016-02-02 |
00:00:26.000 |
33 |
21201.41 |
120.76 |
42892.48 |
47846.18 |
NaN |
27 |
2016-02-02 |
00:00:27.000 |
33 |
21201.42 |
120.74 |
42892.47 |
47846.20 |
NaN |
28 |
2016-02-02 |
00:00:28.000 |
33 |
21201.41 |
120.75 |
42892.47 |
47846.21 |
NaN |
29 |
2016-02-02 |
00:00:29.000 |
33 |
21201.41 |
120.76 |
42892.50 |
47846.20 |
NaN |
... |
... |
... |
... |
... |
... |
... |
... |
... |
86370 |
2016-02-02 |
23:59:30.000 |
33 |
21191.18 |
158.23 |
42897.90 |
47846.59 |
NaN |
86371 |
2016-02-02 |
23:59:31.000 |
33 |
21191.18 |
158.24 |
42897.90 |
47846.60 |
NaN |
86372 |
2016-02-02 |
23:59:32.000 |
33 |
21191.18 |
158.23 |
42897.91 |
47846.60 |
NaN |
86373 |
2016-02-02 |
23:59:33.000 |
33 |
21191.17 |
158.27 |
42897.91 |
47846.58 |
NaN |
86374 |
2016-02-02 |
23:59:34.000 |
33 |
21191.19 |
158.28 |
42897.90 |
47846.59 |
NaN |
86375 |
2016-02-02 |
23:59:35.000 |
33 |
21191.21 |
158.31 |
42897.91 |
47846.60 |
NaN |
86376 |
2016-02-02 |
23:59:36.000 |
33 |
21191.25 |
158.30 |
42897.90 |
47846.60 |
NaN |
86377 |
2016-02-02 |
23:59:37.000 |
33 |
21191.23 |
158.30 |
42897.93 |
47846.61 |
NaN |
86378 |
2016-02-02 |
23:59:38.000 |
33 |
21191.21 |
158.34 |
42897.93 |
47846.60 |
NaN |
86379 |
2016-02-02 |
23:59:39.000 |
33 |
21191.20 |
158.41 |
42897.91 |
47846.59 |
NaN |
86380 |
2016-02-02 |
23:59:40.000 |
33 |
21191.19 |
158.46 |
42897.90 |
47846.56 |
NaN |
86381 |
2016-02-02 |
23:59:41.000 |
33 |
21191.18 |
158.48 |
42897.86 |
47846.53 |
NaN |
86382 |
2016-02-02 |
23:59:42.000 |
33 |
21191.18 |
158.48 |
42897.86 |
47846.50 |
NaN |
86383 |
2016-02-02 |
23:59:43.000 |
33 |
21191.21 |
158.45 |
42897.85 |
47846.51 |
NaN |
86384 |
2016-02-02 |
23:59:44.000 |
33 |
21191.21 |
158.45 |
42897.86 |
47846.53 |
NaN |
86385 |
2016-02-02 |
23:59:45.000 |
33 |
21191.23 |
158.46 |
42897.84 |
47846.53 |
NaN |
86386 |
2016-02-02 |
23:59:46.000 |
33 |
21191.25 |
158.45 |
42897.83 |
47846.55 |
NaN |
86387 |
2016-02-02 |
23:59:47.000 |
33 |
21191.26 |
158.44 |
42897.84 |
47846.53 |
NaN |
86388 |
2016-02-02 |
23:59:48.000 |
33 |
21191.26 |
158.46 |
42897.82 |
47846.56 |
NaN |
86389 |
2016-02-02 |
23:59:49.000 |
33 |
21191.25 |
158.49 |
42897.83 |
47846.54 |
NaN |
86390 |
2016-02-02 |
23:59:50.000 |
33 |
21191.21 |
158.49 |
42897.84 |
47846.54 |
NaN |
86391 |
2016-02-02 |
23:59:51.000 |
33 |
21191.22 |
158.49 |
42897.82 |
47846.52 |
NaN |
86392 |
2016-02-02 |
23:59:52.000 |
33 |
21191.22 |
158.49 |
42897.85 |
47846.53 |
NaN |
86393 |
2016-02-02 |
23:59:53.000 |
33 |
21191.21 |
158.51 |
42897.85 |
47846.51 |
NaN |
86394 |
2016-02-02 |
23:59:54.000 |
33 |
21191.24 |
158.48 |
42897.84 |
47846.52 |
NaN |
86395 |
2016-02-02 |
23:59:55.000 |
33 |
21191.23 |
158.46 |
42897.83 |
47846.54 |
NaN |
86396 |
2016-02-02 |
23:59:56.000 |
33 |
21191.23 |
158.47 |
42897.83 |
47846.53 |
NaN |
86397 |
2016-02-02 |
23:59:57.000 |
33 |
21191.23 |
158.48 |
42897.86 |
47846.55 |
NaN |
86398 |
2016-02-02 |
23:59:58.000 |
33 |
21191.23 |
158.48 |
42897.86 |
47846.52 |
NaN |
86399 |
2016-02-02 |
23:59:59.000 |
33 |
21191.24 |
158.50 |
42897.88 |
47846.53 |
NaN |
86400 rows × 8 columns
.. code:: python
clf.dtypes
.. parsed-literal::
DATE object
TIME object
DOY int64
CLFX float64
CLFY float64
CLFZ float64
CLFF float64
| float64
dtype: object
plot single column
.. code:: python
clf['CLFF'].plot()
.. parsed-literal::
.. image:: 10_pandas_files/10_pandas_21_1.png
plot specific columns
.. code:: python
clf.plot(x='TIME', y='CLFF', figsize=(12,8))
.. parsed-literal::
.. image:: 10_pandas_files/10_pandas_23_1.png
add a rolling mean
.. code:: python
clf['CLFF_rolling'] = pd.rolling_mean( clf['CLFF'], window=1000, center=True)
.. code:: python
clf.loc[490:530]
.. raw:: html
|
DATE |
TIME |
DOY |
CLFX |
CLFY |
CLFZ |
CLFF |
| |
CLFF_rolling |
490 |
2016-02-02 |
00:08:10.000 |
33 |
21202.41 |
121.06 |
42892.77 |
47846.95 |
NaN |
NaN |
491 |
2016-02-02 |
00:08:11.000 |
33 |
21202.40 |
121.07 |
42892.73 |
47846.95 |
NaN |
NaN |
492 |
2016-02-02 |
00:08:12.000 |
33 |
21202.41 |
121.10 |
42892.73 |
47846.93 |
NaN |
NaN |
493 |
2016-02-02 |
00:08:13.000 |
33 |
21202.39 |
121.11 |
42892.72 |
47846.91 |
NaN |
NaN |
494 |
2016-02-02 |
00:08:14.000 |
33 |
21202.40 |
121.09 |
42892.70 |
47846.90 |
NaN |
NaN |
495 |
2016-02-02 |
00:08:15.000 |
33 |
21202.40 |
121.07 |
42892.70 |
47846.91 |
NaN |
NaN |
496 |
2016-02-02 |
00:08:16.000 |
33 |
21202.38 |
121.08 |
42892.72 |
47846.86 |
NaN |
NaN |
497 |
2016-02-02 |
00:08:17.000 |
33 |
21202.39 |
121.06 |
42892.70 |
47846.89 |
NaN |
NaN |
498 |
2016-02-02 |
00:08:18.000 |
33 |
21202.40 |
121.06 |
42892.67 |
47846.88 |
NaN |
NaN |
499 |
2016-02-02 |
00:08:19.000 |
33 |
21202.37 |
121.06 |
42892.68 |
47846.86 |
NaN |
NaN |
500 |
2016-02-02 |
00:08:20.000 |
33 |
21202.37 |
121.03 |
42892.70 |
47846.87 |
NaN |
47846.59858 |
501 |
2016-02-02 |
00:08:21.000 |
33 |
21202.37 |
121.03 |
42892.71 |
47846.86 |
NaN |
47846.59846 |
502 |
2016-02-02 |
00:08:22.000 |
33 |
21202.39 |
121.06 |
42892.68 |
47846.88 |
NaN |
47846.59836 |
503 |
2016-02-02 |
00:08:23.000 |
33 |
21202.39 |
121.02 |
42892.67 |
47846.88 |
NaN |
47846.59832 |
504 |
2016-02-02 |
00:08:24.000 |
33 |
21202.39 |
121.05 |
42892.70 |
47846.86 |
NaN |
47846.59829 |
505 |
2016-02-02 |
00:08:25.000 |
33 |
21202.39 |
121.04 |
42892.73 |
47846.89 |
NaN |
47846.59832 |
506 |
2016-02-02 |
00:08:26.000 |
33 |
21202.42 |
121.04 |
42892.72 |
47846.89 |
NaN |
47846.59837 |
507 |
2016-02-02 |
00:08:27.000 |
33 |
21202.41 |
121.03 |
42892.73 |
47846.89 |
NaN |
47846.59846 |
508 |
2016-02-02 |
00:08:28.000 |
33 |
21202.38 |
121.06 |
42892.75 |
47846.87 |
NaN |
47846.59851 |
509 |
2016-02-02 |
00:08:29.000 |
33 |
21202.37 |
121.08 |
42892.70 |
47846.85 |
NaN |
47846.59863 |
510 |
2016-02-02 |
00:08:30.000 |
33 |
21202.36 |
121.11 |
42892.69 |
47846.85 |
NaN |
47846.59873 |
511 |
2016-02-02 |
00:08:31.000 |
33 |
21202.36 |
121.12 |
42892.65 |
47846.83 |
NaN |
47846.59888 |
512 |
2016-02-02 |
00:08:32.000 |
33 |
21202.35 |
121.11 |
42892.65 |
47846.82 |
NaN |
47846.59899 |
513 |
2016-02-02 |
00:08:33.000 |
33 |
21202.33 |
121.12 |
42892.65 |
47846.83 |
NaN |
47846.59914 |
514 |
2016-02-02 |
00:08:34.000 |
33 |
21202.36 |
121.16 |
42892.62 |
47846.81 |
NaN |
47846.59927 |
515 |
2016-02-02 |
00:08:35.000 |
33 |
21202.37 |
121.15 |
42892.59 |
47846.79 |
NaN |
47846.59941 |
516 |
2016-02-02 |
00:08:36.000 |
33 |
21202.38 |
121.16 |
42892.61 |
47846.79 |
NaN |
47846.59953 |
517 |
2016-02-02 |
00:08:37.000 |
33 |
21202.36 |
121.17 |
42892.61 |
47846.78 |
NaN |
47846.59964 |
518 |
2016-02-02 |
00:08:38.000 |
33 |
21202.37 |
121.16 |
42892.59 |
47846.79 |
NaN |
47846.59977 |
519 |
2016-02-02 |
00:08:39.000 |
33 |
21202.36 |
121.19 |
42892.57 |
47846.79 |
NaN |
47846.59990 |
520 |
2016-02-02 |
00:08:40.000 |
33 |
21202.35 |
121.22 |
42892.55 |
47846.78 |
NaN |
47846.59999 |
521 |
2016-02-02 |
00:08:41.000 |
33 |
21202.34 |
121.22 |
42892.57 |
47846.77 |
NaN |
47846.60009 |
522 |
2016-02-02 |
00:08:42.000 |
33 |
21202.39 |
121.23 |
42892.59 |
47846.80 |
NaN |
47846.60017 |
523 |
2016-02-02 |
00:08:43.000 |
33 |
21202.38 |
121.25 |
42892.55 |
47846.75 |
NaN |
47846.60024 |
524 |
2016-02-02 |
00:08:44.000 |
33 |
21202.38 |
121.29 |
42892.54 |
47846.75 |
NaN |
47846.60031 |
525 |
2016-02-02 |
00:08:45.000 |
33 |
21202.36 |
121.29 |
42892.56 |
47846.71 |
NaN |
47846.60036 |
526 |
2016-02-02 |
00:08:46.000 |
33 |
21202.33 |
121.30 |
42892.58 |
47846.72 |
NaN |
47846.60039 |
527 |
2016-02-02 |
00:08:47.000 |
33 |
21202.34 |
121.30 |
42892.57 |
47846.70 |
NaN |
47846.60041 |
528 |
2016-02-02 |
00:08:48.000 |
33 |
21202.33 |
121.29 |
42892.57 |
47846.70 |
NaN |
47846.60039 |
529 |
2016-02-02 |
00:08:49.000 |
33 |
21202.32 |
121.28 |
42892.53 |
47846.71 |
NaN |
47846.60036 |
530 |
2016-02-02 |
00:08:50.000 |
33 |
21202.30 |
121.28 |
42892.51 |
47846.71 |
NaN |
47846.60035 |
.. code:: python
ax = clf.plot(x='TIME', y='CLFF', figsize=(12,8))
clf.plot(ax=ax, x='TIME', y='CLFF_rolling', color='r', linewidth=3)
plt.show()
.. image:: 10_pandas_files/10_pandas_27_0.png