pandas

Pandas is a module for Python to manipulate and analyze tabular data. It allows easy reading and writing to different file formats (including MS Excel).

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

working with Excel data

myexcel = pd.read_excel("pandas_test.xlsx", sheetname="mydata", skiprows=2)
myexcel
sample weight volume magnetic moment
0 A 20.1 10.3 2.000000e-08
1 B 20.4 10.2 2.000000e-08
2 C 19.8 9.8 2.000000e-08
3 D 17.2 11.0 2.000000e-08
4 E 24.3 10.3 2.000000e-08
5 F 18.5 10.3 2.000000e-08
myexcel[['sample','weight']]
sample weight
0 A 20.1
1 B 20.4
2 C 19.8
3 D 17.2
4 E 24.3
5 F 18.5

calculations with rows:

myexcel['density'] = myexcel['weight'] / myexcel['volume']

save pandas dataframe back to an excel file

myexcel.to_excel('new_excel_file.xlsx')

and a text file

myexcel.to_csv('new_excel_file.csv', sep='\t')

intermagnet data

Data from magnetic observatories all over the world can be downloaded here:

http://www.intermagnet.org/data-donnee/download-eng.php

we start by reading in the file downloaded from intermagnet in IAGA2002 format

clf = pd.read_csv("clf20160202vsec.sec", skiprows=22, sep='\s+')
clf
DATE TIME DOY CLFX CLFY CLFZ CLFF |
0 2016-02-02 00:00:00.000 33 21201.36 120.82 42892.47 47846.22 NaN
1 2016-02-02 00:00:01.000 33 21201.35 120.80 42892.49 47846.20 NaN
2 2016-02-02 00:00:02.000 33 21201.35 120.80 42892.47 47846.20 NaN
3 2016-02-02 00:00:03.000 33 21201.38 120.79 42892.49 47846.21 NaN
4 2016-02-02 00:00:04.000 33 21201.37 120.80 42892.49 47846.20 NaN
5 2016-02-02 00:00:05.000 33 21201.36 120.83 42892.46 47846.18 NaN
6 2016-02-02 00:00:06.000 33 21201.34 120.86 42892.45 47846.19 NaN
7 2016-02-02 00:00:07.000 33 21201.36 120.83 42892.45 47846.19 NaN
8 2016-02-02 00:00:08.000 33 21201.35 120.83 42892.45 47846.17 NaN
9 2016-02-02 00:00:09.000 33 21201.35 120.81 42892.47 47846.17 NaN
10 2016-02-02 00:00:10.000 33 21201.33 120.80 42892.44 47846.15 NaN
11 2016-02-02 00:00:11.000 33 21201.33 120.82 42892.43 47846.17 NaN
12 2016-02-02 00:00:12.000 33 21201.33 120.85 42892.41 47846.14 NaN
13 2016-02-02 00:00:13.000 33 21201.35 120.84 42892.43 47846.15 NaN
14 2016-02-02 00:00:14.000 33 21201.34 120.83 42892.45 47846.14 NaN
15 2016-02-02 00:00:15.000 33 21201.31 120.84 42892.43 47846.15 NaN
16 2016-02-02 00:00:16.000 33 21201.34 120.84 42892.41 47846.16 NaN
17 2016-02-02 00:00:17.000 33 21201.35 120.85 42892.40 47846.13 NaN
18 2016-02-02 00:00:18.000 33 21201.36 120.82 42892.44 47846.14 NaN
19 2016-02-02 00:00:19.000 33 21201.36 120.81 42892.45 47846.16 NaN
20 2016-02-02 00:00:20.000 33 21201.36 120.82 42892.44 47846.15 NaN
21 2016-02-02 00:00:21.000 33 21201.36 120.81 42892.44 47846.16 NaN
22 2016-02-02 00:00:22.000 33 21201.35 120.82 42892.46 47846.16 NaN
23 2016-02-02 00:00:23.000 33 21201.35 120.81 42892.48 47846.15 NaN
24 2016-02-02 00:00:24.000 33 21201.33 120.82 42892.47 47846.16 NaN
25 2016-02-02 00:00:25.000 33 21201.37 120.77 42892.46 47846.18 NaN
26 2016-02-02 00:00:26.000 33 21201.41 120.76 42892.48 47846.18 NaN
27 2016-02-02 00:00:27.000 33 21201.42 120.74 42892.47 47846.20 NaN
28 2016-02-02 00:00:28.000 33 21201.41 120.75 42892.47 47846.21 NaN
29 2016-02-02 00:00:29.000 33 21201.41 120.76 42892.50 47846.20 NaN
... ... ... ... ... ... ... ... ...
86370 2016-02-02 23:59:30.000 33 21191.18 158.23 42897.90 47846.59 NaN
86371 2016-02-02 23:59:31.000 33 21191.18 158.24 42897.90 47846.60 NaN
86372 2016-02-02 23:59:32.000 33 21191.18 158.23 42897.91 47846.60 NaN
86373 2016-02-02 23:59:33.000 33 21191.17 158.27 42897.91 47846.58 NaN
86374 2016-02-02 23:59:34.000 33 21191.19 158.28 42897.90 47846.59 NaN
86375 2016-02-02 23:59:35.000 33 21191.21 158.31 42897.91 47846.60 NaN
86376 2016-02-02 23:59:36.000 33 21191.25 158.30 42897.90 47846.60 NaN
86377 2016-02-02 23:59:37.000 33 21191.23 158.30 42897.93 47846.61 NaN
86378 2016-02-02 23:59:38.000 33 21191.21 158.34 42897.93 47846.60 NaN
86379 2016-02-02 23:59:39.000 33 21191.20 158.41 42897.91 47846.59 NaN
86380 2016-02-02 23:59:40.000 33 21191.19 158.46 42897.90 47846.56 NaN
86381 2016-02-02 23:59:41.000 33 21191.18 158.48 42897.86 47846.53 NaN
86382 2016-02-02 23:59:42.000 33 21191.18 158.48 42897.86 47846.50 NaN
86383 2016-02-02 23:59:43.000 33 21191.21 158.45 42897.85 47846.51 NaN
86384 2016-02-02 23:59:44.000 33 21191.21 158.45 42897.86 47846.53 NaN
86385 2016-02-02 23:59:45.000 33 21191.23 158.46 42897.84 47846.53 NaN
86386 2016-02-02 23:59:46.000 33 21191.25 158.45 42897.83 47846.55 NaN
86387 2016-02-02 23:59:47.000 33 21191.26 158.44 42897.84 47846.53 NaN
86388 2016-02-02 23:59:48.000 33 21191.26 158.46 42897.82 47846.56 NaN
86389 2016-02-02 23:59:49.000 33 21191.25 158.49 42897.83 47846.54 NaN
86390 2016-02-02 23:59:50.000 33 21191.21 158.49 42897.84 47846.54 NaN
86391 2016-02-02 23:59:51.000 33 21191.22 158.49 42897.82 47846.52 NaN
86392 2016-02-02 23:59:52.000 33 21191.22 158.49 42897.85 47846.53 NaN
86393 2016-02-02 23:59:53.000 33 21191.21 158.51 42897.85 47846.51 NaN
86394 2016-02-02 23:59:54.000 33 21191.24 158.48 42897.84 47846.52 NaN
86395 2016-02-02 23:59:55.000 33 21191.23 158.46 42897.83 47846.54 NaN
86396 2016-02-02 23:59:56.000 33 21191.23 158.47 42897.83 47846.53 NaN
86397 2016-02-02 23:59:57.000 33 21191.23 158.48 42897.86 47846.55 NaN
86398 2016-02-02 23:59:58.000 33 21191.23 158.48 42897.86 47846.52 NaN
86399 2016-02-02 23:59:59.000 33 21191.24 158.50 42897.88 47846.53 NaN

86400 rows × 8 columns

clf.dtypes
DATE     object
TIME     object
DOY       int64
CLFX    float64
CLFY    float64
CLFZ    float64
CLFF    float64
|       float64
dtype: object

plot single column

clf['CLFF'].plot()
<matplotlib.axes._subplots.AxesSubplot at 0x97f59b0>
_images/10_pandas_21_1.png

plot specific columns

clf.plot(x='TIME', y='CLFF', figsize=(12,8))
<matplotlib.axes._subplots.AxesSubplot at 0x93d2710>
_images/10_pandas_23_1.png

add a rolling mean

clf['CLFF_rolling'] = pd.rolling_mean( clf['CLFF'], window=1000, center=True)
clf.loc[490:530]
DATE TIME DOY CLFX CLFY CLFZ CLFF | CLFF_rolling
490 2016-02-02 00:08:10.000 33 21202.41 121.06 42892.77 47846.95 NaN NaN
491 2016-02-02 00:08:11.000 33 21202.40 121.07 42892.73 47846.95 NaN NaN
492 2016-02-02 00:08:12.000 33 21202.41 121.10 42892.73 47846.93 NaN NaN
493 2016-02-02 00:08:13.000 33 21202.39 121.11 42892.72 47846.91 NaN NaN
494 2016-02-02 00:08:14.000 33 21202.40 121.09 42892.70 47846.90 NaN NaN
495 2016-02-02 00:08:15.000 33 21202.40 121.07 42892.70 47846.91 NaN NaN
496 2016-02-02 00:08:16.000 33 21202.38 121.08 42892.72 47846.86 NaN NaN
497 2016-02-02 00:08:17.000 33 21202.39 121.06 42892.70 47846.89 NaN NaN
498 2016-02-02 00:08:18.000 33 21202.40 121.06 42892.67 47846.88 NaN NaN
499 2016-02-02 00:08:19.000 33 21202.37 121.06 42892.68 47846.86 NaN NaN
500 2016-02-02 00:08:20.000 33 21202.37 121.03 42892.70 47846.87 NaN 47846.59858
501 2016-02-02 00:08:21.000 33 21202.37 121.03 42892.71 47846.86 NaN 47846.59846
502 2016-02-02 00:08:22.000 33 21202.39 121.06 42892.68 47846.88 NaN 47846.59836
503 2016-02-02 00:08:23.000 33 21202.39 121.02 42892.67 47846.88 NaN 47846.59832
504 2016-02-02 00:08:24.000 33 21202.39 121.05 42892.70 47846.86 NaN 47846.59829
505 2016-02-02 00:08:25.000 33 21202.39 121.04 42892.73 47846.89 NaN 47846.59832
506 2016-02-02 00:08:26.000 33 21202.42 121.04 42892.72 47846.89 NaN 47846.59837
507 2016-02-02 00:08:27.000 33 21202.41 121.03 42892.73 47846.89 NaN 47846.59846
508 2016-02-02 00:08:28.000 33 21202.38 121.06 42892.75 47846.87 NaN 47846.59851
509 2016-02-02 00:08:29.000 33 21202.37 121.08 42892.70 47846.85 NaN 47846.59863
510 2016-02-02 00:08:30.000 33 21202.36 121.11 42892.69 47846.85 NaN 47846.59873
511 2016-02-02 00:08:31.000 33 21202.36 121.12 42892.65 47846.83 NaN 47846.59888
512 2016-02-02 00:08:32.000 33 21202.35 121.11 42892.65 47846.82 NaN 47846.59899
513 2016-02-02 00:08:33.000 33 21202.33 121.12 42892.65 47846.83 NaN 47846.59914
514 2016-02-02 00:08:34.000 33 21202.36 121.16 42892.62 47846.81 NaN 47846.59927
515 2016-02-02 00:08:35.000 33 21202.37 121.15 42892.59 47846.79 NaN 47846.59941
516 2016-02-02 00:08:36.000 33 21202.38 121.16 42892.61 47846.79 NaN 47846.59953
517 2016-02-02 00:08:37.000 33 21202.36 121.17 42892.61 47846.78 NaN 47846.59964
518 2016-02-02 00:08:38.000 33 21202.37 121.16 42892.59 47846.79 NaN 47846.59977
519 2016-02-02 00:08:39.000 33 21202.36 121.19 42892.57 47846.79 NaN 47846.59990
520 2016-02-02 00:08:40.000 33 21202.35 121.22 42892.55 47846.78 NaN 47846.59999
521 2016-02-02 00:08:41.000 33 21202.34 121.22 42892.57 47846.77 NaN 47846.60009
522 2016-02-02 00:08:42.000 33 21202.39 121.23 42892.59 47846.80 NaN 47846.60017
523 2016-02-02 00:08:43.000 33 21202.38 121.25 42892.55 47846.75 NaN 47846.60024
524 2016-02-02 00:08:44.000 33 21202.38 121.29 42892.54 47846.75 NaN 47846.60031
525 2016-02-02 00:08:45.000 33 21202.36 121.29 42892.56 47846.71 NaN 47846.60036
526 2016-02-02 00:08:46.000 33 21202.33 121.30 42892.58 47846.72 NaN 47846.60039
527 2016-02-02 00:08:47.000 33 21202.34 121.30 42892.57 47846.70 NaN 47846.60041
528 2016-02-02 00:08:48.000 33 21202.33 121.29 42892.57 47846.70 NaN 47846.60039
529 2016-02-02 00:08:49.000 33 21202.32 121.28 42892.53 47846.71 NaN 47846.60036
530 2016-02-02 00:08:50.000 33 21202.30 121.28 42892.51 47846.71 NaN 47846.60035
ax = clf.plot(x='TIME', y='CLFF', figsize=(12,8))
clf.plot(ax=ax, x='TIME', y='CLFF_rolling', color='r', linewidth=3)
plt.show()
_images/10_pandas_27_0.png