pandas ------ Pandas is a module for Python to manipulate and analyze tabular data. It allows easy reading and writing to different file formats (including MS Excel). .. code:: python import pandas as pd import numpy as np import matplotlib.pyplot as plt %matplotlib inline working with Excel data ~~~~~~~~~~~~~~~~~~~~~~~ .. code:: python myexcel = pd.read_excel("pandas_test.xlsx", sheetname="mydata", skiprows=2) .. code:: python myexcel .. raw:: html
sample weight volume magnetic moment
0 A 20.1 10.3 2.000000e-08
1 B 20.4 10.2 2.000000e-08
2 C 19.8 9.8 2.000000e-08
3 D 17.2 11.0 2.000000e-08
4 E 24.3 10.3 2.000000e-08
5 F 18.5 10.3 2.000000e-08
.. code:: python myexcel[['sample','weight']] .. raw:: html
sample weight
0 A 20.1
1 B 20.4
2 C 19.8
3 D 17.2
4 E 24.3
5 F 18.5
calculations with rows: .. code:: python myexcel['density'] = myexcel['weight'] / myexcel['volume'] save pandas dataframe back to an excel file .. code:: python myexcel.to_excel('new_excel_file.xlsx') and a text file .. code:: python myexcel.to_csv('new_excel_file.csv', sep='\t') intermagnet data ~~~~~~~~~~~~~~~~ Data from magnetic observatories all over the world can be downloaded here: http://www.intermagnet.org/data-donnee/download-eng.php we start by reading in the file downloaded from intermagnet in IAGA2002 format .. code:: python clf = pd.read_csv("clf20160202vsec.sec", skiprows=22, sep='\s+') .. code:: python clf .. raw:: html
DATE TIME DOY CLFX CLFY CLFZ CLFF |
0 2016-02-02 00:00:00.000 33 21201.36 120.82 42892.47 47846.22 NaN
1 2016-02-02 00:00:01.000 33 21201.35 120.80 42892.49 47846.20 NaN
2 2016-02-02 00:00:02.000 33 21201.35 120.80 42892.47 47846.20 NaN
3 2016-02-02 00:00:03.000 33 21201.38 120.79 42892.49 47846.21 NaN
4 2016-02-02 00:00:04.000 33 21201.37 120.80 42892.49 47846.20 NaN
5 2016-02-02 00:00:05.000 33 21201.36 120.83 42892.46 47846.18 NaN
6 2016-02-02 00:00:06.000 33 21201.34 120.86 42892.45 47846.19 NaN
7 2016-02-02 00:00:07.000 33 21201.36 120.83 42892.45 47846.19 NaN
8 2016-02-02 00:00:08.000 33 21201.35 120.83 42892.45 47846.17 NaN
9 2016-02-02 00:00:09.000 33 21201.35 120.81 42892.47 47846.17 NaN
10 2016-02-02 00:00:10.000 33 21201.33 120.80 42892.44 47846.15 NaN
11 2016-02-02 00:00:11.000 33 21201.33 120.82 42892.43 47846.17 NaN
12 2016-02-02 00:00:12.000 33 21201.33 120.85 42892.41 47846.14 NaN
13 2016-02-02 00:00:13.000 33 21201.35 120.84 42892.43 47846.15 NaN
14 2016-02-02 00:00:14.000 33 21201.34 120.83 42892.45 47846.14 NaN
15 2016-02-02 00:00:15.000 33 21201.31 120.84 42892.43 47846.15 NaN
16 2016-02-02 00:00:16.000 33 21201.34 120.84 42892.41 47846.16 NaN
17 2016-02-02 00:00:17.000 33 21201.35 120.85 42892.40 47846.13 NaN
18 2016-02-02 00:00:18.000 33 21201.36 120.82 42892.44 47846.14 NaN
19 2016-02-02 00:00:19.000 33 21201.36 120.81 42892.45 47846.16 NaN
20 2016-02-02 00:00:20.000 33 21201.36 120.82 42892.44 47846.15 NaN
21 2016-02-02 00:00:21.000 33 21201.36 120.81 42892.44 47846.16 NaN
22 2016-02-02 00:00:22.000 33 21201.35 120.82 42892.46 47846.16 NaN
23 2016-02-02 00:00:23.000 33 21201.35 120.81 42892.48 47846.15 NaN
24 2016-02-02 00:00:24.000 33 21201.33 120.82 42892.47 47846.16 NaN
25 2016-02-02 00:00:25.000 33 21201.37 120.77 42892.46 47846.18 NaN
26 2016-02-02 00:00:26.000 33 21201.41 120.76 42892.48 47846.18 NaN
27 2016-02-02 00:00:27.000 33 21201.42 120.74 42892.47 47846.20 NaN
28 2016-02-02 00:00:28.000 33 21201.41 120.75 42892.47 47846.21 NaN
29 2016-02-02 00:00:29.000 33 21201.41 120.76 42892.50 47846.20 NaN
... ... ... ... ... ... ... ... ...
86370 2016-02-02 23:59:30.000 33 21191.18 158.23 42897.90 47846.59 NaN
86371 2016-02-02 23:59:31.000 33 21191.18 158.24 42897.90 47846.60 NaN
86372 2016-02-02 23:59:32.000 33 21191.18 158.23 42897.91 47846.60 NaN
86373 2016-02-02 23:59:33.000 33 21191.17 158.27 42897.91 47846.58 NaN
86374 2016-02-02 23:59:34.000 33 21191.19 158.28 42897.90 47846.59 NaN
86375 2016-02-02 23:59:35.000 33 21191.21 158.31 42897.91 47846.60 NaN
86376 2016-02-02 23:59:36.000 33 21191.25 158.30 42897.90 47846.60 NaN
86377 2016-02-02 23:59:37.000 33 21191.23 158.30 42897.93 47846.61 NaN
86378 2016-02-02 23:59:38.000 33 21191.21 158.34 42897.93 47846.60 NaN
86379 2016-02-02 23:59:39.000 33 21191.20 158.41 42897.91 47846.59 NaN
86380 2016-02-02 23:59:40.000 33 21191.19 158.46 42897.90 47846.56 NaN
86381 2016-02-02 23:59:41.000 33 21191.18 158.48 42897.86 47846.53 NaN
86382 2016-02-02 23:59:42.000 33 21191.18 158.48 42897.86 47846.50 NaN
86383 2016-02-02 23:59:43.000 33 21191.21 158.45 42897.85 47846.51 NaN
86384 2016-02-02 23:59:44.000 33 21191.21 158.45 42897.86 47846.53 NaN
86385 2016-02-02 23:59:45.000 33 21191.23 158.46 42897.84 47846.53 NaN
86386 2016-02-02 23:59:46.000 33 21191.25 158.45 42897.83 47846.55 NaN
86387 2016-02-02 23:59:47.000 33 21191.26 158.44 42897.84 47846.53 NaN
86388 2016-02-02 23:59:48.000 33 21191.26 158.46 42897.82 47846.56 NaN
86389 2016-02-02 23:59:49.000 33 21191.25 158.49 42897.83 47846.54 NaN
86390 2016-02-02 23:59:50.000 33 21191.21 158.49 42897.84 47846.54 NaN
86391 2016-02-02 23:59:51.000 33 21191.22 158.49 42897.82 47846.52 NaN
86392 2016-02-02 23:59:52.000 33 21191.22 158.49 42897.85 47846.53 NaN
86393 2016-02-02 23:59:53.000 33 21191.21 158.51 42897.85 47846.51 NaN
86394 2016-02-02 23:59:54.000 33 21191.24 158.48 42897.84 47846.52 NaN
86395 2016-02-02 23:59:55.000 33 21191.23 158.46 42897.83 47846.54 NaN
86396 2016-02-02 23:59:56.000 33 21191.23 158.47 42897.83 47846.53 NaN
86397 2016-02-02 23:59:57.000 33 21191.23 158.48 42897.86 47846.55 NaN
86398 2016-02-02 23:59:58.000 33 21191.23 158.48 42897.86 47846.52 NaN
86399 2016-02-02 23:59:59.000 33 21191.24 158.50 42897.88 47846.53 NaN

86400 rows × 8 columns

.. code:: python clf.dtypes .. parsed-literal:: DATE object TIME object DOY int64 CLFX float64 CLFY float64 CLFZ float64 CLFF float64 | float64 dtype: object plot single column .. code:: python clf['CLFF'].plot() .. parsed-literal:: .. image:: 10_pandas_files/10_pandas_21_1.png plot specific columns .. code:: python clf.plot(x='TIME', y='CLFF', figsize=(12,8)) .. parsed-literal:: .. image:: 10_pandas_files/10_pandas_23_1.png add a rolling mean .. code:: python clf['CLFF_rolling'] = pd.rolling_mean( clf['CLFF'], window=1000, center=True) .. code:: python clf.loc[490:530] .. raw:: html
DATE TIME DOY CLFX CLFY CLFZ CLFF | CLFF_rolling
490 2016-02-02 00:08:10.000 33 21202.41 121.06 42892.77 47846.95 NaN NaN
491 2016-02-02 00:08:11.000 33 21202.40 121.07 42892.73 47846.95 NaN NaN
492 2016-02-02 00:08:12.000 33 21202.41 121.10 42892.73 47846.93 NaN NaN
493 2016-02-02 00:08:13.000 33 21202.39 121.11 42892.72 47846.91 NaN NaN
494 2016-02-02 00:08:14.000 33 21202.40 121.09 42892.70 47846.90 NaN NaN
495 2016-02-02 00:08:15.000 33 21202.40 121.07 42892.70 47846.91 NaN NaN
496 2016-02-02 00:08:16.000 33 21202.38 121.08 42892.72 47846.86 NaN NaN
497 2016-02-02 00:08:17.000 33 21202.39 121.06 42892.70 47846.89 NaN NaN
498 2016-02-02 00:08:18.000 33 21202.40 121.06 42892.67 47846.88 NaN NaN
499 2016-02-02 00:08:19.000 33 21202.37 121.06 42892.68 47846.86 NaN NaN
500 2016-02-02 00:08:20.000 33 21202.37 121.03 42892.70 47846.87 NaN 47846.59858
501 2016-02-02 00:08:21.000 33 21202.37 121.03 42892.71 47846.86 NaN 47846.59846
502 2016-02-02 00:08:22.000 33 21202.39 121.06 42892.68 47846.88 NaN 47846.59836
503 2016-02-02 00:08:23.000 33 21202.39 121.02 42892.67 47846.88 NaN 47846.59832
504 2016-02-02 00:08:24.000 33 21202.39 121.05 42892.70 47846.86 NaN 47846.59829
505 2016-02-02 00:08:25.000 33 21202.39 121.04 42892.73 47846.89 NaN 47846.59832
506 2016-02-02 00:08:26.000 33 21202.42 121.04 42892.72 47846.89 NaN 47846.59837
507 2016-02-02 00:08:27.000 33 21202.41 121.03 42892.73 47846.89 NaN 47846.59846
508 2016-02-02 00:08:28.000 33 21202.38 121.06 42892.75 47846.87 NaN 47846.59851
509 2016-02-02 00:08:29.000 33 21202.37 121.08 42892.70 47846.85 NaN 47846.59863
510 2016-02-02 00:08:30.000 33 21202.36 121.11 42892.69 47846.85 NaN 47846.59873
511 2016-02-02 00:08:31.000 33 21202.36 121.12 42892.65 47846.83 NaN 47846.59888
512 2016-02-02 00:08:32.000 33 21202.35 121.11 42892.65 47846.82 NaN 47846.59899
513 2016-02-02 00:08:33.000 33 21202.33 121.12 42892.65 47846.83 NaN 47846.59914
514 2016-02-02 00:08:34.000 33 21202.36 121.16 42892.62 47846.81 NaN 47846.59927
515 2016-02-02 00:08:35.000 33 21202.37 121.15 42892.59 47846.79 NaN 47846.59941
516 2016-02-02 00:08:36.000 33 21202.38 121.16 42892.61 47846.79 NaN 47846.59953
517 2016-02-02 00:08:37.000 33 21202.36 121.17 42892.61 47846.78 NaN 47846.59964
518 2016-02-02 00:08:38.000 33 21202.37 121.16 42892.59 47846.79 NaN 47846.59977
519 2016-02-02 00:08:39.000 33 21202.36 121.19 42892.57 47846.79 NaN 47846.59990
520 2016-02-02 00:08:40.000 33 21202.35 121.22 42892.55 47846.78 NaN 47846.59999
521 2016-02-02 00:08:41.000 33 21202.34 121.22 42892.57 47846.77 NaN 47846.60009
522 2016-02-02 00:08:42.000 33 21202.39 121.23 42892.59 47846.80 NaN 47846.60017
523 2016-02-02 00:08:43.000 33 21202.38 121.25 42892.55 47846.75 NaN 47846.60024
524 2016-02-02 00:08:44.000 33 21202.38 121.29 42892.54 47846.75 NaN 47846.60031
525 2016-02-02 00:08:45.000 33 21202.36 121.29 42892.56 47846.71 NaN 47846.60036
526 2016-02-02 00:08:46.000 33 21202.33 121.30 42892.58 47846.72 NaN 47846.60039
527 2016-02-02 00:08:47.000 33 21202.34 121.30 42892.57 47846.70 NaN 47846.60041
528 2016-02-02 00:08:48.000 33 21202.33 121.29 42892.57 47846.70 NaN 47846.60039
529 2016-02-02 00:08:49.000 33 21202.32 121.28 42892.53 47846.71 NaN 47846.60036
530 2016-02-02 00:08:50.000 33 21202.30 121.28 42892.51 47846.71 NaN 47846.60035
.. code:: python ax = clf.plot(x='TIME', y='CLFF', figsize=(12,8)) clf.plot(ax=ax, x='TIME', y='CLFF_rolling', color='r', linewidth=3) plt.show() .. image:: 10_pandas_files/10_pandas_27_0.png