Tuesday, August 3, 2021

Writing Data From the Pandas Data Analysis Library To MySQL

    In the last installment the flexible and easy to use open source data analysis and manipulation tool     Pandas, built on top of the Python programming language, was used with the MySQL Shell, aka mysqlsh.  Pandas is a very popular open source project that features data structures and tools to manipulate numerical tables.  And the name is derived from Panel Data and not the bamboo chewing critters. 

    Tasks like generating a series is simple:

dstokes@localhost pandas]$ python
Python 3.9.6 (default, Jul 16 2021, 00:00:00) 
[GCC 11.1.1 20210531 (Red Hat 11.1.1-3)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas as pd
>>> import numpy as np
>>> s = pd.Series([1, 3, 5, 11, 29, 42])
>>> s
0     1
1     3
2     5
3    11
4    29
5    42
dtype: int64
>>> 

    Data frame is a table presentation concept that is easy to use.  Below some data on various guitars is presented:

>>> df = pd.DataFrame(
...  { "Brand": 
    [ "Fender Telecaster", "Gretsch 5420", "Gibson Flying-V"],
...    "color": [ "blonde", "black", "natural"],
... }
... )
>>> df
               Brand    color
0  Fender Telecaster   blonde
1       Gretsch 5420    black
2    Gibson Flying-V  natural
>>> 

    Hmm, does the above data frame definition look a bit like JSON to anyone besides me?  I will have to experiment with loading and saving from MySQL's JSON data type.  In a future installment I will dig a bit more deeply.

But How Do You Save Data??

    The last blog on Pandas generated some questions about saving data from Pandas into MySQL.  

    In the example below the data from a series is saved to a table.  This time we are not using mysqlsh and instead uses a overly simple script that uses the MySQL Python connector.   

    Numpy's random number generator is combined with Panda's one dimensional array class  s = pd.Series(np.random.randn(1)) to produce a numeric value.

#!/usr/bin/python
import pandas as pd
import numpy as np
import mysql.connector
import random
import string

#Generate a random letter
randomLetter = random.choice(string.ascii_letters)

#Generate a random number
s = pd.Series(np.random.randn(1))

try:
  mydb = mysql.connector.connect(host="localhost", 
         database="bear", user="panda", passwd="123Bear!")
  mycursor = mydb.cursor()
  sql = "INSERT INTO t1 (myletter, myvalue) VALUES (%s, %s)"
  val = [randomLetter ,  str(s[0])]
  mycursor.execute(sql,val)
  mydb.commit()

except Exception as e:
  mydb.close()
  print(str(e))

Now there are other connectors but my preference is the MySQL Python Connector for the reasons that it works exceedingly well and in simple to use.  Now this is a very simple example and yes the code works within a script and within mysqlsh.