Introduction
This post on how to Install psycopg2, create connection to Postgres and createdb is not needed if you clone or download Karttur´s GeoImagine Framework from the GitHub.com. This post, however, contains the details on how to create a more secure database connection. And it also covers both general and detailed instructions that are useful for other setups with psycopg2 as well as when updates are required.
In earlier posts I described how to install Eclipse IDE for Python development after installing Anaconda Python as the Python interpreter, and then I installed PostgreSQL and PostGIS. This post describes how to connect Python with the Postgres server and create a new database in Postgres using Python.
Environments and packages
The Anaconda Python distribution contains a lot of Python packages (a package is a self contained collection of python modules [.py files] that performs given tasks). When working with Python you can find packages for almost all your needs, either that can be used out of the box, or after some modification. There is of course a package for connecting Python to Postgres server - psycopg2.
Psycopg2 as a package in a virtual environments
If you followed the post on Conda virtual environments you should have a virtual environment for Python (e.g. geoimagine001) setup on your local machine. And it should include psycopg2 as this package was added to the list of default packages to install with any new virtual environment.
Add psycopg2 to a virtual environment
if you did setup a virtual python environment as described in the previous section, you can just skip this section.
You can install new packages into your environment in the usual way that conda packages are installed. Make sure that the environment into which you want to install a package (psycopg2 in this case) is the active environment:
$ conda activate geoimagine001
The terminal prompt should now point at your environment. and you can enter the install command:
(geoimagine001) … $ conda install -c anaconda psycopg2
or tell conda under which environment to install the packages:
(base) …$ conda install –name geoimagine001 -c anaconda psycopg2
Once the installation is finished you should see the installed packages under the site-packages path. On my machine that is:
/Applications/anaconda3/envs/geoimagine001/lib/python3.6/site-packages/psycopg2
Add psycopg2 to the Anaconda base
This section describes how to add psycopg2 to your conda base environment. If you have a virtual environment as describes above, you can also skip this section.
If you installed Anaconda and set up Eclipse as described in the earlier posts, the Python distribution that Eclipse uses is under:
‘path’/anaconda3/lib/python3.7
where ‘path’ is the path you choose for installing Anaconda (if you use another python version then 3.7, the version number is different). Your Python path contains a folder called site-packages. The packages in that folder are available for direct use in the Eclipse IDE.
In my Anaconda installation, psycopg2 was not installed under site-packages, but included as an .egg file - a kind of package installer. To install psycopg2, with Anaconda set up as described in the earlier post, all you have to do to add psycopg2 to your site-packages is to execute the Terminal command:
$ conda install psycopg2
To check if psycopg2 is in place, list the package content in the Terminal:
$ ls ‘path’/anaconda2/lib/python2.7/site-packages/psycopg2
You can also open the folder in Finder. Copy the full path to the site-packages folder, and when in Finder, simultaneously press the keys for ‘command(cmd)’+’Shift’+’G’, in the Go to the Folder that opens, just paste the full path and click .
Security
If you are only going to use your Postgres database as localhost (on your own machine), security is less important. But if you want to protect your data you must set some level of security. The solution I use is primarily for macOS and UNIX/Linux systems, and is not very advanced. I use a combination of storing my password in my home directory (~) combined with a simple encryption.
Create a file in your home directory (~) called .netrc that defines your credentials. An earlier post describes how to use the Terminal for creating and editing files in detail. In the Terminal go to your home directory:
$ cd ~
Then start the Terminal text editor pico for editing/creating the file:
$ pico .netrc
Enter the two lines below (but with your role/user and password), one for the default user (if you installed Postgres with Homebrew the default user is the same as your user on the local machine - ‘yourUser’), and one for the production user (‘prodUser’) if you followed my suggestions in the previous post. If you only have the default user, enter the same login and password in both lines.
machine localhost0 login yourUser password yourPassword
machine localhost1 login prodUser password prodPassword
Exit pico (ctrl-X) and save the file (click Y when asked to save). You probably have to change the read and write permissions for .netrc, which you do by executing the following Terminal command:
$ chmod og-rw .netrc
With this solution your credentials will only be explicitly written out in a hidden file.
Set Postgres connection in Python (Eclipse)
Start Eclipse, and you should come back to the workbench that you created in a previous post. Repeat the steps outlined in that post to create a new PyDev project, with a PyDev package and a sub-package.
Create a project (call it what you want):
Create a new PyDev Package and call it ‘geoimagine’:
Create a new sub-package and call it ‘db_setup’:
In the dialog window you should see the main package (‘geoimagine’) already filled (geoimagine). Use Python syntax and add a dot (.) followed by the name of the sub-package (‘db_setup’) (geoimagine.db_setup).
PyDev Class module
In the sub-package create a Python module called ‘db_setup_class’ (set type to Class in the Template) window that appears after you click click :
In the db_setup_class.py module enter (or copy and paste) the following code (replace the default Class that was created with the module):
import psycopg2
from base64 import b64decode
class PGsession:
"""Connect to postgres server"""
def __init__(self, query):
"""Connect to selected database"""
query['pswd'] = b64decode(query['pswd'])
conn_string = "host='localhost' dbname='%(db)s' user='%(user)s' password='%(pswd)s'" %query
self.conn = psycopg2.connect(conn_string)
self.cursor = self.conn.cursor()
The import psycopg2 you installed above, whereas the package base64 is a Python core package. You will use it to send your password in encrypted form. When you call the module class PGsession it expects a variabe called ‘query’. This variable should be in the form of a dictionary, with pairs of keys and values:
query = {key0:value0, key1:value1 ...}
You are going to use the query dictionary for sending the user and (encrypted) password to PGsession. But you must first create the Main PyDev module from which you will do that.
Connect and create a new DB
Create a second PyDev module, called db_setup_main in the db_setup package. Below are two versions, but the second version is just in case your system can not handle the default .netrc file using the netrc package (the second version instead includes an explicit file reader, but you should not need to worry about that).
from db_setup_class import PGsession
from base64 import b64encode
import netrc
def SetUpProdDb(prodDB):
'''
Create production database(db)
'''
HOST = 'localhost1'
secrets = netrc.netrc()
username, account, password = secrets.authenticators( HOST )
#encrypt the password
pswd = b64encode(password)
#create a query dictionary for connecting to the Postgres server
query = {'db':'postgres','user':username,'pswd':pswd}
#Connect to the Postgres Server
iniSession = PGsession(query)
#Set the name of your production database(db)
prodDbD = {'dbname':prodDB}
#Select the current (cluster) db
iniSession.cursor.execute("SELECT current_database()")
#Get the results from the SELECT statement
record = iniSession.cursor.fetchone()
#Print Current (cluster) db
print 'Current database',record[0]
#Select the logged in user
iniSession.cursor.execute("SELECT user")
#Get the results from the SELECT statement
record = iniSession.cursor.fetchone()
#Print Current user
print 'User',record[0]
#Select all databases in the cluster db
iniSession.cursor.execute("SELECT datname FROM pg_database;")
#Get the results from the SELECT statement
records = iniSession.cursor.fetchall()
#Convert the retrieved records to a simple list
dbL = [item[0] for item in records]
#Print the list of all databases in the cluster
print 'Databases', dbL
#Check if your required production db exists
if not prodDbD['dbname'] in dbL:
#Your production db does not exists
printstr = 'Creating database: %s' %( prodDbD['dbname'])
print printstr
#Import the psycopg extension that allows you to create a new db
from psycopg2.extensions import ISOLATION_LEVEL_AUTOCOMMIT
#Invoke the connection with the extension
iniSession.conn.set_isolation_level(ISOLATION_LEVEL_AUTOCOMMIT)
#Create your production db
iniSession.cursor.execute("CREATE DATABASE %(dbname)s;" %prodDbD)
if __name__ == "__main__":
prodDB = 'YourProdDB'
SetUpProdDb(prodDB)
The alternative version (with an explicit reader for the .netrc) is only a backup if the above version does not work properly. Click the button to see the code.
Run your package
With the main module (db_setup_main.py) open in the Eclipse main pane, kick off your code from the Eclipse menu:
If everything worked out, the Console pane should return:
Current database postgres
User YourUser
Databases ['postgres', 'template1', 'template0']
Creating database: 'prodDB'
If you run it again, the database ‘prodDB’ will be listed in ‘Databases []’ in the Console, and not be created again.
You can look at the details for ‘prodDB’ in pgAdmin (installing and setting up pgAdmin is described in this post). Connect pgAdmin to your Postgres server and expand the Browser menu by clicking on the small ‘+’ signs:
If you click ‘prodDB’ (you might also need to click the tab SQL in the right pane of pgAdmin) you will see how ‘prodDB’ was defined:
CREATE DATABASE 'prodDB'
WITH
OWNER = 'youruser'
ENCODING = 'UTF8'
LC_COLLATE = 'en_US.UTF-8'
LC_CTYPE = 'en_US.UTF-8'
TABLESPACE = pg_default
CONNECTION LIMIT = -1;
Encoding, collation, and Ctype are set to UTF8 (the most universal set of characters), and tablespace is the to pg_default. The ‘-1’ for connection limit means unlimited. You can set all this parameters in the SQL for CREATEDB if you want, but I prefer the most universal, which is also the postgres default.