
It is a tremendously fast NewSQL database which is avaiable as Open source software under Apache 2.0 license. It is basically like a superset of a SQL database (with added features and support for time-varying data or event data). It is made with the prime focus on performance.
Step 1: Installing Docker
Follow the instructions here and choose the docker installer based on your Operating system: https://docs.docker.com/engine/install/
Step 2: Pulling the QuestDB image and creating a docker container
docker run -p 9000:9000 -p 8812:8812 questdb/questdbThis parameter will publish a port to the host, you can specify:
-p 9000:9000 for the REST API and the Web Console. The web console is available on http://localhost:9000-p 8812:8812 for the Postgres wire protocol-p 9009:9009 InfluxDB line protocolStep 3: Checking if the QuestDB container is running or not
Use docker ps to check container status.
Step 4: Python and Jupyter installation
pip3 install requests urlib matplotlib pandas to install requried modulespip3 install --upgrade ipython jupyterjupyter notebook which will start jupyter, once it is up create a new notebookWe would generate some random data and store that data into a test database named weather which we would create. The create statement in QuestDB pushes the data into the bottom of the table. Our data is comprised of:
temp is the temperature in Celcius.rain24H is the amount of precipitation in the last 24 hours.thunder is a boolean returning True if thunder is present.timestamp is the date and time. If the below code block prints 200, it means the database was created successfully. If it prints 400 then it means that the database already exists.Code:
import requests
import urllib.parse as par
q = 'create table weather'\
'(temp int,'\
'rain24H double,'\
'thunder boolean,'\
'timestamp timestamp)'\
'timestamp(timestamp)'
r = requests.get("http://localhost:9000/exec?query=" + q)
print(r.status_code)Output: 400
In the next code cell, we generate and add 1000 entries of data to our database.
import requests
import random
from datetime import datetime
success = 0
fail = 0
random.seed()
for x in range(1000):
temp = random.randint(-40, 55)
rain24H = round(random.uniform(10.45, 235.15), 2)
thunder = bool(random.getrandbits(1))
query = "insert into weather values("\
+ str(temp) + ","\
+ str(rain24H) + "," \
+ str(thunder) +",systimestamp())"
r = requests.get("http://localhost:9000/exec?query=" + query)
if r.status_code == 200:
success += 1
else:
fail += 1
print("Rows inserted: " + str(success))
if fail > 0:
print("Rows Failed: " + str(fail))Output: Rows inserted: 1000
import requests
import io
r = requests.get("http://localhost:9000/exp?query=select * from weather")
rawData = r.text
print(rawData)Output:
"tempF","rain24H","thunder","timestamp"
-37,234.0,false,"2020-08-24T11:41:37.191530Z"
-11,165.52,false,"2020-08-24T11:41:37.195714Z"
-31,178.13,true,"2020-08-24T11:41:37.202453Z"
14,31.470000000000,true,"2020-08-24T11:41:37.208365Z"
33,83.31,true,"2020-08-24T11:41:37.214347Z"import pandas as pd
pData = pd.read_csv(io.StringIO(rawData), parse_dates=['timestamp'])
print(pData)Output:
tempF rain24H thunder timestamp
0 -37 234.00 False 2020-08-24 11:41:37.191530+00:00
1 -11 165.52 False 2020-08-24 11:41:37.195714+00:00
2 -31 178.13 True 2020-08-24 11:41:37.202453+00:00
3 14 31.47 True 2020-08-24 11:41:37.208365+00:00
4 33 83.31 True 2020-08-24 11:41:37.214347+00:00
... ... ... ... ...
3995 -24 183.22 False 2020-08-24 11:58:32.164254+00:00
3996 1 151.96 True 2020-08-24 11:58:32.166610+00:00
3997 -40 213.86 True 2020-08-24 11:58:32.170839+00:00
3998 -33 101.91 True 2020-08-24 11:58:32.173131+00:00
3999 -38 182.98 False 2020-08-24 11:58:32.177191+00:00
[4000 rows x 4 columns]Note: The query string must be URL-encoded before it is sent.
import urllib.parse
q = "select tempF,"\
" rain24H,"\
" timestamp"\
" from weather"\
query = urllib.parse.quote(q)
r = requests.get("http://localhost:9000/exp?query=" + query)
queryData = r.content
rawData = pd.read_csv(io.StringIO(queryData.decode('utf-8')))
print(rawData)Output:
tempF rain24H timestamp
0 -37 234.00 2020-08-24T11:41:37.191530Z
1 -11 165.52 2020-08-24T11:41:37.195714Z
2 -31 178.13 2020-08-24T11:41:37.202453Z
3 14 31.47 2020-08-24T11:41:37.208365Z
4 33 83.31 2020-08-24T11:41:37.214347Z
... ... ... ...
3995 -24 183.22 2020-08-24T11:58:32.164254Z
3996 1 151.96 2020-08-24T11:58:32.166610Z
3997 -40 213.86 2020-08-24T11:58:32.170839Z
3998 -33 101.91 2020-08-24T11:58:32.173131Z
3999 -38 182.98 2020-08-24T11:58:32.177191Z
[4000 rows x 3 columns]from matplotlib import pyplot as plt
plt.bar(rawData['timestamp'], rawData['rain24H'])
from matplotlib import pyplot as plt
plt.bar(rawData['timestamp'], rawData['tempF'])
Check out the complete notebook at questdb.ipynb
Help us improve this content by editing this page on GitHub