It is a tremendously fast NewSQL database which is avaiable as Open source software under Apache 2.0 license. It is basically like a superset of a SQL database (with added features and support for time-varying data
or event data). It is made with the prime focus on performance.
Step 1: Installing Docker
Follow the instructions here and choose the docker installer based on your Operating system: https://docs.docker.com/engine/install/
Step 2: Pulling the QuestDB image and creating a docker container
docker run -p 9000:9000 -p 8812:8812 questdb/questdb
This parameter will publish a port to the host, you can specify:
-p 9000:9000
for the REST API and the Web Console. The web console is available on http://localhost:9000-p 8812:8812
for the Postgres wire protocol-p 9009:9009
InfluxDB line protocolStep 3: Checking if the QuestDB container is running or not
Use docker ps
to check container status.
Step 4: Python and Jupyter installation
pip3 install requests urlib matplotlib pandas
to install requried modulespip3 install --upgrade ipython jupyter
jupyter notebook
which will start jupyter, once it is up create a new notebookWe would generate some random data and store that data into a test database named weather which we would create. The create statement in QuestDB pushes the data into the bottom of the table. Our data is comprised of:
temp
is the temperature in Celcius.rain24H
is the amount of precipitation in the last 24 hours.thunder
is a boolean returning True if thunder is present.timestamp
is the date and time. If the below code block prints 200, it means the database was created successfully. If it prints 400 then it means that the database already exists.Code:
import requests
import urllib.parse as par
q = 'create table weather'\
'(temp int,'\
'rain24H double,'\
'thunder boolean,'\
'timestamp timestamp)'\
'timestamp(timestamp)'
r = requests.get("http://localhost:9000/exec?query=" + q)
print(r.status_code)
Output: 400
In the next code cell, we generate and add 1000 entries of data to our database.
import requests
import random
from datetime import datetime
success = 0
fail = 0
random.seed()
for x in range(1000):
temp = random.randint(-40, 55)
rain24H = round(random.uniform(10.45, 235.15), 2)
thunder = bool(random.getrandbits(1))
query = "insert into weather values("\
+ str(temp) + ","\
+ str(rain24H) + "," \
+ str(thunder) +",systimestamp())"
r = requests.get("http://localhost:9000/exec?query=" + query)
if r.status_code == 200:
success += 1
else:
fail += 1
print("Rows inserted: " + str(success))
if fail > 0:
print("Rows Failed: " + str(fail))
Output: Rows inserted: 1000
import requests
import io
r = requests.get("http://localhost:9000/exp?query=select * from weather")
rawData = r.text
print(rawData)
Output:
"tempF","rain24H","thunder","timestamp"
-37,234.0,false,"2020-08-24T11:41:37.191530Z"
-11,165.52,false,"2020-08-24T11:41:37.195714Z"
-31,178.13,true,"2020-08-24T11:41:37.202453Z"
14,31.470000000000,true,"2020-08-24T11:41:37.208365Z"
33,83.31,true,"2020-08-24T11:41:37.214347Z"
import pandas as pd
pData = pd.read_csv(io.StringIO(rawData), parse_dates=['timestamp'])
print(pData)
Output:
tempF rain24H thunder timestamp
0 -37 234.00 False 2020-08-24 11:41:37.191530+00:00
1 -11 165.52 False 2020-08-24 11:41:37.195714+00:00
2 -31 178.13 True 2020-08-24 11:41:37.202453+00:00
3 14 31.47 True 2020-08-24 11:41:37.208365+00:00
4 33 83.31 True 2020-08-24 11:41:37.214347+00:00
... ... ... ... ...
3995 -24 183.22 False 2020-08-24 11:58:32.164254+00:00
3996 1 151.96 True 2020-08-24 11:58:32.166610+00:00
3997 -40 213.86 True 2020-08-24 11:58:32.170839+00:00
3998 -33 101.91 True 2020-08-24 11:58:32.173131+00:00
3999 -38 182.98 False 2020-08-24 11:58:32.177191+00:00
[4000 rows x 4 columns]
Note: The query string must be URL-encoded before it is sent.
import urllib.parse
q = "select tempF,"\
" rain24H,"\
" timestamp"\
" from weather"\
query = urllib.parse.quote(q)
r = requests.get("http://localhost:9000/exp?query=" + query)
queryData = r.content
rawData = pd.read_csv(io.StringIO(queryData.decode('utf-8')))
print(rawData)
Output:
tempF rain24H timestamp
0 -37 234.00 2020-08-24T11:41:37.191530Z
1 -11 165.52 2020-08-24T11:41:37.195714Z
2 -31 178.13 2020-08-24T11:41:37.202453Z
3 14 31.47 2020-08-24T11:41:37.208365Z
4 33 83.31 2020-08-24T11:41:37.214347Z
... ... ... ...
3995 -24 183.22 2020-08-24T11:58:32.164254Z
3996 1 151.96 2020-08-24T11:58:32.166610Z
3997 -40 213.86 2020-08-24T11:58:32.170839Z
3998 -33 101.91 2020-08-24T11:58:32.173131Z
3999 -38 182.98 2020-08-24T11:58:32.177191Z
[4000 rows x 3 columns]
from matplotlib import pyplot as plt
plt.bar(rawData['timestamp'], rawData['rain24H'])
from matplotlib import pyplot as plt
plt.bar(rawData['timestamp'], rawData['tempF'])
Check out the complete notebook at questdb.ipynb
Help us improve this content by editing this page on GitHub