Concept for Web Interface
MetaInfo
derived from requirements, concept_shepherd and concept_meeting_notes
Design-Features
Privacy is a features - avoid personal data (OAuth, ActiveDirectory)
usage per website & API
1 experiment at a time, no concurrency
Functions
user-management
permission
role management
groups (share datapool)
Quota (data is linked to user and deleted with the account)
authentication via external service like OAuth (i.e. GitHub) (if possible)
Upload user provided data
Harvesting- / Energy-Traces, -traces, IV-Curves
regulator / converter-definition
Target-Firmware
user-scripts (sandboxed, limited resources (cpu, ram, storage), only limited libs allowed, limited user, lowest nice-level)
Scratch-Area for experimentation-results (or DB)
integrated pre-tests for measurement: test for plausibility, pre-run in software (abstract virtual hardware)
link in python code of shepherd project
scheduler for measurement, time to prepare and follow up (data transfer, conversion, compression, …)
results downloadable by user, for a certain time (no hording)
feedback via e-mail - measurement start, data available, error, shortly before deletion
(optional) grafana-visualization
documentation and instructions
landing page
“what is that thing in my office”
clarify use and capability of the boxes (so people sharing an office with the testbed don’t freak out)
where find what
FAQ
Testbed-Status
structure
last seen
handle lockups (power-cycle)
keep testbed alive for a limited time (otherwise it will go to sleep / power down)
collecting of logs (temp, ram-usage, cpu-usage, time-sync, …)
Web-Framework (some points are redundant)
user-management (roles for admins and users)
experiment-management, configure and control, add data (see below)
experiment-scheduling, calender (set active, start-time, duration)
data management / quota (retrieve / delete recordings)
authentication via external services
E-Mail notifications
testbed status, topology
(optional) grafana visualization of recorded data
(admins) server status, quota, testbed control
documentation and instructions
target-management (specify slots of nodes)
(optional) benchmark-management (post-scripts)
live-view ⇾ lower prio
Implementation
possibly python based (django vs flask, big vs small)
django offers the most, is flexible, modular and easy to extend
admin-interface
authentication (also oauth client), accounts
session-management
sub-websites with html-templates
⇾ security seems to be OK, attach surface is big, but >v3.0 seems to be mature
flask is a microframework, offers minimal attack surface and seems perfect for the python REST API
basic implementation could be similar to https://github.com/fkt/36c3-junior-ctf-pub (web-interface for a ctf, that didn’t get compromised)
API: rest
(prio) allow scripted workflow ⇾ yaml ⇾ rest ⇾ server internals
this could also be the base for the web-page-interface
Database
influxDB or
timescaleDB
postgreSQL
MessageSystem
Protobuf
RabbitMQ
RPC
visualization, analysis ⇾ dash?
option to sample down
DB Decision
needed:
30x 100k Inserts of timestamp (8B), node_id (1B), voltage (4B), current (4B)
⇾ 3M Inserts of >=17 Bytes ⇾ 3 GB / minute
can happen locally or remote, concurrently is fine
main bottleneck:
databases with timeseries do not seem to have a low level api for insertions, interface is ascii and needs parsing
(solution) low level api (raw data, shared mem, …) ⇾ there are possible formats like::
BSON ⇾ MongoDB
UBJSON ⇾ TeradataDB, Wolfram (no use for us)
apache avro ⇾ Apache Spark SQL
JSONB ⇾ Postgresql, but they say: “JSON is faster to ingest vs. JSONB”
Timescale DB vs Influx ⇾ influx seems to dominate with fewer devices <= 100
timescale: SQL, robust, based on postgreSQL, time series, relational, various datatypes
looks more professional, but like influx they want to sell
presetup hard to script
1M insert/s are considered excellent, i landed at ~60k with one remote connection
no low level api available, but some SQLs allow to load from file (csv)
influxDB2:
inserting 200s data takes ~ 190s (1 node), with almost no load on VM or system
⇾ makes 108k/s inserts from one node
marketing documents say the insert-rate of free-database is good for ~ 250k/s
is it artificially limited or is it another invisible bottleneck?
ram usage seems to be ok << 1 GB
query’s with ns resolution can get very slow. ~3s for averaging windows
influx can almost naturally import hdf5, numpy-arrays, pandas Dataframes
dataexplorer shows plots only windowed, smallest window is 1s (may be unimportant)
elastic + logstash, search engine,
redis, key-value store
mongoDB
allows usage of BSON instead of JSON
DB-Bypass
measurement-data could be stored directly on the server
each measurement is stored in a separate folder, named by hash or timestamp
it contains config data, logs and results
file-references are inserted into a DB
metrics for benchmarks or competitions could be generated by a user-script
pandas or numpy, as a fast cPython convenience, allows powerful oneliners
these metrics could be displayed on the website as a condensed result
these metrics could also be used for some kind of leaderboard
a downsampled dataset (1 kHz?) could be fed into a database for semi-live analysis / observation
a full-res dataset could be fed into the database afterward but would mean a longer blocking time between measurements
TODO
try paid db-vm (influx)
compare elastic against influx, no support for nanosec?
benchmark server (disks / ram)
offer predefined energy-patterns (on/off, diode, different converters (boost, buck/boost))
design-choices for later
does shepherd need databases for immediate (deep)analysis of result
alternative: provide post-scripts that filter data for key-parameters (benchmark-management)
data hording or economical use of space?
what else ?????