Infrastructure¶
In general we want:
- Open Source solutions over proprietary
- Decentralized solutions over centralized
- Good integration and extendability (we're hackers after all)
- Good usability (can also be improved with custom modifications)
- Good compatibility (at least 3 desktop and 2 mobile platforms)
- Easy maintainability
Servers¶
h1¶
This is my private Hetzner server, however it's trivial to migrate containers once we have dedicated lab10 servers. The host has the name h1.d10r.net, d10r.net and *.d10r.net also point to it.
I've several years experience with several root servers at Hetzner and am quite satisfied, especially regarding performance and cost, but also service.
The server was installed using installimage with the attached installimage.cfg
as config.
The Admin interface for the server is located at https://robot.your-server.de/server.
Here it can be rebooted (in case not possible otherwise), put to rescue mode etc.
Here it's also possible to order additional IPs or IP blocks, set reverse DNS, traffic warnings etc.
OS and above level administration can be done via ssh.
Hardware
Currently one server is used: Hetzner EX4S: i7-2600 (quadcore), 32 GB DDR3 RAM, 2x 3 TB HDD, 1 Gbit/s NIC.
Traffic is unlimited, but throttled to 10Mbit/s after 20TB/month.
It currently has 3 v4 IPs and a /64 v6 IP subnet.
Storage
Most of the storage is allocated for LVM, using a single volume group vg0
, with the logical volume /dev/vg0/root
being used for the root fs and /dev/vg0/data
(2,7 TB) for the rest. Logical volumes on vg0
: root
with 80 GB, data
with 1,76 TB, pvepool
with 870 GB.
The LXC containers use thinly provisioned logical volumes which aren't shown by the df
tool. Use lvdisplay /dev/vg0/pvepool
in order to check usage.
OS
It runs Debian 8 with Proxmox VE 4.4.
Proxmox is basically a convenience wrapper around KVM and LXC, I currently use only LXC.
I have a handmade bash script which allows me to get an LXC container based on on an existing template (e.g. Ubuntu 16.04) running and configured in few minutes.
With LXC, containers can be over-provisioned. That means I can in summary assign more CPU cores, RAM and storage then is effectively available. This makes it easier to configure containers for good peak performance.
Currently, most containers run Ubuntu (all 14.04 or 16.04 LTS). In general I avoid distros with short support lifecycles as OS upgrades always require manual work and risk of breakage.
Networking
Containers not having their own public IP have an internal IP of the 192.168.2.0/24 subnet, with the last tuple corresponding to the container id. For example: 192.168.2.101.
Ports are forwarded from the host via iptables. The container setup script contains a forwarding for the ssh port, where the public port is 22, e.g. 10122 for container with id 101.
Security
I use ssh only with keys, not passwords. Additional user accounts are created with password disabled.
The container config script (a copy is located in every container on /root/guest_setup.sh
) also sets up automatic security updates via UnattendedUpgrades. I try to minimize installing software from outside official repositories in order to have everything covered by this automatic updates.
LXC containers are not great when it comes to security. KVM offers better isolation. Thus if we run something with increased risk of being owned, it should probably get a KVM instance in order to keep the host safe. Dedicated HW is of course always the better option security wise.
Monitoring
Currently there's only basic monitoring set up. Notifications go to admin@d10r.net
. This includes:
- Output generated by cron scripts (e.g. automatic letsencrypt renewal)
- Automatic updates
- LVM errors of the host
- SMART errors of the host
Munin is running on h1: link (I didn't yet put any love in configuring it).
If you wonder about the CPU usage: that's mining processes (I don't fully saturate the CPUs in order to keep stuff like Redmine responsive).
Backup¶
h1 has currently no automatic backup set up. It's however using RAID 1 (btw. both HDDs were recently replaced due SMART errors, resync worked fine).
Proxmox / LXC supports live container backups via LVM snapshotting. I can look up how exactly to do it on a server I was responsible for until recently.
In order for backups to really make sense, they need to be stored on and fetched from a different server.
We also should have a backup on a server in lab10 office, but I'm not sure if that's a good place for primary backup. Restore could take quite a while due to the limited bandwidth. Probably ok for some things, but not for all. Minebox may not be a bad option.
A dedicated backup server could also be configured for quick failover. If it runs Proxmox, container backups can easily be deployed. Outage time could basically come down to reaction time + DNS TTL. Even real HA wouldn't be much more difficult with the Proxmox Cluster option.
We need to find the right balance between resources invested and risks at any given time.
Missing¶
- E-Mail: Where do we host E-Mail for @lab10.coop (and do we also want @lab10.io? mails / aliases?). I've run an E-Mail server once long ago and am not eager to do that again (no, my name is not Hillary). Needs to be 100% reliable, secure, Spam resistant. Should ideally allow programmatic account creation, management and backup. Should have good server side filtering rules configurability and the option to edit rules for multiple accounts (should all be checked if there's an API). Needs to be perfectly configured (including stuff like reverse DNS, SPF) in order to not risk outgoing mail being spam filtered. Additional mail addresses should be cheap (address != account). Should not annoy with mailbox size or attachment size limitations.
- Mailing lists: Easy to create and manage, with archiving, archive website, searchable.
- SSO Account. Something not reliant on a central, proprietary provider and which integrates with a lot of services (e.g. Redmine). What's a good option for that? LDAP seems to be it, I've no experience with it yet.
- Shared Storage: Something similar to Dropbox, but not central and locked. Should allow granular syncing. Should ideally integrate well with our PM and Messaging. Should support full text search. I've experience with ownCloud and mixed feelings (Nextcloud supposedly does better however). Last time we tried, our attempt to integrate it well was of limited success. My experience is that for many things simple Redmine attachments work best, because it's where the context is and easy to find thanks to full text search. It's however not for everything. Redmine also has Files and Documents modules, but as long as this isn't connected to a syncable backend, it doesn't add any value imo. We should probably do some brainstorming and analysis about what such a storage would actually be needed for (after subtracting content residing in git or as Redmine attachment or E-Mail attachment or Chat attachment or shared document or backup...). Probably a possible alternative is a system collecting / mapping content from all that other locations instead of an additional location.
- Shared Calendar: Should be easy to access via Desktop and Mobile. Should allow easy importing of events. Nextcloud?
- Shared documents: Something like Google Docs, but not by Google? For text and spreadsheets there's e.g. Etherpad and Ethercalc. But what about docs? Maybe Nextcloud has something?
- CRM: Redmine has this plugin installed (free version). I can't judge how good / useful it is. In any case it's extensible and open to whatever integration, since even with the commercial version we get the sourcecode. The commercial version does however require manual intervention for updating. Thomas mentioned pipedrive as a positive example. Maybe this can be tailored with Redmine (example agile board). I think what the Redmine CRM solution is still lacking is activity planning (at least that was an issue at my previous team). For somebody with Rails skills (mine are very limited) it shouldn't be too difficult to get it where we want it.
- VCS: git. But, where? I've experience with standalone git repos. Setting up repos is convenient with scripts, but I didn't have fine grained access control (never bothered much and am not sure if we need that at lab10, considering that ultimately most code is going to be open sourced anyway). Currently gitoline is set up and integrated with Redmine (see #25-5), due to the very limited experience so far I can't however judge how reliable and good of a solution that is. What about gitlab (I haven't tried it yet, but heard good things).
- CI. I've used Jenkins, found it quite ok (once it's set up). Nowadays I'd probably first take a closer look at gitlab (again).
- Analytics: Piwik works well imo, have used it for ~5 years without much issues.
- Messaging: Slack is nice, but ... centralized and proprietary. While there's Ethereum/Whisper based messaging in the making (see #26), it's far from stable and so far very platform limited. The hottest candidate imo is Matrix with it's Riot clients. It can integrate with Slack and whatever else offers an API. Looks like heaven for hacking. This should also be a great basis for hand crafted chat bots.
- Chat bots: I don't yet have much experience with it (except bots reporting about CI tasks), do however believe that something like Hubot can be a great tool for integration of various services and productivity boosting (I see a bot enhanced messaging platform as a kind of shared terminal which is also approachable for non-devs). Matthias knows more.
What else?
We may take a look at Sandstorm for some Apps. I've already installed it (see #3), but not yet working due to an issue with the virtual environment (I'm bsically running a VM inside a VM here).