Welcome to End Point’s blog

Ongoing observations by End Point people

Wrocoverb part 2 - The Elixir Hype

One of the main reasons I attend Wrocloverb almost every year, is that it's a great forum for confronting ideas. It's almost a tradition to have at least 2 very enjoyful discussion panels during this conference. One of them was devoted to Elixir and why Ruby [1] community is so hyping about it.

Why Elixir is “sold” to us as “new better Ruby” while its underlying principles are totally different? Won’t it result in Elixir programmers that do not understand Elixir (like Rails programmers that do not know Ruby)?

Panelists discussed briefly the history of Elixir:

Jose Valim (who created Elixir) was working on threading in Rails and he was searching for better approaches for threading in web frameworks. He felt like lots of things were lacking in Erlang and Elixir is his approach for better Exceptions, better developer experience.

Then they've jumped to Elixir's main goals which are:
  • Compatibility with Erlang (all datatypes)
  • Better tooling
  • Improving developer's experience

After that, they've started speculating about problems that Elixir solves and RoR doesn't:

Ruby on Rails addresses many problems in ways that may be somehow archaic to us in the ever-scaling world of 2017. There are many approaches to it - e.g. "actor model" which is implemented in Ruby by Celluloid, in Scala with Akka and also Elixir and Phoenix (Elixir's rails-like framework) has it's own actor model.

Phoenix ("Rails for Elixir") is just an Elixir app - unlike Rails, it is not separate from Elixir. Moreover Elixir is exactly the same language as Erlang so:

Erlang = Elixir = Phoenix

Great comment:

Then a discussion followed during which panelists speculated about the price of the jump from Rails to Elixir:

Java to Rails jump was caused by business/productivity. There's no such jump for Phoenix/Elixir. Elixir code is more verbose (less instance variables, all args are passed explicitly to all functions)

My conclusions

A reason why this discussion was somehow shallow and pointless was that those two world have different problems. Great comment:

Elixir solves a lot of technical problems with scaling thanks to Erlang's virtual machine. Such problems are definitely only a small part of what typical ruby problems solvers deal with on a daily basis. Hearing Elixir and Ruby on Rails developers talk past each other was probably the first sign of the fact that there's no hyping technology right now. Each problem can be addressed by many tech tools and frameworks.

[1] Wrocloverb describes itself as a "best Java conference in Ruby world". It's deceiving:

Wrocloverb 2017 part 1

Wrocloverb is a single-track 3-day conference that takes place in Wrocław (Poland) every year in March.

Here's a subjective list of most interesting talks from the first day

# Kafka / Karafka (by Maciej Mensfeld)

Karafka is another library that simplifies Apache Kafka usage in Ruby. It lets Ruby on Rails apps benefit from horizontally scalable message busses in a pub-sub (or publisher/consumer) type of network.

Why Kafka is (probably) better message/task broker for your app:
- broadcasting is a real power feature of kafka (http lacks that)
- author claims that its easier to support it  rather than ZeroMQ/RabbitMQ
- it's namespaced with topics (similar to Robot Operating System)
- great replacement for ruby-kafka and Poseidon

# Machine Learning to the Rescue (Mariusz Gil)

This talk was devoted to Machine Learning success (and failure) story of the author.

Author underlined that Machine Learning is a process and proposed following workflow:

  1. define a problem
  2. gather you data
  3. understand your data
  4. prepare and condition the data
  5. select & run your algorithms
  6. tune algorithms parameters
  7. select final model
  8. validate final model (test using production data)
Mariusz described few ML problems that he has dealt with in the past. One of them was a project designed to estimate cost of a code review. He outlined the process of tuning the input data. Here's a list of what comprised the input for a code review estimation cost:
  • number of lines changed
  • number of files changed
  • efferent coupling
  • afferent coupling
  • number of classes
  • number of interfaces
  • inheritance level
  • number of method calls
  • lloc metric
  • lcom metric (whether single responsibility pattern is followed or not)

# Spree lightning talk by

One of the lightning talks was devoted to Spree. Here's some interesting latest data from spree world:

  • number of contributors of spree - 700
  • it's very modular modular
  • it's api driven
  • it's one of the biggest repos on github
  • very large number of extensions
  • it drives thousands of stores worldwide
  • Spark Solutions is a maintainer
  • Popular companies that use spree: Go Daddy, Goop, Casper, Bonobos, Littlebits, Greetabl
  • it support rails 5, rails 4.2 and rails 3.x
Author also released newest 3.2.0 stable version during the talk:

Half day GlusterFS training in Selangor, Malaysia

On January 21, 2017, I had an opportunity to join a community-organized training on storage focused on GlusterFS. GlusterFS is an open source cloud-based filesharing network. The training was not a strictly structured training as the topic approached knowledge sharing from various experts and introduced GlusterFS to the ones who were new to it. The first session was delivered by Mr Adzmely Mansor from NexoPrima. He shared a bit of his view on GlusterFS and technologies that are related to it.

Mr Haris, a freelance Linux expert, later lead a GlusterFS technical class. Here we created two virtual machines (we used Virtualbox) to understand how GlusterFS works in a hands-on scenario. We used Ubuntu 16.04 as the guest OS during technical training. We used Digital Ocean's GlusterFS settings as a base of reference. The below commands detail roughly what we did during the training.

In GlusterFS the data section is called as "brick". Hence we could have a lot of "bricks" if we have it more than once :) . As Ubuntu already had the related packages in its repository, we could simply run apt-get for the package installation. Our class notes were loosely based from Digital Ocean's GlusterFS article here. (Note: the article was based on Ubuntu 12.04 so some of the steps could be omitted).

The GlusterFS packages could be installed as a superuser with the following command:

apt-get install glusterfs-server

Since we were using a bridged VM during the demo, we simply edited the /etc/hosts in the each VM so they could communicate between each other by using hostname instead of using typing the IP manually.

root@gluster2:~# cat /etc/hosts|grep gluster	gluster1	gluster2

Here we will try to probe the remote host whether it is reachable:

root@gluster2:~# gluster peer probe gluster1
peer probe: success. Host gluster1 port 24007 already in peer list

The following commands create the storage volume. Later, whatever we put in the /data partition will be reachable on the other gluster node.

gluster volume create datastore1 replica 2 transport tcp gluster1:/data gluster2:/data
gluster volume create datastore1 replica 2 transport tcp gluster1:/data gluster2:/data force
gluster volume start datastore1

Most of the parts here could be retrieved from the link that I gave above. But let's see what will happen later on when the mounting part is done.

cd /datastore1/
root@gluster2:/datastore1# touch blog
root@gluster2:/datastore1# ls -lth
total 512
-rw-r--r-- 1 root root  0 Mar 14 21:33 blog
-rw-r--r-- 1 root root 28 Jan 21 12:15 ujian.txt

The same output could be retrieved from gluster1

root@gluster1:/datastore1# ls -lth
total 512
-rw-r--r-- 1 root root  0 Mar 14 21:33 blog
-rw-r--r-- 1 root root 28 Jan 21 12:15 ujian.txt

Mr. Adzmely gave explanation on the overall picture of GlusterFS

Mr. Haris explained on the technical implementation of GlusterFS

In terms of the application, the redundancy based storage is good for situations where you have a file being updated on several servers and you need to ensure the file is there for retrieval even if one of the servers is down. One audience member shared his experience deploying GlusterFS in his workplace (a university) for the purpose of new intake of student's registration. If anyone ever uses Samba filesystem or NFS, this kind of similar, but GlusterFS is much more advanced. I recommend additional reading here.

Shell efficiency: mkdir and mv

Little tools can be a nice improvement. Not everything needs to be thought-leaderish.

For example, once upon a time in my Unix infancy I didn't know that mkdir has the -p option to make intervening directories automatically. So back then, in order to create the path a/b/c/ I would've run: mkdir a; mkdir b; mkdir c when I could instead have simply run: mkdir -p a/b/c.

In working at the shell, particularly on my own local machine, I often find myself wanting to move one or several files into a different location, to file them away. For example:

mv -i ~/Downloads/Some\ Long\ File\ Name.pdf ~/some-other-long-file-name.tar.xz ~/archive/new...

at which point I realize that the subdirectory of ~/archive that I want to move those files into does not yet exist.

I can't simply move to the beginning of the line and change mv to mkdir -p without removing my partially-typed ~/archive/new....

I can go ahead and remove that, and then after I run the command I have to change the mkdir back to mv and add back the ~/archive/new....

In one single day I found I was doing that so often that it became tedious, so I re-read the GNU coreutils manpage for mv to see if there was a relevant option I had missed or a new one that would help. And I searched the web to see if a prebuilt tool is out there, or if anyone had any nice solutions.

To my surprise I found nothing suitable, but I did find some discussion forums full of various suggestions and many brushoffs and ill-conceived suggestions that either didn't work for me or seemed much overengineered.

The solution I came up with was very simple. I've been using it for a few months and am happy enough with it to share it and see if it helps anyone else.

In zsh (my main local shell) add to ~/.zshrc:

mkmv() {
    mkdir -p -- "$argv[-1]"
    mv "$@"

And in bash (which I use on most of the many servers I access remotely) add to ~/.bashrc:

mkmv() {
    mkdir -p -- "${!#}"
    mv "$@"

To use: Once you realize you're about to try to move files or directories into a nonexistent directory, simply go to the beginning of the line (^A in standard emacs keybindings) and type mk in front of the mv that was already there:

mkmv -i ~/Downloads/Some\ Long\ File\ Name.pdf ~/some-other-long-file-name.tar.xz ~/archive/new...

It makes the directory (or directories) and then completes the move.

There are a few important considerations that I didn't foresee in my initial naive implementation:

  • Having the name be somethingmv meant less typing than something requiring me to remove the mv.
  • For me, it needs to support not just moving one thing to one destination, but rather a whole list of things. That meant accessing the last argument (the destination) for the mkdir.
  • I also needed to allow through arguments to mv such as -i, -v, and -n, which I often use.
  • The -- argument to mkdir ensures that we don't accidentally end up with any other options and that we can handle a destination with a leading - (however unlikely that is).
  • The mv command needs to have a double-quoted "$@" so that the original parameters are each expanded into double-quoted arguments, allowing for spaces and other shell metacharacters in the paths. (See the zsh and bash manpages for details on the important difference in behavior of "$@" compared to "$*" and either of them unquoted.)

This doesn't support GNU extensions to mv such as a --target-directory that precedes the source paths. I don't use that interactively, so I don't mind.

Because this is such a small thing, I avoided for years bothering to set it up. But now that I use it all the time I'm glad I have it!

360° Panoramic Video on Liquid Galaxy

End Point’s Liquid Galaxy recently had an exciting breakthrough! We can now show 360° panoramic video effectively and seamlessly on our Liquid Galaxy systems.

Since our integration of seamless 360° panovideo, our marketing and content teams have been loading up our system with the best 360° panoramic videos. The Liquid Galaxy system at our headquarter office has 360° panoramic videos from National Geographic, The New York Times, Google Spotlight, gaming videos, and more.

The technical challenge with panoramic video is to get multiple instances of the video player app to work in sync. On the Liquid Galaxy, you're actually seeing 7 different video player apps all running at the same time and playing the same video file, but with each app showing a slightly adjusted slice of the 360° panoramic video. This synchronization and angle-adjustment is at the heart of the Liquid Galaxy platform, and allows a high-resolution experience and surrounding immersion that doesn't happen on any other video wall system.

We anticipate that our flawless 360° panoramic video will resonate with many industries, one of whom is the gaming industry. Below we’ve included a video of 360° gaming videos, and how they appear on Liquid Galaxy. Mixed with all the other exciting capabilities on Liquid Galaxy, we anticipate the ability to view angle-adjusted and 3D-immersive 360° video on the system will be a huge crowd-pleaser.

If you are interested in learning more about 360° panoramic video on Liquid Galaxy, don’t hesitate to contact us or visit our Liquid Galaxy website for more information.

Become A File Spy With This One Easy Trick! Sys Admins Love This!

We had an interesting problem to track down. (Though I suppose I wouldn't be writing about it if it weren't, yes?) Over the years a client had built up quite the collection of scripts executed by cron to maintain some files on their site. Some of these were fairly complex, taking a long while to run, and overlapping with each other.

One day, the backup churn hit a tipping point and we took notice. Some process, we found, seemed to be touching an increasing number of image files: The contents were almost always the same, but the modification timestamps were updated. But digging through the myriad of code to figure out what was doing that was proving to be somewhat troublesome.

Enter auditd, already present on the RHEL host. This allows us to attach a watch on the directory in question, and track down exactly what was performing the events. -- Note, other flavors of Linux, such as Ubuntu, may not have it out of the box. But you can usually install it via the the auditd package.
(output from a test system for demonstration purposes)
# auditctl -w /root/output
# tail /var/log/audit/audit.log
type=SYSCALL msg=audit(1487974252.630:311): arch=c000003e syscall=2 success=yes exit=3 a0=b51cf0 a1=241 a2=1b6 a3=2 items=2 ppid=30272 pid=30316 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=pts2 ses=1 comm="" exe="/usr/bin/bash" key=(null)
type=CWD msg=audit(1487974252.630:311):  cwd="/root"
type=PATH msg=audit(1487974252.630:311): item=0 name="output/files/" inode=519034 dev=fd:01 mode=040755 ouid=0 ogid=0 rdev=00:00 objtype=PARENT
type=PATH msg=audit(1487974252.630:311): item=1 name="output/files/1.txt" inode=519035 dev=fd:01 mode=0100644 ouid=0 ogid=0 rdev=00:00 objtype=CREATE
The most helpful logged items include the executing process's name and path, the file's path, operation, pid and parent pid. But there's a good bit of data there per syscall.

Don't forget to auditctl -W /root/output to remove watch. auditctl -l will list what's currently out there:
# auditctl -l
-w /root/output -p rwxa
That's the short version. auditctl has a different set of parameters that are a little bit more verbose, but have more options. The equivalent of the above would be: auditctl -a always,exit -F dir=/root/output -F perm=rwxa ... with options for additional rule fields/filters on uid, gid, pid, whether or not the action was successful, and so on.

FOSDEM 2017: experience, community and good talks

In case you happen to be short on time: my final overall perspective about FOSDEM 2017 is that it was awesome... with very few downsides.

If you want the longer version, keep reading cause there's a lot to know and do at FOSDEM and never enough time, sadly.

This year I actually took a different approach than last time and decided to concentrate on one main track per day, instead of (literally) jumping from one to the other. While I think that overall this may be a good approach if most of the topics covered in a track are of your interest, that comes at the cost of missing one of the best aspects of FOSDEM which is "variety" in contents and presenters.

Day 1: Backup & Recovery

For the first day I chose the Backup & Recovery track which hosted talks revolving around three interesting and useful projects: namely REAR (Relax and Recovery), DRLM, a wrapper and backup management tool based on REAR and Bareos, which is a backup solution forked from Bacula in 2010 and steadily proceeding and improving since then. Both REAR and DLRM were explained and showcased by some of the respective projects main contributors and creators. As a long time system administrator, I particularly appreciated the pride in using Bash as the main "development platform" for both projects. As Johannes Meixner correctly mentioned, using bash facilitates introduces these tools into your normal workflow with knowledge that you'll most likely already have as a System Administrator or DevOps, thus allowing you to easily "mold" these scripts to your specific needs without spending weeks to learn how to interact with them.

During the Day 1 Backup & Recovery track there were also a few speeches from two Bareos developers (Jörg Steffens and Stephan Dühr) that presented many aspects of their great project, ranging from very introductory topics, to providing a common knowledge ground for the audience, up to more in depth topics like software capabilities extension through Python Plugins, or a handful of best practices and common usage scenarios. I also enjoyed the speech about automated testing in REAR, presented by Gratien D'haese, which showed how to leverage common testing paradigms and ideas to double-check a REAR setup for potential unexpected behaviors after updates or on new installations or simply as a fully automated monitoring tool to do sanity checks on the backup data flow. While this testing project was new, it's already functional and impressive to see at work.

Day 2: Cloud Microservices

On the second day I moved in a more "cloudy" section of the FOSDEM where most of the conferences revolved around Kubernetes, Docker and more in general the microservices landscape. CoreOS (the company behind the open source distribution) was a major contributor and I liked their Kubernetes presentation by Josh Wood and Luca Bruno which respectively explained the new Kubernetes Operators feature and how containers work under the hood in Kubernetes.

Around lunch time there was a "nice storm of lightning talks" which kept most of the audience firmly on their seats, especially since the Microservices track room didn't have a free seat for the entire day. I especially liked the talk from Spyros Trigazis about how CERN created and is maintaining a big OpenStack Magnum (the container integrated version of OpenStack) cloud installation for their internal use.

Then it was Chris Down's turn and, while he's a developer from Facebook, his talk gave the audience a good perspective on the future of CGROUPs in the Linux kernel and how they are already relatively safe and usable, even if not yet officially marked as production ready. While I already knew and used "sysdig" in past as a troubleshooting and investigation tool, it was nice to see one of the main developers, Jorge Salamero, using it and showing alternative approaches such as investigating timeout issues between Kubernetes Docker containers by just sysdig and its many modules and filters. It was really impressive seeing how easy it is to identify cross-containers issues and data flow.


There were a lot of Open Source communities with "advertising desks" and I had a nice talk with a few interesting developers from the CoreOS team or from FSFE (Free Software Foundation Europe). Grabbing as many computer stickers as possible is also mandatory at FOSDEM, so I took my share and my new Thinkpad is way more colorful now. In fact, on a more trivial note, this year the FOSDEM staff decided to sell on sale all the laptops that were used during the video encoding phase for the streaming videos before the upload. These laptops were all IBM Thinkpad X220 and there were only a handful of them (~30) at a very appealing price. In fact, this article is being written from one of those very laptops now as I was one of the lucky few which managed to grab one before they were all gone within an hour or so. So if you're short of a laptop and happen to be at FOSDEM next year, keep your eyes open cause I think they'll do it again!

So what's not to like in such a wonderful scenario? While I admit that there was a lot to be seen and listened to, I sadly didn't see any "ground-shaking" innovation this year at FOSDEM. I did see many quality talks and I want to send a special huge "thank you" to all the speakers for the effort and high quality standards that they keep for their FOSDEM talks - but I didn't see anything extraordinarily new from what I can remember.

Bottom line is that I still have yet to find someone who was ever disappointed at FOSDEM, but the content quality varies from presenter to presenter and from year to year, so be sure to check the presentations you want to attend carefully before hand.

I think that the most fascinating part of FOSDEM is meeting interesting, smart, and like-minded people that would be difficult to reach otherwise.

In fact, while a good share of the merit should be attributed to the quality of the content presented, I firmly believe that the community feeling that you get at FOSDEM is hard to beat and easy to miss when skipped even for one year.

I'll see you all next year at FOSDEM then.