News

Welcome to End Point’s blog

Ongoing observations by End Point people

FOSDEM 2017: experience, community and good talks

In case you happen to be short on time: my final overall perspective about FOSDEM 2017 is that it was awesome... with very few downsides.

If you want the longer version, keep reading cause there's a lot to know and do at FOSDEM and never enough time, sadly.

This year I actually took a different approach than last time and decided to concentrate on one main track per day, instead of (literally) jumping from one to the other.While I think that overall this may be a good approach if most of the topics covered in a track are of your interest, that comes at the cost of missing one of the best aspects of FOSDEM which is "variety" in contents and presenters.

Day 1: Backup & Recovery

For the first day I chose the Backup & Recovery track which hosted talks revolving around three interesting and useful projects: namely REAR (Relax and Recovery), DRLM, a wrapper and backup management tool based on REAR and Bareos, which is a backup solution forked from Bacula in 2010 and steadily proceeding and improving since then. Both REAR and DLRM were explained and showcased by some of the respective projects main contributors and creators. As a long time system administrator, I particularly appreciated the pride in using Bash as the main "development platform" for both projects. As Johannes Meixner correctly mentioned, using bash facilitates introduces these tools into your normal workflow with knowledge that you'll most likely already have as a System Administrator or DevOps, thus allowing you to easily "mold" these scripts to your specific needs without spending weeks to learn how to interact with them.

During the Day 1 Backup & Recovery track there were also a few speeches from two Bareos developers (Jörg Steffens and Stephan Dühr) that presented many aspects of their great project, ranging from very introductory topics, to providing a common knowledge ground for the audience, up to more in depth topics like software capabilities extension through Python Plugins, or a handful of best practices and common usage scenarios. I also enjoyed the speech about automated testing in REAR, presented by Gratien D'haese, which showed how to leverage common testing paradigms and ideas to double-check a REAR setup for potential unexpected behaviors after updates or on new installations or simply as a fully automated monitoring tool to do sanity checks on the backup data flow. While this testing project was new, it's already functional and impressive to see at work.

Day 2: Cloud Microservices

On the second day I moved in a more "cloudy" section of the FOSDEM where most of the conferences revolved around Kubernetes, Docker and more in general the microservices landscape. CoreOS (the company behind the open source distribution) was a major contributor and I liked their Kubernetes presentation by Josh Wood and Luca Bruno which respectively explained the new Kubernetes Operators feature and how containers work under the hood in Kubernetes.

Around lunch time there was a "nice storm of lightning talks" which kept most of the audience firmly on their sits, especially since the Microservices track room didn't have a free seat for the entire day. I especially liked the talk from Spyros Trigazis about how CERN created and is maintaining a big OpenStack Magnum (the container integrated version of OpenStack) cloud installation for their internal use.

Then it was Chris Down's turn and, while he's a developer from Facebook, his talk gave the audience a good perspective on the future of CGROUPs in the Linux kernel and how they are already relatively safe and usable, even if not yet officially marked as production ready. While I already knew and used "sysdig" in past as a troubleshooting and investigation tool, it was nice to see one of the main developers, Jorge Salamero, using it and showing alternative approaches such as investigating timeout issues between Kubernetes Docker containers by just sysdig and its many modules and filters. It was really impressive seeing how easy it is to identify cross-containers issues and data flow.

Atmosphere

There were a lot of Open Source communities with "advertising desks" and I had a nice talk with a few interesting developers from the CoreOS team or from FSFE (Free Software Foundation). Grabbing as many computer stickers is also mandatory at FOSDEM, so I took my share and my new Thinkpad is way more colorful now. In fact, on a more trivial note, this year the FOSDEM staff decided to sell on sale all the laptops that were used during the video encoding phase for the streaming videos before the upload. These laptops were all IBM Thinkpad X220 and there were only a handful of them (~30) at a very appealing price. In fact, this article is being written from one of those very laptops now as I was one of the lucky few which managed to grab one before they were all gone within an hour or so. So if you're short of a laptop and happen to be at FOSDEM next year, keep your eyes open cause I think they'll do it again!

So what's not to like in such a wonderful scenario? While I admit that there was a lot to be seen and listened to, I sadly didn't see any "ground-shaking" innovation this year at FOSDEM. I did see many quality talks and I want to send a special huge "thank you" to all the speakers for the effort and high quality standards that they keep for their FOSDEM talks - but I didn't see anything extraordinarily new from what I can remember.

Bottom line is that I still have yet to find someone who was ever disappointed at FOSDEM, but the content quality varies from presenter to presenter and from year to year, so be sure to check the presentations you want to attend carefully before hand.

I think that the most fascinating part of FOSDEM is meeting interesting, smart, and like-minded people that would be difficult to reach otherwise.

In fact, while a good share of the merit should be attributed to the quality of the content presented, I firmly believe that the community feeling that you get at FOSDEM is hard to beat and easy to miss when skipped even for one year.




I'll see you all next year at FOSDEM then.

Full Cesium Mapping on the Liquid Galaxy

A few months ago, we shared a video and some early work we had done with bringing the Cesium open source mapping application to the Liquid Galaxy. We've now completed a full deployment for Smartrac, a retail tracking analytics provider, using Cesium in a production environment! This project presented a number of technical challenges beyond the early prototype work, but also brought great results for the client and garnered a fair amount of attention in the press, to everyone's benefit.

Cesium is an open source mapping application that separates out the tile sets, elevation, and markup language. This separation allows for flexibility at each major element:

  • We can use a specific terrain elevation data set while substituting any one of several map "skins" to drape on that elevation: a simple color coded map, a nighttime illumination map, even a water-colored "pirate map" look.
  • For the terrain, we can download as much or as little is needed: As the Cesium viewer zooms in on a given spot, Cesium uses a sort of fractal method to download finer and finer resolution terrains in just the surrounding area, eventually getting to the data limit of the set. This gradual approach balances download requirements with viewable accuracy. In our case, we downloaded an entire terrain set up to level 14 (Earth from high in space is level 1, then zooms in to levels 2, 3, 4, etc.) which gave us a pretty good resolution while conserving disk space. (The data up to level 14 totaled about 250 GB.)
  • Using some KML tools we have developed for past projects and adapting to CZML ("cesium language", get it?), we were able to take Smartrac's supply chain data and show a comprehensive overview of the product flow from factories in southeast Asia through a distribution center in Seattle and on to retail stores throughout the Western United States.
The debut for this project was the National Retail Federation convention at the Javitz Center in New York City. Smartrac (and we also) wanted to avoid any show-stoppers that might come from a sketchy internet connection. So, we downloaded the map tiles, a terrain set, built our visualizations, and saved the whole thing locally on the head node of the Liquid Galaxy server stack, which sat in the back of the booth behind the screens.

The show was a great success, with visitors running through the visualizations almost non-stop for 3 days. The client is now taking the Liquid Galaxy and the Cesium visualizations on to another convention in Europe next month. The NRF, IBM, and several other ecommerce bloggers wrote up the platform, which brings good press for Smartrac, Cesium, and the Liquid Galaxy.

Liquid Galaxy Success at U.S. Embassy’s Cultural Center

The U.S. Embassy to Jakarta features a high-tech cultural center called “@america”. @america’s mission is to provide a space for young Indonesians to learn more about the United States through discussions, cultural performances, debates, competitions and exhibitions.

Since Google generously donated it six years ago, @america has had a Liquid Galaxy deployed for use at the center. Not until recently, however, has @america taken advantage of our Content Management System. This past year, End Point developed and rolled out a revamped and powerful Content Management System for the fleet of Liquid Galaxies we support. With the updated Content Management System, End Point’s Content Team created a specialized Interactive Education Portal on @america's Liquid Galaxy. The Education Portal featured over 50 high quality, interactive university experiences. Thanks to the CMS, the Liquid Galaxy now shows campus videos, university statistics, and fly-tos and orbits around the schools. The campus videos included both recruitment videos, as well as student-created videos on topics like housing, campus sports, and religion. These university experiences allow young Indonesians the opportunity to learn more about U.S. Universities and culture.

@America and the US Embassy report that from December through the end of January, already more than 16,500 Indonesians have had the opportunity to engage with the Education Portal while visiting @america. We are thankful to have had the opportunity to help the US Embassy use their Liquid Galaxy for such a positive educational cause.


Liquid Galaxy systems are installed at educational institutions, from embassies to research libraries, around the world. If you’d like to learn more about Liquid Galaxy, please visit our Liquid Galaxy website or contact us here.

Smartrac's Liquid Galaxy at National Retail Federation

Last week, Smartrac exhibited at the retail industry’s BIG Show, NRF 2017, using a Liquid Galaxy with custom animations to showcase their technology.

Smartrac provides data analytics to retail chains by tracking physical goods with NFC and Bluetooth tags that combine to track goods all the way from the factory to the distribution center to retail stores. It's a complex but complete solution. How best to visualize all that data and show the incredible value that Smartrac brings? Seven screens with real time maps in Cesium, 3D store models in Unity, and browser-based dashboards, of course. End Point has been working with Smartrac for a number of years as a development resource on their SmartCosmos platform, helping them with UX and back-end database interfaces. This work included development of REST-based APIs for data handling, as well as a Virtual Reality project utilizing the Unity game engine to visualize data and marketing materials directly on several platforms including the Oculus Rift, the Samsung Gear 7 VR, and WebGL. Bringing that back-end work forward in a highly visible platform for the retail conference was a natural extension for them, and the Liquid Galaxy fulfilled that role perfectly. The large Liquid Galaxy display allowed Smartrac to showcase some of their tools on a much larger scale.

For this project, End Point deployed two new technologies for the Liquid Galaxy:
  • Cesium Maps - Smartrac had two major requirements for their data visualizations: show the complexity of the solution and global reach, while also making the map data offline wherever possible to avoid the risk of sketchy Internet connections at the convention center (a constant risk). For this, we deployed Cesium instead of Google Earth, as it allowed for a fully offline tileset that we could store locally on the server, as well as providing a rich data visualization set (we've shown other examples before).
  • Unity3D Models - Smartrac also wanted to show how their product tracking works in a typical retail store. Rather than trying to wire a whole solution during the short period for a convention, however, they made the call to visualize everything with Unity, a very popular 3D rendering engine. Given the multiple screens of the Liquid Galaxy, and our ability to adjust the view angle for each screen in the arch around the viewers, this Unity solution would be very immersive and able to tell their story quite well.
Smartrac showcased multiple scenes that incorporated 3D content with live data, labels superimposed on maps, and a multitude of supporting metrics. End Point developers worked on custom animation to show their tools in a engaging demo. During the convention, Smartrac had representatives leading attendees through the Liquid Galaxy presentations to show their data. Video of these presentations can be viewed below.



Smartrac’s Liquid Galaxy received positive feedback from everyone who saw it, exhibitors and attendees alike. Smartrac felt it was a great way to walk through their content, and attendees both enjoyed the content and were intrigued by the display on which they were seeing the content. Many attendees who had never seen Liquid Galaxy inquired about it.

If you’d like to learn more about Liquid Galaxy or new projects we are working on or having custom content developed, please visit our Liquid Galaxy website or contact us here.

TriSano Case Study


Overview

End Point has been working with state and local health agencies since 2008. We host disease outbreak surveillance and management systems and have expertise providing clients with the sophisticated case management tools they need to deliver in-house analysis, visualization, and reporting - combined with the flexibility to comply with changing state and federal requirements. End Point provides the hosting infrastructure, database, reporting systems, and customizations that agencies need in order to service to their populations.

Our work with health agencies is a great example of End Point’s ability to use our experience in open source technology, Ruby on Rails, manage and back up large secure datasets, and integrate reporting systems to build and support a full-stack application. We will discuss one such client in this case study.

Why End Point?

End Point is a good fit for this project because of our expertise in several areas including reporting and our hosting capabilities. End Point has had a long history of consultant experts in PostgreSQL and Ruby on Rails, which are the core software behind this application.

Also, End Point specializes in customizing open-source software, which can save not-for-profit and state agencies valuable budget dollars they can invest in other social programs.

Due to the secure nature of the medical data in these database, we and our clients must adhere to all HIPAA and CDC policies regarding hosting of data handling, server hosting, and staff authorization and access auditing.




Team

Steve Yoman

Steve serves as the project manager for both communication and internal development for End Point’s relationship with the client. Steve brings many years in project management to the table for this job and does a great job keeping track of every last detail, quote, and contract item.


Selvakumar Arumugam

Selva is one of those rare engineers who is gifted with both development and DevOps expertise. He is the main developer on daily tasks related to the disease tracking system. He also does a great job navigating a complex hosting environment and has helped the client make strides towards their future goals.


Josh Tolley

Josh is one of End Point’s most knowledgeable database and reporting experts. Josh’s knowledge of PostgreSQL is extremely helpful to make sure that the data is secure and stable. He built and maintains a standalone reporting application based on Pentaho.




Application

The disease tracking system consists of several applications including a web application, reporting application, two messaging areas, and SOAP services that relay data between internal and external systems.

TriSano: The disease tracking web app is an open source Ruby on Rails application based on the TriSano product, originally built at the Collaborative Software Initiative. This is a role-based web application where large amounts of epidemiological data can be entered manually or by data transfer.

Pentaho: Pentaho is a PostgreSQL reporting application that allows you to run a separate reporting service or embed reports into your website. Pentaho has a community version and an enterprise version, which is what is used on this particular project. This reporting application provides OLAP services, dashboarding, and generates ad hoc and static reports. Josh Tolley customized Pentaho so that the client can download or create custom reports depending on their needs.

Two Messaging Area applications: The TriSano system also serves as the central repository for messaging feeds used to collect data from local health care providers, laboratories throughout the state, and the CDC.

SOAP services run between the TriSano web app, the Pentaho reporting application, and the client’s data systems translate messages into the correct formats and relay the information to each application.

Into the Future

Based on the success over 9+ years working on this project, the client continues to work with their End Point team to manage their few non open-source software licenses, create long term security strategies, and plan and implement all of their needs related to the continuous improvement and changes in epidemiology tracking. We partner with the client so they can focus their efforts on reading results and planning for the future health of their citizens. This ongoing partnership is something End Point is very proud to be a part of and we hope to continue our work in this field well into the future.

End Point Rings the Morning Bell for Small Business



Recently Chase unveiled a digital campaign for Chase for Business by asking small businesses to submit themselves ringing their own morning bells every day when they open for business. Chase would select one video every day to post on their website and to play on their big screen in Times Square.

A few months back, Chase chose to feature End Point for their competition! They sent a full production team to our office to film us and how we ring the morning bell.

In preparation for Chase, we built a Liquid Galaxy presentation for Chase on our content management system. The presentation consisted of two scenes. In scene 1, we had “Welcome to Liquid Galaxy” written out across the outside four screens. We displayed the End Point Liquid Galaxy logo on the center screen, and set the system to orbit around the globe. In scene 2, the Liquid Galaxy flies to Chase’s Headquarter office in New York City, and orbits around their office. Two bells ring, each shown across two screens. The bell videos used were courtesy of Rayden Mizzi and St Gabriel's Church. Our logo continues to display on the center screen, and the Chase for Business website is shown on a screen as well.

The video that Chase created (shown above) features our CEO Rick giving an introduction of our company and then clicking on the Liquid Galaxy’s touchscreen to launch into the presentation.

We had a great time working with Chase, and were thrilled that they chose to showcase our company as part of their work to promote small businesses! To learn more about the Liquid Galaxy, you can visit our Liquid Galaxy website or contact us here.

Using Awk to beautify grep searches

Recently we've seen a sprout of re-implementations of many popular Unix tools. With the expansion of communities built around new languages or platforms, it seems that apart from the novelties in technologies — the ideas on how to use them stay the same. There are more and more solutions to the same kinds of problems:

  • text editors
  • CSS pre-processors
  • find-in-files tools
  • screen scraping tools
  • ... many more ...

In this blog post I'd like to tackle the problem from yet another perspective. Instead of resolving to "new and cool" libraries and languages (grep implemented in X language) — I'd like to use what's out there already in terms of tooling to build a nice search-in-files tool for myself.

Search in files tools

It seems that for many people it's very important to have a "search in files" tool that they really like. Some of the nice work we've seen so far include:

These are certainly very nice. As the goal of this post is to build something out of the tooling found in any minimal Unix-like installation — they won't work though. They either need to be compiled or require Perl to be installed which isn't everywhere (e. g. FreeBSD on default — though obviously available via the ports).

What I really need from the tool

I do understand that for some developers, waiting 100 ms longer for the search results might be too long. I'm not like that though. Personally, all I care about when searching is how the results are being presented. I also like to have the consistency of using the same approach between many machines I work on. We're often working on remote machines at End Point. The need to install e.g Rust compiler just to get the ripgrep tool is too time consuming and hence doesn't contribute to getting things done faster. Same goes for e. g the_silver_searcher which needs to be compiled too. What options do I have then?

Using good old Unix tools

The "find in files" functionality is covered fully by the Unix grep tool. It allows searching for a given substring but also "Regex" matches. The output can not only contain only the lines with matches, but also the lines before and after to give some context. The tool can provide line numbers and also search recursively within directories.

While I'm not into speeding it up, I'd certainly love to play with its output because I do care about my brain's ability to parse text and hence: be more productive.

The usual output of grep:

$ # searching inside of the ripgrep repo sources:
$ egrep -nR Option src
(...)
src/search_stream.rs:46:    fn cause(&self) -> Option<&StdError> {
src/search_stream.rs:64:    opts: Options,
src/search_stream.rs:71:    line_count: Option<u64>,
src/search_stream.rs:78:/// Options for configuring search.
src/search_stream.rs:80:pub struct Options {
src/search_stream.rs:89:    pub max_count: Option<u64>,
src/search_stream.rs:94:impl Default for Options {
src/search_stream.rs:95:    fn default() -> Options {
src/search_stream.rs:96:        Options {
src/search_stream.rs:113:impl Options {
src/search_stream.rs:160:            opts: Options::default(),
src/search_stream.rs:236:    pub fn max_count(mut self, count: Option<u64>) -> Self {
src/search_stream.rs:674:    pub fn next(&mut self, buf: &[u8]) -> Option<(usize, usize)> {
src/worker.rs:24:    opts: Options,
src/worker.rs:28:struct Options {
src/worker.rs:38:    max_count: Option<u64>,
src/worker.rs:44:impl Default for Options {
src/worker.rs:45:    fn default() -> Options {
src/worker.rs:46:        Options {
src/worker.rs:72:            opts: Options::default(),
src/worker.rs:148:    pub fn max_count(mut self, count: Option<u64>) -> Self {
src/worker.rs:186:    opts: Options,
(...)

What my eyes would like to see is more like the following:

$ mygrep Option src
(...)
src/search_stream.rs:
 46        fn cause(&self) -> Option<&StdError> {
 ⁞    
 64        opts: Options,
 ⁞    
 71        line_count: Option<u64>,
 ⁞    
 78    /// Options for configuring search.
 ⁞    
 80    pub struct Options {
 ⁞    
 89        pub max_count: Option<u64>,
 ⁞    
 94    impl Default for Options {
 95        fn default() -> Options {
 96            Options {
 ⁞    
 113   impl Options {
 ⁞    
 160               opts: Options::default(),
 ⁞    
 236       pub fn max_count(mut self, count: Option<u64>) -> Self {
 ⁞    
 674       pub fn next(&mut self, buf: &[u8]) -> Option<(usize, usize)> {

src/worker.rs:
 24        opts: Options,
 ⁞    
 28    struct Options {
 ⁞    
 38        max_count: Option<u64>,
 ⁞    
 44    impl Default for Options {
 45        fn default() -> Options {
 46            Options {
 ⁞    
 72                opts: Options::default(),
 ⁞    
 148       pub fn max_count(mut self, count: Option<u64>) -> Self {
 ⁞    
 186       opts: Options,
(...)

Fortunately, even the tiniest of Unix like system installation already has all we need to make it happen without the need to install anything else. Let's take a look at how we can modify the output of grep with awk to achieve what we need.

Piping into awk

Awk has been in Unix systems for many years — it's older than me! It is a programming language interpreter designed specifically to work with text. In Unix, we can use pipes to direct output of one program to be the standard input of another in the following way:

$ oneapp | secondapp

The idea with our searching tool is to use what we already have and pipe it between the programs to format the output as we'd like:

$ egrep -nR Option src | awk -f script.awk

Notice that we used egrep when in this simple case we didn't need to. It was sufficient to use fgrep or just grep.

Very quick introduction to coding with Awk

Awk is one of the forefathers of languages like Perl and Ruby. In fact some of the ideas I'll show you here exist in them as well.

The structure of awk programs can be summarized as follows:

BEGIN {
  # init code goes here
}

# "body" of the script follows:

/pattern-1/ {
  # what to do with the line matching the pattern?
}

/pattern-n/ {
  # ...
}

END {
  # finalizing
}

The interpreter provides default versions for all three parts: a "no-op" for BEGIN and END and "print each line unmodified" for the "body" of the script.

Each line is being exploded into columns based on the "separator" which by default is any number of consecutive white characters. One can change it via the -F switch or by assigning the FS variable inside the BEGIN area. We'll do just that in our example.

The "columns" that lines are being exploded into can be accessed via the special variables:

$0 # the whole line
$1 # first column
$2 # second column
# etc

The FS variable can contain a pattern too. So for example if we'd have a file with the following contents:

One | Two | Three | Four
Eins | Zwei | Drei | Vier
One | Zwei | Three | Vier

The following assignment would make Awk explode lines into proper columns:

BEGIN {
  FS="|"
}

# the ~ operator gives true if left side matches
# the regex denoted by the right side:
$1 ~ "One" {
  print $2
}

Running the following script would result with:

$ cat file.txt | awk -f script.awk
Two
Zwei

Simple Awk coding to format the search results

Armed with this simple knowledge, we can tackle the problem we stated in the earlier part of this post:

BEGIN {
  # the output of grep in the simple case
  # contains:
  # <file-name>:<line-number>:<file-fragment>
  # let's capture these parts into columns:
  FS=":"
  
  # we are going to need to "remember" if the <file-name>
  # changes to print it's name and to do that only
  # once per file:
  file=""
  
  # we'll be printing line numbers too; the non-consecutive
  # ones will be marked with the special line with vertical
  # dots; let's have a variable to keep track of the last
  # line number:
  ln=0
  
  # we also need to know we've just encountered a new file
  # not to print these vertical dots in such case:
  filestarted=0
}

# let's process every line except the ones grep prints to
# say if some binary file matched the predicate:
!/(--|Binary)/ {

  # remember: $1 is the first column which in our case is
  # the <file-name> part; The file variable is used to
  # store the file name recently processed; if the ones 
  # don't match up - then we know we encountered a new
  # file name:
  if($1 != file && $1 != "")
  {
    file=$1
    print "\n" $1 ":"
    ln = $2
    filestarted=0
  }

  # if the line number isn't greater than the last one by
  # one then we're dealing with the result from non-consecutive
  # line; let's mark it with vertical dots:
  if($2 > ln + 1 && filestarted != 0)
  {
    print "⁞"
  }

  # the substr function returns a substring of a given one
  # starting at a given index; we need to print out the
  # search result found in a file; here's a gotcha: the results
  # may contain the ':' character as well! simply printing
  # $3 could potentially left out some portions of it;
  # this is why we're using the whole line, cutting off the
  # part we know for sure we don't need:
  out=substr($0, length($1 ":" $2 ": "))

  # let's deal with only the lines that make sense:
  if($2 >= ln && $2 != "")
  {
    # sprintf function matches the one found in C lang;
    # here we're making sure the line numbers are properly
    # spaced:
    linum=sprintf("%-4s", $2)
    
    # print <line-number> <found-string>
    print linum " " out
    
    # assign last line number for later use
    ln=$2
    
    # ensure that we know that we "started" current file:
    filestarted=1
  }
}

Notice that the "middle" part of the script (the one with the patterns and actions) gets ran in an implicit loop - once for each input line.

To use the above awk script you could wrap it up with the following shell script:

#!/bin/bash

egrep -nR $@ | awk -f script.awk

Here we're very trivially (and somewhat naively) passing all the arguments passed to the script to egrep with the use of $@.

This of course is a simple solution. Some care needs to be applied when trying to make it work with A, B and C switches, it's not difficult either though. All it takes is to e.g pipe it through sed (another great Unix tool - the "stream editor") to replace the initial '-' characters in the [filename]-[line-number] parts to match our assumptions of having ":" as the separator in the awk script.

In praise of "what-already-works"

The simple script like shown above could easily be placed in your GitHub, BitBucket or GitLab account and fetched with curl on whichever machine you're working on. With one call to curl and maybe another one to put the scripts somewhere in the local PATH you'd gain a productivity enhancing tool that doesn't require anything else to work than what you already have.

I'll keep learning "what we already have" to not fall too much into "what's hot and new" unnecessarily.