News

Welcome to End Point’s blog

Ongoing observations by End Point people

YAPC::NA 2006 Conference Report

End Pointers Jon Jensen and I, along with 450-500 other Perl enthusiasts, descended on the campus of IIT, the Illinois Institute of Technology in Chicago, for three days at the end of June for the annual North American edition of what is affectionately known by the Perl community as Yet Another Perl Conference (YAPC).

This year's conference had three main focuses covered by four tracks of talks: Web 2.0 (as the hypesters like to call it), software development methodology improvements, and Perl 6 -- the future of the language. Participants and speakers had a range of experience with Perl and varied backgrounds, from experts working for the Perl Foundation on the forefront of Perl 6 development to beginners finding out how best to implement their first assignment.

Perl 6, while not immediately practical for daily use at this time, is advancing the language at the heart of most of End Point's development. Perl 6 represents advancements in language design that will likely bring Perl back to the forefront of dynamic language syntax and research. But Perl 5 remains in the practical lead it's long held, with the amazingly wide-ranging and useful CPAN, and it is still progressing in research as well. One talk covered the upcoming release of Perl 5.10, which incorporates some of the much-anticipated improvements in Perl 6. The Moose object system now brings much of the Perl 6 object design philosophy to Perl 5 developers as well.

"Web 2.0", on the other hand, has fully arrived, and at End Point we have already deployed sites using several of these techniques to improve interactivity and user experience. Doing so does not require a fundamental shift to a new framework or interface design methodology; we have been incorporating many newer techniques into the already mature code base of Interchange. Talks on JavaScript, AJAX, JSON, and JSAN (the JavaScript equivalent of CPAN), provided a solid foundation for user experience improvements. Talks about Catalyst, CGI::Application, Solstice, Jifty, and POE, showed how varied server-side application frameworks can help developers achieve their goals more quickly and with higher quality. Additional talks about technologies underlying these frameworks rounded out the details, particularly along these lines: MVC (model-view-controller) web development and templating systems such as Mason, object implementations such as the aforementioned Moose and inside-out objects, object/relational database mappers such as UR, and direct object storage with Presto.

Running central through all of the talks was an emphasis on design quality, methodology improvement, and enterprise class application development. Several talks showed how source code could be better managed using the increasingly popular Subversion source code control system with svk providing offline repository management at the client side. Others detailed how to test applications for improved quality assurance using automated testing platforms such as Test::WWW::Simple and Selenium, a browser based framework for testing web applications on multiple platforms in multiple browsers. Still others were less technical in essence and were tailored towards improving the general workflow of the developer through all phases of project implementation. One of these included a talk about getting out of "technical debt", which used a metaphor comparing personal financial bankruptcy to what happens when a software project ignores crucial elements such as testing, documentation, and backups, the so-called interest accumulating on top of the work already needing to be done to meet the stated goal. A second focused on common mistakes made when trying to scale systems to an arbitrarily large size.

But YAPC isn't just about the technical talks. It is also about building and improving the community surrounding the language which is always so fundamental to the success of open source software. Through informal Birds of a Feather gatherings, whether they are a social trip to see the sights of the city from the top of the Sears Tower or to hear the amazing sounds of musicians brought to the scenic Millenium Park by Yo-Yo Ma and the Silk Road Ensemble. Or through a job fair open to all sizes of companies where Perl developers can seek out new mediums of expression or where companies can find just the right person to take their next project to completion. The sponsors of, presenters at, and attendees of YAPC know that it is the experience as a whole that is the real reason to go.

More information can be found at the conference website at yapcchicago.org

Review: Practices of an Agile Developer

Practices of an Agile Developer
Venkat Subramaniam and Andy Hunt
Pragmatic Bookshelf, 2006

Practices of an Agile Developer is a fast, enjoyable, often-amusing read. It's also full of wisdom and excellent advice for engineers of all levels, technical managers, and just about anyone with an interest in the software development process. The book makes for a great companion to The Pragmatic Programmer (also co-authored by Andy Hunt), but is more concise and focuses on bigger-picture development practices than does the often-implementation-oriented latter. The authors clearly outline a number of agile methodologies, with attention given to tasks ranging from managing teams, managing time, and keeping estimates realistic, to integration automation, release management, and test-oriented development.

Of particular interest are topics that relate to developer/client interaction, which could apply to a variety of professional relationships (consultant and client; development team and sales team; etc.). The practices outlined present a path to greater success for all parties, with a greater sense of ownership for the client and a greater sense of satisfaction for the developer. Sounds pretty good, doesn't it?

A common scenario in developer/client requirements gathering and project planning involves the production of elaborate specifications of enormous scope; the client is supposed to approve the requirements, effectively saying "yes, this represents everything I want, the way I want it", while the developer must then both estimate the time for completion and, assuming the estimate is accepted, move forward with implementation. In a consulting situation, the client may require a fixed price for the work, rather than just an estimate. This scenario seems perfectly rational, as the requirements represent a point of consensus around which work will be done, bills paid, and so on.

Or so it may seem, until one has been down that road.

Software development is really the business of change; the authors take great pains to illustrate this repeatedly, and they make their point effectively. While requirements and project scope are established, neither the client nor the developer possesses perfect information or a complete understanding of the possibilities. Usability studies may reveal a better way to approach certain problems; market research may indicate more profitable ways of selling goods; any such discovery may necessitate a requirements change on the client's part. Furthermore, during project development, all parties may observe new possibilities implicit in the growing system -- possibilities that would never occur to developer or client without the benefit of actual working examples. Given tightly-specified requirements and strict budgetary constraints, these new discoveries and possibilities cannot be pursued or easily integrated into the project. Software, and its theoretical ability to accomodate real-world change, is stifled by this development approach.

The Agile Developer authors instead favor a model that could be described as a "feedback loop"; the development team works in short, manageable iterations lasting between a few days to perhaps a couple of weeks, and at the end of each iteration provides tangible deliverables to the client. This provides the client with the opportunity to review the results and play a part in steering where the next iteration will go. While a master set of goals or desirables may guide the entire process, the individual iteration is the largest chunk of development work ever pursued at any one time; estimates can be based on each iteration, and each successful iteration brings the software closer to the final goals. Most critically, the process allows all parties to account for change as new information and experience inevitably brings new desires and ideas.

Such an approach fits nicely into other development methodologies advocated by the authors, guided by axioms like "keep it releasable" (indicating that a developer should never go home for the day leaving code in a broken state), "integrate early; integrate often" (effectively requiring developers to work in small chunks that continuously get merged back into the main project code base), and a variety of suggestions regarding test-driven development practices (develop test cases before implementing actual systems as a means of establishing implementation/interface requirements, automate execution of test cases as part of the continuous build process, treat failed tests as bugs, etc.).

Ultimately, Practices of an Agile Developer illustrates just how subject to change all aspects of the software development process can be, and offers compelling ways to build the management of change into your systems and methodologies. Such methods become all the more valuable as companies and individuals increasingly look to the web as a means of delivering a service, and services increasingly appear as unique web-based applications which need to organically grow with their user community, rather than in discrete, versioned releases to be consumed and re-consumed by users over time. Check out this book if you haven't already; it highlights a great approach to which anyone involved in the software development cycle can aspire.

Trouble with MySQL 4.1 under heavy load

Two of our customers running high-traffic websites backed by MySQL (one using PHP, the other using Perl and Interchange) recently ran into serious problems with their MySQL server getting extremely slow or ceasing to respond altogether. In both cases the problem happened under heavy load, with MySQL 4.1 running on Red Hat Enterprise Linux 4 (i386) with the MySQL RPMs built by Red Hat, installed from Red Hat Network. In both cases, we were unable to find any log output or traffic patterns that indicated the cause of the problem.

When this happened on the first server, we tried numerous MySQL configuration changes, and wrote scripts to monitor the MySQL daemon and restart it when it failed, to give us time to investigate the problem fully. But eventually, out of expediency we simply upgraded to MySQL 5.0 with RPMs provided by the creators of MySQL. Doing so immediately fixed the problem. About a month later when another client encountered the same problem, we went straight for the upgrade path and it fixed things there too.

We haven't had trouble like this with MySQL before, from any source. I filed a bug report in Red Hat's bug tracker and Tom Lane quickly pointed me to another similar bug.

Apparently the latest RHN update of MySQL 4.1.20 came just a little too late for our first encounter with this problem, and MySQL 5.0.21 that we upgraded to had the fix in it. It sounds like using the latest MySQL for RHEL 4 from RHN now should work. If you're seeing similar problems, we hope our experience will be of use to you.

At least we can report that the upgrades to MySQL 5.0 were trouble-free. Using nonstandard MySQL client libraries requires you to build new php-mysql, perl-DBD-MySQL, and other dependent RPMs to match, but that's not too hard and is worth the effort if you want to use features from the newer version.

Interchange 5.4.1 released

Interchange 5.4.1 was released today. This is a maintenance update of the web application server End Point has long supported.

There were a few important bugfixes:

  • Dan Collis-Puro fixed bugs in the ITL (Interchange Tag Language) parser that can cause an infinite loop when malformed ITL opening tags are encountered.
  • Stefan Hornburg fixed a regression in the htmlarea widget (word processor-style editing in web browsers) which kept it from working with Microsoft Internet Explorer and browsers claiming to be compatible.
  • Brian Miller fixed an obscure profile parsing bug that kicked in when a comment immediately precedes the __NAME__ identifier.
  • Kevin Walsh changed the default robot detection settings to check for "GoogleBot", rather than just "Google", to prevent false positive matches with other user agent values such as "GoogleToolbar".
  • Josh Lavin and Mike Heins made improvements to the Linkpoint and VeriSign payment modules.
  • Ryan Perry added support for recent versions of mod_perl 2 to Interchange::Link.
  • UPS shipping rates were updated in the Standard demo.
  • There were also numerous other minor bugfixes in core code, tags, the admin, and the Standard demo.

    We invite you to learn more about Interchange if you're not familiar with it.

    PostgreSQL Supports Two-Phase Commit

    The recent release of the 8.1 version of PostgreSQL provides a new feature that offers even more flexibility and scaling for End Point clients. That new feature is the support for "two-phase commit". But what is this feature, and why would you use it?

    First, let's explore how database management systems (DBMS) do their jobs. Consider a simple example of editing a file. You open a file with a text editor or word processor. You make some changes. You save those changes. You close the file.

    A little more complexity enters the picture if you make some changes, change your mind, and close the file without saving it. Now the file is in the original state. Even more complexity: what if the file is shared between two (or more) users, for instance, by use of a network system with shared drives?

    Now we have to consider how the operations on the file occur over time. If Alan opens the file at 9:00 and starts making changes, but Becky opens the same file at 9:05, what file will Becky see? It depends on when Alan saves his changes. If he saves them at 9:04, then Becky sees those changes. If he saves at 9:06, Becky sees the original file.

    If Becky is allowed to make changes to that file, what condition will it be in when she is finished? Again, it depends on when Alan saved his changes. If Alan saves at 9:04, Becky's changes will include Alan's. If Alan saves at 9:06, and Becky saves before that, then Becky's changes are lost. In fact, under these circumstances, unless Alan saves before Becky opens, whoever saves first loses: the file will reflect only the changes applied by the last user.

    For this reason, modern network systems allow for the locking of files, so that the first user to open the file for editing gains a virtual lock, preventing later users from opening the file (to make changes; most modern network systems allow viewing the file in its pre-locked condition).

    The same kind of approach is used by the modern DBMS, but the locking is more complex, as it can affect an entire table, or just one or more records (or rows) in that table. At the risk of over-simplifying: reading a row (or table) locks that row (or table) against alteration, until the reader's transaction (a collection of database operations intended to be performed as a unit) completes. The successful end of a database transaction is called a "commit". It's at this point that changes are "committed" to the database; not only are they no longer under the control of the transaction's owner, but the changes are now "visible" to other users of the DBMS.

    Even more complexity enters the picture when two DBMSs must communicate and maintain data consistancy between themselves. This kind of operation is known as a "distributed transaction". The classic example is a transfer of money between two accounts in two different banks. For instance:

    • My account at First Bank contains $100.
    • Your account at Second Bank contains $50.
    • I wish to transfer $25 from my account to yours.

    The balance at First Bank is altered by subtracting $25. Then the balance at Second Bank is altered by adding $25. At the end of the transaction, both accounts should contain $75. If the transaction fails at either bank, then my account should contain $100, and yours $50. Any other condition is an error, and gets the bank examiners very excited.

    Transactions (in both the database and banking sense of the word!) can be interrupted by many things, but let's take the real-world example of a power interruption. If the power goes off while money is being removed from my account, I'd really like to have that money back in the account when the power goes on. Likewise, if the power goes off while money is being put into your account, that money shouldn't be there when the power is restored.

    If the database transaction at First Bank is completed, so the money is out of my account, and the transaction at Second Bank is interrupted, then my money has "vanished", and you didn't get it. If the transaction at First Bank is made to wait until Second's transaction completes, and THEN the power at First goes off, then I'll have an "extra" $75 in my account that shouldn't be there.

    All this leads us to conclude that a plain "commit" at one database followed by a plain "commit" at the other is just not sufficient. For this, the concept of a "two-phase commit" evolved.

    In a two-phase commit protocol, one of the databases acts as the coordinator for the distributed transaction. It starts the transaction at its side, then sends out a "prepare" message to the other database. The message usually contains a unique "transaction ID" which will appear in all subsequent messages, so the two databases know which distributed transaction to synchronize.

    Both the sending and receiving of a "prepare" message means that the associated transaction will be "almost committed": in most DBMSs (and PostgreSQL is consistent here), the transaction will be written out, available after a database interruption to be re-applied.

    A two-phase commit allows both databases to keep the details of the transaction on disk, but yet not committed (and thus invisible to other users of the database). When the two databases are each satisifed that the other has a permanent record of the transaction, then they can commit their part. If either side crashes, that database can look through its stored history of "almost committed" data: a sort of "while you were out" record.

    For most PostgreSQL installations, in which a single database instance holds all the users' data, the two-phase commit isn't critical. But in larger installations, where data may be distributed (and partially replicated) on two or more platforms, the two-phase commit supplied by PostgreSQL 8.1 keeps everything in sync.

    PostgreSQL Master Joins Crew

    Greg Sabino Mullane joined End Point in October and brings with him a broad expanse of knowledge regarding databases of all kinds. But while Greg is highly proficient in working with MySQL, Oracle, and Microsoft SQL Server, his greatest database passion is for PostgreSQL. According to Greg, "PostgreSQL is a true open-source alternative to Oracle. It's powerful, easy to use, highly extensible, very standards-compliant, and has a support community Oracle and the others can't match."

    Greg is one of just a few "major developers" in the PostgreSQL community and has contributed substantial code to the current PostgreSQL product. He's also the primary developer of the DBD::Pg module, which bridges PostgreSQL and his favorite coding language, Perl. Says Greg, "It's great working for a company like End Point that recognizes the advantages of PostgreSQL and actively encourages its use."

    Greg is also skilled in developing compact, efficient code. He recently rewrote and expanded database update scripts to connect to a remote SOAP server for an End Point client. Efficiency of the project was enhanced by Greg refactoring existing code into a single object-oriented module.

    Previous to coming to End Point, Greg worked at the National Human Genome Research Institute, a part of the National Institutes of Health, in Bethesda, Maryland. While there, he wrote complex Perl scripts and worked with very large datasets. His work contributed to the institute's efforts to better understand the human genome and such aims as developing gene therapy treatments to counter rare serious diseases.

    End Point is excited about the kind of work that Greg has already proven himself capable of since joining the company.

    (Author: Brian Dunn)

    End Point Launches New Website

    End Point rang in the new year with a brand new corporate website at endpoint.com!

    This new site replaces a design that served End Point through five years of rapid growth. The new site will help propel the company into its next stage by highlighting End Point's depth and breadth of expertise, the company's dedication to providing excellent customer service, and our belief in innovative technological solutions.

    Visitors to the site can now get a better, more-telling picture of End Point as a company, can learn the company's background, and can get to know the world-class developers who make up our engineering team.

    As part of our devotion to technology, the site now includes regularly appearing Technology News updates that investigate new technologies and assess their usefulness and potential benefit for End Point clients.

    We're excited about the way the new site represents our company and are proud of the positive message of broad capability and client satisfaction it conveys.

    (Author: Brian Dunn)

    End Point's New HQ

    We've moved! To help accommodate our company's growth and progress, we've moved our headquarters to roomier offices in Manhattan's Flatiron District at 920 Broadway.

    The new office provides End Point with a congenial meeting space as well as state-of-the-art work facilities for the engineers and other personnel working there. It also provides End Point with room for expansion that addresses the company's intent to hire additional developers in the New York area.

    Our new full address is End Point Corporation, 920 Broadway Suite 701, New York, NY 10010. Our phone number remains (212) 929-6923.

    (Author: Brian Dunn)

    Interchange 5.4 Released

    At the end of December 2005, a new major version of Interchange was released, making widely available the improvements developed over the previous year and a half.

    While many of the hundreds of important changes are small and incremental, Interchange 5.4 offers a number of larger improvements as well:

    • Improved pre-fork server model supports higher traffic.
    • Extensible architecture improvements allow more customization (Feature, AccumulateCode, TagRepository, DispatchRoutines).
    • Shopping cart triggers have been added, for easier control over complex shopping cart behaviors.
    • Multiple "discount spaces" may be defined, for complex discounting schemes.
    • The "permanent more" facility allows shared pageable searches, for reduced database load and paging disk space.
    • The email interception feature reroutes outgoing email to one or more developer addresses, stopping email from accidentally going to real users during testing.
    • Quicker development of email functions using HTML parts or attached files.
    • A new demo application, called "Standard", was added.
    • Access to loop data in embedded Perl is now easier with the new $Row object.
    • User-defined subroutines can be accessed more ways with the new $Sub object.
    • More payment gateways are supported, including an interface to CPAN's Business::OnlinePayment.
    • More languages are supported in the admin area.
    • ... and many other feature enhancements and bugfixes.

    Ethan Rowe and Jon Jensen, two End Point engineers and members of the Interchange Development Group, added several of these new features based on work done earlier for our clients. We value highly the whole Interchange team's commitment to stability and reliability in the code, and cooperation and ongoing improvement together. In particular we appreciate the efforts of Mike Heins, Stefan Hornburg, and Davor Ocelic, whose regular contributions make Interchange's progress impressive. And Interchange would be weaker without the valuable work of Kevin Walsh, Ton Verhagen, Jonathan Clark, Dan Browning, Paul Vinciguerra, Ed LaFrance, and others.

    We look forward to seeing this latest and greatest version of Interchange being used by the wider Interchange community.