The technical road to Journey Analytics is itself a fascinating and ongoing journey. We like to talk about the series of events or touchpoints that occur over time when we talk about customer experience and other journeys. ClickFox is similarly on a journey. Our technical journey is dotted with landmarks, critical points in time where decisions were made that continue to drive the evolution of the ClickFox platform. As with all journeys, we are always looking for opportunities to improve, in our case, performance and functionality. Just as we might tell a customer engaging in digital experience, Journey Analytics, ClickFox’s own evolution is the product of a process of iterative fine tuning. With that in mind, we thought we’d reveal some of the milestones along the ClickFox technical journey to give you the inside scoop.
ClickFox got its start on Java and that early decision has had important implications for our ability to be cross-platform. We’ve been able to migrate our core engine to new platforms over the years. The use of Greenplum, a multi-node SQL database engine, combined with user defined functions in Java running on each node, enabled us to parallelize massive processing by pushing processing to the data before this became a common practice.
It wasn’t long before we found that our original architecture had limitations in terms of scalability, the amount of hardware, the quantity of RAM needed, and other critical factors. With the industry moving away from more traditional SQL engines, we looked at migrating our code and our Journey Analytics engines to Hadoop.
A Hadoop of Our Own
The Hadoop ecosystem is still continually evolving and is relatively new in tech terms. New technologies and configurations for big data processing continue to emerge. Rather than a burden, getting such an environment tuned to ClickFox’s needs created a unique opportunity for us. As our product evolved, we’ve been able to identify the specific levers we needed to pull to optimize performance.
Our singular approach to Hadoop benefits our clients. Enterprises have staff that read Hadoop books and take training on MapReduce, Spark, and the like. The application of that knowledge is something entirely different though. There are so many options to consider: size and content of datasets, data formats, cluster configurations, RAM, the number of nodes, the number of cores, the specific Hadoop distribution, and dozens of others.
We, like many, started our journey thinking Hadoop was a panacea for massive data processing. We’ve since learned the important limitations and designed for them. For example, we developed techniques for pre-caching data that allows us to prepare and optimize the use of certain datasets, providing performance gains. Unlike a production IT environment in an enterprise, the ClickFox team has had the driving mandate to focus on figuring out what works well for the very specific task of Journey Analytics.
Closing the Gap between Data and Understanding
Overcoming the data processing complexities of merging and transforming an enormous amount of data in our journey engine was a long and challenging road, but has now become a rewarding game of fine-tuning. This is not an undertaking based on everyday data cleansing or ETL processes. Journey Analytics relies on a very complex series of merges and transformations using directed acyclic graphs (DAG). ClickFox made the decision to enable our platform to develop DAGs on the fly, based on a user request, to do journey data processing. This enables ClickFox to merge many data sources, across channels, into a massive, combined dataset. The transformations then allow us to create a common language between those channels.
The process is a lot like writing. You have words, which can be assembled into phrases, then sentences and paragraphs until you have a coherent narrative. That’s what we do: take small elements of data, merge and transform them, and in the process turn them from individual events and attributes into a journey with context that clearly shows meaning. That unique foundation is ultimately what enables ClickFox to deliver meaningful results to a business analyst or executive who has to make a decision.
Our technical journey never ends. We are transitioning many key capabilities from MapReduce to Spark to benefit from the performance gains of in memory processing. We push every “off-the-shelf” technology we use to its limits and in the process discover algorithms, strategies, and ways of combining techniques and technologies in ways to make performance even better. With a great foundation beneath our Journey Analytics, we are able to make fine adjustments as part of our own journey to best serve our client needs. We’ve done the work to build the platform designed for Journey Analytics so you don’t have to.