Category Archives: Architecture

A Digital Ecommerce Transformation – Front End Madness – Part II

Part II – Start at the beginning with Part I

In 2010, the cloud was not new but it was ignored by large companies, particularly in non-technology focused segments such as retail. While it is clear now, at the time large retailers had not yet awakened to the new reality that a company’s prowess in software might decide its future outcomes.

As an example, in my first year at TWLER (The Worlds Largest Electronics Retailer), store sales during December were not going well. Since many retailers make 50% or more of their annual sales in November and December, this spelled impending disaster. Traffic was down in stores and the company’s reaction was to propose that the digital channel stop its free shipping offer. This clearly showed the company leadership’s inability to fathom online shopping. The logic would be that if a customer could buy it online and have it shipped for free, they would not go to a TWLER store. Seems reasonable, but if the customer is shopping online and we did not offer free shipping, they would simply click to the next store, probably Amazon, and buy their electronics from them with free shipping. This customer was not going to a TWLER store that Holiday, ever. We were actually saving sales for the company, it just wasn’t understood.

A minor digression on the state of the ATG system is necessary to understand what we were dealing with. The original ATG system was built in 2003, at the time it was an excellent decision for a mid-sized retailer to build its first ecommerce engine. But over time, and numerous one-off projects, the codebase had morphed, intertwined and been generally neglected and abused. As an example, one ongoing project when I arrived was to widen the product detail page (PDP) and move the Add To Cart button from the left side of the page to the right side. This seemed fairly innocuous, but it took six months and well over one million dollars to accomplish this task. This seems ridiculous to me so as an architect, I dug into why this was happening.

It turns out there were multiple reasons why this project was practically impossible to complete. To start with, there were nine separate versions of the PDP, each made for a different category such as TVs, Music, Computers, etc. The nine separate PDPs all had common origins in some ancestral PDP, but after years of projects aimed at the individual categories, they had all strayed in different ways, including using different Javascript frameworks and versions to accomplish dynamic page elements. These PDPs were written in JSP/Javascript and were each well over 10,000 lines long intermixing actual Java into the JSPs themselves. Imagine trying to figure out how to change nine different pages all implemented slightly differently in monstrous JSP files, with no test automation to determine if you broke anything in the process.

This sounds bad enough, but the executable for ATG was in the GB range, built as an ear, with a special Ant file which only one or two people understood (more on this later). It was necessary to build and run the entire ear to determine if the page changes worked since the JSP code was so intertwined with the server side code. However, it was impossible to actually run this ear on a developer’s machine because it also required a full working copy of the Oracle database. No one had actually figured out how to make all these things work yet on a single desktop or laptop machine.

Instead, there was one shared development server for the entire Dotcom division. This server was a large Unix box but still too small to serve thousands of developers trying to build and run the ATG codebase. This server alone routinely failed due to lack of disk space and not enough CPU. But, it was the only place to build and run code you had worked on in your IDE, so everyone had to deal with it.

If you weren’t crying yet, the next step was to actually deploy the code to the staging environment (skipping the integration environment altogether) because the reality was the front end code only worked if it could access the Internet as there were so many externally downloaded components to the page. Even though you deployed the code to the shared developer environment, you couldn’t actually run it there. The staging build happened once every night.

To sum this up, the normal front end development cycle is change some code, save it locally, have it automatically picked up by your running app server and test it. This cycle time should be in the seconds range so you can work quickly and efficiently through all the little tweaks necessary to make a UI page look good and work as expected. The cycle at TWLER was change some code, save it locally, do your best to make sure the page compiled, check in the code, push to your developer environment, do your best to check it compiled, wait for the overnight stage push (go home), come back the next morning and see if the change worked in stage (assuming the push didn’t fail, which it often did). Instead of a cycle time of seconds for each code change, the cycle time was one day. One entire 24 hour day!

Did I mention zero automated regression tests?

Now I bet you think that $1M was cheap. In fact, I still don’t know how anyone actually got any work done in these conditions, but I do know that the churn in the front end development team was enormous.

GOTO Part III

A Digital eCommerce Transformation – A Multipart Series

It took six long years but we successfully transformed a monolithic 10 year old ATG based commerce system into a completely distributed, infinite scaling eCommerce platform.  I’ll call the place TWLER, or the world’s largest electronics retailer.

I joined TWLER in 2010 as a Hadoop Architect on the small team that was tasked with rewriting TWLER.com.   I interviewed in May and was offered the role, but didn’t start until July. About two weeks before I started the VP called me and let me know that the Hadoop project was cancelled and if I declined the offer they would understand. I had spent the last two years learning about Hadoop and installing one of the first Hadoop cluster in the Twin Cities on a bunch of old Dell towers for a company called Peoplenet. The VP there had the idea that they would build a remote data collection system that could take in one million messages per second back in 2008. I tried to startup a Hadoop consulting arm for the consulting company I was working for at the time, Object Partners. However, it was a bit too early in the Twin Cities for companies to be interested in BigData and NoSQL. We spent a lot of time talking to companies about the technology, but no contracts were forthcoming. I put together an Introduction to Hadoop presentation and gave that numerous times, all to no avail. So when TWLER came calling with a Hadoop Architect position, I jumped at the chance to actually use Hadoop at a large company. All that to say I was quite disappointed when the role fell through and seriously considered declining the position as I had not yet resigned, and after two years of pushing a technology I did want to see Hadoop in action. But I was quite tired of my second stint in consulting and ready to move on.

So I arrived at TWLER with no assigned role. At the time we were five architects under an Operations Vice President, and I had no idea of the organizational politics that were hindering a major system rewrite. The cancelling of one project should have been a clue to the future as already, funds were being removed from this team.

We started out under the tutelage of Michael N., a renowned architect that had worked on the initial version of TWLER.com. Our first task was to setup a modern development pipeline using Chef in AWS. A reminder that this was in 2010 when a large retailer doing anything in AWS was highly uncommon, and Chef was practically brand new.

The pipeline we stood up at the time was fairly standard, Atlassian Stash for Git, Crowd for user management, Confluence for knowledge management, Jira for issue tracking, Bamboo for continuous integration, and Artifactory for artifact storage and a Maven repo. We used Chef to completely automate the deployment of these tools into AWS. While none of these was new to TWLER, they had not been combined together, made externally accessible or offered to anyone who wanted to use them throughout the company.

We used this infrastructure to start a small AWS based project to put together a small site that could be used during outages. The site would allow customers to lookup locations, products and prices, but full commerce capabilities would not be available. It would reside in the cloud, ready to be deployed within minutes. It says something that the first thing we built was an outage site, TWLER.com was not a stable platform at the time.

GOTO Part II

What Architects Do? – Part 2

Governance or Strategy

I’ve fielded many questions lately on how I am governing and reviewing the architectures of Target to make sure they conform to enterprise standards. This is a common question asked of Architecture teams. After all, many people believe the main responsibility of Enterprise Architecture is governance.

But governance is the last thing I like to discuss about what Architects actually do. Governance or discussions about being governed means that I’ve actually failed to deliver a simple, clear and implementable technology strategy. If I’ve clearly communicated a technology strategy with desired, but not necessarily mandated, system structure, implementation stacks and supporting platforms, than engineers will happily fit into the architecture to deliver their systems.

Let’s unpack that last statement and define point #2 of what architects do:

Answer #2: Architects define the types of systems and their boundaries that are possible within an organization.

When we discuss technology strategy my goals are to deliver composeable systems which meet current and future business demands. The key to this statement is that we will meet unknown future demands without rebuilding our current systems, or having to build large new systems. There’s an assumption here that your future unknown demands are extensions of your current capabilities. If you’re headed into new business models, you’ll likely need a few new systems.

In today’s world, unknown future demands are presented every day, and the expectation is that they can be delivered in days, weeks or months, not years. For an organization to stay relevant with their consumers, a technology strategy must meet this demand for speed above all else. To achieve speed at scale the systems are constrained to deliver one thing only. A system is generally composed of services (or microservices if you like) that deliver either data or process through an API. The only governance that architects do is to ensure that data or process is comprehensive and unique. Data systems are comprehensive when they are complete for an organization, and take all changes generated throughout the organization. Process systems are comprehensive when they contain the basic services necessary to complete a process, and can be flexibly orchestrated by any entity within the organization.

Done correctly, an organization is composed of hundreds of APIs that each implement a narrow set of functionality. Systems don’t continue to expand to take on more business demands and processes. Actually, systems never want to do this, people expand systems to become more important in their organizations. Governance falls out of strategy, the only governance necessary is ensuring teams don’t construct competing data systems or alternative processes. If the current systems don’t meet their needs, fix them by doing the work and submitting a pull request. That’s also much easier than standing up a new system and engineers generally like this solution.

What Do Architects Do? – Part 1

Having been Chief Architect for Best Buy and currently Chief Architect for Target, I get this question frequently.  The question occasionally comes at your neighborhood party but generally it’s from people in Engineering or Marketing.  At the neighborhood party you attempt to answer this question at risk of becoming known as the “boring tech guy.”

What do architects do?

People ask this question because they truly don’t understand.  They are really asking “Are architects necessary?”  They ask this question because they have rarely seen value from the architects they’ve seen in the past.  This is the sad reality of much of the architecture world, enterprise architecture in particular.  My opinion on why architecture has devolved to a place where many companies are eliminating the practice altogether is simple.  Most architects in the upper echelons of companies were never software engineers.

Why this is important and is a point I harp on continuously, is that if you haven’t spent your 10 years writing code and building and running systems with higher and higher business complexity, you cannot do the first thing architects should do.

Answer #1:  Architects create the environment for engineering culture to thrive.

To create an engineering-centric culture, you have to have been an engineer.  You have to have a few large scale Agile/DevOps systems under your belt.  You have to understand what drives and motivates engineers that want to work on six person teams tackling the toughest problems facing enterprises.  You need to feel it in your gut when one person on the team is a hack and can’t pull their weight and your management won’t address the problem.  You must have found the rock and through sheer force of will, pushed it up the hill, leading in such a way that the rest of the team helps push it along with you.   If you haven’t done these things over the course of years, you haven’t been a software engineer.

Architects that understand how engineers want to work, spend all of their time and energy creating an environment where hard problems are solved, new solutions are found, and everyone sleeps well at night.  Architects make decisions based on whether engineer’s will understand them and choose them regardless of what else is available.

Architects also create the constraints that allow engineers to solve their problems quickly.  With the near infinite array of tools, frameworks and packages available, to remain economically competitive, enterprises need to limit the scope of technology in some way.  Make no mistake, architects select and limit technologies, but only those that involve large expenditures.  If a technology selection will cost a company in excess of $1M for licenses, subscriptions or maintenance over a five year lifespan, architects should be involved.

Architects essentially act as the aggregated will of the engineers.  Architects are there to make engineer’s lives easier.  Architects in effect are servants of both engineers, and the enterprise and walk the fine line that brings the maximum value to both.