Category Archives: Twin Cities

A Digital Ecommerce Transformation – How to Avoid IT Integrators – Part XIX

Part XIX of a multipart series.  Start at the beginning with Part I.

As we eased into 2012 and we continued to build up our teams at TWLER.com, the IT leadership decided that they needed to get their integrators more information on the architecture and technology we were pursuing. Over the course of a few weeks, I proceeded to have 2+ hour meetings with all the major IT integrators, C* G*, I**, Acc*, T*T*, W*P* and a few others. I think they thought I didn’t understand the value brought by these integrators, and that talking to them about my architecture and direction for TWLER.com might somehow win me over. I proceeded to give my well-oiled architecture presentation to each of the integration teams. At this point I’d probably given this presentation a hundred times. The integrators had sent their top technical architects and salespeople to these meetings. They would have been better off without the salespeople waylaying the conversation, but the tech people tried to tone them down.

They were all very excited to hear about what we were building and eager to bring in consultants to help me build the systems. It was just a matter of signing off on the work orders and people would start showing up next week. That’s not how I wanted to operate; here’s how you manage integrators:

  1. If they wanted to be part of the project, they needed to send me their three best candidates
  2. I would interview each of the candidates, and if they passed the screening, they could start as individual contributors on one of my projects
  3. I expect that they would stay at least one year
  4. If they performed well, in a couple months we would interview more of their candidates
  5. If they continued to perform well, and their resources were not removed from our account, we would continue to ramp up individuals
  6. At this rate, you might have 20 people on the account in a year’s time

For some reason, this set of criteria did not sit well with the integrators or the candidates. If you understand the incentive system of the integrators, than constructing a hiring process to make your project unappealing is fairly easy. The lead sales people are only interested in placing consultants onsite. They are incentivized by the number of people placed and number of hours billed. Since I was only looking for top tech talent, the individuals that show up for the interviews are some of the best technical minds at their companies. Their job is to start up tech teams, pile on a hundred people in 3-6 months, and move on to the next company.

Probably half the candidates that were sent to interview were good enough to join the team and act as one of the coders. However, during the interview process I made it clear that I expected them to stay one year, and there would be no account building during that time. They were expected to code 100% of their billable hours. If they spent any time wandering the company looking for new projects, I would kick them out.

Given all those criteria, not a single consultant that came through was placed on our teams. The technical folks thought the project would be a great experience, but they were all in career building mode in their firms. My project would set them back a year against their peers, as they would only account for their own billable hours.

The sales people weren’t interested either; a ramp up of 20 people in a year is nothing for them and not worth their time.

After about two months of this charade, our IT team finally got angry. They decided to sit in on one of the interviews to determine why we either kept rejecting candidates, or the ones we liked declined to show up. I thought that was a great idea, and said they should get one of the integrators to send their best candidate to the interview.

So we scheduled an interview with T*T* later that week, I was sent a resume of a lead engineer that actually looked pretty good. When the day arrived and the interview was about to start, I met one of our IT Senior Directors at the interview room. The candidate was escorted in by the T*T* representative and we started introductions. It turned out the candidate wasn’t the person from the resume.

Well, that was a surprise and the TWLER Senior Director was ticked. But I said, give me this new person’s resume and let’s do the interview. So we all sat down with the new person they had sent in as their “best candidate” and started into the questions. This all turned out rather badly because the new candidate had no web background and rather limited technology skills. The interview consisted of repeated failures to answer junior level easy software engineering questions. Like, “what are the four basic SQL commands?” It was an unmitigated disaster driven home by my unmerciful continuation of questions well beyond the point where everyone knew the interview was over.

Afterwards, the TWLER Senior Director was actually abashed and said he’d pull back on the interview process with the integrators. Again, I thought I had won, but I continued to underestimate the persistence of the TWLER IT management team to instill amateur engineering hour on the TWLER.com project.

Goto Part XX

A Digital Ecommerce Transformation – Holiday – This Year is Always the Most Important One Ever – Part XII

Part XII of a multipart story, to start at the beginning goto Part 1.

Since we’re in the middle of Holiday 2017, I thought a digression on Holiday was in order.

If you have never worked in retail, than you’ve have missed out on the grand experience we call “Holiday”. On the other hand, you’ve probably actually enjoyed the time of year from mid-November to Christmas while you celebrate with your friends and family, and take advantage of thousands of days of deals from the many retailers trying to get their share of wallet from you.

Holiday, with a capital H, is something that has to be experienced to be believed. In my first Holiday at TWLER in 2010, I was on a team that had just started writing code and had very little in production leading into Thanksgiving. The only offering we supported was the failure site, if the main TWLER.com went down, we would quickly spin up the browse only site so consumers would be able to at least see what products we sold, and where our stores were located. In 2010, this was actually a pretty good thing since the ecommerce site was still less than 5% of revenue.

When you work in IT in a retailer, your entire year is judged on whether or not the systems you support survive the shopping onslaught of Holiday. In the online space, an ecommerce site might make 30% of its revenue in the five days from Thanksgiving to Cyber Monday. TWLER.com also experienced the third highest traffic of North American retailers during that time. This massive scale up to 20X normal daily traffic was largely accomplished without clouds in the 2000s. You had to take a really good guess as to how much infrastructure was needed, build it all out over the course of the year, and hope you weren’t overwhelmed by consumer behavior. You could easily receive 1M requests per second at the edge, and 100,000+ requests per second to your actual systems. If those requests were concentrated on the wrong systems, you could easily take down your site.

TWLER counts how long you’ve been at a company by the number of Holidays you’ve experienced. If someone asks how long you’ve worked there, you might say “four Holidays.” And every Holiday is the most important one yet, because those six weeks account for 50% or more of yearly revenue.

After a few Holidays, you realize the second the current year’s Holiday is over, you are immediately planning for the next one. There is no break. It’s like a giant tsunami that is slowly approaching, day by day. You can look over your shoulder and it’s always there, waiting to crash down on you and ruin your day. Once this year’s tsunami passes, you turn around and can see next year’s on the horizon.

In my six Holidays at TWLER, we experienced numerous outages, usually caused by either internal stupidity, or unexpected consumer behavior. In our first few years, we would purposely force our ecommerce site to use “enterprise” services because they were the “single source” for things like taxes, or inventory. This is a great notion, but only if the “enterprise” services were actually built to support the entire Enterprise. Since TWLER was store focused, this meant the “enterprise” services were often down at night for maintenance, or were not built to withstand massive surges in traffic. One million people refreshing a PDP to check for inventory on a big sale every few seconds quickly overwhelmed these services. So we often turned these services off and flew semi-blind, rather than have the site completely fail.

In other instances we tried to use various promotion functions embedded in our ATG commerce server. These seemed like useful things to easily setup a promotion like buy one get one. But when millions of people come looking for the sale, the vendor built commerce engines go down quickly by destroying their own database with the same exact calls, over and over again.  They hadn’t heard of caching yet, I guess.

We would sometimes publish our starting times for various sales, saying a big sale is starting at 11AM and send out millions of customer emails. The marketing teams loved the starting times and the technology teams hated them. We warned that setting a hard start time is a sure route to failure. Yet we did it multiple times and incurred multiple failures as the traffic surge brought down the site. There are physical limits even in clouds, you can only spin things up so fast and 10M rqs will bring down most sites. After a few of these episodes, we did convince the marketing teams that it wasn’t the way to go and learned how to have sales with gradual ramp-ups in requests rather than massive surges.

Around 2013, the Black Friday shopping was so intense in the evening across the nation that the credit card networks themselves slowed down. Instead of taking a few seconds to auth a credit card, it started taking one or two minutes. This was across all retailers. However, the change in time caused threads to hang up inside our ecommerce systems and all of a sudden we ran out of threads as they were all tied up waiting for payments to happen. For the next year, we changed our payment process to go asynchronous so that would never happen again.

There are many more stories of failure, but from every failure we learned something and implemented fixes for the next year’s wave. This is why Holiday in retail is such fun, every year you get to test your mettle against the highest traffic the world can generate. You planned all year, you implemented new technologies and new solutions, but sometimes the consumer confounds you and does something totally unexpected.

The last story is one where the consumer behavior combined with new features took us down unexpectedly. In 2014 we implemented “Save for Later” lists where you could put your items on a list that you could access later and add them to your cart. As Thanksgiving rolled around and the Black Friday sale went out at around 2AM, our Add to Cart function started getting pounded at a rate far higher than we had tested it for. We were seeing 100K rqs in the first few minutes the sale was happening, it rapidly brought the Add to Cart function to its knees and we had to take a outage immediately to get systems back together and increase capacity.

This was completely unexpected consumer behavior so what happened? It turned out that customers used the Save for Later lists to pre-shop the Black Friday sale and add all the things they wanted to buy into the lists. Then when 2AM rolled around, they opened their Save for Later lists and started clicking the Add to Cart buttons one after the other. A single customer might click 5-10 Add to Cart buttons in a few seconds. With hundreds of thousands of customers figuring out the same method independently, it led to a massive spike in Add to Cart requests, we effectively DDOSed our Add to Cart function with simultaneous collective human behavior.

I feel like I could keep going on Holiday for another two pages, but that’s enough for this year, maybe we’ll do it again in the all important next year.

Goto Part XIII

A Digital Ecommerce Transformation – Making the New Mission: A Whole New Architecture – Part VIII

Part VIII – To start at the beginning goto Part I.

I’ll admit it, the deck I made was terrible, I’m not a master of Power Point, and the color scheme left a lot to be desired. I had crude animations showing how we would shift our monolithic application into the cloud, while retaining the customer data and checkout processes in the datacenter.

For about three weeks I worked mainly with another architect to take the many ideas we had discussed over the last year, and what we’d learned about operating in a cloud, and turn that into an architecture vision and implementation plan. We settled on three years to transform the ATG system to a distributed service oriented layered cloud architecture. The deck outlined the current issues with the ATG system, the future state architecture and how we would get there, and the cost of the first year of development.

My colleague urged me to begin presenting the deck to interested parties to get feedback and learn what resonated with the various digital teams. He was instrumental in networking across the organization and arranging meetings with Directors, Senior Directors and VPs in Digital and Business teams.

The first presentations did not go well, the business leaders didn’t get much from a highly technical deck with $13M of capital tied to it in the first year. Mostly the feedback was that we’ve heard this pitch multiple times over the last ten years, why should we believe you? They had a point, numerous consulting firms had been through with grand plans to rewrite TWLER.com. It had already been attempted twice, the last attempt a failed implementation of the Microsoft Commerce system that was relegated to powering the Canadian site and failing miserably even at that effort.

We regrouped and tried to determine what would make this a better presentation. We knew many of the core problems with the site and that the business teams had been unable to make changes in the homepage or product detail pages (PDPs) for years. There were a few decks kicking around that defined the UX driven future of TWLER.com that would never be implemented due to technology failure. We decided to modify the deck and highlight that in the first year we would transform the homepage and PDPs into a new architecture that would allow fast changes and high scale utilizing the CDN for more caching and isolating all calls to the cloud layer. In that way we would severely limit the number of calls making it back to the ATG commerce system running in the datacenter allowing it to scale by relegating it to the Cart and Checkout functions.

There wasn’t anything we could find that outlined a similar architecture so, as far as we knew, we were embarking on a bold new way to use clouds at scale.

GOTO Part IX

A Digital Ecommerce Transformation – The Business Viewpoint in 2010 – Part V

Part V – Start at the beginning with Part I

When I arrived at TWLER (The Worlds Largest Electronics Retailer), it was clear that the digital business teams were sad, sad, sad, sad, sad, sad, sad, sad and very sad. After a couple weeks of reading through the ATG codebase I was also sad. Sad enough that I seriously considered searching for a new job because the code was such a mess. A massive mess. We were supposed to fix this?

The business teams were in charge of managing the site, adding new items, changing pricing, removing items from the site, fixing orders, making content, creating sale landing pages, emails, etc. Everything that keeps a large ecommerce site moving, usually referred to as site operations or business ops. The business teams had created a small shadow IT organization to try and maintain stability and make changes in the only way they could with IT controlling the ATG codebase. The slogans for the shadow IT teams were things like “Do more with less!” and “Any way to get it done!”   The only recourse they had was to make changes to the UI via Javascript and use a bypass of the deployment systems to post new Javascript files directly onto the production servers. Since this was the largest known ATG cluster at more than 400 servers, this procedure was fraught with danger. Appalling, yes, but if that’s the only way to get something done than it falls within creative license.

The process to start a new project went something like this:

  1. Write an RFP for a new thing such as adding a marketplace to the browse and commerce portions of the site.
  2. Seek bids from the three IT integrators that were approved by IT.
  3. Receive bids back with one IT integrator, we’ll call them A, as the project manager and the other two IT integrators vying for delivery.
  4. Bids start at $1M and only go up. For something like a marketplace, $27M was closer to the mark.
  5. Sign the contract to start the work.
  6. Within a week, 20 onshore coordinators and 100 offshore developers magically appear and start wreaking havoc on the shared codebase.
  7. 9-19 months later, severely over budget, something resembling a marketplace appears and is attempted to merge with the existing headstream, using a branch that started 9-19 months ago.
  8. Chaos ensues as every other project delivered between that time is broken and the IT integrator’s teams start fighting amongst themselves.
  9. After another two months, victory is declared, something buggy and barely working is deployed, the contract is finished and the 120 people disappear within a week.
  10. Bug fixes are now the responsibility of the shadow IT team mentioned above, to re-engage the IT integrators to fix all the problems they created needs a new RFP.
  11. Repeat this process until spirit is broken.

Surprisingly (sarcasm) the business teams were not very receptive to a new IT-like team coming in and telling them they were going to fix everything with Agile, DevOps, Cloud and really small engineering teams. As I was told numerous times when trying to engage the business to act as the SME for the Agile teams, “heard it before, new process, SOA architecture, will be able to work magic two years from now.” “Not buying it this time!”

There’s really only one solution to this problem (besides hiring a whole new business team) and that is to start delivering on your promises. That’s what we set out to do, but the environment made it unduly difficult for us.

GOTO Part VI

Tech Cities 2016 – Quick Fun at the Carlson School

Tech Cities 2016 turned out to be a quick half-day conference, high on fun and low on pretense. I mistakenly thought we had an hour to play the Agile Architecture game, which was already a short time for explaining the rules and playing once through, but it turned out we only had 45 minutes.
techcities-1 Generally we like to have 90 minutes and 2 hours works best. Kevin Matheny and I cranked through our short presentation on Agile Architecture and the rules of the game and jumped right into playing it with ~40 of our new friends.
techcities-3

We always played the game with software people in the past, so there were a few more questions about how to play than we’ve seen in previous games. But with a lot of individualized attention from ourselves and our three additional proctors, everyone was able to get through the game without being fired! Some even earned some gold pirate coins for completing their objectives!

If you have an interest in learning what life is like for an Agile Architect, let me know. Kevin is putting together these workshops going forward and I’ll be helping him out when I have the time.

Tech Cities 2016 – The Agile Architecture Game

Coming up in February 2016 I’ll be facilitating the Agile Architecture game with my former colleague and game inventor Kevin Matheny.  We’ve used the Agile Architecture game within BestBuy.com to help project managers, business analysts, product managers, engineers, and others learn about the tradeoffs involved with long term software architecture choices.  It’s a fast way to learn about the hard choices that architects make every day.

One of the comments from a person at Best Buy that played the game was “It felt like work.”   This person was an architect so we felt like we got the game right.

Tech Cities 2016 is a conference sponsored by the Carlson School of Business to foster the vision of Minneapolis being the tech center of the North.

It should be a fun conference!

Open Source North, A Great Start for a Tech Conference in MN

The OpenSourceNorth conference put on this weekend by Solution Design Group was a rocking success.  Local design, local speakers and just a little local beer made for a good time.

IMG_2174Best Buy had a recruiting booth at the conference and hopefully you had a chance to stop by and chat either with a recruiter or an engineer.  If not check out the BestBuy.com jobs located here.

osn_pic2I had a chance to give a new presentation on how and why engineers should be engineering managers, at least a few of them.  If we ever want more great places for engineers to live and work, someone needs to create them.  Only individuals that understand how software engineering really works can create those places.  Here’s a picture from the conference.

If you couldn’t make it this year, put it on your calendar for next year.

What’s Wrong with the Twin Cities’ Developer Career Path

I’ve hired over 100 developers and architects throughout the years.  I’ve worked at around 20 companies, some very briefly, as both an employee and consultant.  I like fast development teams that ride the fine line between excellence and chaos.  But I’ve recently come to understand that a lot of my attitudes around being a developer in the Twin Cities are vastly different than those of our West Coast brethren.  This post is full of vast generalizations so your particular situation may not apply, bear with.

Twin Cities Career Path

As a software developer in the Twin Cities, the general path is to get your foot in the door anywhere that will hire you.  Small companies, large companies, whoever will pay you to learn enterprise scale, team oriented development.  Honestly, your first 5 years as a developer probably cost the company you work for more money to have you there than your worth in business value.  Even so, you’ll add the most value at a small Agile shop where you get to do everything from development to deployment to maintaining the Cloud infrastructure.  Your learning experience at these places will set you up for your next 10 years of work.

Career Choice

Anywhere after your 5 year mark you get the choice to become a career developer.  Around here is where all the people that can’t really cut it become PMs or BAs or some other manager.  This is a fine choice as the life of development can be tough.  But the choice for developers is to continue to work as a cog in the corporate machine or to become an independent or sponsored (employee at a body shop) consultant.

Employee Route

When you go the employee route you get perceived stability in exchange for little control of your own destiny and work.  Due to the nature of financial funding of development projects, employees are often relegated to maintaining existing systems or as the token employee or two on the new development projects staffed by contractors.  You’re only on these projects because they want someone to maintain the system after all the consultants leave.

Why is this the case?

Budgets at corporations split development into two buckets, capital and expense.  Projects that can be capitalized allow the company to depreciate the cost of the development which saves them money on taxes.  Projects that are expense go right to the bottom line of the budget and are a drain on that division’s profitability. From a bean counter’s perspective, adding flexible staff to write new development projects makes the most sense as the cost of consultants is higher than the cost of full time staff.  That higher cost then gets depreciated and the ongoing maintenance expense goes to cheaper full time or offshore resources.

Why does this limits the career path in Minneapolis?

What this means to you as a developer is that, since your value is lower to the company as a pure expense, the amount company’s are willing to pay you is limited.  Thus the salary of a full time developer in the Twin Cities is capped not far north of $100,000.

Consultant Route

As a consultant, you can make significantly more with a similar amount of experience.  The rates vary greatly depending on your skill set and experience level but rates to the consultant between $80 and $120 per hour are fairly common.  Also, when a company pays this much for development resources, you are more likely to land a new green field project as the project is capitalized (and can be depreciated).

Career Results

Given the general climate of Twin Cities development, the more aggressive and generally higher quality developers go into consulting once they realize the limitations of working in the Twin Cities.  But, while the pay is better the career path is over at this point and one risks being a developer hopping from contract to contract for one’s whole career or until your skillset is no longer valued.

West Coast Difference

Having not lived there, I’m now going on conversations with the west coast engineers that I know.  In general, they’ve described the opposite situation as exists in the Twin Cities, where the vast majority of developers stay full time employees and the number of contractors is limited.  Also, that the best development talent is always scooped up by the numerous tech companies in the area at high salary plus stock options.  Plus, the culture of the startup breaking out is always there and quite tangible.  The latest story being InstagramThe Instagram writeup in the New York Times drives home the point of what is wrong with the Twin Cities.  This excerpt sums it up:

The extraordinary success of Instagram is a tale about the culture of the Bay Area tech scene, driven by a tightly woven web of entrepreneurs and investors who nurture one another’s projects with money, advice and introductions to the right people. By and large, it is a network of young men, many who attended Stanford and had the attention of the world’s biggest venture capitalists before they even left campus.

What is wrong with the culture of the Twin Cities tech scene?  These structures just plain do not exist here and that is the main culprit behind the exodus of top talent in the Twin Cities to mercenarial endeavors.  With all those smart minds going to short term financial reward, there will be no culture of risk taking, the financial availability won’t appear and the notion that ideas for the next billion dollar startup are ubiquitous will never materialize.

How we can start this in the Twin Cities will be difficult.  It involves getting the venture capitalist and angel’s around here to start taking chances on smart college kids and catching them before they hit the Twin Cities career path.  Once they are in, our culture will take over and they’ll be consultants before you know it.  And the next generation of risk takers will be lost.